Thread: [RFC: bug fix?] Connection attempt block forever when the synchronous standby is not running

[RFC: bug fix?] Connection attempt block forever when the synchronous standby is not running

From

"MauMau"

Date:

04 July 2014, 13:55:48

Hello,

My customer reported a strange connection hang problem.  He and I couldn't 
reproduce it.  I haven't been able to understand the cause, but I can think 
of one hypothesis.  Could you give me your opinions on whether my hypothesis 
is correct, and a direction on how to fix the problem?  I'm willing to 
submit a patch if necessary.


[Problem]
The customer is using synchronous streaming replication with PostgreSQL 
9.2.8.  The cluster consists of two nodes.

He performed archive recovery test like this:

1. Take a base backup.  At that time, some notable settings in 
postgresql.conf are:
synchronous_standby_names = 'node2'
autovacuum = on
# synchronous_commit is commented out, so it's on by default

2. Some update operations.  I don't know what.

3. Shutdown the primary and promote the standby.

4. Shutdown the new primary.

5. Perform archive recovery.  That is, restore the base backup, create 
recovery.conf, and do pg_ctl start.

6. Immediately after the archive recovery is complete, connect to the 
database server and perform some queries to check user data.

The steps 5 and 6 are done in some recovery script.

However, the connection attempt in step 6 got stuck for 12 hours, and the 
test was canceled.  The stack trace was:

#0  0x0000003f4badf258 in poll () from /lib64/libc.so.6
#1  0x0000000000619b94 in WaitLatchOrSocket ()
#2  0x0000000000640c4c in SyncRepWaitForLSN ()
#3  0x0000000000491c18 in RecordTransactionCommit ()
#4  0x0000000000491d98 in CommitTransaction ()
#5  0x0000000000493135 in CommitTransactionCommand ()
#6  0x000000000074938a in InitPostgres ()
#7  0x000000000066ddd7 in PostgresMain ()
#8  0x0000000000627d81 in PostmasterMain ()
#9  0x00000000005c4803 in main ()

The connection attempt is waiting for a reply from the standby.  This is 
strange, because we didn't anticipate that the connection establishment (and 
subsequent SELECT queries) would update something and write some WAL.  The 
doc says:

http://www.postgresql.org/docs/current/static/warm-standby.html#SYNCHRONOUS-REPLICATION

"When requesting synchronous replication, each commit of a write transaction 
will wait until confirmation is received that the commit has been written to 
the transaction log on disk of both the primary and standby server.
...
Read only transactions and transaction rollbacks need not wait for replies 
from standby servers. Subtransaction commits do not wait for responses from 
standby servers, only top-level commits."


[Hypothesis]
Why does the connection processing emit WAL?

Probably, it did page-at-a-time vacuum during access to pg_database and 
pg_authid for client authentication.  src/backend/access/heap/README.HOT 
describes:

"Effectively, space reclamation happens during tuple retrieval when the
page is nearly full (<10% free) and a buffer cleanup lock can be
acquired.  This means that UPDATE, DELETE, and SELECT can trigger space
reclamation, but often not during INSERT ... VALUES because it does
not retrieve a row."

But the customer could not reproduce the problem when he performed the same 
archive recovery from the same base backup again.  Why?  I guess the 
autovacuum daemon vacuumed the system catalogs before he attempted to 
connect to the database.

Is this correct?


[How to fix]
Of course, adding "-o '-c synchronous_commit=local'" or "-o '-c 
synchronous_standby_names='" to pg_ctl start in the recovery script would 
prevent the problem.

But isn't there anything to fix in PostgreSQL?  I think the doc needs 
improvement so that users won't misunderstand that only write transactions 
would block at commit.

Do you think something else should be done?  I guess pg_basebackup, 
pg_isready, and PQping() called in pg_ctl -w start/restart would block 
likewise, and I'm afraid users don't anticipate it.  pg_upgrade appears to 
set synchronous_commit to local when starting the database server.

Regards
MauMau

Re: [RFC: bug fix?] Connection attempt block forever when the synchronous standby is not running

From

Amit Kapila

Date:

06 July 2014, 06:37:04

On Fri, Jul 4, 2014 at 7:29 PM, MauMau <maumau307@gmail.com> wrote:
>
> Hello,
>
> "When requesting synchronous replication, each commit of a write transaction will wait until confirmation is received that the commit has been written to the transaction log on disk of both the primary and standby server.
> ...
> Read only transactions and transaction rollbacks need not wait for replies from standby servers. Subtransaction commits do not wait for responses from standby servers, only top-level commits."
>
>
> [Hypothesis]
> Why does the connection processing emit WAL?
>
> Probably, it did page-at-a-time vacuum during access to pg_database and pg_authid for client authentication. src/backend/access/heap/README.HOT describes:

I agree with your analysis that it can happen during connection

attempt.

> But the customer could not reproduce the problem when he performed the same archive recovery from the same base backup again. Why? I guess the autovacuum daemon vacuumed the system catalogs before he attempted to connect to the database.
>
> Is this correct?

One way to confirm could be to perform the archive recovery by

disabling autovacuum.

>
> [How to fix]
> Of course, adding "-o '-c synchronous_commit=local'" or "-o '-c synchronous_standby_names='" to pg_ctl start in the recovery script would prevent the problem.
>
> But isn't there anything to fix in PostgreSQL? I think the doc needs improvement so that users won't misunderstand that only write transactions would block at commit.

I also think at the very least we should update docs even if we

don't have any solution for this case.

Another thing which I am wondering about is can't the same happen

even for Read Only transaction (incase someone does Select which

prunes the page).

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: [RFC: bug fix?] Connection attempt block forever when the synchronous standby is not running

From

Rajeev rastogi

Date:

07 July 2014, 04:20:26

On 04 July 2014 19:29, MauMau Wrote:

> [How to fix]
> Of course, adding "-o '-c synchronous_commit=local'" or "-o '-c
> synchronous_standby_names='" to pg_ctl start in the recovery script
> would prevent the problem.
>
> But isn't there anything to fix in PostgreSQL?  I think the doc needs
> improvement so that users won't misunderstand that only write
> transactions would block at commit.

As of now there is no solution for this in PostgreSQL but I had submitted a patch "Standalone synchronous master" in
9.4 2014-01 CommitFest, which was rejected because of some issues. This patch was meant to degrade the synchronous
level of master, if all synchronous standbys are down.
I plan to resubmit this with better design sometime in 9.5.

Thanks and Regards,
Kumar Rajeev Rastogi

Re: [RFC: bug fix?] Connection attempt block forever when the synchronous standby is not running

From

Andres Freund

Date:

07 July 2014, 05:14:45

On 2014-07-07 04:20:12 +0000, Rajeev rastogi wrote:
> 
> On 04 July 2014 19:29, MauMau Wrote:
> 
> > [How to fix]
> > Of course, adding "-o '-c synchronous_commit=local'" or "-o '-c
> > synchronous_standby_names='" to pg_ctl start in the recovery script
> > would prevent the problem.
> > 
> > But isn't there anything to fix in PostgreSQL?  I think the doc needs
> > improvement so that users won't misunderstand that only write
> > transactions would block at commit.
> 
> As of now there is no solution for this in PostgreSQL but I had submitted a patch "Standalone synchronous master" in

> 9.4 2014-01 CommitFest, which was rejected because of some issues. This patch was meant to degrade the synchronous
> level of master, if all synchronous standbys are down.
> I plan to resubmit this with better design sometime in 9.5.

That seems to be more less orthogonal to the issue at hand. The problem
here is that a readonly command lead to a wait. And even worse it was a
command the user had no influence over.

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services

Re: [RFC: bug fix?] Connection attempt block forever when the synchronous standby is not running

From

Andres Freund

Date:

07 July 2014, 07:15:01

Hi,

On 2014-07-04 22:59:15 +0900, MauMau wrote:
> My customer reported a strange connection hang problem.  He and I couldn't
> reproduce it.  I haven't been able to understand the cause, but I can think
> of one hypothesis.  Could you give me your opinions on whether my hypothesis
> is correct, and a direction on how to fix the problem?  I'm willing to
> submit a patch if necessary.

> The connection attempt is waiting for a reply from the standby.  This is
> strange, because we didn't anticipate that the connection establishment (and
> subsequent SELECT queries) would update something and write some WAL.  The
> doc says:
> 
> http://www.postgresql.org/docs/current/static/warm-standby.html#SYNCHRONOUS-REPLICATION
> 
> "When requesting synchronous replication, each commit of a write transaction
> will wait until confirmation is received that the commit has been written to
> the transaction log on disk of both the primary and standby server.
> ...
> Read only transactions and transaction rollbacks need not wait for replies
> from standby servers. Subtransaction commits do not wait for responses from
> standby servers, only top-level commits."
> 
> 
> [Hypothesis]
> Why does the connection processing emit WAL?
> 
> Probably, it did page-at-a-time vacuum during access to pg_database and
> pg_authid for client authentication.  src/backend/access/heap/README.HOT
> describes:

> [How to fix]
> Of course, adding "-o '-c synchronous_commit=local'" or "-o '-c
> synchronous_standby_names='" to pg_ctl start in the recovery script would
> prevent the problem.

> But isn't there anything to fix in PostgreSQL?  I think the doc needs
> improvement so that users won't misunderstand that only write transactions
> would block at commit.

I think we should rework RecordTransactionCommit() to only wait for the
standby if `markXidCommitted' and not if `wrote_xlog'. There really
isn't a reason to make a readonly transaction's commit wait just because
it did some hot pruning.

Greetings,

Andres Freund

Re: [RFC: bug fix?] Connection attempt block forever when the synchronous standby is not running

From

Fujii Masao

Date:

07 July 2014, 11:10:30

On Mon, Jul 7, 2014 at 4:14 PM, Andres Freund <andres@2ndquadrant.com> wrote:
> Hi,
>
> On 2014-07-04 22:59:15 +0900, MauMau wrote:
>> My customer reported a strange connection hang problem.  He and I couldn't
>> reproduce it.  I haven't been able to understand the cause, but I can think
>> of one hypothesis.  Could you give me your opinions on whether my hypothesis
>> is correct, and a direction on how to fix the problem?  I'm willing to
>> submit a patch if necessary.
>
>> The connection attempt is waiting for a reply from the standby.  This is
>> strange, because we didn't anticipate that the connection establishment (and
>> subsequent SELECT queries) would update something and write some WAL.  The
>> doc says:
>>
>> http://www.postgresql.org/docs/current/static/warm-standby.html#SYNCHRONOUS-REPLICATION
>>
>> "When requesting synchronous replication, each commit of a write transaction
>> will wait until confirmation is received that the commit has been written to
>> the transaction log on disk of both the primary and standby server.
>> ...
>> Read only transactions and transaction rollbacks need not wait for replies
>> from standby servers. Subtransaction commits do not wait for responses from
>> standby servers, only top-level commits."
>>
>>
>> [Hypothesis]
>> Why does the connection processing emit WAL?
>>
>> Probably, it did page-at-a-time vacuum during access to pg_database and
>> pg_authid for client authentication.  src/backend/access/heap/README.HOT
>> describes:
>
>> [How to fix]
>> Of course, adding "-o '-c synchronous_commit=local'" or "-o '-c
>> synchronous_standby_names='" to pg_ctl start in the recovery script would
>> prevent the problem.
>
>> But isn't there anything to fix in PostgreSQL?  I think the doc needs
>> improvement so that users won't misunderstand that only write transactions
>> would block at commit.
>
> I think we should rework RecordTransactionCommit() to only wait for the
> standby if `markXidCommitted' and not if `wrote_xlog'. There really
> isn't a reason to make a readonly transaction's commit wait just because
> it did some hot pruning.

Sounds good direction. One question is: Can RecordTransactionCommit() avoid
waiting for not only replication but also local WAL flush safely in
such read-only
transaction case?

Regards,

-- 
Fujii Masao

Re: [RFC: bug fix?] Connection attempt block forever when the synchronous standby is not running

From

Tom Lane

Date:

07 July 2014, 13:57:29

Andres Freund <andres@2ndquadrant.com> writes:
> I think we should rework RecordTransactionCommit() to only wait for the
> standby if `markXidCommitted' and not if `wrote_xlog'. There really
> isn't a reason to make a readonly transaction's commit wait just because
> it did some hot pruning.

Well, see the comment that explains why the logic is like this now:
        * If we didn't create XLOG entries, we're done here; otherwise we        * should flush those entries the same
asa commit record.  (An        * example of a possible record that wouldn't cause an XID to be        * assigned is a
sequenceadvance record due to nextval() --- we want        * to flush that to disk before reporting commit.)

I agree that HOT pruning isn't a reason to make a commit wait, but
nextval() is.

We could perhaps add more flags that would keep track of which sorts of
xlog entries justify a wait at commit, but TBH I'm skeptical of the entire
proposition.  Having synchronous replication on with no live slave *will*
result in arbitrary hangs, and the argument that this particular case
should be exempt seems a bit thin to me.  The sooner the user realizes
he's got a problem, the better.  If read-only transactions don't show a
problem, the user might not realize he's got one until he starts to wonder
why autovac/autoanalyze aren't working.

I think a more useful line of thought would be to see if we can't complain
more loudly when we have no synchronous standby.  Perhaps a "WARNING:
waiting forever for lack of a synchronous standby" could be emitted when
a transaction starts to wait.
        regards, tom lane

Re: [RFC: bug fix?] Connection attempt block forever when the synchronous standby is not running

From

Andres Freund

Date:

07 July 2014, 15:51:19

On 2014-07-07 09:57:20 -0400, Tom Lane wrote:
> Andres Freund <andres@2ndquadrant.com> writes:
> > I think we should rework RecordTransactionCommit() to only wait for the
> > standby if `markXidCommitted' and not if `wrote_xlog'. There really
> > isn't a reason to make a readonly transaction's commit wait just because
> > it did some hot pruning.
> 
> Well, see the comment that explains why the logic is like this now:
> 
>          * If we didn't create XLOG entries, we're done here; otherwise we
>          * should flush those entries the same as a commit record.  (An
>          * example of a possible record that wouldn't cause an XID to be
>          * assigned is a sequence advance record due to nextval() --- we want
>          * to flush that to disk before reporting commit.)

I think we should 'simply' make sequences assign a toplevel xid - then
we can get rid of that special case in RecordTransactionCommit(). And I
think the performance benefit of not having to wait on XLogFlush() for
readonly xacts due to hot prunes far outweighs the decrease due to the
xid assignment/commit record.  I don't think that nextval()s are called
overly much without a later xid assigning statement.

> I agree that HOT pruning isn't a reason to make a commit wait, but
> nextval() is.

Agreed.

> We could perhaps add more flags that would keep track of which sorts of
> xlog entries justify a wait at commit, but TBH I'm skeptical of the entire
> proposition.  Having synchronous replication on with no live slave *will*
> result in arbitrary hangs, and the argument that this particular case
> should be exempt seems a bit thin to me.  The sooner the user realizes
> he's got a problem, the better.  If read-only transactions don't show a
> problem, the user might not realize he's got one until he starts to wonder
> why autovac/autoanalyze aren't working.

Well, the user might just want to log in to diagnose the problem. If he
can't even login to see pg_stat_replication it's a pretty screwed up
situation.

> I think a more useful line of thought would be to see if we can't complain
> more loudly when we have no synchronous standby.  Perhaps a "WARNING:
> waiting forever for lack of a synchronous standby" could be emitted when
> a transaction starts to wait.

In the OP's case the session wasn't even started - so proper feedback
isn't that easy...
We could special case that by forcing s_c=off until the session started properly.

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services

Re: [RFC: bug fix?] Connection attempt block forever when the synchronous standby is not running

From

Tom Lane

Date:

07 July 2014, 16:06:25

Andres Freund <andres@2ndquadrant.com> writes:
> On 2014-07-07 09:57:20 -0400, Tom Lane wrote:
>> Well, see the comment that explains why the logic is like this now:

> I think we should 'simply' make sequences assign a toplevel xid - then
> we can get rid of that special case in RecordTransactionCommit(). And I
> think the performance benefit of not having to wait on XLogFlush() for
> readonly xacts due to hot prunes far outweighs the decrease due to the
> xid assignment/commit record.  I don't think that nextval()s are called
> overly much without a later xid assigning statement.

Yeah, that could well be true.  I'm not sure if there are any other cases
where we have non-xid-assigning operations that are considered part of
what has to be flushed before reporting commit; if there are not, I'd
be okay with changing nextval() this way.

>> I think a more useful line of thought would be to see if we can't complain
>> more loudly when we have no synchronous standby.  Perhaps a "WARNING:
>> waiting forever for lack of a synchronous standby" could be emitted when
>> a transaction starts to wait.

> In the OP's case the session wasn't even started - so proper feedback
> isn't that easy...

Perhaps I'm wrong, but I think a WARNING emitted here would be seen in
psql even though we're still in InitPostgres.  If it isn't, we have a
problem there anyhow, IMO.

> We could special case that by forcing s_c=off until the session started properly.

Ugh.
        regards, tom lane

Re: [RFC: bug fix?] Connection attempt block forever when the synchronous standby is not running

From

Fujii Masao

Date:

07 July 2014, 16:25:57

On Tue, Jul 8, 2014 at 1:06 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Andres Freund <andres@2ndquadrant.com> writes:
>> On 2014-07-07 09:57:20 -0400, Tom Lane wrote:
>>> Well, see the comment that explains why the logic is like this now:
>
>> I think we should 'simply' make sequences assign a toplevel xid - then
>> we can get rid of that special case in RecordTransactionCommit(). And I
>> think the performance benefit of not having to wait on XLogFlush() for
>> readonly xacts due to hot prunes far outweighs the decrease due to the
>> xid assignment/commit record.  I don't think that nextval()s are called
>> overly much without a later xid assigning statement.
>
> Yeah, that could well be true.  I'm not sure if there are any other cases
> where we have non-xid-assigning operations that are considered part of
> what has to be flushed before reporting commit;

Maybe pg_switch_xlog().

> if there are not, I'd
> be okay with changing nextval() this way.

+1

Regards,

-- 
Fujii Masao

Re: [RFC: bug fix?] Connection attempt block forever when the synchronous standby is not running

From

Andres Freund

Date:

07 July 2014, 17:23:09

On 2014-07-07 12:06:14 -0400, Tom Lane wrote:
> Andres Freund <andres@2ndquadrant.com> writes:
> > On 2014-07-07 09:57:20 -0400, Tom Lane wrote:
> >> Well, see the comment that explains why the logic is like this now:
> 
> > I think we should 'simply' make sequences assign a toplevel xid - then
> > we can get rid of that special case in RecordTransactionCommit(). And I
> > think the performance benefit of not having to wait on XLogFlush() for
> > readonly xacts due to hot prunes far outweighs the decrease due to the
> > xid assignment/commit record.  I don't think that nextval()s are called
> > overly much without a later xid assigning statement.
> 
> Yeah, that could well be true.  I'm not sure if there are any other cases
> where we have non-xid-assigning operations that are considered part of
> what has to be flushed before reporting commit; if there are not, I'd
> be okay with changing nextval() this way.

I'm not aware of any adhoc, but I think to actually change it someone
would have to iterate over all wal record types to make sure.

> >> I think a more useful line of thought would be to see if we can't complain
> >> more loudly when we have no synchronous standby.  Perhaps a "WARNING:
> >> waiting forever for lack of a synchronous standby" could be emitted when
> >> a transaction starts to wait.
> 
> > In the OP's case the session wasn't even started - so proper feedback
> > isn't that easy...
> 
> Perhaps I'm wrong, but I think a WARNING emitted here would be seen in
> psql even though we're still in InitPostgres.

Yes, it is visible.

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services

Re: [RFC: bug fix?] Connection attempt block forever when the synchronous standby is not running

From

"MauMau"

Date:

08 July 2014, 10:40:10

From: "Amit Kapila" <amit.kapila16@gmail.com>
> On Fri, Jul 4, 2014 at 7:29 PM, MauMau <maumau307@gmail.com> wrote:
>> [Hypothesis]
>> Why does the connection processing emit WAL?
>>
>> Probably, it did page-at-a-time vacuum during access to pg_database and
> pg_authid for client authentication.  src/backend/access/heap/README.HOT
> describes:
>
> I agree with your analysis that it can happen during connection
> attempt.

Thank you.  I'm relieved the cause seems correct.



>> But the customer could not reproduce the problem when he performed the
> same archive recovery from the same base backup again.  Why?  I guess the
> autovacuum daemon vacuumed the system catalogs before he attempted to
> connect to the database.
>>
> One way to confirm could be to perform the archive recovery by
> disabling autovacuum.

Yeah, I thought of that too.  Unfortunately, the customer deleted the the 
base backup for testing.


> Another thing which I am wondering about is can't the same happen
> even for Read Only transaction (incase someone does Select which
> prunes the page).

I'm afraid about that, too.

Regards
MauMau

Re: [RFC: bug fix?] Connection attempt block forever when the synchronous standby is not running

From

"MauMau"

Date:

08 July 2014, 10:46:48

From: "Rajeev rastogi" <rajeev.rastogi@huawei.com>
As of now there is no solution for this in PostgreSQL but I had submitted a 
patch "Standalone synchronous master" in
9.4 2014-01 CommitFest, which was rejected because of some issues. This 
patch was meant to degrade the synchronous
level of master, if all synchronous standbys are down.
I plan to resubmit this with better design sometime in 9.5.


Although I only read some mails of that thread, I'm sure your proposal is 
what many people would appreciate.  Your new operation mode is equivalent to 
the maximum availability mode of Oracle Data Guard, isn't it?  I'm looking 
forward to it.  Good luck.


==================================================
Maximum availability
This protection mode provides the highest level of data protection that is 
possible without compromising the availability of a primary database. 
Transactions do not commit until all redo data needed to recover those 
transactions has been written to the online redo log and to at least one 
standby database. If the primary database cannot write its redo stream to at 
least one standby database, it effectively switches to maximum performance 
mode to preserve primary database availability and operates in that mode 
until it is again able to write its redo stream to a standby database.

This protection mode ensures zero data loss except in the case of certain 
double faults, such as failure of a primary database after failure of the 
standby database.

Maximum performance
This is the default protection mode. It provides the highest level of data 
protection that is possible without affecting the performance of a primary 
database. This is accomplished by allowing transactions to commit as soon as 
all redo data generated by those transactions has been written to the online 
log. Redo data is also written to one or more standby databases, but this is 
done asynchronously with respect to transaction commitment, so primary 
database performance is unaffected by delays in writing redo data to the 
standby database(s).

This protection mode offers slightly less data protection than maximum 
availability mode and has minimal impact on primary database performance.
==================================================


Regards
MauMau

Re: [RFC: bug fix?] Connection attempt block forever when the synchronous standby is not running

From

"MauMau"

Date:

08 July 2014, 10:59:44

From: "Tom Lane" <tgl@sss.pgh.pa.us>
> problem, the user might not realize he's got one until he starts to wonder
> why autovac/autoanalyze aren't working.

In autovacuum.c, autovacuum workers avoid waiting for the standby by doing:
/* * Force synchronous replication off to allow regular maintenance even if * we are waiting for standbys to connect.
Thisis important to ensure we * aren't blocked from performing anti-wraparound tasks. */if (synchronous_commit >
SYNCHRONOUS_COMMIT_LOCAL_FLUSH)SetConfigOption("synchronous_commit", "local",     PGC_SUSET, PGC_S_OVERRIDE);
 

Regards
MauMau

Re: [RFC: bug fix?] Connection attempt block forever when the synchronous standby is not running

From

"MauMau"

Date:

08 July 2014, 11:14:56

From: "Tom Lane" <tgl@sss.pgh.pa.us>
> Andres Freund <andres@2ndquadrant.com> writes:
>> On 2014-07-07 09:57:20 -0400, Tom Lane wrote:
>>> Well, see the comment that explains why the logic is like this now:
>
>> I think we should 'simply' make sequences assign a toplevel xid - then
>> we can get rid of that special case in RecordTransactionCommit(). And I
>> think the performance benefit of not having to wait on XLogFlush() for
>> readonly xacts due to hot prunes far outweighs the decrease due to the
>> xid assignment/commit record.  I don't think that nextval()s are called
>> overly much without a later xid assigning statement.
>
> Yeah, that could well be true.  I'm not sure if there are any other cases
> where we have non-xid-assigning operations that are considered part of
> what has to be flushed before reporting commit; if there are not, I'd
> be okay with changing nextval() this way.

Thank you all for letting me know your thoughts.  I understood and agree 
that read-only transactions, including the connection establishment one, 
should not wait for the standby nor the XLOG flush at commit, and the 
current documentation/specification should not be changed.

I'll consider how to fix this problem, learning the code, then I'll ask for 
review.  I'd like to submit the patch for next CF if possible.

From: "Fujii Masao" <masao.fujii@gmail.com>
> Sounds good direction. One question is: Can RecordTransactionCommit() 
> avoid
> waiting for not only replication but also local WAL flush safely in
> such read-only
> transaction case?

I'd appreciate any opinion on this, too.

Regards
MauMau

Re: [RFC: bug fix?] Connection attempt block forever when the synchronous standby is not running

From

Alvaro Herrera

Date:

26 February 2015, 14:53:46

FWIW a fix for this has been posted to all active branches:

Author: Andres Freund <andres@anarazel.de>
Branch: master [fd6a3f3ad] 2015-02-26 12:50:07 +0100
Branch: REL9_4_STABLE [d72115112] 2015-02-26 12:50:07 +0100
Branch: REL9_3_STABLE [abce8dc7d] 2015-02-26 12:50:07 +0100
Branch: REL9_2_STABLE [d67076529] 2015-02-26 12:50:07 +0100
Branch: REL9_1_STABLE [5c8dabecd] 2015-02-26 12:50:08 +0100
Branch: REL9_0_STABLE [82e0d6eb5] 2015-02-26 12:50:08 +0100
Reconsider when to wait for WAL flushes/syncrep during commit. Up to now RecordTransactionCommit() waited for
WALto be flushed (if synchronous_commit != off) and to be synchronously replicated (if enabled), even if a
transactiondid not have a xid assigned. The primary reason for that is that sequence's nextval() did not assign a
xid,but are worthwhile to wait for on commit. This can be problematic because sometimes read only transactions
do write WAL, e.g. HOT page prune records. That then could lead to read only transactions having to wait during
commit.Not something people expect in a read only transaction. This lead to such strange symptoms as backends
beingseemingly stuck during connection establishment when all synchronous replicas are down. Especially annoying
whensaid stuck connection is the standby trying to reconnect to allow syncrep again... This behavior also is
involvedin a rather complicated <= 9.4 bug where the transaction started by catchup interrupt processing waited for
syncrepusing latches, but didn't get the wakeup because it was already running inside the same overloaded signal
handler.Fix the issue here doesn't properly solve that issue, merely papers over the problems. In 9.5 catchup
interruptsaren't processed out of signal handlers anymore. To fix all this, make nextval() acquire a top level
xid,and only wait for transaction commit if a transaction both acquired a xid and emitted WAL records. If only a
xidhas been assigned we don't uselessly want to wait just because of writes to temporary/unlogged tables; if only WAL
has been written we don't want to wait just because of HOT prunes. The xid assignment in nextval() is unlikely to
causeoverhead in real-world workloads. For one it only happens SEQ_LOG_VALS/32 values anyway, for another only
usageof nextval() without using the result in an insert or similar is affected. Discussion:
20150223165359.GF30784@awork2.anarazel.de, 369698E947874884A77849D8FE3680C2@maumau,
5CF4ABBA67674088B3941894E22A0D25@maumau Per complaint from maumau and Thom Brown Backpatch all the way back;
9.0doesn't have syncrep, but it seems better to be consistent behavior across all maintained branches.

--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services