On Mon, Jul 7, 2014 at 4:14 PM, Andres Freund <andres@2ndquadrant.com> wrote:
> Hi,
>
> On 2014-07-04 22:59:15 +0900, MauMau wrote:
>> My customer reported a strange connection hang problem. He and I couldn't
>> reproduce it. I haven't been able to understand the cause, but I can think
>> of one hypothesis. Could you give me your opinions on whether my hypothesis
>> is correct, and a direction on how to fix the problem? I'm willing to
>> submit a patch if necessary.
>
>> The connection attempt is waiting for a reply from the standby. This is
>> strange, because we didn't anticipate that the connection establishment (and
>> subsequent SELECT queries) would update something and write some WAL. The
>> doc says:
>>
>> http://www.postgresql.org/docs/current/static/warm-standby.html#SYNCHRONOUS-REPLICATION
>>
>> "When requesting synchronous replication, each commit of a write transaction
>> will wait until confirmation is received that the commit has been written to
>> the transaction log on disk of both the primary and standby server.
>> ...
>> Read only transactions and transaction rollbacks need not wait for replies
>> from standby servers. Subtransaction commits do not wait for responses from
>> standby servers, only top-level commits."
>>
>>
>> [Hypothesis]
>> Why does the connection processing emit WAL?
>>
>> Probably, it did page-at-a-time vacuum during access to pg_database and
>> pg_authid for client authentication. src/backend/access/heap/README.HOT
>> describes:
>
>> [How to fix]
>> Of course, adding "-o '-c synchronous_commit=local'" or "-o '-c
>> synchronous_standby_names='" to pg_ctl start in the recovery script would
>> prevent the problem.
>
>> But isn't there anything to fix in PostgreSQL? I think the doc needs
>> improvement so that users won't misunderstand that only write transactions
>> would block at commit.
>
> I think we should rework RecordTransactionCommit() to only wait for the
> standby if `markXidCommitted' and not if `wrote_xlog'. There really
> isn't a reason to make a readonly transaction's commit wait just because
> it did some hot pruning.
Sounds good direction. One question is: Can RecordTransactionCommit() avoid
waiting for not only replication but also local WAL flush safely in
such read-only
transaction case?
Regards,
--
Fujii Masao