Re: [RFC: bug fix?] Connection attempt block forever when the synchronous standby is not running - Mailing list pgsql-hackers

From Andres Freund
Subject Re: [RFC: bug fix?] Connection attempt block forever when the synchronous standby is not running
Date
Msg-id 20140707155113.GB1136@alap3.anarazel.de
Whole thread Raw
In response to Re: [RFC: bug fix?] Connection attempt block forever when the synchronous standby is not running  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: [RFC: bug fix?] Connection attempt block forever when the synchronous standby is not running  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
On 2014-07-07 09:57:20 -0400, Tom Lane wrote:
> Andres Freund <andres@2ndquadrant.com> writes:
> > I think we should rework RecordTransactionCommit() to only wait for the
> > standby if `markXidCommitted' and not if `wrote_xlog'. There really
> > isn't a reason to make a readonly transaction's commit wait just because
> > it did some hot pruning.
> 
> Well, see the comment that explains why the logic is like this now:
> 
>          * If we didn't create XLOG entries, we're done here; otherwise we
>          * should flush those entries the same as a commit record.  (An
>          * example of a possible record that wouldn't cause an XID to be
>          * assigned is a sequence advance record due to nextval() --- we want
>          * to flush that to disk before reporting commit.)

I think we should 'simply' make sequences assign a toplevel xid - then
we can get rid of that special case in RecordTransactionCommit(). And I
think the performance benefit of not having to wait on XLogFlush() for
readonly xacts due to hot prunes far outweighs the decrease due to the
xid assignment/commit record.  I don't think that nextval()s are called
overly much without a later xid assigning statement.

> I agree that HOT pruning isn't a reason to make a commit wait, but
> nextval() is.

Agreed.

> We could perhaps add more flags that would keep track of which sorts of
> xlog entries justify a wait at commit, but TBH I'm skeptical of the entire
> proposition.  Having synchronous replication on with no live slave *will*
> result in arbitrary hangs, and the argument that this particular case
> should be exempt seems a bit thin to me.  The sooner the user realizes
> he's got a problem, the better.  If read-only transactions don't show a
> problem, the user might not realize he's got one until he starts to wonder
> why autovac/autoanalyze aren't working.

Well, the user might just want to log in to diagnose the problem. If he
can't even login to see pg_stat_replication it's a pretty screwed up
situation.

> I think a more useful line of thought would be to see if we can't complain
> more loudly when we have no synchronous standby.  Perhaps a "WARNING:
> waiting forever for lack of a synchronous standby" could be emitted when
> a transaction starts to wait.

In the OP's case the session wasn't even started - so proper feedback
isn't that easy...
We could special case that by forcing s_c=off until the session started properly.

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: Pg_upgrade and toast tables bug discovered
Next
From: Tom Lane
Date:
Subject: Re: [RFC: bug fix?] Connection attempt block forever when the synchronous standby is not running