Re: Primary not sending to synchronous standby - Mailing list pgsql-hackers

From Thom Brown
Subject Re: Primary not sending to synchronous standby
Date
Msg-id CAA-aLv6zYz+AFUByZLC0Y2L7DqQu9552iOuNS8UbykqrZ+tiDg@mail.gmail.com
Whole thread Raw
In response to Re: Primary not sending to synchronous standby  (Andres Freund <andres@2ndquadrant.com>)
Responses Re: Primary not sending to synchronous standby  (Andres Freund <andres@2ndquadrant.com>)
List pgsql-hackers
On 23 February 2015 at 16:53, Andres Freund <andres@2ndquadrant.com> wrote:
On 2015-02-23 15:48:25 +0000, Thom Brown wrote:
> On 23 February 2015 at 15:42, Andres Freund <andres@2ndquadrant.com> wrote:
>
> > On 2015-02-23 16:38:44 +0100, Andres Freund wrote:
> > > I unfortunately don't remember enough of the thread to reference it
> > > here.
> >
> > Found the right keywords. The threads below
> >
> > http://archives.postgresql.org/message-id/369698E947874884A77849D8FE3680C2%40maumau
> > and
> >
> > http://www.postgresql.org/message-id/5CF4ABBA67674088B3941894E22A0D25@maumau
> >
>
> Yes, this seems to be virtually the same issue reported.  The trace looks
> the same except for RecordTransactionCommit.

So, I proposed in
http://www.postgresql.org/message-id/20140707155113.GB1136@alap3.anarazel.de
that we make sequences assign a xid and only wait for syncrep when a xid
is assigned. The biggest blocker was that somebody would have to do some
code reviewing to find other locations that might need similar
treatment.

I did a, quick, grep for XLogInsert() and I think we're otherwise
fine. There's some debatable cases:

* XLOG_STANDBY_LOCK  doesn't force a xid to be assigned. I think it's
  harmless though, as we really only need to wait for that to be
  replicated if the transaction did something relevant (i.e. catalog
  changes). And those will force xid assignment.
* 2pc records don't assign a xid. But twophase.c does it's own waiting,
  so that's fine.
* Plain vacuums will not trigger waits. But I think that's good. There's
  really no need to wait if all that's been done is some cleanup without
  visible consequences.
* Fujii brought up that we might want to wait for XLOG_SWITCH - I don't
  really see why.
* XLOG_RESTORE_POINT is a similar candidate - I don't see really valid
  arguments for making 2pc wait.


The attached, untested, patch changes things so that we
a) only wait for syncrep if we both wrote WAL and had a xid assigned
b) use an async commit if we just had a xid assigned, without having
   written WAL, even if synchronous_commit = off
c) acquire a xid when WAL logging sequence changes (arguable at least
   one of the xid assignments is redundant, but it doesn't cost
   anything, so ...)

I think it makes sense to change a) and b) that way because there's no
need to wait for WAL flushes/syncrep waits when all that happened is
manipulations of temporary/unlogged tables or HOT pruning. It's slightly
wierd that the on-disk flush and the syncrep wait essentially used two
different mechanisms for deciding when to flush.

Comments? This is obviously just a POC, but I think something like this
does make a great deal of sense.

Thom, does that help?

Yeah, this appears to eliminate the problem, at least in the case I reported.

Thanks

--
Thom

pgsql-hackers by date:

Previous
From: Heikki Linnakangas
Date:
Subject: Re: Redesigning checkpoint_segments
Next
From: Dmitry Dolgov
Date:
Subject: Re: mogrify and indent features for jsonb