Re: Reporting the commit LSN at commit time - Mailing list pgsql-hackers

From Craig Ringer
Subject Re: Reporting the commit LSN at commit time
Date
Msg-id 53E41EC1.5050603@2ndquadrant.com
Whole thread Raw
In response to Re: Reporting the commit LSN at commit time  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Reporting the commit LSN at commit time
Re: Reporting the commit LSN at commit time
List pgsql-hackers
On 08/08/2014 03:54 AM, Tom Lane wrote:
> Craig Ringer <craig@2ndquadrant.com> writes:
>> Hi all
>> To support transparent client-side failover in BDR, it's necessary to
>> know what the LSN of a node was at the time a transaction committed and
>> keep track of that in the client/proxy.
> 
>> I'm thinking about adding a new message type in the protocol that gets
>> sent immediately before CommandComplete, containing the LSN of the
>> commit. Clients would need to enable the sending of this message with a
>> GUC that they set when they connect, so it doesn't confuse clients that
>> aren't expecting it or aware of it.
> 
> FWIW, I think it's a seriously bad idea to expose LSNs in the protocol
> at all.   What happens five years from now when we switch to some other
> implementation that doesn't have LSNs?

Everyone who's relying on them already via pg_stat_replication, etc, breaks.

They're _already_ exposed to users. That ship has sailed.

That's not to say I'm stuck to LSNs as the way to do this, just that I
don't think that particular argument is relevant.

> I don't mind if you expose some other way to inquire about LSNs, but
> let's *not* embed it into the wire protocol.  Not even as an option.

Well, the underlying need here is to have the client able to keep track
of what point in the server's time-line it must see on a replica before
it proceeds to use that replica.

So if the client does some work on server A, then for some reason needs
to / must use server B, it can ask server B "are you replayed up to the
last transaction I performed on server A yet?" and wait until it is.

That's useful for streaming replication (to permit consistent queries
against read replicas) but it's much more so for BDR, where it's
necessary to avoid a variety of multi-master replication anomalies and
conflicts.

I considered LSNs to be the logical mechanism for this as they're
already user-visible, exposed in pg_stat_replication, they can already
be used for just this purpose by hand (just with an extra round-trip), etc.

An obvious alternative is to merge the commit timestamp work, then
expose the timestamp of the last commit replayed in pg_stat_replication.
Then all the client needs to keep track of is the server time of the
last commit.

> This position also obviates the need to complain about having a GUC
> that changes the protocol-level behavior, which is also a seriously
> bad idea.

Well, I'd prefer to be able to negotiate with the server and establish
requirements during the protocol handshake.

As far as I know there isn't an explicit protocol negotiation with
capabilities fields (just a plain version field), but we do have the
startup packet's 'options' field. So I was thinking that requesting the
setting of a PGC_BACKEND GUC in the startup packet would be a logical
way for the client to request use of a protocol extension.

Looking at ProcessStartupPacket(...) in postmaster.c I see that there's
room for special-casing options. Do you think it would be more
appropriate to add a new connection option that's sent by a client to
request reporting of commit timestamps / LSNs / whatever by the server
at commit time?

If not, do you have an alternative suggestion? I can't imagine that
extending the CommandComplete message is a desirable option.

It seems like it'd be useful to expose this as a read-only GUC anyway,
so I don't really see why a PGC_BACKEND GUC isn't exactly the right
thing to use for this, but I'm happy to listen to suggestions.

-- Craig Ringer                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services



pgsql-hackers by date:

Previous
From: Josh Berkus
Date:
Subject: Re: Minmax indexes
Next
From: Michael Paquier
Date:
Subject: Re: Minmax indexes