Thread: Re: protocol-level wait-for-LSN

Re: protocol-level wait-for-LSN

From
Ants Aasma
Date:
On Mon, 28 Oct 2024 at 17:51, Peter Eisentraut <peter@eisentraut.org> wrote:
> This is something I hacked together on the way back from pgconf.eu.
> It's highly experimental.
>
> The idea is to do the equivalent of pg_wal_replay_wait() on the protocol
> level, so that it is ideally fully transparent to the application code.
> The application just issues queries, and they might be serviced by a
> primary or a standby, but there is always a correct ordering of reads
> after writes.

The idea is great, I have been wanting something like this for a long
time. For future proofing it might be a good idea to not require the
communicated-waited value to be a LSN.

In a sharded database a Lamport timestamp would allow for sequential
consistency. Lamport timestamp is just some monotonically increasing
value that is eagerly shared between all communicating participants,
including clients. For a single cluster LSNs work fine for this
purpose. But with multiple shards LSNs will not work, unless arranged
as a vector clock which is what I think Matthias proposed.

Even without sharding LSN might not be a final choice. Right now on
the primary the visibility order is not LSN order. So if a connection
does synchronous_commit = off commit, the write location is not even
going to see the commit. By publishing the end of the commit record it
would be better. But I assume at some point we would like to have a
consistent visibility order, which quite likely means using something
other than LSN as the logical clock.

I see the patch names the field LSN, but on the protocol level and for
the client library this is just an opaque 127 byte token. So basically
I'm thinking the naming could be more generic. And for a complete
Lamport timestamp implementation we would need the capability of
extracting the last seen value and another set-if-greater update
operation.

-- 
Ants Aasma
www.cybertec-postgresql.com



Re: protocol-level wait-for-LSN

From
Jelte Fennema-Nio
Date:
On Wed, 30 Oct 2024 at 18:18, Ants Aasma <ants.aasma@cybertec.at> wrote:
> The idea is great, I have been wanting something like this for a long
> time. For future proofing it might be a good idea to not require the
> communicated-waited value to be a LSN.

Yours and Matthias' feedback make total sense I think. From an
implementation perspective I think there are a few things necessary to
enable these wider usecases:
1. The token should be considered opaque for clients (should be documented)
2. The token should be defined as variable length in the protocol
3. We should have a hook to allow postgres extensions to override the
default token generation
4. We should have a hook to allow postgres extensions to override
waiting until the token "timestamp"

> Even without sharding LSN might not be a final choice. Right now on
> the primary the visibility order is not LSN order. So if a connection
> does synchronous_commit = off commit, the write location is not even
> going to see the commit. By publishing the end of the commit record it
> would be better. But I assume at some point we would like to have a
> consistent visibility order, which quite likely means using something
> other than LSN as the logical clock.

I was going to say that the default could probably still be LSN, but
this makes me doubt that. Is there some other token that we can send
now that we could "wait" on instead of the LSN, which would work for.
If not, I think LSN is still probably a good choice as the default. Or
maybe only as a default in case synchronous_commit != off.



Re: protocol-level wait-for-LSN

From
Jesper Pedersen
Date:
Hi,

On 10/30/24 1:45 PM, Jelte Fennema-Nio wrote:
> On Wed, 30 Oct 2024 at 18:18, Ants Aasma <ants.aasma@cybertec.at> wrote:
>> The idea is great, I have been wanting something like this for a long
>> time. For future proofing it might be a good idea to not require the
>> communicated-waited value to be a LSN.
> 
> Yours and Matthias' feedback make total sense I think. From an
> implementation perspective I think there are a few things necessary to
> enable these wider usecases:
> 1. The token should be considered opaque for clients (should be documented)
> 2. The token should be defined as variable length in the protocol
> 3. We should have a hook to allow postgres extensions to override the
> default token generation
> 4. We should have a hook to allow postgres extensions to override
> waiting until the token "timestamp"
> 
>> Even without sharding LSN might not be a final choice. Right now on
>> the primary the visibility order is not LSN order. So if a connection
>> does synchronous_commit = off commit, the write location is not even
>> going to see the commit. By publishing the end of the commit record it
>> would be better. But I assume at some point we would like to have a
>> consistent visibility order, which quite likely means using something
>> other than LSN as the logical clock.
> 
> I was going to say that the default could probably still be LSN, but
> this makes me doubt that. Is there some other token that we can send
> now that we could "wait" on instead of the LSN, which would work for.
> If not, I think LSN is still probably a good choice as the default. Or
> maybe only as a default in case synchronous_commit != off.
> 

There are known wish-lists for a protocol v4, like

  https://github.com/pgjdbc/pgjdbc/blob/master/backend_protocol_v4_wanted_features.md

and a lot of clean-room implementations in drivers and embedded in 
projects/products.

Having LSN would be nice, but to break all existing implementations, no. 
Having to specify with startup parameters how a core message format 
looks like sounds like a bad idea to me,

  https://www.postgresql.org/docs/devel/protocol-message-formats.html

is it.

If we want to start on a protocol v4 thing then that is ok - but there 
are a lot of feature requests for that one.

Best regards,
  Jesper




Re: protocol-level wait-for-LSN

From
Jelte Fennema-Nio
Date:
On Wed, 30 Oct 2024 at 19:04, Jesper Pedersen
<jesper.pedersen@comcast.net> wrote:
> Having LSN would be nice, but to break all existing implementations, no.
> Having to specify with startup parameters how a core message format
> looks like sounds like a bad idea to me,

It would really help if you would explain why you think it's a bad
idea to use a startup parameter for that, instead of simply stating
that you think it needs a major protocol version bump.

The point of enabling it through a startup parameter (aka protocol
option) is exactly so it will not break any existing implementations.
If clients request the protocol option (which as the name suggests is
optional), then they are expected to be able to parse it. If they
don't, then they will get the old message format. So no existing
implementation will be broken. If some middleware/proxy gets a request
for a startup option it does not support it can advertise that to the
client using the NegotiateProtocolVersion message. Allowing the client
to continue in a mode where the option is not enabled.

So, not bumping the major protocol version and enabling this feature
through a protocol option actually causes less breakage in practice.

Also regarding the wishlist. I think it's much more likely for any of
those to happen in a minor version bump and/or protocol option than it
is that we'll bump the major protocol version.

P.S. Like I said in another email on this thread: I think for this
specific case I'd also prefer a separate new message, because that
makes it easier to filter that message out when received by PgBouncer.
But I'd still like to understand your viewpoint better on this,
because adding fields to existing message types is definitely one of
the types of changes that I personally think would be fine for some
protocol changes.