Re: logical decoding and replication of sequences, take 2 - Mailing list pgsql-hackers

From Tomas Vondra
Subject Re: logical decoding and replication of sequences, take 2
Date
Msg-id 1f96b282-cb90-8302-cee8-7b3f5576a31c@enterprisedb.com
Whole thread Raw
In response to Re: logical decoding and replication of sequences, take 2  (Masahiko Sawada <sawada.mshk@gmail.com>)
Responses Re: logical decoding and replication of sequences, take 2  ("Gregory Stark (as CFM)" <stark.cfm@gmail.com>)
Re: logical decoding and replication of sequences, take 2  (Alvaro Herrera <alvherre@alvh.no-ip.org>)
Re: logical decoding and replication of sequences, take 2  (Peter Eisentraut <peter.eisentraut@enterprisedb.com>)
List pgsql-hackers
On 3/30/23 05:15, Masahiko Sawada wrote:
>
> ...
>
>>>
>>> Perhaps it'd be reasonable to tie the "protocol version" to subscriber
>>> capabilities, so that a protocol version guarantees what message types
>>> the subscriber understands. So we could increment the protocol version,
>>> check it in pgoutput_startup and then error-out in the sequence callback
>>> if the subscriber version is too old.
>>>
>>> That'd be nicer in the sense that we'd generate nicer error message on
>>> the publisher, not an "unknown message type" on the subscriber.
>>>
>>
>> Agreed. So, we can probably formalize this rule such that whenever in
>> a newer version publisher we want to send additional information which
>> the old version subscriber won't be able to handle, the error should
>> be raised at the publisher by using protocol version number.
> 
> +1
> 

OK, I took a stab at this, see the attached 0007 patch which bumps the
protocol version, and allows the subscriber to specify "sequences" when
starting the replication, similar to what we do for the two-phase stuff.

The patch essentially adds 'sequences' to the replication start command,
depending on the server version, but it can be overridden by "sequences"
subscription option. The patch is pretty small, but I wonder how much
smarter this should be ...


I think there are about 4 cases that we need to consider

1) there are no sequences in the publication -> OK

2) publication with sequences, subscriber knows how to apply (and
specifies "sequences on" either automatically or explicitly) -> OK

3) publication with sequences, subscriber explicitly disabled them by
specifying "sequences off" in startup -> OK

4) publication with sequences, subscriber without sequence support (e.g.
older Postgres release) -> PROBLEM (?)


The reason why I think (4) may be a problem is that my opinion is we
shouldn't silently drop stuff that is meant to be part of the
publication. That is, if someone creates a publication and adds a
sequence to it, he wants to replicate the sequence.

But the current behavior is the old subscriber connects, doesn't specify
the 'sequences on' so the publisher disables that and then simply
ignores sequence increments during decoding.

I think we might want to detect this and error out instead of just
skipping the change, but that needs to happen later, only when the
publication actually has any sequences ...

I don't want to over-think / over-engineer this, though, so I wonder
what are your opinions on this?

There's a couple XXX comments in the code, mostly about stuff I left out
when copying the two-phase stuff. For example, we store two-phase stuff
in the replication slot itself - I don't think we need to do that for
sequences, though.

Another thing what to do about ALTER SUBSCRIPTION - at the moment it's
not possible to change the "sequences" option, but maybe we should allow
that? But then we'd need to re-sync all the sequences, somehow ...


Aside from that, I've also added 0005, which does the sync interlock in
a slightly different way - instead of a custom function for locking
sequence, it allows LOCK on sequences. Peter Eisentraut suggested doing
it like this, it's simpler, and I can't see what issues it might cause.
The patch should update LOCK documentation, I haven't done that yet.
Ultimately it should all be merged into 0003, of course.


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
Attachment

pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: O(n) tasks cause lengthy startups and checkpoints
Next
From: Andres Freund
Date:
Subject: Re: regression coverage gaps for gist and hash indexes