Re: logical decoding and replication of sequences, take 2 - Mailing list pgsql-hackers

From Masahiko Sawada
Subject Re: logical decoding and replication of sequences, take 2
Date
Msg-id CAD21AoDjCD==p76qOp2=iEmzSGaRvOdnhAdjgYJVp_wP8wR2Lw@mail.gmail.com
Whole thread Raw
In response to Re: logical decoding and replication of sequences, take 2  (Tomas Vondra <tomas.vondra@enterprisedb.com>)
List pgsql-hackers
On Wed, Mar 29, 2023 at 3:34 AM Tomas Vondra
<tomas.vondra@enterprisedb.com> wrote:
>
> On 3/28/23 18:34, Masahiko Sawada wrote:
> > On Mon, Mar 27, 2023 at 11:46 PM Tomas Vondra
> > <tomas.vondra@enterprisedb.com> wrote:
> >>
> >>
> >>
> >> On 3/27/23 03:32, Masahiko Sawada wrote:
> >>> Hi,
> >>>
> >>> On Fri, Mar 24, 2023 at 7:26 AM Tomas Vondra
> >>> <tomas.vondra@enterprisedb.com> wrote:
> >>>>
> >>>> I merged the earlier "fixup" patches into the relevant parts, and left
> >>>> two patches with new tweaks (deducing the corrent "WAL" state from the
> >>>> current state read by copy_sequence), and the interlock discussed here.
> >>>>
> >>>
> >>> Apart from that, how does the publication having sequences work with
> >>> subscribers who are not able to handle sequence changes, e.g. in a
> >>> case where PostgreSQL version of publication is newer than the
> >>> subscriber? As far as I tested the latest patches, the subscriber
> >>> (v15)  errors out with the error 'invalid logical replication message
> >>> type "Q"' when receiving a sequence change. I'm not sure it's sensible
> >>> behavior. I think we should instead either (1) deny starting the
> >>> replication if the subscriber isn't able to handle sequence changes
> >>> and the publication includes that, or (2) not send sequence changes to
> >>> such subscribers.
> >>>
> >>
> >> I agree the "invalid message" error is not great, but it's not clear to
> >> me how to do either (1). The trouble is we don't really know if the
> >> publication contains (or will contain) sequences. I mean, what would
> >> happen if the replication starts and then someone adds a sequence?
> >>
> >> For (2), I think that's not something we should do - silently discarding
> >> some messages seems error-prone. If the publication includes sequences,
> >> presumably the user wanted to replicate those. If they want to replicate
> >> to an older subscriber, create a publication without sequences.
> >>
> >> Perhaps the right solution would be to check if the subscriber supports
> >> replication of sequences in the output plugin, while attempting to write
> >> the "Q" message. And error-out if the subscriber does not support it.
> >
> > It might be related to this topic; do we need to bump the protocol
> > version? The commit 64824323e57d introduced new streaming callbacks
> > and bumped the protocol version. I think the same seems to be true for
> > this change as it adds sequence_cb callback.
> >
>
> It's not clear to me what should be the exact behavior?
>
> I mean, imagine we're opening a connection for logical replication, and
> the subscriber does not handle sequences. What should the publisher do?
>
> (Note: The correct commit hash is 464824323e57d.)

Thanks.

>
> I don't think the streaming is a good match for sequences, because of a
> couple important differences ...
>
> Firstly, streaming determines *how* the changes are replicated, not what
> gets replicated. It doesn't (silently) filter out "bad" events that the
> subscriber doesn't know how to apply. If the subscriber does not know
> how to deal with streamed xacts, it'll still get the same changes
> exactly per the publication definition.
>
> Secondly, the default value is "streming=off", i.e. the subscriber has
> to explicitly request streaming when opening the connection. And we
> simply check it against the negotiated protocol version, i.e. the check
> in pgoutput_startup() protects against subscriber requesting a protocol
> v1 but also streaming=on.
>
> I don't think we can/should do more check at this point - we don't know
> what's included in the requested publications at that point, and I doubt
> it's worth adding because we certainly can't predict if the publication
> will be altered to include/decode sequences in the future.

True. That's a valid argument.

>
> Speaking of precedents, TRUNCATE is probably a better one, because it's
> a new action and it determines *what* the subscriber can handle. But
> that does exactly the thing we do for sequences - if you open a
> connection from PG10 subscriber (truncate was added in PG11), and the
> publisher decodes a truncate, subscriber will do:
>
> 2023-03-28 20:29:46.921 CEST [2357609] ERROR:  invalid logical
>    replication message type "T"
> 2023-03-28 20:29:46.922 CEST [2356534] LOG:  worker process: logical
>    replication worker for subscription 16390 (PID 2357609) exited with
>    exit code 1
>
> I don't see why sequences should do anything else. If you need to
> replicate to such subscriber, create a publication that does not have
> 'sequence' in the publish option ...
>

I didn't check TRUNCATE cases, yes, sequence replication is a good
match for them. So it seems we don't need to do anything.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com



pgsql-hackers by date:

Previous
From: "Drouvot, Bertrand"
Date:
Subject: Re: Reconcile stats in find_tabstat_entry() and get rid of PgStat_BackendFunctionEntry
Next
From: Kyotaro Horiguchi
Date:
Subject: Re: Should vacuum process config file reload more often