Re: Logical Replication of sequences - Mailing list pgsql-hackers

From vignesh C
Subject Re: Logical Replication of sequences
Date
Msg-id CALDaNm0eQ3Uy0O97QEUB_r=whGjTnfmZFMKnm70QxKr+mAK-hg@mail.gmail.com
Whole thread Raw
In response to Re: Logical Replication of sequences  (Amit Kapila <amit.kapila16@gmail.com>)
Responses Re: Logical Replication of sequences
List pgsql-hackers
On Wed, 22 Oct 2025 at 10:36, Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Tue, Oct 21, 2025 at 8:11 PM vignesh C <vignesh21@gmail.com> wrote:
> >
> > On Tue, 21 Oct 2025 at 03:49, Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > > ---
> > >  /*
> > > - * Check and log a warning if the publisher has subscribed to the same table,
> > > - * its partition ancestors (if it's a partition), or its partition children (if
> > > - * it's a partitioned table), from some other publishers. This check is
> > > - * required in the following scenarios:
> > > + * Check and log a warning if the publisher has subscribed to the same relation
> > > + * (table or sequence), its partition ancestors (if it's a partition), or its
> > > + * partition children (if it's a partitioned table), from some other
> > > publishers.
> > > + * This check is required in the following scenarios:
> > >   *
> > >   * 1) For CREATE SUBSCRIPTION and ALTER SUBSCRIPTION ... REFRESH PUBLICATION
> > >   *    statements with "copy_data = true" and "origin = none":
> > >   *    - Warn the user that data with an origin might have been copied.
> > > - *    - This check is skipped for tables already added, as incremental sync via
> > > - *      WAL allows origin tracking. The list of such tables is in
> > > - *      subrel_local_oids.
> > > + *    - This check is skipped for tables and sequences already added, as
> > > + *      incremental sync via WAL allows origin tracking. The list of
> > > such tables
> > > + *      is in subrel_local_oids.
> > >   *
> > >   * 2) For CREATE SUBSCRIPTION and ALTER SUBSCRIPTION ... REFRESH PUBLICATION
> > >   *    statements with "retain_dead_tuples = true" and "origin = any", and for
> > > @@ -2338,13 +2440,19 @@ AlterSubscriptionOwner_oid(Oid subid, Oid newOwnerId)
> > >   *    - Warn the user that only conflict detection info for local changes on
> > >   *      the publisher is retained. Data from other origins may lack sufficient
> > >   *      details for reliable conflict detection.
> > > + *    - This check targets for tables only.
> > >   *    - See comments atop worker.c for more details.
> > > + *
> > > + * 3) For ALTER SUBSCRIPTION ... REFRESH SEQUENCE statements with "origin =
> > > + *    none":
> > > + *    - Warn the user that sequence data from another origin might have been
> > > + *      copied.
> > >   */
> > >
> > > While this function is well documented, I find it's quite complex, and
> > > this patch adds to that complexity. The function has 9 arguments,
> > > making it difficult to understand which combinations of arguments
> > > enable which checks. For example, the function header comment doesn't
> > > explain when to use the only_sequences parameter. At first, I thought
> > > only_sequences should be set to true when checking if the publisher
> > > has subscribed to sequences from other publishers, but looking at the
> > > code, I discovered it doesn't check sequences when check_rdt is true:
> > >
> > > +   if (walrcv_server_version(wrconn) < 190000 || check_rdt)
> > > +       appendStringInfo(&cmd, query,
> > > +                        "(SELECT relid, TRUE as istable FROM
> > > pg_get_publication_tables(P.pubname))");
> > > +   else if (only_sequences)
> > > +       appendStringInfo(&cmd, query,
> > > +                        "(SELECT relid, FALSE as istable FROM
> > > pg_get_publication_sequences(P.pubname))");
> > > +   else
> > > +       appendStringInfo(&cmd, query,
> > > +                        "(SELECT relid, TRUE as istable FROM
> > > pg_get_publication_tables(P.pubname) UNION ALL"
> > > +                        " SELECT relid, FALSE as istable FROM
> > > pg_get_publication_sequences(P.pubname))");
> > > +
> > >
> > > I find that the complexity might stem from checking different cases in
> > > one function,  but I don't have better ideas to improve the logic for
> > > now. I think we can at least describe what the caller can expect from
> > > specifying only_sequence to true.
> >
> > Split this function into check_publications_origin_sequences and
> > check_publications_origin_tables to reduce the complexity. After this
> > change we log two warnings if both tables and sequences are subscriber
> > to the same tables and sequences like:
> >
>
> I think the case where both WARNINGs will be displayed is rare so it
> should be okay as it simplifies the code quite a bit. Another thing is
> we need to query twice but as this happens during DDL and only for
> very specific cases that should also be okay. We can anyway merge
> these later if we see any problem with it but for now it would be
> better to prefer code simplicity.

I agree.

> When check_publications_origin_sequences() is called from Alter
> Subscription ... Refresh Publication ... or Create Subscription ...
> code path then shouldn't we check copy_data as well along with origin
> as none? Because, if copy_data is false, we should have added a
> sequence in the READY state, so we don't need to fetch its values.

Fixed this.

> I have added a few comments in this new function and made a number of
> other cosmetic improvements in the attached.

Thanks for the changes, I merged it.

The attached patch has the changes for the same.
I have also addressed the other comments: a) Shveta's comments at [1]
b) Peter's comments at [2] & [3] c) Shveta's 2nd patch comments at [4]
and d) Chao's comment#12 from [5] which was pending.

[1] - https://www.postgresql.org/message-id/CAJpy0uDesLXjpDiDs6fA8HMr419D2YrXb7tA10e9Bp%2BuCypZ_Q%40mail.gmail.com
[2] - https://www.postgresql.org/message-id/CAHut%2BPtp%2BFMHgS-kaKZwWEoNeyomecUDGECvoCpfx4_eCUDyUA%40mail.gmail.com
[3] - https://www.postgresql.org/message-id/CAHut%2BPvx945dJGhMtf2Rv5p8Xn4xQke65MfO-UwK3cRPnsXFRQ%40mail.gmail.com
[4] - https://www.postgresql.org/message-id/CAJpy0uDxms8ynrHWXHGFuxDLA7QzDLLASqpqdWnD%3DLa8UJPt7Q%40mail.gmail.com
[5] - https://www.postgresql.org/message-id/598FC353-8E9A-4857-A125-740BE24DCBEB%40gmail.com


Regards,
Vignesh

Attachment

pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: Making pg_rewind faster
Next
From: Alexander Lakhin
Date:
Subject: Re: IO in wrong state on riscv64