Re: Logical Replication of sequences - Mailing list pgsql-hackers

From Shlok Kyal
Subject Re: Logical Replication of sequences
Date
Msg-id CANhcyEWKhHWFzpdAF6czbwq76NRDNCecDqQNtN6Bomn26mqHFw@mail.gmail.com
Whole thread Raw
In response to Re: Logical Replication of sequences  (Peter Smith <smithpb2250@gmail.com>)
List pgsql-hackers
On Sun, 22 Jun 2025 at 08:05, vignesh C <vignesh21@gmail.com> wrote:
>
> On Thu, 19 Jun 2025 at 11:26, Nisha Moond <nisha.moond412@gmail.com> wrote:
> >
> > Hi,
> >
> > Here are my review comments for v20250610 patches:
> >
> > Patch-0005:sequencesync.c
> >
> > 1) report_error_sequences()
> >
> > In case there are both missing and mismatched sequences, the ERROR
> > message logged is -
> >
> > ```
> > 2025-05-28 14:22:19.898 IST [392259] ERROR:  logical replication
> > sequence synchronization failed for subscription "subs": sequences
> > ("public"."n84") are missing on the publisher. Additionally,
> > parameters differ for the remote and local sequences ("public.n1")
> > ```
> >
> > I feel this error message is quite long. Would it be possible to split
> > it into ERROR and DETAIL? Also, if feasible, we could consider
> > including a HINT, as was done in previous versions.
> >
> > I explored a few possible ways to log this error with a hint. Attached
> > top-up patch has the suggestion implemented. Please see if it seems
> > okay to consider.
>
> This looks good, merged it.
> > ~~~
> >
> > 2) copy_sequences():
> > + /* Retrieve the sequence object fetched from the publisher */
> > + for (int i = 0; i < batch_size; i++)
> > + {
> > + LogicalRepSequenceInfo *sequence_info =
> > lfirst(list_nth_cell(remotesequences, current_index + i));
> > +
> > + if (!strcmp(sequence_info->nspname, nspname) &&
> > + !strcmp(sequence_info->seqname, seqname))
> > + seqinfo = sequence_info;
> > + }
> >
> > The current logic performs a search through the local sequence list
> > for each sequence fetched from the publisher, repeating the traverse
> > of 100(batch size) length of the list per sequence, which may impact
> > performance.
> >
> > To improve efficiency, we can optimize it by sorting the local list
> > and traverses it only once for matching. Kindly review the
> > implementation in the attached top-up patch and consider merging it if
> > it looks good to you.
>
> Looks good, merged it.
>
> > ~~~
> >
> > 3) copy_sequences():
> > + if (message_level_is_interesting(DEBUG1))
> > + ereport(DEBUG1,
> > + errmsg_internal("logical replication synchronization for
> > subscription \"%s\", sequence \"%s\" has finished",
> > + MySubscription->name,
> > + seqinfo->seqname));
> > +
> > + batch_succeeded_count++;
> > + }
> >
> > The current debug log might be a bit confusing when sequences with the
> > same name exist in different schemas. To improve clarity, we could
> > include the schema name in the message, e.g.,
> > " ... sequence "schema"."sequence" has finished".
>
> Modified
>
> > ~~~~
> >
> > Few minor comments in doc - Patch-0006 : logical-replication.sgml
> >
> > 4)
> > +  <para>
> > +   To replicate sequences from a publisher to a subscriber, first publish them
> > +   using <link linkend="sql-createpublication-params-for-all-sequences">
> > +   <command>CREATE PUBLICATION ... FOR ALL SEQUENCES</command></link>.
> > +  </para>
> >
> > I think it would be better to use "To synchronize" instead of "To
> > replicate" here, to maintain consistency and avoid confusion between
> > replication and synchronization.
>
> Modified
>
> > 5)
> > +   <para>
> > +    Update the sequences at the publisher side few times.
> > +<programlisting>
> >
> > /side few /side a few /
> >
> > 6) Can avoid using multiple "or" in the sentences below:
> >
> > 6a)
> > -   a change set or replication set.  Each publication exists in only
> > one database.
> > +   generated from a table or a group of tables or the current state of all
> > +   sequences, and might also be described as a change set or replication set
> >
> > / table or a group of tables/ table, a group of tables/
>
> Modified
>
> > 6b)
> > +   Publications may currently only contain tables or sequences. Objects must be
> > +   added explicitly, except when a publication is created using
> > +   <literal>FOR TABLES IN SCHEMA</literal>, or <literal>FOR ALL
> > TABLES</literal>,
> > +   or <literal>FOR ALL SEQUENCES</literal>. Unlike tables, the current state of
> >
> > / IN SCHEMA</literal>, or <literal>FOR ALL TABLES/ IN
> > SCHEMA</literal>, <literal>FOR ALL TABLES
>
> Modified
>
> Thanks for the comment, the attached v20250622 version patch has the
> changes for the same.
>
Hi Vignesh,

I have reviewed the 0004 patch. Here are my comments:

1. I think we need to update the below comment for function
AlterSubscription_refresh. I feel replacing 'tables' with 'relations'
would be sufficient.
/*
* Build qsorted array of local table oids for faster lookup. This can
* potentially contain all tables in the database so speed of lookup
* is important.
*/

2. Similarly as above comment.
/*
* Walk over the remote tables and try to match them to locally known
* tables. If the table is not known locally create a new state for
* it.
*
* Also builds array of local oids of remote tables for the next step.
*/

3. Similarly as above comment.
/*
* Next remove state for tables we should not care about anymore using
* the data we collected above
*/
Similarly for above comment.

4. Since we are not adding sequences in the list 'sub_remove_rels',
should we only palloc for (the count of no. of tables)? Is it worth
the effort?
/*
* Rels that we want to remove from subscription and drop any slots
* and origins corresponding to them.
*/
sub_remove_rels = palloc(subrel_count * sizeof(SubRemoveRels));

Thanks and Regards,
Shlok Kyal



pgsql-hackers by date:

Previous
From: Tomas Vondra
Date:
Subject: Re: pgsql: Introduce pg_shmem_allocations_numa view
Next
From: Fujii Masao
Date:
Subject: Re: pg_dump misses comments on NOT NULL constraints