Re: Logical Replication of sequences - Mailing list pgsql-hackers
From | Shlok Kyal |
---|---|
Subject | Re: Logical Replication of sequences |
Date | |
Msg-id | CANhcyEWKhHWFzpdAF6czbwq76NRDNCecDqQNtN6Bomn26mqHFw@mail.gmail.com Whole thread Raw |
In response to | Re: Logical Replication of sequences (Peter Smith <smithpb2250@gmail.com>) |
List | pgsql-hackers |
On Sun, 22 Jun 2025 at 08:05, vignesh C <vignesh21@gmail.com> wrote: > > On Thu, 19 Jun 2025 at 11:26, Nisha Moond <nisha.moond412@gmail.com> wrote: > > > > Hi, > > > > Here are my review comments for v20250610 patches: > > > > Patch-0005:sequencesync.c > > > > 1) report_error_sequences() > > > > In case there are both missing and mismatched sequences, the ERROR > > message logged is - > > > > ``` > > 2025-05-28 14:22:19.898 IST [392259] ERROR: logical replication > > sequence synchronization failed for subscription "subs": sequences > > ("public"."n84") are missing on the publisher. Additionally, > > parameters differ for the remote and local sequences ("public.n1") > > ``` > > > > I feel this error message is quite long. Would it be possible to split > > it into ERROR and DETAIL? Also, if feasible, we could consider > > including a HINT, as was done in previous versions. > > > > I explored a few possible ways to log this error with a hint. Attached > > top-up patch has the suggestion implemented. Please see if it seems > > okay to consider. > > This looks good, merged it. > > ~~~ > > > > 2) copy_sequences(): > > + /* Retrieve the sequence object fetched from the publisher */ > > + for (int i = 0; i < batch_size; i++) > > + { > > + LogicalRepSequenceInfo *sequence_info = > > lfirst(list_nth_cell(remotesequences, current_index + i)); > > + > > + if (!strcmp(sequence_info->nspname, nspname) && > > + !strcmp(sequence_info->seqname, seqname)) > > + seqinfo = sequence_info; > > + } > > > > The current logic performs a search through the local sequence list > > for each sequence fetched from the publisher, repeating the traverse > > of 100(batch size) length of the list per sequence, which may impact > > performance. > > > > To improve efficiency, we can optimize it by sorting the local list > > and traverses it only once for matching. Kindly review the > > implementation in the attached top-up patch and consider merging it if > > it looks good to you. > > Looks good, merged it. > > > ~~~ > > > > 3) copy_sequences(): > > + if (message_level_is_interesting(DEBUG1)) > > + ereport(DEBUG1, > > + errmsg_internal("logical replication synchronization for > > subscription \"%s\", sequence \"%s\" has finished", > > + MySubscription->name, > > + seqinfo->seqname)); > > + > > + batch_succeeded_count++; > > + } > > > > The current debug log might be a bit confusing when sequences with the > > same name exist in different schemas. To improve clarity, we could > > include the schema name in the message, e.g., > > " ... sequence "schema"."sequence" has finished". > > Modified > > > ~~~~ > > > > Few minor comments in doc - Patch-0006 : logical-replication.sgml > > > > 4) > > + <para> > > + To replicate sequences from a publisher to a subscriber, first publish them > > + using <link linkend="sql-createpublication-params-for-all-sequences"> > > + <command>CREATE PUBLICATION ... FOR ALL SEQUENCES</command></link>. > > + </para> > > > > I think it would be better to use "To synchronize" instead of "To > > replicate" here, to maintain consistency and avoid confusion between > > replication and synchronization. > > Modified > > > 5) > > + <para> > > + Update the sequences at the publisher side few times. > > +<programlisting> > > > > /side few /side a few / > > > > 6) Can avoid using multiple "or" in the sentences below: > > > > 6a) > > - a change set or replication set. Each publication exists in only > > one database. > > + generated from a table or a group of tables or the current state of all > > + sequences, and might also be described as a change set or replication set > > > > / table or a group of tables/ table, a group of tables/ > > Modified > > > 6b) > > + Publications may currently only contain tables or sequences. Objects must be > > + added explicitly, except when a publication is created using > > + <literal>FOR TABLES IN SCHEMA</literal>, or <literal>FOR ALL > > TABLES</literal>, > > + or <literal>FOR ALL SEQUENCES</literal>. Unlike tables, the current state of > > > > / IN SCHEMA</literal>, or <literal>FOR ALL TABLES/ IN > > SCHEMA</literal>, <literal>FOR ALL TABLES > > Modified > > Thanks for the comment, the attached v20250622 version patch has the > changes for the same. > Hi Vignesh, I have reviewed the 0004 patch. Here are my comments: 1. I think we need to update the below comment for function AlterSubscription_refresh. I feel replacing 'tables' with 'relations' would be sufficient. /* * Build qsorted array of local table oids for faster lookup. This can * potentially contain all tables in the database so speed of lookup * is important. */ 2. Similarly as above comment. /* * Walk over the remote tables and try to match them to locally known * tables. If the table is not known locally create a new state for * it. * * Also builds array of local oids of remote tables for the next step. */ 3. Similarly as above comment. /* * Next remove state for tables we should not care about anymore using * the data we collected above */ Similarly for above comment. 4. Since we are not adding sequences in the list 'sub_remove_rels', should we only palloc for (the count of no. of tables)? Is it worth the effort? /* * Rels that we want to remove from subscription and drop any slots * and origins corresponding to them. */ sub_remove_rels = palloc(subrel_count * sizeof(SubRemoveRels)); Thanks and Regards, Shlok Kyal
pgsql-hackers by date: