Re: Logical Replication of sequences - Mailing list pgsql-hackers
From | shveta malik |
---|---|
Subject | Re: Logical Replication of sequences |
Date | |
Msg-id | CAJpy0uAtpiczptnO2prS1RLcLEd-gxYCQCc-HruA=UwwkBFX4g@mail.gmail.com Whole thread Raw |
In response to | Re: Logical Replication of sequences (vignesh C <vignesh21@gmail.com>) |
List | pgsql-hackers |
On Mon, Jul 14, 2025 at 10:03 AM vignesh C <vignesh21@gmail.com> wrote: > > On Fri, 11 Jul 2025 at 14:26, shveta malik <shveta.malik@gmail.com> wrote: > > > > On Wed, Jul 9, 2025 at 4:11 PM vignesh C <vignesh21@gmail.com> wrote: > > > > > > > > > 3) > > > > SyncFetchRelationStates: > > > > Earlier the name was FetchTableStates. If we really want to use the > > > > 'Sync' keyword, we can name it FetchRelationSyncStates, as we are > > > > fetching sync-status only. Thoughts? > > > > > > Instead of FetchRelationSyncStates, I preferred FetchRelationStates, > > > and changed it to FetchRelationStates. > > > > > > > Okay, LGTM. > > > > > > 5) > > > > + if (!MyLogicalRepWorker->sequencesync_failure_time || > > > > + TimestampDifferenceExceeds(MyLogicalRepWorker->sequencesync_failure_time, > > > > + now, wal_retrieve_retry_interval)) > > > > + { > > > > + MyLogicalRepWorker->sequencesync_failure_time = 0; > > > > + > > > > + logicalrep_worker_launch(WORKERTYPE_SEQUENCESYNC, > > > > + MyLogicalRepWorker->dbid, > > > > + MySubscription->oid, > > > > + MySubscription->name, > > > > + MyLogicalRepWorker->userid, > > > > + InvalidOid, > > > > + DSM_HANDLE_INVALID); > > > > + break; > > > > + } > > > > > > > > We set sequencesync_failure_time to 0, but if logicalrep_worker_launch > > > > is not able to launch the worker due to some reason, next time it will > > > > not even wait for 'wal_retrieve_retry_interval time' to attempt > > > > restarting it again. Is that intentional? > > > > > > > > In other workflows such as while launching table-sync or apply worker, > > > > this scenario does not arise. This is because we maintain start_time > > > > there which can never be 0 instead of failure time and before > > > > attempting to start the workers, we set start_time to current time. > > > > The seq-sync failure-time OTOH is only set to non-null in > > > > logicalrep_seqsyncworker_failure() and it is not necessary that we > > > > will hit that function as the logicalrep_worker_launch() may fail > > > > before that itself. Do you think we shall maintain start-time instead > > > > of failure-time for seq-sync worker as well? Or is there any other way > > > > to handle it? > > > > > > I preferred the suggestion from [1]. Modified it accordingly. > > > > Okay, works for me. > > > > > The attached v20250709 version patch has the changes for the same. > > > > > > > Thank You for the patches. Please find a few comments: > > > > 1) > > Shall we update pg_publication doc as well to indicate that pubinsert, > > pubupdate, pubdelete , pubtruncate, pubviaroot are meaningful only > > when publishing tables. For sequences, these have no meaning. > > Since it is clearly mentioned it is for tables, I felt no need to > mention again it is not applicable for sequences. > > > 2) > > Shall we have walrcv_disconnect() after copy is done in > > LogicalRepSyncSequences() > > There is a cleanup function registered for the worker to handle this > at the worker exit. So this is not required. > > > 3) > > Do we really need for-loop in ProcessSyncingSequencesForApply? I think > > this function is inspired from ProcessSyncingTablesForApply() but > > there we need different tablesync workers for different tables. For > > sequences, that is not the case and thus for-loop can be omitted. If > > we do so, we can amend the comments too where it says " Walk over all > > subscription sequences....." > > Handled in the previous version > > > 4) > > +# Confirm that the warning for parameters differing is logged. > > +$node_subscriber->wait_for_log( > > > > We can drop regress_seq_sub on the publisher now and check for missing > > warnings as the next step. > > Modified > > > 5) > > I am revisiting the test given in [1], I see there is some document change as: > > > > + Incremental sequence changes are not replicated. Although the data in > > + serial or identity columns backed by sequences will be replicated as part > > + of the table, the sequences themselves do not replicate ongoing changes. > > + On the subscriber, a sequence will retain the last value it synchronized > > + from the publisher. If the subscriber is used as a read-only database, > > + then this should typically not be a problem. If, however, some kind of > > + switchover or failover to the subscriber database is intended, then the > > + sequences would need to be updated to the latest values, either by > > + executing <link > > linkend="sql-altersubscription-params-refresh-publication-sequences"> > > + <command>ALTER SUBSCRIPTION ... REFRESH PUBLICATION > > SEQUENCES</command></link> > > + or by copying the current data from the publisher (perhaps using > > + <command>pg_dump</command>) or by determining a sufficiently high value > > + from the tables themselves. > > > > But this doc specifically mentions a failover case. It does not > > mention the case presented in [1] i.e. if user is trying to use > > sequence to populate identity column of a "subscribed" table where the > > sequence is also synced originally from publisher, then he may end up > > with corrupted state > > of IDENTITY column, and thus such cases should be used with caution. > > Please review once and see if we need to mention this and the example > > too. > > In this case, the identity column data—as well as the non-identity > columns—will be sent by the publisher as part of the row data. This > behavior appears consistent with how non-sequence objects are handled > in a publication. > The following documentation note should be sufficient, as it already > clarifies that "it will retain the last value it synchronized from the > publisher": Not very sure about it, but let me think more. > On the subscriber, a sequence will retain the last value it > synchronized from the publisher. If the subscriber is used as a > read-only database, this should typically not pose a problem. > But if you have something in mind which should be added let me know. Will let you know if I come up with something better to add here. > The attached patch has the changes for the same. Thank You. Few comments: 1) patch 005 has trailing whitespaces issue. 2) In LogicalRepSyncSequences(), do we really need this: + seq_count = list_length(subsequences);; seq_count is only used at the end to figure out if we really had some sequences. We can simply check subsequences against NIL for that purpose. If we really want to use list_length as a check, then we shall move it at the end where we use it. 3) LogicalRepSyncSequences(): + MemoryContext oldctx; we can move this to a for-loop where it is being used. 4) The only usage of sequence_states_not_ready is this now: + /* No sequences to sync, so nothing to do */ + if (list_length(sequence_states_not_ready) == 0) + return; Now, do we need to have it as a List? 5) + <sect2 id="missing-sequences"> + <title>Missing Sequences</title> + <para> + During sequence synchronization, if a sequence is dropped on the + publisher. An ERROR is logged listing the missing sequences before the + process exits. The apply worker detects this failure and repeatedly + respawns the sequence synchronization worker to continue the + synchronization process until the sequences are created in the publisher. + See also <link linkend="guc-wal-retrieve-retry-interval"><varname>wal_retrieve_retry_interval</varname></link>. + </para> + <para> + To resolve this, either use + <link linkend="sql-createsequence"><command>CREATE SEQUENCE</command></link> + to recreate the missing sequence on the publisher, or, if the sequence are + no longer required, execute <link linkend="sql-altersubscription-params-refresh-publication"> + <command>ALTER SUBSCRIPTION ... REFRESH PUBLICATION</command></link> + to remove the stale sequence entries from synchronization in the subscriber. + </para> + </sect2> + Please see if this looks appropriate, I have added drop-sequence option as well and corrected few trivial things: During sequence synchronization, if a sequence is dropped on the publisher, an ERROR is logged listing the missing sequences before the process exits. The apply worker detects this failure and repeatedly respawns the sequence synchronization worker to continue the synchronization process until the sequences are either recreated on the publisher, dropped on the subscriber, or removed from the synchronization list. To resolve this issue, either recreate the missing sequence on the publisher using CREATE SEQUENCE, drop the sequences on the subscriber if they are no longer needed using DROP SEQUENCE, or run ALTER SUBSCRIPTION ... REFRESH PUBLICATION to remove these sequences from synchronization on the subscriber. thanks Shveta
pgsql-hackers by date: