Re: Logical Replication of sequences - Mailing list pgsql-hackers
From | shveta malik |
---|---|
Subject | Re: Logical Replication of sequences |
Date | |
Msg-id | CAJpy0uDOcjk-k2y1FVeBS0RQeCGDKbE14f7qES6TU+RUVW7eXA@mail.gmail.com Whole thread Raw |
In response to | Re: Logical Replication of sequences (Peter Smith <smithpb2250@gmail.com>) |
Responses |
Re: Logical Replication of sequences
|
List | pgsql-hackers |
On Fri, Jul 4, 2025 at 3:53 PM vignesh C <vignesh21@gmail.com> wrote: > > On Tue, 1 Jul 2025 at 15:20, shveta malik <shveta.malik@gmail.com> wrote: > > On Mon, Jun 30, 2025 at 3:21 PM Nisha Moond <nisha.moond412@gmail.com> wrote: > > > > > > Please find the attached v20250630 patch set addressing above comments > > > and other comments in [1],[2],[3] and [4]. > > > > Thanks for the patches. I am still in process of reviewing it but > > please find few comments: > > > > 1) > > + if (pset.sversion >= 180000) > > + appendPQExpBuffer(&buf, > > + ",\n puballsequences AS \"%s\"", > > + gettext_noop("All sequences")); > > > > The server version check throughout the patch can be modified to 19000 > > as a new branch is created now. > > Modified > > > 2) > > + bool all_pub; /* Special publication for all tables, > > + * sequecnes */ > > > > a) Typo: sequecnes --> sequences > > Modified > > > b) It is not clear from the comment that when will it be true? Will it > > be set when either of all-tables or all-sequences is given or does it > > need both? > > Updated the comments > > > 3) > > postgres=# create publication pub1 for all sequences WITH ( PUBLISH='delete'); > > CREATE PUBLICATION > > postgres=# create publication pub2 for all tables, sequences WITH > > (PUBLISH='update'); > > CREATE PUBLICATION > > > > For the first command, 'WITH ( publication_parameter..' is useless. > > For the second command, it is applicable only for 'all tables'. > > > > a) I am not sure if we even allow WITH in the first command? > > b) In the second command, even if we allow it, there should be some > > sort of NOTICE informing that it is applicable only to 'TABLES'. > > > > Thoughts? > > > > Also we allowed altering publication_parameter for all-sequences publication: > > > > postgres=# alter publication pub1 set (publish='insert,update'); > > ALTER PUBLICATION > > > > c) Should this be restricted as well? Thoughts? > > We have documented that the parameter of with clause is not applicable > for sequences. I feel that all the above statements are ok with the > documentation mentioned. > > Regarding the comments from [1]. > > 5) > > In LogicalRepSyncSequences, why are we allocating it in a permanent > > memory context? > > > > + /* Allocate the tracking info in a permanent memory context. */ > > + oldctx = MemoryContextSwitchTo(CacheMemoryContext); > > + foreach_ptr(SubscriptionRelState, seq_state, sequences) > > + { > > + SubscriptionRelState *rstate = palloc(sizeof(SubscriptionRelState)); > > + > > + memcpy(rstate, seq_state, sizeof(SubscriptionRelState)); > > + sequences_not_synced = lappend(sequences_not_synced, rstate); > > + } > > + MemoryContextSwitchTo(oldctx); > > > > Same for 'seq_info' allocation. > > When we are in between a transaction we will be using > TopTransactionContext. We can palloc() in TopTransactionContext and > safely use that memory throughout the transaction. But we cannot > cannot access memory allocated in TopTransactionContext after > CommitTransaction() finishes, because TopTransactionContext is > explicitly reset (or deleted) at the end of the transaction. > This is the reason we have to use CacheMemoryContext here. > Okay. I see. Thanks for the details. Can we add this info in comments, something like: Allocate the tracking information in a permanent memory context to ensure it remains accessible across multiple transactions during the sequence copy process. The memory will be released once the copy is finished. > The rest of the comments are fixed. Thank you for the patches. I am not done with review yet, but please find the comments so far: 1) LogicalRepSyncSequences() + /* Allocate the sequences information in a permanent memory context. */ + oldctx = MemoryContextSwitchTo(CacheMemoryContext); + + /* Get the sequences that should be synchronized. */ + subsequences = GetSubscriptionRelations(subid, false, true, true); Here too we are using CacheMemoryContext? 'subsequences' is only used in given function and not in copy_sequeneces. I guess, we start multiple transactions in given function and thus make it essential to allocate subsequences in CacheMemoryContext. But why are we doing StartTransactionCommand twice? Is this intentional? Can we do it once before GetSubscriptionRelations() and commit it once after for-loop is over? IIUC, then we will not need CacheMemoryContext here. 2) logicalrep_seqsyncworker_set_failuretime() a) This function is extern'ed in worker_internal.h. But I do not see its usage outside launcher.c. Is it a mistake? b) Also, do we really need logicalrep_seqsyncworker_set_failuretime? Is it better to move its logic instead in logicalrep_seqsyncworker_set_failuretime()? 3) SyncFetchRelationStates: Earlier the name was FetchTableStates. If we really want to use the 'Sync' keyword, we can name it FetchRelationSyncStates, as we are fetching sync-status only. Thoughts? 4) ProcessSyncingSequencesForApply(): + + if (rstate->state != SUBREL_STATE_INIT) + continue; Why do we expect that rstate->state is not INIT at this time when we have fetched only INIT sequences in sequence_states_not_ready. Shall this be assert? If there is a valid scenario where it can be READY here, please add comments. 5) + if (!MyLogicalRepWorker->sequencesync_failure_time || + TimestampDifferenceExceeds(MyLogicalRepWorker->sequencesync_failure_time, + now, wal_retrieve_retry_interval)) + { + MyLogicalRepWorker->sequencesync_failure_time = 0; + + logicalrep_worker_launch(WORKERTYPE_SEQUENCESYNC, + MyLogicalRepWorker->dbid, + MySubscription->oid, + MySubscription->name, + MyLogicalRepWorker->userid, + InvalidOid, + DSM_HANDLE_INVALID); + break; + } We set sequencesync_failure_time to 0, but if logicalrep_worker_launch is not able to launch the worker due to some reason, next time it will not even wait for 'wal_retrieve_retry_interval time' to attempt restarting it again. Is that intentional? In other workflows such as while launching table-sync or apply worker, this scenario does not arise. This is because we maintain start_time there which can never be 0 instead of failure time and before attempting to start the workers, we set start_time to current time. The seq-sync failure-time OTOH is only set to non-null in logicalrep_seqsyncworker_failure() and it is not necessary that we will hit that function as the logicalrep_worker_launch() may fail before that itself. Do you think we shall maintain start-time instead of failure-time for seq-sync worker as well? Or is there any other way to handle it? thanks Shveta
pgsql-hackers by date: