Re: Logical Replication of sequences - Mailing list pgsql-hackers
From | shveta malik |
---|---|
Subject | Re: Logical Replication of sequences |
Date | |
Msg-id | CAJpy0uA6ugJN+tBwv38mG+6vf-uMuQoqxaZCX-mg1qBWX+=Bkw@mail.gmail.com Whole thread Raw |
In response to | Re: Logical Replication of sequences (shveta malik <shveta.malik@gmail.com>) |
List | pgsql-hackers |
On Mon, Jul 7, 2025 at 2:37 PM shveta malik <shveta.malik@gmail.com> wrote: > > On Fri, Jul 4, 2025 at 3:53 PM vignesh C <vignesh21@gmail.com> wrote: > > > > On Tue, 1 Jul 2025 at 15:20, shveta malik <shveta.malik@gmail.com> wrote: > > > On Mon, Jun 30, 2025 at 3:21 PM Nisha Moond <nisha.moond412@gmail.com> wrote: > > > > > > > > Please find the attached v20250630 patch set addressing above comments > > > > and other comments in [1],[2],[3] and [4]. > > > > > > Thanks for the patches. I am still in process of reviewing it but > > > please find few comments: > > > > > > 1) > > > + if (pset.sversion >= 180000) > > > + appendPQExpBuffer(&buf, > > > + ",\n puballsequences AS \"%s\"", > > > + gettext_noop("All sequences")); > > > > > > The server version check throughout the patch can be modified to 19000 > > > as a new branch is created now. > > > > Modified > > > > > 2) > > > + bool all_pub; /* Special publication for all tables, > > > + * sequecnes */ > > > > > > a) Typo: sequecnes --> sequences > > > > Modified > > > > > b) It is not clear from the comment that when will it be true? Will it > > > be set when either of all-tables or all-sequences is given or does it > > > need both? > > > > Updated the comments > > > > > 3) > > > postgres=# create publication pub1 for all sequences WITH ( PUBLISH='delete'); > > > CREATE PUBLICATION > > > postgres=# create publication pub2 for all tables, sequences WITH > > > (PUBLISH='update'); > > > CREATE PUBLICATION > > > > > > For the first command, 'WITH ( publication_parameter..' is useless. > > > For the second command, it is applicable only for 'all tables'. > > > > > > a) I am not sure if we even allow WITH in the first command? > > > b) In the second command, even if we allow it, there should be some > > > sort of NOTICE informing that it is applicable only to 'TABLES'. > > > > > > Thoughts? > > > > > > Also we allowed altering publication_parameter for all-sequences publication: > > > > > > postgres=# alter publication pub1 set (publish='insert,update'); > > > ALTER PUBLICATION > > > > > > c) Should this be restricted as well? Thoughts? > > > > We have documented that the parameter of with clause is not applicable > > for sequences. I feel that all the above statements are ok with the > > documentation mentioned. > > > > Regarding the comments from [1]. > > > 5) > > > In LogicalRepSyncSequences, why are we allocating it in a permanent > > > memory context? > > > > > > + /* Allocate the tracking info in a permanent memory context. */ > > > + oldctx = MemoryContextSwitchTo(CacheMemoryContext); > > > + foreach_ptr(SubscriptionRelState, seq_state, sequences) > > > + { > > > + SubscriptionRelState *rstate = palloc(sizeof(SubscriptionRelState)); > > > + > > > + memcpy(rstate, seq_state, sizeof(SubscriptionRelState)); > > > + sequences_not_synced = lappend(sequences_not_synced, rstate); > > > + } > > > + MemoryContextSwitchTo(oldctx); > > > > > > Same for 'seq_info' allocation. > > > > When we are in between a transaction we will be using > > TopTransactionContext. We can palloc() in TopTransactionContext and > > safely use that memory throughout the transaction. But we cannot > > cannot access memory allocated in TopTransactionContext after > > CommitTransaction() finishes, because TopTransactionContext is > > explicitly reset (or deleted) at the end of the transaction. > > This is the reason we have to use CacheMemoryContext here. > > > > Okay. I see. Thanks for the details. Can we add this info in comments, > something like: > Allocate the tracking information in a permanent memory context to > ensure it remains accessible across multiple transactions during the > sequence copy process. The memory will be released once the copy is > finished. > > > The rest of the comments are fixed. > > Thank you for the patches. I am not done with review yet, but please > find the comments so far: > > > 1) > LogicalRepSyncSequences() > > + /* Allocate the sequences information in a permanent memory context. */ > + oldctx = MemoryContextSwitchTo(CacheMemoryContext); > + > + /* Get the sequences that should be synchronized. */ > + subsequences = GetSubscriptionRelations(subid, false, true, true); > > Here too we are using CacheMemoryContext? 'subsequences' is only used > in given function and not in copy_sequeneces. I guess, we start > multiple transactions in given function and thus make it essential to > allocate subsequences in CacheMemoryContext. But why are we doing > StartTransactionCommand twice? Is this intentional? Can we do it once > before GetSubscriptionRelations() and commit it once after for-loop is > over? IIUC, then we will not need CacheMemoryContext here. > > 2) > logicalrep_seqsyncworker_set_failuretime() > > a) This function is extern'ed in worker_internal.h. But I do not see > its usage outside launcher.c. Is it a mistake? > > b) Also, do we really need logicalrep_seqsyncworker_set_failuretime? > Is it better to move its logic instead in > logicalrep_seqsyncworker_set_failuretime()? > > 3) > SyncFetchRelationStates: > Earlier the name was FetchTableStates. If we really want to use the > 'Sync' keyword, we can name it FetchRelationSyncStates, as we are > fetching sync-status only. Thoughts? > > 4) > ProcessSyncingSequencesForApply(): > + > + if (rstate->state != SUBREL_STATE_INIT) > + continue; > > Why do we expect that rstate->state is not INIT at this time when we > have fetched only INIT sequences in sequence_states_not_ready. Shall > this be assert? If there is a valid scenario where it can be READY > here, please add comments. > > 5) > + if (!MyLogicalRepWorker->sequencesync_failure_time || > + TimestampDifferenceExceeds(MyLogicalRepWorker->sequencesync_failure_time, > + now, wal_retrieve_retry_interval)) > + { > + MyLogicalRepWorker->sequencesync_failure_time = 0; > + > + logicalrep_worker_launch(WORKERTYPE_SEQUENCESYNC, > + MyLogicalRepWorker->dbid, > + MySubscription->oid, > + MySubscription->name, > + MyLogicalRepWorker->userid, > + InvalidOid, > + DSM_HANDLE_INVALID); > + break; > + } > > We set sequencesync_failure_time to 0, but if logicalrep_worker_launch > is not able to launch the worker due to some reason, next time it will > not even wait for 'wal_retrieve_retry_interval time' to attempt > restarting it again. Is that intentional? > > In other workflows such as while launching table-sync or apply worker, > this scenario does not arise. This is because we maintain start_time > there which can never be 0 instead of failure time and before > attempting to start the workers, we set start_time to current time. > The seq-sync failure-time OTOH is only set to non-null in > logicalrep_seqsyncworker_failure() and it is not necessary that we > will hit that function as the logicalrep_worker_launch() may fail > before that itself. Do you think we shall maintain start-time instead > of failure-time for seq-sync worker as well? Or is there any other way > to handle it? > I thought about this more. Another idea could be to capture the return value of logicalrep_worker_launch() and if it is false, then we can set failure_time to current time. Thoughts? thanks Shveta
pgsql-hackers by date: