Re: Logical Replication of sequences - Mailing list pgsql-hackers

From shveta malik
Subject Re: Logical Replication of sequences
Date
Msg-id CAJpy0uA6ugJN+tBwv38mG+6vf-uMuQoqxaZCX-mg1qBWX+=Bkw@mail.gmail.com
Whole thread Raw
In response to Re: Logical Replication of sequences  (shveta malik <shveta.malik@gmail.com>)
List pgsql-hackers
On Mon, Jul 7, 2025 at 2:37 PM shveta malik <shveta.malik@gmail.com> wrote:
>
> On Fri, Jul 4, 2025 at 3:53 PM vignesh C <vignesh21@gmail.com> wrote:
> >
> > On Tue, 1 Jul 2025 at 15:20, shveta malik <shveta.malik@gmail.com> wrote:
> > > On Mon, Jun 30, 2025 at 3:21 PM Nisha Moond <nisha.moond412@gmail.com> wrote:
> > > >
> > > > Please find the attached v20250630 patch set addressing above comments
> > > > and other comments in [1],[2],[3] and [4].
> > >
> > > Thanks for the patches. I am still in process of reviewing it but
> > > please find few comments:
> > >
> > > 1)
> > > + if (pset.sversion >= 180000)
> > > + appendPQExpBuffer(&buf,
> > > +   ",\n  puballsequences AS \"%s\"",
> > > +   gettext_noop("All sequences"));
> > >
> > > The server version check throughout the patch can be modified to 19000
> > > as a new branch is created now.
> >
> > Modified
> >
> > > 2)
> > > + bool all_pub; /* Special publication for all tables,
> > > + * sequecnes */
> > >
> > > a) Typo: sequecnes --> sequences
> >
> > Modified
> >
> > > b) It is not clear from the comment that when will it be true? Will it
> > > be set when either of all-tables or all-sequences is given or does it
> > > need both?
> >
> > Updated the comments
> >
> > > 3)
> > > postgres=# create publication pub1 for all sequences WITH ( PUBLISH='delete');
> > > CREATE PUBLICATION
> > > postgres=# create publication pub2 for all tables, sequences WITH
> > > (PUBLISH='update');
> > > CREATE PUBLICATION
> > >
> > > For the first command, 'WITH ( publication_parameter..' is useless.
> > > For the second command, it is applicable only for 'all tables'.
> > >
> > > a) I am not sure if we even allow WITH in the first command?
> > > b) In the second command, even if we allow it, there should be some
> > > sort of NOTICE informing that it is applicable only to 'TABLES'.
> > >
> > > Thoughts?
> > >
> > > Also we allowed altering publication_parameter for all-sequences publication:
> > >
> > > postgres=# alter publication pub1 set (publish='insert,update');
> > > ALTER PUBLICATION
> > >
> > > c) Should this be restricted as well? Thoughts?
> >
> > We have documented that the parameter of with clause is not applicable
> > for sequences. I feel that all the above statements are ok with the
> > documentation mentioned.
> >
> > Regarding the comments from [1].
> > > 5)
> > > In LogicalRepSyncSequences, why are we allocating it in a permanent
> > > memory context?
> > >
> > > + /* Allocate the tracking info in a permanent memory context. */
> > > + oldctx = MemoryContextSwitchTo(CacheMemoryContext);
> > > + foreach_ptr(SubscriptionRelState, seq_state, sequences)
> > > + {
> > > + SubscriptionRelState *rstate = palloc(sizeof(SubscriptionRelState));
> > > +
> > > + memcpy(rstate, seq_state, sizeof(SubscriptionRelState));
> > > + sequences_not_synced = lappend(sequences_not_synced, rstate);
> > > + }
> > > + MemoryContextSwitchTo(oldctx);
> > >
> > > Same for 'seq_info' allocation.
> >
> > When we are in between a transaction we will be using
> > TopTransactionContext. We can palloc() in TopTransactionContext and
> > safely use that memory throughout the transaction. But we cannot
> > cannot access memory allocated in TopTransactionContext after
> > CommitTransaction() finishes, because TopTransactionContext is
> > explicitly reset (or deleted) at the end of the transaction.
> > This is the reason we have to use CacheMemoryContext here.
> >
>
> Okay. I see. Thanks for the details. Can we add this info in comments,
> something like:
> Allocate the tracking information in a permanent memory context to
> ensure it remains accessible across multiple transactions during the
> sequence copy process. The memory will be released once the copy is
> finished.
>
> > The rest of the comments are fixed.
>
> Thank you for the patches. I am not done with review yet, but please
> find the comments so far:
>
>
> 1)
> LogicalRepSyncSequences()
>
> + /* Allocate the sequences information in a permanent memory context. */
> + oldctx = MemoryContextSwitchTo(CacheMemoryContext);
> +
> + /* Get the sequences that should be synchronized. */
> + subsequences = GetSubscriptionRelations(subid, false, true, true);
>
> Here too we are using CacheMemoryContext? 'subsequences' is only used
> in given function and not in copy_sequeneces. I guess, we start
> multiple transactions in given function and thus make it essential to
> allocate subsequences in CacheMemoryContext. But why are we doing
> StartTransactionCommand twice? Is this intentional? Can we do it once
> before GetSubscriptionRelations() and commit it once after for-loop is
> over? IIUC, then we will not need CacheMemoryContext here.
>
> 2)
> logicalrep_seqsyncworker_set_failuretime()
>
> a) This function is extern'ed in worker_internal.h. But I do not see
> its usage outside launcher.c. Is it a mistake?
>
> b) Also, do we really need logicalrep_seqsyncworker_set_failuretime?
> Is it better to move its logic instead in
> logicalrep_seqsyncworker_set_failuretime()?
>
> 3)
> SyncFetchRelationStates:
> Earlier the name was FetchTableStates. If we really want to  use the
> 'Sync' keyword, we can name it FetchRelationSyncStates, as we are
> fetching sync-status only. Thoughts?
>
> 4)
> ProcessSyncingSequencesForApply():
> +
> + if (rstate->state != SUBREL_STATE_INIT)
> + continue;
>
> Why do we expect that rstate->state is not INIT at this time when we
> have fetched only INIT sequences in sequence_states_not_ready. Shall
> this be assert? If there is a valid scenario where it can be READY
> here, please add comments.
>
> 5)
> + if (!MyLogicalRepWorker->sequencesync_failure_time ||
> + TimestampDifferenceExceeds(MyLogicalRepWorker->sequencesync_failure_time,
> +    now, wal_retrieve_retry_interval))
> + {
> + MyLogicalRepWorker->sequencesync_failure_time = 0;
> +
> + logicalrep_worker_launch(WORKERTYPE_SEQUENCESYNC,
> + MyLogicalRepWorker->dbid,
> + MySubscription->oid,
> + MySubscription->name,
> + MyLogicalRepWorker->userid,
> + InvalidOid,
> + DSM_HANDLE_INVALID);
> + break;
> + }
>
> We set sequencesync_failure_time to 0, but if logicalrep_worker_launch
> is not able to launch the worker due to some reason, next time it will
> not even wait for 'wal_retrieve_retry_interval time' to attempt
> restarting it again. Is that intentional?
>
> In other workflows such as while launching table-sync or apply worker,
>  this scenario does not arise. This is because we maintain start_time
> there which can never be 0 instead of failure time and before
> attempting to start the workers, we set start_time to current time.
> The seq-sync failure-time OTOH is only set to non-null in
> logicalrep_seqsyncworker_failure() and it is not necessary that we
> will hit that function as the  logicalrep_worker_launch() may fail
> before that itself. Do you think we shall maintain start-time instead
> of failure-time for seq-sync worker as well? Or is there any other way
> to handle it?
>

I thought about this more. Another idea could be to capture the return
value of logicalrep_worker_launch() and if it is false, then we can
set failure_time to current time. Thoughts?

thanks
Shveta



pgsql-hackers by date:

Previous
From: Yura Sokolov
Date:
Subject: Re: Add 64-bit XIDs into PostgreSQL 15
Next
From: Matthias van de Meent
Date:
Subject: Re: Limiting overshoot in nbtree's parallel SAOP index scans