Re: Logical Replication of sequences - Mailing list pgsql-hackers

From vignesh C
Subject Re: Logical Replication of sequences
Date
Msg-id CALDaNm3tmVea7P2K4t31wRxZ-sM_gZGZyz-uv+9wwqOTuyfELg@mail.gmail.com
Whole thread Raw
In response to Re: Logical Replication of sequences  (Amit Kapila <amit.kapila16@gmail.com>)
Responses Re: Logical Replication of sequences
List pgsql-hackers
On Thu, 23 Oct 2025 at 16:47, Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Thu, Oct 23, 2025 at 11:45 AM vignesh C <vignesh21@gmail.com> wrote:
> >
> > The attached patch has the changes for the same.
> >
>
> I have pushed 0001 and the following are comments on 0002.
>
> 1.
> @@ -1414,6 +1414,7 @@ CREATE VIEW pg_stat_subscription_stats AS
>          ss.subid,
>          s.subname,
>          ss.apply_error_count,
> +        ss.sequence_sync_error_count,
>          ss.sync_error_count,
>
> The new parameter name is noticeably longer than other columns. Can we
> name it as ss.seq_sync_error_count. We may also want to reconsider
> changing existing column sync_error_count to tbl_sync_error_count. Can
> we extract this in a separate stats patch?

Modified and extracted a separate patch for tbl_sync_error_count

> 2.
>  Datum
>  pg_get_sequence_data(PG_FUNCTION_ARGS)
> @@ -1843,6 +1843,13 @@ pg_get_sequence_data(PG_FUNCTION_ARGS)
>
>   values[0] = Int64GetDatum(seq->last_value);
>   values[1] = BoolGetDatum(seq->is_called);
> +
> + /*
> + * The page LSN will be used in logical replication of sequences to
> + * record the LSN of the sequence page in the pg_subscription_rel
> + * system catalog.  It reflects the LSN of the remote sequence at the
> + * time it was synchronized.
> + */
>   values[2] = LSNGetDatum(PageGetLSN(page));
>
> This comment appears out of place. We should mention it somewhere in
> sequencesync worker and give reference of that place/function here.

Moved this comment to the copy_sequence function just above
UpdateSubscriptionRelState.

> 3.
> LogicalRepWorker *
> -logicalrep_worker_find(Oid subid, Oid relid, bool only_running)
> +logicalrep_worker_find(Oid subid, Oid relid, LogicalRepWorkerType wtype,
> +    bool only_running)
> …
> …
>  void
> -logicalrep_worker_stop(Oid subid, Oid relid)
> +logicalrep_worker_stop(Oid subid, Oid relid, LogicalRepWorkerType wtype)
>
> Let's extract changes for these APIs and their callers in a separate
> patch that can be committed prior to the main patch.

Prepared a separated patch

> 4.
> +void
> +logicalrep_reset_seqsync_start_time(void)
> +{
> + LogicalRepWorker *worker;
> +
> + LWLockAcquire(LogicalRepWorkerLock, LW_SHARED);
> +
> + worker = logicalrep_worker_find(MyLogicalRepWorker->subid, InvalidOid,
> + WORKERTYPE_APPLY, true);
>
> Shouldn't WORKERTYPE_SEQUENCESYNC be used? If not, then better to add
> a comment on why a different type of worker is used for resetting the
> seqsync time.

It should be apply worker here. We set the last_seqsync_start_time for
the sequence worker in the apply worker instead of the sequence sync
worker, as the sequence sync worker has finished and is about to exit.

> 5.
> + * XXX: An alternative design was considered where the launcher process would
> …
> ...
> + * c) It utilizes the existing tablesync worker code to start the sequencesync
> + *    process, thus preventing code duplication in the launcher.
> + * d) It simplifies code maintenance by consolidating changes to a single
> + *    location rather than multiple components.
> + * e) The apply worker can access the sequences that need to be synchronized
> + *    from the pg_subscription_rel system catalog. Whereas the launcher process
> + *    operates without direct database access so would need a framework to
> + *    establish connections with the databases to retrieve the sequences for
> + *    synchronization.
>
> Only these three points are sufficient for not going with this
> alternative approach. The point (e) is most important and should be
> mentioned as the first point.

Modified

> 6.
> + /* Get the local sequence object */
> + sequence_rel = try_table_open(seqinfo->localrelid, RowExclusiveLock);
> + tup = SearchSysCache1(SEQRELID, ObjectIdGetDatum(seqinfo->localrelid));
> + if (!sequence_rel || !HeapTupleIsValid(tup))
> + {
> + (*skipped_count)++;
> + elog(LOG, "skip synchronization of sequence \"%s.%s\" because it has
> been dropped concurrently",
> + nspname, seqname);
> + return;
>
> We should close the relation when the tuple is not valid and can't proceed.

Modified slightly to check in validate_sequence and close the relation
in copy_sequence.

> 7.
> + /* Skip if the entry is no longer valid */
> + if (!seqinfo->entry_valid)
> + {
> + ReleaseSysCache(tup);
> + table_close(sequence_rel, RowExclusiveLock);
> + (*skipped_count)++;
> + ereport(LOG, errmsg("skip synchronization of sequence \"%s.%s\"
> because it has been altered concurrently",
> + nspname, seqname));
> + return;
>
> Isn't it better to release cache and close relation just before
> return? Tomorrow, if we need to use something from tuple/relation, it
> would be easier.

Because of another comment, the code is re-structured now. As per
current logic the tuple will be released in validate_sequence and the
logging based on error is done at copy_sequence. For new code
structure releasing tuple in validate_sequence is better

> 8.
> +LogicalRepSyncSequences(void)
> {
> ...
> + ScanKeyInit(&skey[1],
> + Anum_pg_subscription_rel_srsubstate,
> + BTEqualStrategyNumber, F_CHARNE,
> + CharGetDatum(SUBREL_STATE_READY));
>
> As we are using only two states (INIT and READY) for sequences, isn't
> it better to use INIT state here? That should avoid sync-in-progress
> tables.

Modified

> 9. I find the copy_sequences->copy_sequence code can be rearranged to
> make it easy to follow. The point I don't like is that the boundary
> between two makes it hard to follow and requires so many parameters to
> be passed to copy_sequence. The one idea to improve is to move all
> failure checks out of copy_sequence either directly in the caller or
> as a separate function. All the values for each sequence can be
> fetched in the caller and copy_sequence can be used to SetSequence and
> UpdateSubscriptionRelState(). If you have any better ideas to
> rearrange this part of the patch then feel free to try those out and
> share the results.

Modified

The attached v20251024 version patch has the changes for the same.
The comments from [1] have also been addressed in this version.

[1] -
https://www.postgresql.org/message-id/TY4PR01MB169078C625FB8980E6F42F4F994F1A%40TY4PR01MB16907.jpnprd01.prod.outlook.com

Regards,
Vignesh

Attachment

pgsql-hackers by date:

Previous
From: Fujii Masao
Date:
Subject: Re: display hot standby state in psql prompt
Next
From: Jeff Davis
Date:
Subject: Re: C11: should we use char32_t for unicode code points?