Re: Logical Replication of sequences - Mailing list pgsql-hackers
| From | vignesh C |
|---|---|
| Subject | Re: Logical Replication of sequences |
| Date | |
| Msg-id | CALDaNm3tmVea7P2K4t31wRxZ-sM_gZGZyz-uv+9wwqOTuyfELg@mail.gmail.com Whole thread Raw |
| In response to | Re: Logical Replication of sequences (Amit Kapila <amit.kapila16@gmail.com>) |
| Responses |
Re: Logical Replication of sequences
|
| List | pgsql-hackers |
On Thu, 23 Oct 2025 at 16:47, Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Thu, Oct 23, 2025 at 11:45 AM vignesh C <vignesh21@gmail.com> wrote:
> >
> > The attached patch has the changes for the same.
> >
>
> I have pushed 0001 and the following are comments on 0002.
>
> 1.
> @@ -1414,6 +1414,7 @@ CREATE VIEW pg_stat_subscription_stats AS
> ss.subid,
> s.subname,
> ss.apply_error_count,
> + ss.sequence_sync_error_count,
> ss.sync_error_count,
>
> The new parameter name is noticeably longer than other columns. Can we
> name it as ss.seq_sync_error_count. We may also want to reconsider
> changing existing column sync_error_count to tbl_sync_error_count. Can
> we extract this in a separate stats patch?
Modified and extracted a separate patch for tbl_sync_error_count
> 2.
> Datum
> pg_get_sequence_data(PG_FUNCTION_ARGS)
> @@ -1843,6 +1843,13 @@ pg_get_sequence_data(PG_FUNCTION_ARGS)
>
> values[0] = Int64GetDatum(seq->last_value);
> values[1] = BoolGetDatum(seq->is_called);
> +
> + /*
> + * The page LSN will be used in logical replication of sequences to
> + * record the LSN of the sequence page in the pg_subscription_rel
> + * system catalog. It reflects the LSN of the remote sequence at the
> + * time it was synchronized.
> + */
> values[2] = LSNGetDatum(PageGetLSN(page));
>
> This comment appears out of place. We should mention it somewhere in
> sequencesync worker and give reference of that place/function here.
Moved this comment to the copy_sequence function just above
UpdateSubscriptionRelState.
> 3.
> LogicalRepWorker *
> -logicalrep_worker_find(Oid subid, Oid relid, bool only_running)
> +logicalrep_worker_find(Oid subid, Oid relid, LogicalRepWorkerType wtype,
> + bool only_running)
> …
> …
> void
> -logicalrep_worker_stop(Oid subid, Oid relid)
> +logicalrep_worker_stop(Oid subid, Oid relid, LogicalRepWorkerType wtype)
>
> Let's extract changes for these APIs and their callers in a separate
> patch that can be committed prior to the main patch.
Prepared a separated patch
> 4.
> +void
> +logicalrep_reset_seqsync_start_time(void)
> +{
> + LogicalRepWorker *worker;
> +
> + LWLockAcquire(LogicalRepWorkerLock, LW_SHARED);
> +
> + worker = logicalrep_worker_find(MyLogicalRepWorker->subid, InvalidOid,
> + WORKERTYPE_APPLY, true);
>
> Shouldn't WORKERTYPE_SEQUENCESYNC be used? If not, then better to add
> a comment on why a different type of worker is used for resetting the
> seqsync time.
It should be apply worker here. We set the last_seqsync_start_time for
the sequence worker in the apply worker instead of the sequence sync
worker, as the sequence sync worker has finished and is about to exit.
> 5.
> + * XXX: An alternative design was considered where the launcher process would
> …
> ...
> + * c) It utilizes the existing tablesync worker code to start the sequencesync
> + * process, thus preventing code duplication in the launcher.
> + * d) It simplifies code maintenance by consolidating changes to a single
> + * location rather than multiple components.
> + * e) The apply worker can access the sequences that need to be synchronized
> + * from the pg_subscription_rel system catalog. Whereas the launcher process
> + * operates without direct database access so would need a framework to
> + * establish connections with the databases to retrieve the sequences for
> + * synchronization.
>
> Only these three points are sufficient for not going with this
> alternative approach. The point (e) is most important and should be
> mentioned as the first point.
Modified
> 6.
> + /* Get the local sequence object */
> + sequence_rel = try_table_open(seqinfo->localrelid, RowExclusiveLock);
> + tup = SearchSysCache1(SEQRELID, ObjectIdGetDatum(seqinfo->localrelid));
> + if (!sequence_rel || !HeapTupleIsValid(tup))
> + {
> + (*skipped_count)++;
> + elog(LOG, "skip synchronization of sequence \"%s.%s\" because it has
> been dropped concurrently",
> + nspname, seqname);
> + return;
>
> We should close the relation when the tuple is not valid and can't proceed.
Modified slightly to check in validate_sequence and close the relation
in copy_sequence.
> 7.
> + /* Skip if the entry is no longer valid */
> + if (!seqinfo->entry_valid)
> + {
> + ReleaseSysCache(tup);
> + table_close(sequence_rel, RowExclusiveLock);
> + (*skipped_count)++;
> + ereport(LOG, errmsg("skip synchronization of sequence \"%s.%s\"
> because it has been altered concurrently",
> + nspname, seqname));
> + return;
>
> Isn't it better to release cache and close relation just before
> return? Tomorrow, if we need to use something from tuple/relation, it
> would be easier.
Because of another comment, the code is re-structured now. As per
current logic the tuple will be released in validate_sequence and the
logging based on error is done at copy_sequence. For new code
structure releasing tuple in validate_sequence is better
> 8.
> +LogicalRepSyncSequences(void)
> {
> ...
> + ScanKeyInit(&skey[1],
> + Anum_pg_subscription_rel_srsubstate,
> + BTEqualStrategyNumber, F_CHARNE,
> + CharGetDatum(SUBREL_STATE_READY));
>
> As we are using only two states (INIT and READY) for sequences, isn't
> it better to use INIT state here? That should avoid sync-in-progress
> tables.
Modified
> 9. I find the copy_sequences->copy_sequence code can be rearranged to
> make it easy to follow. The point I don't like is that the boundary
> between two makes it hard to follow and requires so many parameters to
> be passed to copy_sequence. The one idea to improve is to move all
> failure checks out of copy_sequence either directly in the caller or
> as a separate function. All the values for each sequence can be
> fetched in the caller and copy_sequence can be used to SetSequence and
> UpdateSubscriptionRelState(). If you have any better ideas to
> rearrange this part of the patch then feel free to try those out and
> share the results.
Modified
The attached v20251024 version patch has the changes for the same.
The comments from [1] have also been addressed in this version.
[1] -
https://www.postgresql.org/message-id/TY4PR01MB169078C625FB8980E6F42F4F994F1A%40TY4PR01MB16907.jpnprd01.prod.outlook.com
Regards,
Vignesh
Attachment
pgsql-hackers by date: