Re: Logical Replication of sequences - Mailing list pgsql-hackers
| From | Amit Kapila |
|---|---|
| Subject | Re: Logical Replication of sequences |
| Date | |
| Msg-id | CAA4eK1LB7u2KQLRFh6xfTSpEB-8gbpR=hqzFOfL9Z1R8rj7Q5g@mail.gmail.com Whole thread Raw |
| In response to | Re: Logical Replication of sequences (vignesh C <vignesh21@gmail.com>) |
| List | pgsql-hackers |
On Thu, Oct 23, 2025 at 11:45 AM vignesh C <vignesh21@gmail.com> wrote:
>
> The attached patch has the changes for the same.
>
I have pushed 0001 and the following are comments on 0002.
1.
@@ -1414,6 +1414,7 @@ CREATE VIEW pg_stat_subscription_stats AS
ss.subid,
s.subname,
ss.apply_error_count,
+ ss.sequence_sync_error_count,
ss.sync_error_count,
The new parameter name is noticeably longer than other columns. Can we
name it as ss.seq_sync_error_count. We may also want to reconsider
changing existing column sync_error_count to tbl_sync_error_count. Can
we extract this in a separate stats patch?
2.
Datum
pg_get_sequence_data(PG_FUNCTION_ARGS)
@@ -1843,6 +1843,13 @@ pg_get_sequence_data(PG_FUNCTION_ARGS)
values[0] = Int64GetDatum(seq->last_value);
values[1] = BoolGetDatum(seq->is_called);
+
+ /*
+ * The page LSN will be used in logical replication of sequences to
+ * record the LSN of the sequence page in the pg_subscription_rel
+ * system catalog. It reflects the LSN of the remote sequence at the
+ * time it was synchronized.
+ */
values[2] = LSNGetDatum(PageGetLSN(page));
This comment appears out of place. We should mention it somewhere in
sequencesync worker and give reference of that place/function here.
3.
LogicalRepWorker *
-logicalrep_worker_find(Oid subid, Oid relid, bool only_running)
+logicalrep_worker_find(Oid subid, Oid relid, LogicalRepWorkerType wtype,
+ bool only_running)
…
…
void
-logicalrep_worker_stop(Oid subid, Oid relid)
+logicalrep_worker_stop(Oid subid, Oid relid, LogicalRepWorkerType wtype)
Let's extract changes for these APIs and their callers in a separate
patch that can be committed prior to the main patch.
4.
+void
+logicalrep_reset_seqsync_start_time(void)
+{
+ LogicalRepWorker *worker;
+
+ LWLockAcquire(LogicalRepWorkerLock, LW_SHARED);
+
+ worker = logicalrep_worker_find(MyLogicalRepWorker->subid, InvalidOid,
+ WORKERTYPE_APPLY, true);
Shouldn't WORKERTYPE_SEQUENCESYNC be used? If not, then better to add
a comment on why a different type of worker is used for resetting the
seqsync time.
5.
+ * XXX: An alternative design was considered where the launcher process would
…
...
+ * c) It utilizes the existing tablesync worker code to start the sequencesync
+ * process, thus preventing code duplication in the launcher.
+ * d) It simplifies code maintenance by consolidating changes to a single
+ * location rather than multiple components.
+ * e) The apply worker can access the sequences that need to be synchronized
+ * from the pg_subscription_rel system catalog. Whereas the launcher process
+ * operates without direct database access so would need a framework to
+ * establish connections with the databases to retrieve the sequences for
+ * synchronization.
Only these three points are sufficient for not going with this
alternative approach. The point (e) is most important and should be
mentioned as the first point.
6.
+ /* Get the local sequence object */
+ sequence_rel = try_table_open(seqinfo->localrelid, RowExclusiveLock);
+ tup = SearchSysCache1(SEQRELID, ObjectIdGetDatum(seqinfo->localrelid));
+ if (!sequence_rel || !HeapTupleIsValid(tup))
+ {
+ (*skipped_count)++;
+ elog(LOG, "skip synchronization of sequence \"%s.%s\" because it has
been dropped concurrently",
+ nspname, seqname);
+ return;
We should close the relation when the tuple is not valid and can't proceed.
7.
+ /* Skip if the entry is no longer valid */
+ if (!seqinfo->entry_valid)
+ {
+ ReleaseSysCache(tup);
+ table_close(sequence_rel, RowExclusiveLock);
+ (*skipped_count)++;
+ ereport(LOG, errmsg("skip synchronization of sequence \"%s.%s\"
because it has been altered concurrently",
+ nspname, seqname));
+ return;
Isn't it better to release cache and close relation just before
return? Tomorrow, if we need to use something from tuple/relation, it
would be easier.
8.
+LogicalRepSyncSequences(void)
{
...
+ ScanKeyInit(&skey[1],
+ Anum_pg_subscription_rel_srsubstate,
+ BTEqualStrategyNumber, F_CHARNE,
+ CharGetDatum(SUBREL_STATE_READY));
As we are using only two states (INIT and READY) for sequences, isn't
it better to use INIT state here? That should avoid sync-in-progress
tables.
9. I find the copy_sequences->copy_sequence code can be rearranged to
make it easy to follow. The point I don't like is that the boundary
between two makes it hard to follow and requires so many parameters to
be passed to copy_sequence. The one idea to improve is to move all
failure checks out of copy_sequence either directly in the caller or
as a separate function. All the values for each sequence can be
fetched in the caller and copy_sequence can be used to SetSequence and
UpdateSubscriptionRelState(). If you have any better ideas to
rearrange this part of the patch then feel free to try those out and
share the results.
--
With Regards,
Amit Kapila.
pgsql-hackers by date: