On Fri, Jul 18, 2025 at 10:44 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> On Thu, Jul 17, 2025 at 4:52 PM vignesh C <vignesh21@gmail.com> wrote:
> >
I was looking at the high level idea of sequence sync worker patch
i.e. 0005, so far I haven't found anything problematic there, but I
haven't completed the review and testing yet. Here are some comments
I have while reading through the patch. I will try to do more
thorough review and testing next week.
1.
+ /*
+ * Count running sync workers for this subscription, while we have the
+ * lock.
+ */
+ nsyncworkers = logicalrep_sync_worker_count(MyLogicalRepWorker->subid);
+
+ /* Now safe to release the LWLock */
+ LWLockRelease(LogicalRepWorkerLock);
+
+ /*
+ * If there is a free sync worker slot, start a new sequencesync worker,
+ * and break from the loop.
+ */
+ if (nsyncworkers < max_sync_workers_per_subscription)
+ {
+ TimestampTz now = GetCurrentTimestamp();
+
+ /*
+ * To prevent starting the sequencesync worker at a high frequency
+ * after a failure, we store its last failure time. We start the
+ * sequencesync worker again after waiting at least
+ * wal_retrieve_retry_interval.
+ */
+ if (!MyLogicalRepWorker->sequencesync_failure_time ||
+ TimestampDifferenceExceeds(MyLogicalRepWorker->sequencesync_failure_time,
+ now, wal_retrieve_retry_interval))
+ {
+ MyLogicalRepWorker->sequencesync_failure_time = 0;
+
+ if (!logicalrep_worker_launch(WORKERTYPE_SEQUENCESYNC,
+ MyLogicalRepWorker->dbid,
+ MySubscription->oid,
+ MySubscription->name,
+ MyLogicalRepWorker->userid,
+ InvalidOid,
+ DSM_HANDLE_INVALID))
+ MyLogicalRepWorker->sequencesync_failure_time = now;
+ }
This code seems to duplicate much of the logic found in
ProcessSyncingTablesForApply() within its final else block, with only
minor differences (perhaps 1-2 lines).
To improve code maintainability and avoid redundancy, consider
extracting the common logic into a static function. This function
could then be called from both places.
2.
+/*
+ * Common function to setup the leader apply, tablesync worker and sequencesync
+ * worker.
+ */
Change to "Common function to setup the leader apply, tablesync and
sequencesync worker"
3.
+ /*
+ * To prevent starting the sequencesync worker at a high frequency
+ * after a failure, we store its last failure time. We start the
+ * sequencesync worker again after waiting at least
+ * wal_retrieve_retry_interval.
+ */
We haven't explained what's the rationale behind comparing with the
last failure time for sequence sync worker whereas for table sync
worker we compare with last start time.
--
Regards,
Dilip Kumar
Google