On Fri, Mar 17, 2023 at 5:52 AM Nathan Bossart <nathandbossart@gmail.com> wrote:
>
> I've attached a minimally-updated patch that doesn't yet address the bigger
> topics under discussion.
>
> On Thu, Mar 16, 2023 at 03:30:37PM +0530, Amit Kapila wrote:
> > On Wed, Feb 1, 2023 at 5:35 AM Nathan Bossart <nathandbossart@gmail.com> wrote:
> >> On Sat, Jan 28, 2023 at 10:26:25AM +0530, Amit Kapila wrote:
> >> > BTW, do we need to do something about wakeups in
> >> > wait_for_relation_state_change()?
> >>
> >> ... and wait_for_worker_state_change(), and copy_read_data(). From a quick
> >> glance, it looks like fixing these would be a more invasive change.
> >
> > What kind of logic do you have in mind to avoid waking up once per
> > second in those cases?
>
> I haven't looked into this too much yet. I'd probably try out Tom's
> suggestions from upthread [0] next and see if those can be applied here,
> too.
>
For the clean exit of tablesync worker, we already wake up the apply
worker in finish_sync_worker(). You probably want to deal with error
cases or is there something else on your mind? BTW, for
wait_for_worker_state_change(), one possibility is to wake all the
sync workers when apply worker exits but not sure if that is a very
good idea.
Few minor comments:
=====================
1.
- if (wal_receiver_timeout > 0)
+ now = GetCurrentTimestamp();
+ if (now >= wakeup[LRW_WAKEUP_TERMINATE])
+ ereport(ERROR,
+ (errcode(ERRCODE_CONNECTION_FAILURE),
+ errmsg("terminating logical replication worker due to timeout")));
+
+ /* Check to see if it's time for a ping. */
+ if (now >= wakeup[LRW_WAKEUP_PING])
{
- TimestampTz now = GetCurrentTimestamp();
Previously, we use to call GetCurrentTimestamp() only when
wal_receiver_timeout > 0 but we ignore that part now. It may not
matter much but if possible let's avoid calling GetCurrentTimestamp()
at additional places.
2.
+ for (int i = 0; i < NUM_LRW_WAKEUPS; i++)
+ LogRepWorkerComputeNextWakeup(i, now);
+
+ /*
+ * LogRepWorkerComputeNextWakeup() will have cleared the tablesync
+ * worker start wakeup time, so we might not wake up to start a new
+ * worker at the appropriate time. To deal with this, we set the
+ * wakeup time to right now so that
+ * process_syncing_tables_for_apply() recalculates it as soon as
+ * possible.
+ */
+ if (!am_tablesync_worker())
+ LogRepWorkerUpdateSyncStartWakeup(now);
Can't we avoid clearing syncstart time in the first place?
--
With Regards,
Amit Kapila.