Dear Nathan,
Thank you for making the patch! I tested your patch, and it basically worked well.
About following part:
```
ConfigReloadPending = false;
ProcessConfigFile(PGC_SIGHUP);
+ now = GetCurrentTimestamp();
+ for (int i = 0; i < NUM_LRW_WAKEUPS; i++)
+ LogRepWorkerComputeNextWakeup(i, now);
+
+ /*
+ * If a wakeup time for starting sync workers was set, just set it
+ * to right now. It will be recalculated as needed.
+ */
+ if (next_sync_start != PG_INT64_MAX)
+ next_sync_start = now;
}
```
Do we have to recalculate the NextWakeup when subscriber receives SIGHUP signal?
I think this may cause the unexpected change like following.
Assuming that wal_receiver_timeout is 60s, and wal_sender_timeout on publisher is
0s (or the network between nodes is disconnected).
And we send SIGHUP signal per 20s to subscriber's postmaster.
Currently the last_recv_time is calcurated when the worker accepts messages,
and the value is used for deciding to send a ping. The worker will exit if the
walsender does not reply.
But in your patch, the apply worker calcurates wakeup[LRW_WAKEUP_PING] and
wakeup[LRW_WAKEUP_TERMINATE] again when it gets SIGHUP, so the worker never sends
ping with requestReply = true, and never exits due to the timeout.
My case seems to be crazy, but there may be another issues if it remains.
Best Regards,
Hayato Kuroda
FUJITSU LIMITED