On Tue, Dec 13, 2022 at 06:32:08PM -0500, Tom Lane wrote:
> Before, there was up to 1 second (with multiple "SELECT count(1) = 0"
> probes from the test script) between the ALTER SUBSCRIPTION command
> and the "apply worker will restart" log entry.  That wait is pretty
> well zapped, but instead now we're waiting hundreds of ms for the
> "apply worker has started" message.
> 
> I've not chased it further than that, but I venture that the apply
> launcher also needs a kick in the pants, and/or there needs to be
> an interlock to ensure that it doesn't wake until after the old
> apply worker quits.
This is probably because the tests set wal_retrieve_retry_interval to
500ms.  Lowering that to 1ms in Cluster.pm seems to wipe out this
particular wait, and the total src/test/subscription test time drops from
119 seconds to 95 seconds on my machine.  This probably lowers the amount
of test coverage we get on the wal_retrieve_retry_interval code paths, but
if that's a concern, perhaps we should write a test specifically for
wal_retrieve_retry_interval.
-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com