On Thu, Jul 11, 2024 at 08:50:57PM -0500, Nathan Bossart wrote:
> I'm not following how this is guaranteed to trigger an autovacuum quickly.
> Shouldn't we set autovacuum_vacuum_insert_threshold to 1 so that it is
> eligible for autovacuum?
You are right, this is not going to influence a faster startup of a
worker, so we could remove that entirely. On closer look, the main
bottlebeck is that the test is spending a lot of time waiting on the
naptime even if we are using the minimum value of 1s, as the scan of
pg_stat_activity checking for workers keeps looping.
[ ..thinks.. ]
I have one trick in my sleeve for this one to make the launcher more
responsive in checking the timestamps of the database list. That's
not perfect, but it reduces the wait behind the worker startups by
400ms (1s previously as of the naptime, 600ms now) with a reload to
set the launcher's latch after the injection point has been attached.
The difference in runtime is noticeable.
>> +# Wait for the autovacuum worker to exit before scanning the logs.
>> +$node->poll_query_until('postgres',
>> + "SELECT count(*) = 0 FROM pg_stat_activity "
>> + . "WHERE pid = $av_pid AND backend_type = 'autovacuum worker';");
>
> WFM. Even if the PID is quickly reused, this should work. We just might
> end up waiting a little longer.
>
> Is it worth testing cancellation, too?
The point is to check after pg_signal_backend, so I am not sure it is
worth the extra cycles for the cancellation.
Attaching the idea, with a fix for the comment you have mentioned and
appending "regress_" the role names for the warnings generated by
-DENFORCE_REGRESSION_TEST_NAME_RESTRICTIONS, while on it.
What do you think?
--
Michael