RE: [PROPOSAL] Termination of Background Workers for ALTER/DROP DATABASE - Mailing list pgsql-hackers

From Aya Iwata (Fujitsu)
Subject RE: [PROPOSAL] Termination of Background Workers for ALTER/DROP DATABASE
Date
Msg-id OS7PR01MB11964AC66E36D8EAE1DAD1440EAF2A@OS7PR01MB11964.jpnprd01.prod.outlook.com
Whole thread Raw
In response to Re: [PROPOSAL] Termination of Background Workers for ALTER/DROP DATABASE  (Peter Smith <smithpb2250@gmail.com>)
List pgsql-hackers
Hi Peter-san, Michael-san,

Thank you for your comments.
I updated patch to v0009. Please review attached patch.

> -----Original Message-----
> From: Peter Smith <smithpb2250@gmail.com>
> Sent: Monday, October 20, 2025 11:02 AM

> Some comments for the latest v8 patch.
> 
> ======
> src/backend/postmaster/bgworker.c
> 
> TerminateBgWorkersByBbOid:
> 
> 1.
> +void
> +TerminateBgWorkersByDbOid(Oid oid)
> 
> Now the function name is more explicit, but that is not a good reason
> to make the parameter name more vague.
> 
> IMO the parameter should still be "dbOid" or "databaseId" instead of
> just "oid". (ditto for the extern in bgworker.h)

I agree with you. I reverted parameter name to databaseId.


> ======
> src/backend/storage/ipc/procarray.c
> 
> 
> CountOtherDBBackends:
> 
> 2.
> + /*
> + * Usually, we try 50 times with 100ms sleep between tries, making 5 sec
> + * total wait. If requested, it would be reduced to 10 times to shorten the
> + * test time.
> + */
> 
> 
> The comment seemed vague to me. How about more like:
> 
> /*
>  * Retry up to 50 times with 100ms between attempts (max 5s total).
>  * Can be reduced to 10 attempts (max 1s total) to speed up tests.
>  */

Thank you. I updated this comment.

> 3.
> + for (tries = 0; tries < ntries; tries++)
> 
> 'tries' can be declared as a for-loop variable.

I have declared "int" within the for-loop.

> ~~~
> 
> 4.
> Something feels strange about this function name
> (CountOtherDBBackends) which suggests it is just for counting stuff,
> but in reality is more about exiting/terminating the workers. In fact
> retuns a boolean, not a count. Compare this with this similarly named
> "CountUserBackends" which really *is* doing what it says.
> 
> Can we give this function a better name, or is that out of scope for this patch?

I think this is out of scope because existing code have terminated autovacuum process by SIGTERM.
It can be discussed separately.
I just added a comment to this function "background workers would also be terminated".

> ======
> src/test/modules/worker_spi/t/002_worker_terminate.pl
> 
> 5.
> +# Firstly register an injection point to make the test faster. Normally, it
> +# spends more than 5 seconds because the backend retries, counting the
> number
> +# of connecting processes 50 times, but now the counting would be done only
> 10
> +# times. See CountOtherDBBackends().
> +$node->safe_psql('postgres', "CREATE EXTENSION injection_points;");
> +$node->safe_psql('postgres',
> + "SELECT injection_points_attach('reduce-ncounts', 'error');");
> +
> 
> It seemed overkill to give details about what "normally" happens. I
> think it is enough to have a simple comment here:
> 
> SUGGESTION
> The injection point 'reduce-ncounts' reduces the number of backend
> retries, allowing for shorter test runs. See CountOtherDBBackends().

Thank you for your suggestion. I updated this comment.

> -----Original Message-----
> From: Michael Paquier <michael@paquier.xyz>
> Sent: Monday, October 20, 2025 1:33 PM

> On Mon, Oct 20, 2025 at 01:01:31PM +1100, Peter Smith wrote:
> > Some comments for the latest v8 patch.
> 
> The comments of Peter apply to comments and parameters.  I am not
> going down to these details in this message, these can be infinitely
> tuned.
> 
> The injection point integration looks correct.  You are checking the
> compile flag and if the extension is available in the installation
> path, which should be enough.
> 
> +   if (IS_INJECTION_POINT_ATTACHED("reduce-ncounts"))
> +       ntries = 10;
> 
> 1s is much faster than the default of 5s, still I am wondering if this
> cannot be brought down a bit more.  Dropping the worker still around
> after the first test with CREATE DATABASE works here.

Thank you. I updated ntries to 3.

> +# Confirm a background worker is still running
> +$node->safe_psql(
> +    "postgres", qq(
> +        SELECT count(1) FROM pg_stat_activity
> +        WHERE backend_type = 'worker_spi dynamic';));
> 
> This does not check that the worker that does not have the flag set is
> still running: you are not feeding the output of this query to an is()
> test.
> 
> +    is($result, 't', "dynamic bgworker launched");
> 
> In launch_bgworker(), this uses the same test description for all the
> callers of this subroutine.  Let's prefix it with $testcase.

I added $testcase. Is it same as your expectations?

> +void
> +TerminateBgWorkersByDbOid(Oid oid)
> 
> FWIW, while reading this code, I was wondering about one improvement
> that could show benefits for more extension code than only what we are
> discussing here because external code has no access to
> BackgroundWorkerSlot while holding the LWLock BackgroundWorkerLock in
> a single loop, by rewriting this new routine with something like that:
> void TerminateBackgroundWorkerMatchin(
> bool (*do_terminate) (int pid, BackgroundWorker *, Datum))
> 
> Then the per-database termination would be a custom routine, defined
> also in bgworker.c.  Other extension code could define their own
> filtering callback routine.  Just an idea in passing, to let extension
> code take more actions on bgworker slots in use-based on a PGPROC
> entry, like a role ID for example, or it could be a different factor.
> Feel free to dislike such a funky idea if you do not like it and say
> so, of course.

Thank you for your advice.
I'd like to address that, but I couldn't figure out how to do it on my own. 
Could you please describe it more?

Regards,
Aya Iwata
Fujitsu Limited

Attachment

pgsql-hackers by date:

Previous
From: Nazir Bilal Yavuz
Date:
Subject: Re: CI: Add task that runs pgindent
Next
From: Dimitrios Apostolou
Date:
Subject: Re: [PATCH v4] parallel pg_restore: avoid disk seeks when jumping short distance forward