Thread: Proposals for making it easier to write correct bgworkers
Hi all
As I've gained experience working on background workers, it's become increasingly clear that they're a bit too different to normal backends for many nontrivial uses.
I thought I'd take a moment to note some of it here, along with some proposals for things we could potentially do to make it much easier to use bgworkers correctly especially when using them to run queries.
This is NOT A PATCH SET. It's a set of discussion proposals and it's also intended as a bit of a helper for people just getting started on bgworkers. There are a lot of subtle differences in the runtime environment a basic bgworker provides vs the runtime environment extension authors will be used to when writing fmgr-callable C functions.
(It looks like pg12 and pg13 have some improvements, so some of the issues I was going to mention with error cleanup paths and locking aren't relevant anymore.)
DIFFERENCES WHEN CODING FOR BGWORKERS
----
Some of the most important differences vs normal backends that I've noticed are:
* There's no supported way to recover from an error and continue execution. Each bgworker has to roll its own logic based on PostgresMain()'s sigsetjmp() handler and maintain it.
* The bgworker infrastructure doesn't set up the the application_name in PGPROC
* Example bgworkers roll their own signal handlers rather than using die, procsignal_sigusr1_handler, and PostgresSigHupHandler. This makes them seem to work OK, but fail to respect CHECK_FOR_INTERRUPTS() and other expected behaviour when they call into the executor etc.* Each bgworker has to roll its own mainloop with ConfigReloadPending / ProcessConfigFile() etc, and this isn't illustrated at all in the examples
* Example workers don't interact with the pgstat subsystem
Managing bgworker lifecycle is also quite difficult. The postmaster doesn't offer any help and we don't expose much help from core. Any extension that spawns workers needs to do its own handling of worker startup registration in shmem, worker exit reaping, handling workers that exit before registering, etc. This can be seen in the logical replication code, which rolls its own management per src/backend/replication/logical/launcher.c .
PROPOSED CHANGES IN EXAMPLE BGWORKERS/CONTRIBS
-----
So I'd like to make these changes to example bgworkers and to contrib extensions:
* Change example bgworkers in contrib to set up normal default signal handlers, not define their own
* Add ConfigReloadPending and ProcessConfigFile() to all example bgworkers
* Show pgstat use in example bgworkers
PROPOSED BGW SETUP AND MAINLOOP HELPERS
----
and I'm wondering if anyone thinks it's a good idea to add some bgworker mainloop helper API, where one call is made at the start of the bgw, before BackgroundWorkerInitializeConnection(), to:
* set application_name
* set process title
* set up default signal handlers and unblock signals
* creates and assigns a BGWMemoryContext named after the bgw
* set state to idle in pgstat
* sets up MyProcPort (or teach InitPostgres to do so for a bgw)
* takes a function pointer for a before_shmem_exit cleanup handler
to encourage ext authors to use this approach
then a mainloop helper that:
* Checks that CurrentMemoryContext == BGWMemoryContext
* Runs CHECK_FOR_INTERRUPTS()
* Checks for postmaster death and exits
* Checks for and performs config file reload
PROPOSED ERROR HANDLING HELPER
----
It'd also be beneficial to have a helper for bgw mainloops with error recovery, generalizing and extracting the pattern I see in all these routines:
src/backend/postmaster/autovacuum.c=AutoVacLauncherMain
src/backend/postmaster/autovacuum.c=AutoVacWorkerMain
src/backend/postmaster/bgworker.c=StartBackgroundWorker
src/backend/postmaster/bgwriter.c=BackgroundWriterMain
src/backend/postmaster/checkpointer.c=CheckpointerMain
src/backend/postmaster/walwriter.c=WalWriterMain
src/backend/tcop/postgres.c=PostgresMain
src/backend/postmaster/autovacuum.c=AutoVacWorkerMain
src/backend/postmaster/bgworker.c=StartBackgroundWorker
src/backend/postmaster/bgwriter.c=BackgroundWriterMain
src/backend/postmaster/checkpointer.c=CheckpointerMain
src/backend/postmaster/walwriter.c=WalWriterMain
src/backend/tcop/postgres.c=PostgresMain
which all do their own error reset loops. There's way too much copy/paste going on there IMO, even before we consider bgworkers, and it's nearly impossible for anyone who isn't quite an experienced Pg developer to write a bgw that can do anything with errors except promote ERROR to FATAL and die.
This would be modeled on PosgresMain(). The worker would only need to add its own logic to the sigsetjmp() error jump path, not duplicate all the core cleanup work.
PROPOSED GENERALISED WORKER MANAGEMENT
----
Finally I'm wondering if there's any interest in generalizing the logical rep worker management for other bgworkers. I've done a ton of work with worker management and it's something I'm sure I could take on but I don't want to write it without knowing there's some chance of acceptance.
The general idea is to provide a way for bgworkers to start up managers for pools / sets of workers. They launch them and have a function they can call in their mainloop that watches their child worker states, invoking callbacks when they fail to launch, launch successfully, exit cleanly after finishing their work, or die with an error. Workers are tracked in a shmem seg where the start of the seg must be a key struct (akin to how the hash API works). We would provide calls to look up a worker shmem struct by key, signal a worker by key, wait for a worker to exit (up to timeout), etc. Like in the logical rep code, access to the worker registration shmem would be controlled by LWLock. The extension code can put whatever it wants in the worker shmem entries after the key, including various unions or whatever - the worker management API won't care.
This abstracts all the low level mess away from bgworker implementations and lets them focus on writing the code they want to run.
I'd probably suggest doing so by extracting the logical rep worker management, and making the logical rep code use the generalized worker management. So it'd be proven, and have in core users.
PROPOSED XACT API WRAPPER
----
Additionally there are some more general postgres API patterns that the team working on pglogical and BDR has landed up writing wrappers for or compensating for. One of the most important is transaction management. In many places in pglogical where we start lightweight transactions in pglogical we use a wrapper that starts the txn if there isn't already one open and remembers whether one was started. Then the end helper closes it only if the start helper opened it. Finally, the memory context is always reset to the memory context before txn start once the txn ends, instead of being left at TopMemoryContext, since we've learned that that's an extremely common source of bugs.
I propose a wrapper for lightweight txn management, like the one we use in many places in pglogical. This is not a copy of that code, it's just a scratched out example of roughly how we do it:
LocalTransactionState txn;
StartTransactionIfNotStarted(&txn, LXACT_WANT_SNAPSHOT);
...
CommitTransactionCommandIfStarted(&txn);
where
static inline void
StartTransactionIfNotStarted(LocalTransactionState *txn, int flags)
{
txn->started = !TransactionInProgress();
txn->save_mctx = CurrentMemoryContext;
if (txn->started)
{
txn->pushed_snapshot = false;
StartTransactionCommand();
if (flags & LXACT_WANT_SNAPSHOT)
{
txn->pushed_snapshot = true;
PushActiveSnapshot(GetTransactionSnapshot());
}
}
}
static inline void
CommitTransactionIfNotStarted(LocalTransactionState *txn)
{
if (txn->started)
{
Assert(TransactionInProgress());
Assert(CurrentMemoryContext == TopTransactionContext);
if (txn->pushed_snapshot)
PopActiveSnapshot();
CommitTransactionCommand();
}
else
{
/* Caller didn't change context between start and end */
Assert(CurrentMemoryContext == txn->save_mctx);
}
(void) MemoryContextSwitchTo(txn->save_mctx);
}
čt 10. 9. 2020 v 5:02 odesílatel Craig Ringer <craig@2ndquadrant.com> napsal:
Hi allAs I've gained experience working on background workers, it's become increasingly clear that they're a bit too different to normal backends for many nontrivial uses.I thought I'd take a moment to note some of it here, along with some proposals for things we could potentially do to make it much easier to use bgworkers correctly especially when using them to run queries.This is NOT A PATCH SET. It's a set of discussion proposals and it's also intended as a bit of a helper for people just getting started on bgworkers. There are a lot of subtle differences in the runtime environment a basic bgworker provides vs the runtime environment extension authors will be used to when writing fmgr-callable C functions.(It looks like pg12 and pg13 have some improvements, so some of the issues I was going to mention with error cleanup paths and locking aren't relevant anymore.)DIFFERENCES WHEN CODING FOR BGWORKERS----Some of the most important differences vs normal backends that I've noticed are:* There's no supported way to recover from an error and continue execution. Each bgworker has to roll its own logic based on PostgresMain()'s sigsetjmp() handler and maintain it.* The bgworker infrastructure doesn't set up the the application_name in PGPROC* Example bgworkers roll their own signal handlers rather than using die, procsignal_sigusr1_handler, and PostgresSigHupHandler. This makes them seem to work OK, but fail to respect CHECK_FOR_INTERRUPTS() and other expected behaviour when they call into the executor etc.* Each bgworker has to roll its own mainloop with ConfigReloadPending / ProcessConfigFile() etc, and this isn't illustrated at all in the examples* Example workers don't interact with the pgstat subsystemManaging bgworker lifecycle is also quite difficult. The postmaster doesn't offer any help and we don't expose much help from core. Any extension that spawns workers needs to do its own handling of worker startup registration in shmem, worker exit reaping, handling workers that exit before registering, etc. This can be seen in the logical replication code, which rolls its own management per src/backend/replication/logical/launcher.c .PROPOSED CHANGES IN EXAMPLE BGWORKERS/CONTRIBS-----So I'd like to make these changes to example bgworkers and to contrib extensions:* Change example bgworkers in contrib to set up normal default signal handlers, not define their own* Add ConfigReloadPending and ProcessConfigFile() to all example bgworkers* Show pgstat use in example bgworkersPROPOSED BGW SETUP AND MAINLOOP HELPERS----and I'm wondering if anyone thinks it's a good idea to add some bgworker mainloop helper API, where one call is made at the start of the bgw, before BackgroundWorkerInitializeConnection(), to:* set application_name* set process title* set up default signal handlers and unblock signals* creates and assigns a BGWMemoryContext named after the bgw* set state to idle in pgstat* sets up MyProcPort (or teach InitPostgres to do so for a bgw)* takes a function pointer for a before_shmem_exit cleanup handlerto encourage ext authors to use this approachthen a mainloop helper that:* Checks that CurrentMemoryContext == BGWMemoryContext* Runs CHECK_FOR_INTERRUPTS()* Checks for postmaster death and exits* Checks for and performs config file reloadPROPOSED ERROR HANDLING HELPER----It'd also be beneficial to have a helper for bgw mainloops with error recovery, generalizing and extracting the pattern I see in all these routines:src/backend/postmaster/autovacuum.c=AutoVacLauncherMain
src/backend/postmaster/autovacuum.c=AutoVacWorkerMain
src/backend/postmaster/bgworker.c=StartBackgroundWorker
src/backend/postmaster/bgwriter.c=BackgroundWriterMain
src/backend/postmaster/checkpointer.c=CheckpointerMain
src/backend/postmaster/walwriter.c=WalWriterMain
src/backend/tcop/postgres.c=PostgresMainwhich all do their own error reset loops. There's way too much copy/paste going on there IMO, even before we consider bgworkers, and it's nearly impossible for anyone who isn't quite an experienced Pg developer to write a bgw that can do anything with errors except promote ERROR to FATAL and die.This would be modeled on PosgresMain(). The worker would only need to add its own logic to the sigsetjmp() error jump path, not duplicate all the core cleanup work.PROPOSED GENERALISED WORKER MANAGEMENT----Finally I'm wondering if there's any interest in generalizing the logical rep worker management for other bgworkers. I've done a ton of work with worker management and it's something I'm sure I could take on but I don't want to write it without knowing there's some chance of acceptance.The general idea is to provide a way for bgworkers to start up managers for pools / sets of workers. They launch them and have a function they can call in their mainloop that watches their child worker states, invoking callbacks when they fail to launch, launch successfully, exit cleanly after finishing their work, or die with an error. Workers are tracked in a shmem seg where the start of the seg must be a key struct (akin to how the hash API works). We would provide calls to look up a worker shmem struct by key, signal a worker by key, wait for a worker to exit (up to timeout), etc. Like in the logical rep code, access to the worker registration shmem would be controlled by LWLock. The extension code can put whatever it wants in the worker shmem entries after the key, including various unions or whatever - the worker management API won't care.This abstracts all the low level mess away from bgworker implementations and lets them focus on writing the code they want to run.I'd probably suggest doing so by extracting the logical rep worker management, and making the logical rep code use the generalized worker management. So it'd be proven, and have in core users.PROPOSED XACT API WRAPPER----Additionally there are some more general postgres API patterns that the team working on pglogical and BDR has landed up writing wrappers for or compensating for. One of the most important is transaction management. In many places in pglogical where we start lightweight transactions in pglogical we use a wrapper that starts the txn if there isn't already one open and remembers whether one was started. Then the end helper closes it only if the start helper opened it. Finally, the memory context is always reset to the memory context before txn start once the txn ends, instead of being left at TopMemoryContext, since we've learned that that's an extremely common source of bugs.I propose a wrapper for lightweight txn management, like the one we use in many places in pglogical. This is not a copy of that code, it's just a scratched out example of roughly how we do it:LocalTransactionState txn;StartTransactionIfNotStarted(&txn, LXACT_WANT_SNAPSHOT);...CommitTransactionCommandIfStarted(&txn);wherestatic inline voidStartTransactionIfNotStarted(LocalTransactionState *txn, int flags){txn->started = !TransactionInProgress();txn->save_mctx = CurrentMemoryContext;if (txn->started){txn->pushed_snapshot = false;StartTransactionCommand();if (flags & LXACT_WANT_SNAPSHOT){txn->pushed_snapshot = true;PushActiveSnapshot(GetTransactionSnapshot());}}}static inline voidCommitTransactionIfNotStarted(LocalTransactionState *txn){if (txn->started){Assert(TransactionInProgress());Assert(CurrentMemoryContext == TopTransactionContext);if (txn->pushed_snapshot)PopActiveSnapshot();CommitTransactionCommand();}else{/* Caller didn't change context between start and end */Assert(CurrentMemoryContext == txn->save_mctx);}(void) MemoryContextSwitchTo(txn->save_mctx);}
any from these proposal looks like good idea
Regards
Pavel
On Thu, Sep 10, 2020 at 5:02 AM Craig Ringer <craig@2ndquadrant.com> wrote:
Hi allAs I've gained experience working on background workers, it's become increasingly clear that they're a bit too different to normal backends for many nontrivial uses.
<snip> a lot of proposals I agree with.
PROPOSED GENERALISED WORKER MANAGEMENT----Finally I'm wondering if there's any interest in generalizing the logical rep worker management for other bgworkers. I've done a ton of work with worker management and it's something I'm sure I could take on but I don't want to write it without knowing there's some chance of acceptance.The general idea is to provide a way for bgworkers to start up managers for pools / sets of workers. They launch them and have a function they can call in their mainloop that watches their child worker states, invoking callbacks when they fail to launch, launch successfully, exit cleanly after finishing their work, or die with an error. Workers are tracked in a shmem seg where the start of the seg must be a key struct (akin to how the hash API works). We would provide calls to look up a worker shmem struct by key, signal a worker by key, wait for a worker to exit (up to timeout), etc. Like in the logical rep code, access to the worker registration shmem would be controlled by LWLock. The extension code can put whatever it wants in the worker shmem entries after the key, including various unions or whatever - the worker management API won't care.This abstracts all the low level mess away from bgworker implementations and lets them focus on writing the code they want to run.I'd probably suggest doing so by extracting the logical rep worker management, and making the logical rep code use the generalized worker management. So it'd be proven, and have in core users.
Yes, there is definitely a lot of interest in this.
It would also be good to somehow generalize away the difference between static bgworkers and dynamic ones. That's something that really annoyed us with the work on the "online checksums" patch, and I've also run into that issue in other cases. I think finding a way to launch a dynamic worker out of the postmaster would be a way to do that -- I haven't looked into the detail, but if we're looking at generalizing the worker management this is definitely something we should include in the consideration.
I haven't looked at the different places we could in theory extract the management out of and reuse, but it makes sense that the logical replication one would be the most appropriate since it's the newest one (vs autovacuum which is the other one that can at least do similar things). And yes, it definitely makes sense to have a generalized set of code for this, because it's certainly a fairly complicated pattern that we shouldn't be re-inventing over and over again with slightly different bugs.
On Thu, Sep 10, 2020 at 11:02:07AM +0800, Craig Ringer wrote: > Hi all > > As I've gained experience working on background workers, it's become > increasingly clear that they're a bit too different to normal backends for many > nontrivial uses. > > I thought I'd take a moment to note some of it here, along with some proposals > for things we could potentially do to make it much easier to use bgworkers > correctly especially when using them to run queries. > > This is NOT A PATCH SET. It's a set of discussion proposals and it's also > intended as a bit of a helper for people just getting started on bgworkers. > There are a lot of subtle differences in the runtime environment a basic > bgworker provides vs the runtime environment extension authors will be used to > when writing fmgr-callable C functions. > > (It looks like pg12 and pg13 have some improvements, so some of the issues I > was going to mention with error cleanup paths and locking aren't relevant > anymore.) > > DIFFERENCES WHEN CODING FOR BGWORKERS Can we put this information somewhere in our docs or source code as a README? -- Bruce Momjian <bruce@momjian.us> https://momjian.us EnterpriseDB https://enterprisedb.com The usefulness of a cup is in its emptiness, Bruce Lee