Review: Extra Daemons / bgworker - Mailing list pgsql-hackers
From | Markus Wanner |
---|---|
Subject | Review: Extra Daemons / bgworker |
Date | |
Msg-id | 50B61846.8050201@bluegap.ch Whole thread Raw |
Responses |
Re: Review: Extra Daemons / bgworker
|
List | pgsql-hackers |
Hello Álvaro, first of all, thank you for bringing this up again and providing a patch. My first attempt on that was more than two years ago [1]. As the author of a former bgworker patch, I'd like to provide an additional review - KaiGai was simply faster to sing up as a reviewer on the commitfest app... I started my review is based on rev 6d7e51.. from your bgworker branch on github, only to figure out later that it wasn't quite up to date. I upgraded to bgworker-7.patch. I hope that's the most recent version. # Concept I appreciate the ability to run daemons that are not connected to Postgres shared memory. It satisfies a user request that came up several times when I talked about the bgworkers in Postgres-R. Another good point is the flexible interface via extensions, even allowing different starting points for such background workers. One thing I'd miss (for use of that framework in Postgres-R) is the ability to start a registered bgworker only upon request, way after the system started. So that the bgworker process doesn't even exist until it is really needed. I realize that RegisterBackgroundWorker() is still necessary in advance to reserve the slots in BackgroundWorkerList and shared memory. As a use case independent of Postgres-R, think of something akin to a worker_spi, but wanting that to perform a task every 24h on hundreds of databases. You don't want to keep hundreds of processes occupying PGPROC slots just perform a measly task every 24h. (We've discussed the ability to let bgworkers re-connect to another database back then. For one, you'd still have currently unneeded worker processes around all the time. And second, I think that's hard to get right - after all, a terminated process is guaranteed to not leak any stale data into a newly started one, no matter what.) From my point of view, autovacuum is the very first example of a background worker process. And I'm a bit puzzled about it not being properly integrated into this framework. Once you think of autovacuum as a background job which needs access to Postgres shared memory and a database, but no client connection, it looks like a bit of code duplication (and not using code we already have). I realize this kind of needs the above feature being able to request the (re)start of bgworkers at arbitrary points in time. However, it would also be a nice test case for the bgworker infrastructure. I'd be happy to help with extending the current patch into that direction, if you agree it's generally useful. Or adapt my bgworker code accordingly. # Minor technical issues or questions In assign_maxconnections, et al, GetNumRegisteredBackgroundWorkers() is used in relation to MAX_BACKENDS or to calculate MaxBackends. That seems to "leak" the bgworkers that registered with neither BGWORKER_SHMEM_ACCESS nor BGWORKER_BACKEND_DATABASE_CONNECTION set. Or is there any reason to discount such extra daemon processes? The additional contrib modules auth_counter and worker_spi are missing from the contrib/Makefile. If that's intentional, they should probably be listed under "Missing". The auth_counter module leaves worker.bgw_restart_time uninitialized. Being an example, it might make sense for auth_counter to provide a signal that just calls SetLatch() to interrupt WaitLatch() and make auth_counter emit its log line upon request, i.e. show how to use SIGUSR1. The bgw_restart_time doesn't always work (certainly not the way I'm expecting it to). For example, if you forget to pass the BGWORKER_SHMEM_ACCESS flag, trying to LWLockAcquire() leads to the worker being restarted immediately and repeatedly - independent of the bgw_restart_time setting. The same holds true if the bgworker exits with status 0 or in case it segfaults. Not when exiting with code 1, though. Why is that? Especially in case of a segfault or equally "hard" errors that can be expected to occur repeatedly, I don't want the worker to be restarted that frequently. # Documentation There are two working examples in contrib. The auth_counter could use a header comment similar to worker_spi, quickly describing what it does. There's no example of a plain extra daemon, without shmem access. Coding guidelines for bgworker / extra daemon writers are missing. I read these must not use sleep(), with an explanation in both examples. Other questions that come to mind: what about signal handlers? fork()? threads? Can/should it use PG_TRY/PG_CATCH or setjmp()/longjmp(). How to best load other 3rd party libraries, i.e. for OpenGL processing? Especially for extra daemons (i.e. bgw_flags = 0), differences to running an external daemon should be documented. I myself am unclear about all of the implications that running as a child of the postmaster has (OOM and cgroups come to mind - there certainly are other aspects). (I myself still have a hard time finding a proper use case for extra daemons. I don't want the postmaster to turn into a general purpose watchdog for everything and the kitchen sink.) # Tests No additional automated tests are included - hard to see how that could be tested automatically, given the lack of a proper test harness. I hope this review provides useful input. Regards Markus Wanner [1]: That was during commit fest 2010-09. Back then, neither extensions, latches nor the walsender existed. The patches had a dependency on imessages, which was not accepted. Other than that, it's a working, pooled bgworker implementation on top of which autovacuum runs just fine. It lacks extensibility and the extra daemons feature, though. For those who missed it: https://commitfest.postgresql.org/action/patch_view?id=339 http://archives.postgresql.org/message-id/4C3C789C.1040409@bluegap.ch http://git.postgres-r.org/?p=bgworker;a=summary
pgsql-hackers by date: