On Mon, Jul 8, 2024 at 5:38 AM Heikki Linnakangas <hlinnaka@iki.fi> wrote:
> Another approach would be to move the responsibility of background
> worker state notifications out of postmaster completely. When a new
> background worker is launched, the worker process itself could send the
> notification that it has started. And similarly, when a worker exits, it
> could send the notification just before exiting. There's a little race
> condition with exiting: if a process is waiting for the bgworker to
> exit, and launches a new worker immediately when the old one exits,
> there will be a brief period when the old and new process are alive at
> the same time. The old worker wouldn't be doing anything interesting
> anymore since it's exiting, but it still counts towards
> max_worker_processes, so launching the new process might fail because of
> hitting the limit. Maybe we should just bump up max_worker_processes. Or
> postmaster could check PMChildFlags and not count processes that have
> already deregistered from PMChildFlags towards the limit.
I can testify that the current system is the result of a lot of trial
and error. I'm not saying it can't be made better, but my initial
attempts at getting this to work (back in the 9.4 era) resembled what
you proposed here, were consequently a lot simpler than what we have
now, and also did not work. Race conditions like you mention here were
part of that. Another consideration is that fork() can fail, and in
that case, the process that tried to register the new background
worker needs to find out that the background worker won't ever be
starting. Yet another problem is that, even if fork() succeeds, the
new process might fail before it executes any of our code e.g. because
it seg faults very early, a case that actually happened to me -
inadvertently - while I was testing these facilities. I ended up
deciding that we can't rely on the new process to do anything until
it's given us some signal that it is alive and able to carry out its
duties. If it dies before telling us that, or never starts in the
first place, we have to have some other way of finding that out, and
it's difficult to see how that can happen without postmaster
involvement.
--
Robert Haas
EDB: http://www.enterprisedb.com