Re: Making BackgroundWorkerHandle a complete type or offering a worker enumeration API? - Mailing list pgsql-hackers

From Craig Ringer
Subject Re: Making BackgroundWorkerHandle a complete type or offering a worker enumeration API?
Date
Msg-id 548F0C34.1070307@2ndquadrant.com
Whole thread Raw
In response to Re: Making BackgroundWorkerHandle a complete type or offering a worker enumeration API?  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: Making BackgroundWorkerHandle a complete type or offering a worker enumeration API?  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
On 12/16/2014 12:12 AM, Robert Haas wrote:
> On Sat, Dec 13, 2014 at 4:13 AM, Craig Ringer <craig@2ndquadrant.com> wrote:
>> While working on BDR, I've run into a situation that I think highlights
>> a limitation of the dynamic bgworker API that should be fixed.
>> Specifically, when the postmaster crashes and restarts shared memory
>> gets cleared but dynamic bgworkers don't get unregistered, and that's a
>> mess.
> 
> I've noticed this as well.  What I was thinking of proposing is that
> we change things so that a BGW_NEVER_RESTART worker is unregistered
> when a crash-and-restart cycle happens, but workers with any other
> restart time are retained.

Personally I need workers that get restarted, but are discarded on
crash. They access shared memory, so when shmem is cleared I need them
to be discarded too, but otherwise I wish them to be persistent
until/unless they're intentionally unregistered.

If I have to use BGW_NO_RESTART then I land up having to implement
monitoring of worker status and manual re-registration in a supervisor
static worker. Which is a pain, since it's duplicating work the
postmaster would otherwise just be doing for me.

I'd really rather a separate flag.

> Maybe it would be best to make the per-database workers BGW_NO_RESTART
> and have the static bgworker, rather than the postmaster, be
> responsible for starting them.  Then the fix mentioned above would
> suffice.

Yeah... it would, but it involves a bunch of code that duplicates
process management the postmaster already does.

More importantly, if the supervisor worker crashes / is killed it loses
its handles to the other workers and the signals they send no longer go
to the right worker. So it's not robust.

> If that's not good for some reason, my second choice is adding a
> BGWORKER_UNREGISTER_AFTER_CRASH flag.  That seems much simpler and
> less cumbersome than your other proposal.

That'd be my preference.

-- Craig Ringer                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services



pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: Commitfest problems
Next
From: Robert Haas
Date:
Subject: Re: pgbench -f and vacuum