Re: dynamic background workers, round two - Mailing list pgsql-hackers

From Robert Haas
Subject Re: dynamic background workers, round two
Date
Msg-id CA+TgmoZZPV68y_WL97NcsPWsKPTp463FgbZjhnBVO7dB+q27Jw@mail.gmail.com
Whole thread Raw
In response to Re: dynamic background workers, round two  (Andres Freund <andres@2ndquadrant.com>)
Responses Re: dynamic background workers, round two
List pgsql-hackers
On Wed, Jul 24, 2013 at 5:34 PM, Andres Freund <andres@2ndquadrant.com> wrote:
> This seems like a sensible idea to me. But, in the context of dynamic
> query, don't we also need the reverse infrastructure of notifying a
> bgworker that the client, that requested it to be started, has died?
> Ending up with a dozen bgworkers running because the normal backend
> FATALed doesn't seem nice.

I think that will be desirable for many, although not all, uses of
background workers.  For example, you might want to be able to do
something like SELECT pg_background('CREATE INDEX CONCURRENTLY ...')
and then exit your session and have the background worker continue to
run.  Or maybe somebody writes a trigger that starts a background
worker (if there's not one already working) and queues up work for the
trigger, and the background worker continues processing the queue
until it is empty, and then exits.

But for parallel query specifically, yeah, we'll probably want the
untimely demise of either the original backend or any of its workers
to result in the death of all the workers and an abort of the parent's
transaction.  However, there's no need for the postmaster to be
involved in any of that.  For example, once the worker process is up
and running and the parent has its PID, it can use an on_shmem_exit
hook or a PG_ENSURE_ERROR_CLEANUP() block to send SIGTERM to any
still-living workers.  In the other direction, the parent needs to
know not only whether or not the child has died after startup, but
most likely also needs to receive errors, notices, tuples, or other
data from the child.  If the parent receives an ERROR or FATAL
indication from a parallel-worker child, it should most likely kill
all other workers and then abort the transaction.  But in neither of
those cases does the postmaster need to be involved.

The reason why this particular mechanism is needed is because the
worker isn't able to report its own failure to start.  Only the
postmaster can do that.  Once both processes are up and running, it's
up to them to handle their own IPC needs.

> It really should be pid_t even though we're not consistently doing that
> in the code.

Hmm, I thought I saw we weren't using that type.  But I guess we are,
so I can go make this consistent.

>> +#define InvalidPid                           (-1)
>> +
>
> That value strikes me as a rather bad idea. As a pid that represents
> init's processgroup. I.e. kill(InvalidPid) will try to kill init... I'd
> rather not spread that outside async.c.

Well, if we ever try to kill(InvalidPid), that's a PostgreSQL bug.
And if the postgres process has permission to send signals to init,
that's an OS bug (or a PostgreSQL bug, since we refuse to run as
root).  So I'm not very worried about it.  I've got two functions that
need distinguishable return values for the not-yet-started case and
the already-dead case, neither of which can be real PIDs.  If we don't
use 0 and -1, the alternative is presumably to complicate the calling
signature with another out parameter.  That does not seem like a net
improvement.

>> +/*
>> + * Report the PID of a newly-launched background worker in shared memory.
>> + *
>> + * This function should only be called from the postmaster.
>> + */
>> +void
>> +ReportBackgroundWorkerPID(RegisteredBgWorker *rw)
>> +{
>> +     BackgroundWorkerSlot *slot;
>> +
>> +     Assert(rw->rw_shmem_slot < max_worker_processes);
>> +     slot = &BackgroundWorkerData->slot[rw->rw_shmem_slot];
>> +     slot->pid = rw->rw_pid;
>> +
>> +     if (rw->rw_worker.bgw_notify_pid != 0)
>> +             kill(rw->rw_worker.bgw_notify_pid, SIGUSR1);
>> +}
>
> Any reason not to use SendProcSignal() properly here? Just that you
> don't know the BackendId? I seems unclean to reuse SIGUSR1 without using
> the infrastructure built for that.

We're in the postmaster here.  The postmaster currently uses kill
directly in all cases, rather than SendProcSignal(), and I'd rather
not change that.  Any code that the postmaster calls has got to be
safe against arbitrary shared memory corruption, and I'd rather keep
that footprint as small as possible.  For example, if we added
bgw_backend_id, the postmaster would have to bounds check it to make
sure it didn't seg fault.  The more of that kind of stuff we add, the
more chances there are to destabilize the postmaster.  I'd like to
keep it as minimal as possible.

>> + * If handle != NULL, we'll set *handle to a pointer that can subsequently
>> + * be used as an argument to GetBackgroundWorkerPid().  The caller can
>> + * free this pointer using pfree(), if desired.
>>   */
>>  bool
>> -RegisterDynamicBackgroundWorker(BackgroundWorker *worker)
>> +RegisterDynamicBackgroundWorker(BackgroundWorker *worker,
>> +
>> BackgroundWorkerHandle **handle)
>
> I'd just let the caller provide the memory, but whatever ;)

It could, but then the structure couldn't be opaque.  That wouldn't be
a disaster but this provides better protection against future API
violations.

> None of the containing code should be allowed to raise signals
> afaics. Was there anything specific you were thinking of? Or just
> defense against leaking set_latch_on_sigusr1 = true?

That's it exactly.

>> +             for (;;)
>> +             {
>> +                     pid = GetBackgroundWorkerPid(handle);
>> +                     if (pid != InvalidPid)
>> +                             break;
>> +                     rc = WaitLatch(&MyProc->procLatch,
>> +                                                WL_LATCH_SET | WL_POSTMASTER_DEATH, 0);
>> +                     if (rc & WL_POSTMASTER_DEATH)
>> +                             return InvalidPid;
>
> This basically ignores postmaster death. With unix_latch.c that
> shouldn't be a problem because afaics it's implementation correctly
> should report an error in the next WaitLatch without waiting as well. I
> am not sure about the win32 implementation, is that ok there as well?

I don't understand what you mean by this.  Why does it ignore postmaster death?

> Also, shouldn't we somewhere check for interrupts in the loop? This
> looks like it would run forever if we start to wait while postmaster is
> about to shut down, because afair postmaster won't start new children in
> that case. But we won't notice that we're asked to get shut down.

Oh, right.  I meant to put a CHECK_FOR_INTERRUPTS() in that loop, but forgot.

> The usual and documented pattern for latches is to do
>  * for (;;)
>  * {
>  *         ResetLatch();
>  *         if (work to do)
>  *                 Do Stuff();
>  *         WaitLatch();
>  * }
> any reason for differing here? I don't really see a problem with the
> pattern used here, but ...

Yeah, that'd be fine, too.  But I actually think it's better to reset
the latch at the bottom of the loop, because that way if the condition
is already satisfied on first entering the loop, we can exit the loop
without resetting the latch.  That saves the next person who sets the
latch the trouble of sending a signal or self-pipe byte, because the
latch is already set.

>> +     set_latch_on_sigusr1 = false;
>
> So, the set_latch_on_sigusr1 logical exists so MyProc->procLatch doesn't
> get SetLatch'ed unless we're in +WaitForBackgroundWorkerStartup()?

Right.  Noah suggested to me that we might do this unconditionally,
which would certainly be simpler and cleaner.  I dunno if it'd risk
too many spurious wake-ups in any scenario, though.  I kind of doubt
it, but I don't know.

>> +/*
>> + * When a backend asks to be notified about worker state changes, we
>> + * set a flag in its backend entry.  The background worker machinery needs
>> + * to know when such backends exit.
>> + */
>> +bool
>> +PostmasterMarkPIDForWorkerNotify(int pid)
>> +{
>> +     dlist_iter      iter;
>> +     Backend    *bp;
>> +
>> +     dlist_foreach(iter, &BackendList)
>> +     {
>> +             bp = dlist_container(Backend, elem, iter.cur);
>> +             if (bp->pid == pid)
>> +             {
>> +                     bp->bgworker_notify = true;
>> +                     return true;
>> +             }
>> +     }
>> +     return false;
>> +}
>
> Maybe we should have MyBackend?

How would that help?  This code is run in the postmaster, so that we
only clear iterate through workers and clear bgw_notify_pid when the
backend that has just exited has at some point appeared in a
bgw_notify_pid.

> Btw, you seem to want to support this in bgworkers started by a
> bgworker. That's not going to work without some changes if the
> "intermediate" bgworker is one without a backend since those don't use
> procsignal_sigusr1_handler.

Right.  I think it's OK for now to limit it to cases where the
intermediate bgworker has a backend.  If someone else finds that
restriction unacceptable, they can fix it.

> So, that's the first crop of things that came to mind. I'll sleep over
> it, but so far I like it generally ;)

Thanks for the quick review!

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



pgsql-hackers by date:

Previous
From: "suresh.balasubra"
Date:
Subject: Re: [GENERAL] currval and DISCARD ALL
Next
From: David Fetter
Date:
Subject: Re: Review: UNNEST (and other functions) WITH ORDINALITY