Home > mailing lists

Re: [HACKERS] parallel.c oblivion of worker-startup failures - Mailing list pgsql-hackers

From	Peter Geoghegan
Subject	Re: [HACKERS] parallel.c oblivion of worker-startup failures
Date	January 24, 2018 23:05:01
Msg-id	CAH2-Wz=3aLj3FcneJBJqk3Qncs8VHHBsXpDJh8epDJ_CmjMgVw@mail.gmail.com Whole thread
In response to	Re: [HACKERS] parallel.c oblivion of worker-startup failures (Thomas Munro <thomas.munro@enterprisedb.com>)
List	pgsql-hackers

Tree view

On Wed, Jan 24, 2018 at 1:57 AM, Thomas Munro
<thomas.munro@enterprisedb.com> wrote:
> On Wed, Jan 24, 2018 at 5:25 PM, Thomas Munro
> <thomas.munro@enterprisedb.com> wrote:
>> If there were some way for the postmaster to cause reason
>> PROCSIG_PARALLEL_MESSAGE to be set in the leader process instead of
>> just notification via kill(SIGUSR1) when it fails to fork a parallel
>> worker, we'd get (1) for free in any latch/CFI loop code.  But I
>> understand that we can't do that by project edict.
>
> Based on the above observation, here is a terrible idea you'll all
> hate.  It is pessimistic and expensive: it thinks that every latch
> wake might be the postmaster telling us it's failed to fork() a
> parallel worker, until we've seen a sign of life on every worker's
> error queue.  Untested illustration code only.  This is the only way
> I've come up with to discover fork failure in any latch/CFI loop (ie
> without requiring client code to explicitly try to read either error
> or tuple queues).

The question, I suppose, is how expensive this is in the real world.
If it's actually not a cost that anybody is likely to notice, then I
think we should pursue this approach. I wouldn't put too much weight
on keeping this simple for users of the parallel infrastructure,
though, because something like Amit's WaitForParallelWorkersToAttach()
idea still seems acceptable. "Call this function before trusting the
finality of nworkers_launched" isn't too onerous a rule to have to
follow.

-- 
Peter Geoghegan

pgsql-hackers by date:

From: Tom Lane
Date: 24 January 2018, 22:57:04
Subject: Re: pgsql: Add parallel-aware hash joins.

From: Robert Haas
Date: 24 January 2018, 23:05:54
Subject: Re: [HACKERS] parallel.c oblivion of worker-startup failures

Re: [HACKERS] parallel.c oblivion of worker-startup failures - Mailing list pgsql-hackers

Previous

Next