Re: Wait for parallel workers to attach - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: Wait for parallel workers to attach
Date
Msg-id CAA4eK1J_3mBfGReJ76DLavFgXgG7a2EifQ9uTXdBg+cxs+aRtQ@mail.gmail.com
Whole thread Raw
In response to Re: Wait for parallel workers to attach  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: Wait for parallel workers to attach
List pgsql-hackers
On Mon, Jan 29, 2018 at 8:25 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Sat, Jan 27, 2018 at 3:14 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:
>> During the recent development of parallel operation (parallel create
>> index)[1], a need has been arised for $SUBJECT.  The idea is to allow
>> leader backend to rely on number of workers that are successfully
>> started.  This API allows leader to wait for all the workers to start
>> or fail even if one of the workers fails to attach.  We consider
>> workers started/attached once they are attached to error queue.  This
>> will ensure that any error after the workers are attached won't be
>> silently ignored by leader.
>
> known_started_workers looks a lot like any_message_received.  Perhaps
> any_message_received should be renamed to known_started_workers and
> reused here.
>

Sure, that sounds good to me.  Do you prefer a separate patch for
renaming any_message_received to known_started_workers or is it okay
to have it along with the main patch?

>  After all, if we know that a worker was started, there's
> no need for WaitForParallelWorkersToFinish to again call
> GetBackgroundWorkerPid() for it.
>

I think in above sentence you wanted to say
WaitForParallelWorkersToAttach, not WaitForParallelWorkersToFinish.
Am I right?

> I think that you shouldn't need the 10ms delay loop; waiting forever
> should work.  If a work fails to start, the postmaster should send
> SIGUSR1 which should set our latch.
>

I am not getting what exactly you are suggesting here.  The wait loop
is intended for the case when some workers are not started.  We want
to wait for sometime before checking again whether workers are
started. I wanted to avoid busy looping waiting for some worker to
start.  I think in most cases we don't need to wait, but for some
corner cases where postmaster didn't get chance to start a worker, we
should avoid busy looping waiting for a worker to start.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: JIT compiling with LLVM v9.0
Next
From: Robert Haas
Date:
Subject: Re: Wait for parallel workers to attach