Re: parallel mode and parallel contexts - Mailing list pgsql-hackers

From Robert Haas
Subject Re: parallel mode and parallel contexts
Date
Msg-id CA+TgmoaAOHSmwJtJQ5unaeR=Z_saQX6SThi7oZtfFEsDsN=G1w@mail.gmail.com
Whole thread Raw
In response to Re: parallel mode and parallel contexts  (Amit Kapila <amit.kapila16@gmail.com>)
Responses Re: parallel mode and parallel contexts
List pgsql-hackers
On Wed, Jan 21, 2015 at 2:11 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:
> On Tue, Jan 20, 2015 at 9:52 PM, Robert Haas <robertmhaas@gmail.com> wrote:
>> On Tue, Jan 20, 2015 at 9:41 AM, Amit Kapila <amit.kapila16@gmail.com>
>> wrote:
>> > It seems [WaitForBackgroundWorkerShutdown] has possibility to wait
>> > forever.
>> > Assume one of the worker is not able to start (not able to attach
>> > to shared memory or some other reason), then status returned by
>> > GetBackgroundWorkerPid() will be BGWH_NOT_YET_STARTED
>> > and after that it can wait forever in WaitLatch.
>>
>> I don't think that's right.  The status only remains
>> BGWH_NOT_YET_STARTED until the postmaster forks the child process.
>
> I think the control flow can reach the above location before
> postmaster could fork all the workers.  Consider a case that
> we have launched all workers during ExecutorStart phase
> and then before postmaster could start all workers an error
> occurs in master backend and then it try to Abort the transaction
> and destroy the parallel context, at that moment it will get the
> above status and wait forever in above code.
>
> I am able to reproduce above scenario with debugger by using
> parallel_seqscan patch.

Hmm.  Well, if you can reproduce it, there clearly must be a bug.  But
I'm not quite sure where.  What should happen in that case is that the
process that started the worker has to wait for the postmaster to
actually start it, and then after that for the new process to die, and
then it should return.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



pgsql-hackers by date:

Previous
From: Andrew Gierth
Date:
Subject: Re: B-Tree support function number 3 (strxfrm() optimization)
Next
From: Amit Kapila
Date:
Subject: Re: parallel mode and parallel contexts