Re: Parallel Seq Scan - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: Parallel Seq Scan
Date
Msg-id CAA4eK1JLv+2y1AwjhsQPFisKhBF7jWF_Nzirmzyno9uPBRCpGw@mail.gmail.com
Whole thread Raw
In response to Re: Parallel Seq Scan  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: Parallel Seq Scan
Re: Parallel Seq Scan
List pgsql-hackers
On Mon, Mar 30, 2015 at 8:31 PM, Robert Haas <robertmhaas@gmail.com> wrote:
>
> On Wed, Mar 18, 2015 at 11:43 PM, Amit Kapila <amit.kapila16@gmail.com> wrote:
> >> I think I figured out the problem.  That fix only helps in the case
> >> where the postmaster noticed the new registration previously but
> >> didn't start the worker, and then later notices the termination.
> >> What's much more likely to happen is that the worker is started and
> >> terminated so quickly that both happen before we create a
> >> RegisteredBgWorker for it.  The attached patch fixes that case, too.
> >
> > Patch fixes the problem and now for Rescan, we don't need to Wait
> > for workers to finish.
>
> I realized that there is a problem with this.  If an error occurs in
> one of the workers just as we're deciding to kill them all, then the
> error won't be reported. Also, the new code to propagate
> XactLastRecEnd won't work right, either.  I think we need to find a
> way to shut down the workers cleanly.  The idea generally speaking
> should be:
>
> 1. Tell all of the workers that we want them to shut down gracefully
> without finishing the scan.
>
> 2. Wait for them to exit via WaitForParallelWorkersToFinish().
>
> My first idea about how to implement this is to have the master detach
> all of the tuple queues via a new function TupleQueueFunnelShutdown().
> Then, we should change tqueueReceiveSlot() so that it does not throw
> an error when shm_mq_send() returns SHM_MQ_DETACHED.  We could modify
> the receiveSlot method of a DestReceiver to return bool rather than
> void; a "true" value can mean "continue processing" where as a "false"
> value can mean "stop early, just as if we'd reached the end of the
> scan".
>

I have implemented this idea (note that I have to expose a new API
shm_mq_from_handle as TupleQueueFunnel stores shm_mq_handle* and
we sum_mq* to call shm_mq_detach) and apart this I have fixed other
problems reported on this thread:

1. Execution of initPlan by master backend and then pass the
required PARAM_EXEC parameter values to workers.
2. Avoid consuming dsm's by freeing the parallel context after
the last tuple is fetched.
3. Allow execution of Result node in worker backend as that can
be added as a gating filter on top of PartialSeqScan.
4. Merged parallel heap scan descriptor patch

To apply the patch, please follow below sequence:

HEAD Commit-Id: 4d930eee
parallel-mode-v9.patch [1]
assess-parallel-safety-v4.patch [2]  (don't forget to run fixpgproc.pl in the patch)
parallel_seqscan_v14.patch (Attached with this mail)



With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com
Attachment

pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: Rounding to even for numeric data type
Next
From: Abhijit Menon-Sen
Date:
Subject: Re: a fast bloat measurement tool (was Re: Measuring relation free space)