Re: Parallel Seq Scan - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: Parallel Seq Scan
Date
Msg-id CAA4eK1JSSonzKSN=L-DWuCEWdLqkbMUjvfpE3fGW2tn2zPo2RQ@mail.gmail.com
Whole thread Raw
In response to Re: Parallel Seq Scan  (Amit Langote <Langote_Amit_f8@lab.ntt.co.jp>)
Responses Re: Parallel Seq Scan
Re: Parallel Seq Scan
Re: Parallel Seq Scan
List pgsql-hackers
On Mon, Mar 16, 2015 at 12:58 PM, Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> wrote:
>
> On 16-03-2015 PM 04:14, Amit Kapila wrote:
> > On Mon, Mar 16, 2015 at 9:40 AM, Amit Langote <Langote_Amit_f8@lab.ntt.co.jp>
> > wrote:
> >> Or if the just-detached queue happens to be the last one, we'll make
> >> shm_mq_receive() to read from a potentially already-detached queue in the
> >> immediately next iteration.
> >
> > Won't the last queue case already handled by below code:
> > else
> > {
> > --funnel->nqueues;
> > if (funnel->nqueues == 0)
> > {
> > if (done != NULL)
> > *done = true;
> > return NULL;
> > }
> >
>
> Actually I meant "currently the last" or:
>
>     funnel->nextqueue == funnel->nqueue - 1
>
> So the code you quote would only take care of subset of the cases.
>

Fixed this issue by resetting funnel->next queue to zero (as per offlist
discussion with Robert), so that it restarts from first queue in such
a case.

>
> >> I can't seem to really figure out the other problem of waiting forever in
> >> WaitLatch()
> >>
> >
> > The reason seems that for certain scenarios, the way we set the latch before
> > exiting needs some more thought.  Currently we are setting the latch in
> > HandleParallelMessageInterrupt(), that doesn't seem to be sufficient.
> >
>
> How about shm_mq_detach() called from ParallelQueryMain() right after
> exec_parallel_stmt() returns? Doesn't that do the SetLatch() that needs to be
> done by a worker?
>

Fixed this issue by not going for Wait incase of detached queues.

Apart from these fixes, latest patch contains below changes:

1.  Integrated with assess-parallel-safety-v4.patch [1].  To test
     with this patch, please remember to comment below line
     in this patch, else it will always enter parallel-mode.
     + glob->parallelModeNeeded = glob->parallelModeOK; /* XXX JUST FOR TESTING */ 

2.  Handle the case where enough workers are not available for
     execution of Funnel node.  In such a case it will run the plan
     with available number of workers and incase no worker is available,
     it will just run the local partial seq scan node.  I think we can
     invent some more advanced solution to handle this problem in
     case there is a strong need after the first version went in.

3.  Support for pg_stat_statements (it will show the stats for parallel-
     statement).  To handle this case, we need to share buffer usage
     stats from all the workers.  Currently the patch does collect
     buffer usage stats by default (even though pg_stat_statements is
     not enabled) as that is quite cheap and we can make it conditional
     if required in future.


So the patches have to be applied in below sequence:
HEAD Commit-id : 8d1f2390
parallel-mode-v8.1.patch [2]
assess-parallel-safety-v4.patch [1]
parallel-heap-scan.patch [3]
parallel_seqscan_v11.patch (Attached with this mail)

The reason for not using the latest commit in HEAD is that latest
version of assess-parallel-safety patch was not getting applied,
so I generated the patch at commit-id where I could apply that
patch successfully.


With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com
Attachment

pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: configure can't detect proper pthread flags
Next
From: Thom Brown
Date:
Subject: Re: assessing parallel-safety