Re: Parallel Seq Scan - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: Parallel Seq Scan
Date
Msg-id CAA4eK1+S8NSBL9OLJn+8dAF6Hrn0v6DPG435g7iXidwgfTe7=Q@mail.gmail.com
Whole thread Raw
In response to Re: Parallel Seq Scan  (Amit Kapila <amit.kapila16@gmail.com>)
Responses Re: Parallel Seq Scan
List pgsql-hackers
On Wed, Mar 4, 2015 at 6:17 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Sun, Feb 22, 2015 at 6:39 AM, Robert Haas <robertmhaas@gmail.com> wrote:
> >
> > On Tue, Feb 17, 2015 at 11:22 AM, Andres Freund <andres@2ndquadrant.com> wrote:
> > > My only "problem" with that description is that I think workers will
> > > have to work on more than one node - it'll be entire subtrees of the
> > > executor tree.
> >
> > Amit and I had a long discussion about this on Friday while in Boston
> > together.  I previously argued that the master and the slave should be
> > executing the same node, ParallelSeqScan.  However, Amit argued
> > persuasively that what the master is doing is really pretty different
> > from what the worker is doing, and that they really ought to be
> > running two different nodes.  This led us to cast about for a better
> > design, and we came up with something that I think will be much
> > better.
> >
> > The basic idea is to introduce a new node called Funnel.  A Funnel
> > node will have a left child but no right child, and its job will be to
> > fire up a given number of workers.  Each worker will execute the plan
> > which is the left child of the funnel.  The funnel node itself will
> > pull tuples from all of those workers, and can also (if there are no
> > tuples available from any worker) execute the plan itself.  
>
> I have modified the patch to introduce a Funnel node (and left child
> as PartialSeqScan node).  Apart from that, some other noticeable
> changes based on feedback include:
> a) Master backend forms and send the planned stmt to each worker,
> earlier patch use to send individual elements and form the planned
> stmt in each worker.
> b) Passed tuples directly via tuple queue instead of going via
> FE-BE protocol.
> c) Removed restriction of expressions in target list.
> d) Introduced a parallelmodeneeded flag in plannerglobal structure
> and set it for Funnel plan.
>
> There is still some work left like integrating with
> access-parallel-safety patch (use parallelmodeok flag to decide
> whether parallel path can be generated, Enter/Exit parallel mode is still
> done during execution of funnel node).
>
> I think these are minor points which can be fixed once we decide
> on the other major parts of patch.  Find modified patch attached with
> this mail.
>
> Note -
> This patch is based on Head (commit-id: d1479011) +
> parallel-mode-v6.patch [1] + parallel-heap-scan.patch[2]
>
> [1]
> http://www.postgresql.org/message-id/CA+TgmobCMwFOz-9=hFv=hJ4SH7p=5X6Ga5V=WtT8=huzE6C+Mg@mail.gmail.com
> [2]
> http://www.postgresql.org/message-id/CA+TgmoYJETgeAXUsZROnA7BdtWzPtqExPJNTV1GKcaVMgSdhug@mail.gmail.com
>

Assuming previous patch is in right direction, I have enabled
join support for the patch and done some minor cleanup of
patch which leads to attached new version.

It is based on commit-id:5a2a48f0 and parallel-mode-v7.patch
and parallel-heap-scan.patch

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com
Attachment

pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: failures with tuplesort and ordered set aggregates (due to 5cefbf5a6c44)
Next
From: Abhijit Menon-Sen
Date:
Subject: Re: MD5 authentication needs help -SCRAM