Re: [DESIGN] ParallelAppend - Mailing list pgsql-hackers

From Kouhei Kaigai
Subject Re: [DESIGN] ParallelAppend
Date
Msg-id 9A28C8860F777E439AA12E8AEA7694F80116EEDC@BPXM15GP.gisp.nec.co.jp
Whole thread Raw
In response to Re: [DESIGN] ParallelAppend  (Kouhei Kaigai <kaigai@ak.jp.nec.com>)
Responses Re: [DESIGN] ParallelAppend  (Amit Langote <Langote_Amit_f8@lab.ntt.co.jp>)
Re: [DESIGN] ParallelAppend  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
I'm now designing the parallel feature of Append...

Here is one challenge. How do we determine whether each sub-plan
allows execution in the background worker context?

The commit f0661c4e8c44c0ec7acd4ea7c82e85b265447398 added
'parallel_aware' flag on Path and Plan structure.
It tells us whether the plan/path-node can perform by multiple
background worker processes concurrently, but also tells us
nothing whether the plan/path-node are safe to run on background
worker process context.

From the standpoint of parallel execution, I understand here are
three types of plan/path nodes.

Type-A) It can be performed on background worker context and  picked up by multiple worker processes concurrently.
(e.g:Parallel SeqScan)
 
Type-B) It can be performed on background worker context but  cannot be picked up by multiple worker processes.  (e.g:
non-parallelaware node)
 
Type-C) It should not be performed on background worker context.  (e.g: plan/path node with volatile functions)

The 'parallel_aware' flag allows to distinguish the type-A and
others, however, we cannot distinguish type-B from type-C.
From the standpoint of parallel append, it makes sense to launch
background workers even if all the sub-plan are type-B, with no
type-A node.

Sorry for late. I'd like to suggest to have 'int num_workers'
in Path and Plan node as a common field.
We give this field the following three meaning.
- If num_workers > 1, it is type-A node, thus, parallel aware Append node shall assign more than one workers on this
node.
- If num_workers == 1, it is type-B node, thus, more than one background worker process shall be never assigned.
- If num_workers == 0, it is type-C node, thus, planner shall give up to run this node on background worker context.

The num_workers state shall propagate to the upper node.
For example, I expect a HashJoin node that takes Parallel
SeqScan with num_workers == 4 as outer input will also have
num_worker == 4, as long as join clauses are safe to run on
background worker side.

How about the idea?

Thanks,
--
NEC Business Creation Division / PG-Strom Project
KaiGai Kohei <kaigai@ak.jp.nec.com>


> -----Original Message-----
> From: pgsql-hackers-owner@postgresql.org
> [mailto:pgsql-hackers-owner@postgresql.org] On Behalf Of Kouhei Kaigai
> Sent: Saturday, October 31, 2015 1:35 AM
> To: Robert Haas
> Cc: pgsql-hackers@postgresql.org; Amit Kapila; Kyotaro HORIGUCHI
> Subject: Re: [HACKERS] [DESIGN] ParallelAppend
> 
> > On Wed, Oct 28, 2015 at 3:55 PM, Kouhei Kaigai <kaigai@ak.jp.nec.com> wrote:
> > > At PGconf.EU, I could have a talk with Robert about this topic,
> > > then it became clear we have same idea.
> > >
> > >> +--------+
> > >> |sub-plan |       * Sub-Plan 1 ... Index Scan on p1
> > >> |index on *-----> * Sub-Plan 2 ... PartialSeqScan on p2
> > >> |shared   |       * Sub-Plan 2 ... PartialSeqScan on p2
> > >> |memory   |       * Sub-Plan 2 ... PartialSeqScan on p2
> > >> +---------+       * Sub-Plan 3 ... Index Scan on p3
> > >>
> > > In the above example, I put non-parallel sub-plan to use only
> > > 1 slot of the array, even though a PartialSeqScan takes 3 slots.
> > > It is a strict rule; non-parallel aware sub-plan can be picked
> > > up once.
> > > The index of sub-plan array is initialized to 0, then increased
> > > to 5 by each workers when it processes the parallel-aware Append.
> > > So, once a worker takes non-parallel sub-plan, other worker can
> > > never take the same slot again, thus, no duplicated rows will be
> > > produced by non-parallel sub-plan in the parallel aware Append.
> > > Also, this array structure will prevent too large number of
> > > workers pick up a particular parallel aware sub-plan, because
> > > PartialSeqScan occupies 3 slots; that means at most three workers
> > > can pick up this sub-plan. If 1st worker took the IndexScan on
> > > p1, and 2nd-4th worker took the PartialSeqScan on p2, then the
> > > 5th worker (if any) will pick up the IndexScan on p3 even if
> > > PartialSeqScan on p2 was not completed.
> >
> > Actually, this is not exactly what I had in mind.  I was thinking that
> > we'd have a single array whose length is equal to the number of Append
> > subplans, and each element of the array would be a count of the number
> > of workers executing that subplan.  So there wouldn't be multiple
> > entries for the same subplan, as you propose here.  To distinguish
> > between parallel-aware and non-parallel-aware plans, I plan to put a
> > Boolean flag in the plan itself.
> >
> I don't have strong preference here. Both of design can implement the
> requirement; none-parallel sub-plans are never picked up twice, and
> parallel-aware sub-plans can be picked up multiple times.
> So, I'll start with the above your suggestion.
> 
> Thanks,
> --
> NEC Business Creation Division / PG-Strom Project
> KaiGai Kohei <kaigai@ak.jp.nec.com>
> 
> 
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers

pgsql-hackers by date:

Previous
From: Kouhei Kaigai
Date:
Subject: Re: CustomScan in a larger structure (RE: CustomScan support on readfuncs.c)
Next
From: Amit Langote
Date:
Subject: Re: [DESIGN] ParallelAppend