Re: [DESIGN] ParallelAppend - Mailing list pgsql-hackers

From Robert Haas
Subject Re: [DESIGN] ParallelAppend
Date
Msg-id CA+Tgmoa49vp61UyC-DmPhiF8bq+Q=sKA7oKg+h53ssVY1FAqxw@mail.gmail.com
Whole thread Raw
In response to Re: [DESIGN] ParallelAppend  (Kouhei Kaigai <kaigai@ak.jp.nec.com>)
List pgsql-hackers
On Wed, Oct 28, 2015 at 3:55 PM, Kouhei Kaigai <kaigai@ak.jp.nec.com> wrote:
> At PGconf.EU, I could have a talk with Robert about this topic,
> then it became clear we have same idea.
>
>> +--------+
>> |sub-plan |       * Sub-Plan 1 ... Index Scan on p1
>> |index on *-----> * Sub-Plan 2 ... PartialSeqScan on p2
>> |shared   |       * Sub-Plan 2 ... PartialSeqScan on p2
>> |memory   |       * Sub-Plan 2 ... PartialSeqScan on p2
>> +---------+       * Sub-Plan 3 ... Index Scan on p3
>>
> In the above example, I put non-parallel sub-plan to use only
> 1 slot of the array, even though a PartialSeqScan takes 3 slots.
> It is a strict rule; non-parallel aware sub-plan can be picked
> up once.
> The index of sub-plan array is initialized to 0, then increased
> to 5 by each workers when it processes the parallel-aware Append.
> So, once a worker takes non-parallel sub-plan, other worker can
> never take the same slot again, thus, no duplicated rows will be
> produced by non-parallel sub-plan in the parallel aware Append.
> Also, this array structure will prevent too large number of
> workers pick up a particular parallel aware sub-plan, because
> PartialSeqScan occupies 3 slots; that means at most three workers
> can pick up this sub-plan. If 1st worker took the IndexScan on
> p1, and 2nd-4th worker took the PartialSeqScan on p2, then the
> 5th worker (if any) will pick up the IndexScan on p3 even if
> PartialSeqScan on p2 was not completed.

Actually, this is not exactly what I had in mind.  I was thinking that
we'd have a single array whose length is equal to the number of Append
subplans, and each element of the array would be a count of the number
of workers executing that subplan.  So there wouldn't be multiple
entries for the same subplan, as you propose here.  To distinguish
between parallel-aware and non-parallel-aware plans, I plan to put a
Boolean flag in the plan itself.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



pgsql-hackers by date:

Previous
From: Pavel Stehule
Date:
Subject: Re: plpgsql - DECLARE - cannot to use %TYPE or %ROWTYPE for composite types
Next
From: Robert Haas
Date:
Subject: Re: security_barrier view option type mistake in create view document