Home > mailing lists

Re: [DESIGN] ParallelAppend - Mailing list pgsql-hackers

From	Amit Langote
Subject	Re: [DESIGN] ParallelAppend
Date	July 28, 2015 08:50:09
Msg-id	55B717B6.60202@lab.ntt.co.jp Whole thread Raw
In response to	Re: [DESIGN] ParallelAppend (Kouhei Kaigai <kaigai@ak.jp.nec.com>)
Responses	Re: [DESIGN] ParallelAppend (Kouhei Kaigai <kaigai@ak.jp.nec.com>)
List	pgsql-hackers

Tree view

KaiGai-san,

On 2015-07-27 PM 11:07, Kouhei Kaigai wrote:
> 
>   Append
>    --> Funnel
>         --> PartialSeqScan on rel1 (num_workers = 4)
>    --> Funnel
>         --> PartialSeqScan on rel2 (num_workers = 8)
>    --> SeqScan on rel3
> 
>  shall be rewritten to
>   Funnel
>     --> PartialSeqScan on rel1 (num_workers = 4)
>     --> PartialSeqScan on rel2 (num_workers = 8)
>     --> SeqScan on rel3        (num_workers = 1)
> 

In the rewritten plan, are respective scans (PartialSeq or Seq) on rel1,
rel2 and rel3 asynchronous w.r.t each other? Or does each one wait for the
earlier one to finish? I would think the answer is no because then it
would not be different from the former case, right? Because the original
premise seems that (partitions) rel1, rel2, rel3 may be on different
volumes so parallelism across volumes seems like a goal of parallelizing
Append.

From my understanding of parallel seqscan patch, each worker's
PartialSeqScan asks for a block to scan using a shared parallel heap scan
descriptor that effectively keeps track of division of work among
PartialSeqScans in terms of blocks. What if we invent a PartialAppend
which each worker would run in case of a parallelized Append. It would use
some kind of shared descriptor to pick a relation (Append member) to scan.
The shared structure could be the list of subplans including the mutex for
concurrency. It doesn't sound as effective as proposed
ParallelHeapScanDescData does for PartialSeqScan but any more granular
might be complicated. For example, consider (current_relation,
current_block) pair. If there are more workers than subplans/partitions,
then multiple workers might start working on the same relation after a
round-robin assignment of relations (but of course, a later worker would
start scanning from a later block in the same relation). I imagine that
might help with parallelism across volumes if that's the case. MergeAppend
parallelization might involve a bit more complication but may be feasible
with a PartialMergeAppend with slightly different kind of coordination
among workers. What do you think of such an approach?

Thanks,
Amit

pgsql-hackers by date:

From: Heikki Linnakangas
Date: 28 July 2015, 08:00:45
Subject: Re: Feature - Index support on an lquery field (from the ltree module)

From: Ashutosh Bapat
Date: 28 July 2015, 09:31:10
Subject: Re: Autonomous Transaction is back

Re: [DESIGN] ParallelAppend - Mailing list pgsql-hackers

Previous

Next