Parallel Append can break run-time partition pruning - Mailing list pgsql-hackers
From | David Rowley |
---|---|
Subject | Parallel Append can break run-time partition pruning |
Date | |
Msg-id | CAApHDvqSchs+ubdybcfFaSPB++EA7kqMaoqajtP0GtZvzOOR3g@mail.gmail.com Whole thread Raw |
Responses |
Re: Parallel Append can break run-time partition pruning
|
List | pgsql-hackers |
I had a report from the wilds that run-time partition pruning was not working in certain cases. After some investigation and obtaining the mockup of the actual case, I discovered that the problem was down to accumulate_append_subpath() hitting the case where it does not pullup a Parallel Append where the first parallel node is > 0. What's actually happening is that the plan is left with a nested Append, and in this particular case, the top-level Append only has a single subpath, to which the code for 8edd0e794 (Suppress Append and MergeAppend plan nodes that have a single child) causes the nested Append to be pulled up to become the main Append. This causes run-time pruning to break since we only attach the pruning information to the top-level Append. The most simplified test case I can find to demonstrate this issue is: create table list (a int, b int) partition by list(a); create table list_12 partition of list for values in(1,2) partition by list(a); create table list_12_1 partition of list_12 for values in(1); create table list_12_2 partition of list_12 for values in(2); insert into list select 2,0 from generate_Series(1,1000000) x; vacuum analyze list; explain (analyze on, costs off, timing off, summary off) select * from list where a = (select 1) and b > 0; -- force the 2nd subnode of the Append to be non-parallel. alter table list_12_1 set (parallel_workers=0); explain (analyze on, costs off, timing off, summary off) select * from list where a = (select 1) and b > 0; The results of this in master are: postgres=# explain (analyze on, costs off, timing off, summary off) select * from list where a = (select 1) and b > 0; QUERY PLAN --------------------------------------------------------------------------- Gather (actual rows=0 loops=1) Workers Planned: 2 Params Evaluated: $0 Workers Launched: 2 InitPlan 1 (returns $0) -> Result (actual rows=1 loops=1) -> Parallel Append (actual rows=0 loops=3) -> Parallel Seq Scan on list_12_2 list_2 (never executed) Filter: ((b > 0) AND (a = $0)) -> Parallel Seq Scan on list_12_1 list_1 (actual rows=0 loops=1) Filter: ((b > 0) AND (a = $0)) (11 rows) postgres=# alter table list_12_1 set (parallel_workers=0); ALTER TABLE postgres=# explain (analyze on, costs off, timing off, summary off) select * from list where a = (select 1) and b > 0; QUERY PLAN --------------------------------------------------------------------------- Gather (actual rows=0 loops=1) Workers Planned: 2 Params Evaluated: $0 Workers Launched: 2 InitPlan 1 (returns $0) -> Result (actual rows=1 loops=1) -> Parallel Append (actual rows=0 loops=3) -> Seq Scan on list_12_1 list_1 (actual rows=0 loops=1) Filter: ((b > 0) AND (a = $0)) -> Parallel Seq Scan on list_12_2 list_2 (actual rows=0 loops=3) Filter: ((b > 0) AND (a = $0)) Rows Removed by Filter: 333333 (12 rows) Notice that we don't get "(never executed)" for list_12_2 in the 2nd case. I'm a bit divided on what the correct fix is. If I blame Parallel Append for not trying hard enough to pull up the lower Append in accumulate_append_subpath(), then clearly the parallel append code is to blame. However, perhaps run-time pruning should be tagging on PartitionPruneInfo to more than top-level Appends. Fixing the latter case, code-wise is about as simple as removing the "rel->reloptkind == RELOPT_BASEREL &&" line from create_append_plan(). Certainly, if the outer Append hadn't been a single subpath Append, then we wouldn't have pulled up the lower-level Append, so perhaps we should be run-time pruning lower-level ones too. What do other people think? (copying in Robert and Amit K due to their work on Parallel Append, Tom as I seem to remember him complaining about accumulate_append_subpath() at some point and Amit L because... partitioning...) David
pgsql-hackers by date: