Home > mailing lists

Make Append Cost aware of some run time partition prune case - Mailing list pgsql-hackers

From	Andy Fan
Subject	Make Append Cost aware of some run time partition prune case
Date	November 10, 2020 00:43:59
Msg-id	CAKU4AWp6Z087YCQj4uN69Ct61arCtcABYNgU_7Mv1Gu-grVjHw@mail.gmail.com Whole thread Raw
List	pgsql-hackers

Tree view

Currently the cost model of append path sums the cost/rows for all the
subpaths, it usually works well until we run into the run-time partition prune
case. The first result is that generic plans will rarely be used for some cases.
For instance, SELECT * FROM p WHERE pkey = $1; The custom plan will only

count the cost of one partition, however generic plan will count the cost for all the
partitions even we are sure that only 1 partition will survive. Another impact
is that planners may choose a wrong plan. for example, SELECT * FROM t1, p

WHERE t1.a = p.pkey; The cost/rows of t1 nest loop p is estimated highly
improperly. This patch wants to help this case to some extent.

The basic idea is we need to estimate how many partitions will survive after
considering the runtime partition prune. After that, we will adjust the
cost/rows accordingly. IIUC, this follows Robert's idea at [1].

However there are too many special cases which is hard
to handle, but luckily the most common case would be sth like partykey = $1,
which we can estimate well, this patch is aimed to handle that part specially.
I supposed 75% partitions will survive for other cases arbitrarily, actually I
want to use 100% to be the same as the current situation. If we decide to

handle their special case differently, PartitionedRelQual has to be redefined

a bit (see comments around that.) which is the main data struct in the implementation.

The attached is the minimum runnable patch. There are some obvious issue here:

1. MergeAppend path is not handled on purpose, it will be done at last.
2. The cost for Parallel Append Path is not adjusted, I'm not sure about what is
the best way to do it. So there are 2 test cases in sql/partition_prune.out
failed due to this, I just echo the 'results' to expected' so that you can
know which one is impacted.

Here are some simple test results.

create table a1 partition of a for values in (1) partition by range (b, c);
create table a1_1 partition of a1 for values from (1, 0) to (1, 10);
create table a1_2 partition of a1 for values from (2, 0) to (2, 10);

create table a2 partition of a for values in (2) partition by range (b, c);
create table a2_1 partition of a2 for values from (1, 0) to (1, 10);
create table a2_2 partition of a2 for values from (2, 0) to (2, 10);

insert into a select 1, i%2 + 1, i % 10 from generate_series(1, 10000) i;
insert into a select 2, i%2 + 1, i % 10 from generate_series(1, 10000) i;

analyze a;

set plan_cache_mode to force_generic_plan;

prepare s as select * from a where a = $1;
PREPARE
explain execute s(1);
QUERY PLAN
-------------------------------------------------------------------
Append (cost=0.00..231.00 rows=10000 width=12) << both rows/cost is adjusted.
Subplans Removed: 2
-> Seq Scan on a1_1 a_1 (cost=0.00..90.50 rows=5000 width=12)
Filter: (a = $1)
-> Seq Scan on a1_2 a_2 (cost=0.00..90.50 rows=5000 width=12)
Filter: (a = $1)
(6 rows)

prepare s4 as select * from a where a = $1 and b = $2 and c = $3;
PREPARE
explain execute s4(1, 1, 2);
QUERY PLAN
--------------------------------------------------------------------
Append (cost=0.00..120.50 rows=1000 width=12) << Here
Subplans Removed: 3
-> Seq Scan on a1_1 a_1 (cost=0.00..115.50 rows=1000 width=12)
Filter: ((a = $1) AND (b = $2) AND (c = $3))
(4 rows)

prepare s2 as select * from a where a = $1 union all select * from a where a = $2;
PREPARE
explain execute s2(1, 1);
QUERY PLAN
-------------------------------------------------------------------------
Append (cost=0.00..762.00 rows=20000 width=12)
-> Append (cost=0.00..231.00 rows=10000 width=12) << Here
Subplans Removed: 2
-> Seq Scan on a1_1 a_1 (cost=0.00..90.50 rows=5000 width=12)
Filter: (a = $1)
-> Seq Scan on a1_2 a_2 (cost=0.00..90.50 rows=5000 width=12)
Filter: (a = $1)
-> Append (cost=0.00..231.00 rows=10000 width=12) << Here
Subplans Removed: 2
-> Seq Scan on a1_1 a_4 (cost=0.00..90.50 rows=5000 width=12)
Filter: (a = $2)
-> Seq Scan on a1_2 a_5 (cost=0.00..90.50 rows=5000 width=12)
Filter: (a = $2)
(13 rows)

prepare s3 as select * from a where a = $1 union select * from a where a = $2;
PREPARE
explain execute s3(1, 1);
QUERY PLAN
-------------------------------------------------------------------------------
HashAggregate (cost=912.00..1112.00 rows=20000 width=12)
Group Key: a.a, a.b, a.c
-> Append (cost=0.00..762.00 rows=20000 width=12)
-> Append (cost=0.00..231.00 rows=10000 width=12) << Here
Subplans Removed: 2
-> Seq Scan on a1_1 a_1 (cost=0.00..90.50 rows=5000 width=12)
Filter: (a = $1)
-> Seq Scan on a1_2 a_2 (cost=0.00..90.50 rows=5000 width=12)
Filter: (a = $1)
-> Append (cost=0.00..231.00 rows=10000 width=12) << Here
Subplans Removed: 2
-> Seq Scan on a1_1 a_4 (cost=0.00..90.50 rows=5000 width=12)
Filter: (a = $2)
-> Seq Scan on a1_2 a_5 (cost=0.00..90.50 rows=5000 width=12)
Filter: (a = $2)
(15 rows)

-- add a limit to make sure the runtime partitions prune.
explain select * from generate_series(1, 10) i(i) join lateral (select * from a
where a.a = (i.i % 2 + 1) and a.b = (i.i % 2 + 1) and a.c = (i.i % 10) limit 10000000) m on true;
QUERY PLAN
----------------------------------------------------------------------------------------------------
Nested Loop (cost=0.00..2030.10 rows=10000 width=16)
-> Function Scan on generate_series i (cost=0.00..0.10 rows=10 width=4)
-> Limit (cost=0.00..183.00 rows=1000 width=12)
-> Append (cost=0.00..183.00 rows=1000 width=12) << Here
-> Seq Scan on a1_1 a_1 (cost=0.00..178.00 rows=1000 width=12)
Filter: ((c = (i.i % 10)) AND (a = ((i.i % 2) + 1)) AND (b = ((i.i % 2) + 1)))
-> Seq Scan on a1_2 a_2 (cost=0.00..178.00 rows=1000 width=12)
Filter: ((c = (i.i % 10)) AND (a = ((i.i % 2) + 1)) AND (b = ((i.i % 2) + 1)))
-> Seq Scan on a2_1 a_3 (cost=0.00..178.00 rows=1000 width=12)
Filter: ((c = (i.i % 10)) AND (a = ((i.i % 2) + 1)) AND (b = ((i.i % 2) + 1)))
-> Seq Scan on a2_2 a_4 (cost=0.00..178.00 rows=1000 width=12)
Filter: ((c = (i.i % 10)) AND (a = ((i.i % 2) + 1)) AND (b = ((i.i % 2) + 1)))
(12 rows)

Any thoughts about this? Thanks

[1] https://www.postgresql.org/message-id/CA%2BTgmoZHYoAL4HYwnGO25B8CxCB%2BvNMdf%2B7rbUzYykR4sU9yUA%40mail.gmail.com

Best Regards

Andy Fan

Attachment

pgsql-hackers by date:

From: Tomas Vondra
Date: 10 November 2020, 00:13:22
Subject: Re: MultiXact\SLRU buffers configuration

From: "tsunakawa.takay@fujitsu.com"
Date: 10 November 2020, 00:45:50
Subject: RE: POC: postgres_fdw insert batching

Make Append Cost aware of some run time partition prune case - Mailing list pgsql-hackers

Attachment

Previous

Next