Thread: [BUGS] BUG #14469: Wrong cost estimates for merge append plan withpartitions.
[BUGS] BUG #14469: Wrong cost estimates for merge append plan withpartitions.
From
maxim.boguk@gmail.com
Date:
The following bug has been logged on the website: Bug reference: 14469 Logged by: Maksym Boguk Email address: maxim.boguk@gmail.com PostgreSQL version: 9.6.1 Operating system: Linux Description: The problem query (which never finishes because of huge tables involved): explain SELECT "feed_actions".* FROM "feed_actions" WHERE (requested_at > '2016-12-17 09:44:17.042659') ORDER BY "feed_actions"."created_at" DESC LIMIT 1; QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------------------ Limit (cost=2.47..853.03 rows=1 width=697) -> Merge Append (cost=2.47..10603118.78 rows=12466 width=697) Sort Key: feed_actions.created_at DESC -> Index Scan Backward using feed_actions_created_at_idx on feed_actions (cost=0.12..8.14 rows=1 width=1184) Filter: (requested_at > '2016-12-17 09:44:17.042659'::timestamp without time zone) -> Index Scan Backward using feed_actions_2016_09_created_at_idx on feed_actions_2016_09 (cost=0.56..3212964.64 rows=1 width=202) Filter: (requested_at > '2016-12-17 09:44:17.042659'::timestamp without time zone) -> Index Scan Backward using feed_actions_2016_10_created_at_idx on feed_actions_2016_10 (cost=0.56..3612797.62 rows=1 width=202) Filter: (requested_at > '2016-12-17 09:44:17.042659'::timestamp without time zone) -> Index Scan Backward using feed_actions_2016_11_created_at_idx on feed_actions_2016_11 (cost=0.56..2998680.21 rows=1 width=214) Filter: (requested_at > '2016-12-17 09:44:17.042659'::timestamp without time zone) -> Index Scan Backward using feed_actions_2016_12_created_at_idx on feed_actions_2016_12 (cost=0.43..778269.49 rows=12442 width=696) Filter: (requested_at > '2016-12-17 09:44:17.042659'::timestamp without time zone) -> Index Scan Backward using feed_actions_2017_01_created_at_idx on feed_actions_2017_01 (cost=0.14..45.19 rows=20 width=1184) Filter: (requested_at > '2016-12-17 09:44:17.042659'::timestamp without time zone) In the same time (for all partitions except feed_actions_2016_12) there are no rows with requested_at > '2016-12-17 09:44:17.042659': explain analyze select * from feed_actions_2016_09 "feed_actions" WHERE (requested_at > '2016-12-17 09:44:17.042659'); QUERY PLAN --------------------------------------------------------------------------------------------------------------------------------------------------------------------------- Index Scan using feed_actions_2016_09_requested_at_idx on feed_actions_2016_09 feed_actions (cost=0.56..4.60 rows=1 width=202) (actual time=0.006..0.006 rows=0 loops=1) Index Cond: (requested_at > '2016-12-17 09:44:17.042659'::timestamp without time zone) Planning time: 0.094 ms Execution time: 0.024 ms What's more, selecting from the single partition with the same order by/limit - producing expected/fast plan: explain analyze select * from feed_actions_2016_09 "feed_actions" WHERE (requested_at > '2016-12-17 09:44:17.042659') ORDER BY "feed_actions"."created_at" DESC LIMIT 1;; QUERY PLAN --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- Limit (cost=4.61..4.61 rows=1 width=202) (actual time=0.012..0.012 rows=0 loops=1) -> Sort (cost=4.61..4.61 rows=1 width=202) (actual time=0.011..0.011 rows=0 loops=1) Sort Key: created_at DESC Sort Method: quicksort Memory: 25kB -> Index Scan using feed_actions_2016_09_requested_at_idx on feed_actions_2016_09 feed_actions (cost=0.56..4.60 rows=1 width=202) (actual time=0.006..0.006 rows=0 loops=1) Index Cond: (requested_at > '2016-12-17 09:44:17.042659'::timestamp without time zone) Planning time: 0.126 ms Execution time: 0.052 ms So statistics - are correct, but somehow the database think that search for the 0 (may by 1) row via scanning whole partition with backward index scan - will be fast. Question - what's wrong with merge append cost estimates in that case? Play with planner cost* setting - have no effect on the plan selection. PS: good/fast plan could be produced via common hack: explain analyze SELECT "feed_actions".* FROM "feed_actions" WHERE (requested_at > '2016-12-17 09:44:17.042659') ORDER BY "feed_actions"."created_at"+'0 second'::interval DESC LIMIT 1; QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ Limit (cost=1212.68..1212.69 rows=1 width=705) (actual time=32.134..32.134 rows=1 loops=1) -> Sort (cost=1212.68..1243.86 rows=12470 width=705) (actual time=32.133..32.133 rows=1 loops=1) Sort Key: ((feed_actions.created_at + '00:00:00'::interval)) DESC Sort Method: top-N heapsort Memory: 25kB -> Result (cost=0.00..1150.33 rows=12470 width=705) (actual time=0.052..24.068 rows=8397 loops=1) -> Append (cost=0.00..994.46 rows=12470 width=697) (actual time=0.047..16.364 rows=8397 loops=1) -> Seq Scan on feed_actions (cost=0.00..0.00 rows=1 width=1184) (actual time=0.003..0.003 rows=0 loops=1) Filter: (requested_at > '2016-12-17 09:44:17.042659'::timestamp without time zone) -> Index Scan using feed_actions_2016_09_requested_at_idx on feed_actions_2016_09 (cost=0.56..4.60 rows=1 width=202) (actual time=0.007..0.007 rows=0 loops=1) Index Cond: (requested_at > '2016-12-17 09:44:17.042659'::timestamp without time zone) -> Index Scan using feed_actions_2016_10_requested_at_idx on feed_actions_2016_10 (cost=0.56..4.60 rows=1 width=202) (actual time=0.004..0.004 rows=0 loops=1) Index Cond: (requested_at > '2016-12-17 09:44:17.042659'::timestamp without time zone) -> Index Scan using feed_actions_2016_11_requested_at_idx on feed_actions_2016_11 (cost=0.56..6.85 rows=1 width=214) (actual time=0.004..0.004 rows=0 loops=1) Index Cond: (requested_at > '2016-12-17 09:44:17.042659'::timestamp without time zone) -> Index Scan using feed_actions_2016_12_requested_at_idx on feed_actions_2016_12 (cost=0.43..967.66 rows=12446 width=696) (actual time=0.027..13.166 rows=8397 loops=1) Index Cond: (requested_at > '2016-12-17 09:44:17.042659'::timestamp without time zone) -> Seq Scan on feed_actions_2017_01 (cost=0.00..10.75 rows=20 width=1184) (actual time=0.005..0.005 rows=0 loops=1) Filter: (requested_at > '2016-12-17 09:44:17.042659'::timestamp without time zone) Planning time: 1.020 ms Execution time: 32.370 ms -- Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-bugs
Re: [BUGS] BUG #14469: Wrong cost estimates for merge append planwith partitions.
From
Alexey Ermakov
Date:
Hi, I think there is a problem with estimation of startup_cost for Merge Append node. Currently for it's estimation we're using sum of startup_cost of child nodes (which could be really small but in fact I think we require each node to produce at least 1 row before merge append could pass them to the parent node). https://github.com/postgres/postgres/blob/34ca0905706422c191b3b0afef6e1c5f54399833/src/backend/optimizer/path/costsize.c#L1592 https://github.com/postgres/postgres/blob/0832f2db68cc43524a240db47d0428cc9525723e/src/backend/optimizer/util/pathnode.c#L1259 I think we should change that estimation and in case when child node expected to return only one row (and really 0 rows) we should use it's total_cost as base for estimation of startup_cost. -- Alexey Ermakov On 12/17/16 16:53, maxim.boguk@gmail.com wrote: > The following bug has been logged on the website: > > Bug reference: 14469 > Logged by: Maksym Boguk > Email address: maxim.boguk@gmail.com > PostgreSQL version: 9.6.1 > Operating system: Linux > Description: > > The problem query (which never finishes because of huge tables involved): > > explain SELECT "feed_actions".* FROM "feed_actions" WHERE (requested_at > > '2016-12-17 09:44:17.042659') ORDER BY "feed_actions"."created_at" DESC > LIMIT 1; > QUERY > PLAN > ------------------------------------------------------------------------------------------------------------------------------------------------ > Limit (cost=2.47..853.03 rows=1 width=697) > -> Merge Append (cost=2.47..10603118.78 rows=12466 width=697) > Sort Key: feed_actions.created_at DESC > -> Index Scan Backward using feed_actions_created_at_idx on > feed_actions (cost=0.12..8.14 rows=1 width=1184) > Filter: (requested_at > '2016-12-17 > 09:44:17.042659'::timestamp without time zone) > -> Index Scan Backward using feed_actions_2016_09_created_at_idx > on feed_actions_2016_09 (cost=0.56..3212964.64 rows=1 width=202) > Filter: (requested_at > '2016-12-17 > 09:44:17.042659'::timestamp without time zone) > -> Index Scan Backward using feed_actions_2016_10_created_at_idx > on feed_actions_2016_10 (cost=0.56..3612797.62 rows=1 width=202) > Filter: (requested_at > '2016-12-17 > 09:44:17.042659'::timestamp without time zone) > -> Index Scan Backward using feed_actions_2016_11_created_at_idx > on feed_actions_2016_11 (cost=0.56..2998680.21 rows=1 width=214) > Filter: (requested_at > '2016-12-17 > 09:44:17.042659'::timestamp without time zone) > -> Index Scan Backward using feed_actions_2016_12_created_at_idx > on feed_actions_2016_12 (cost=0.43..778269.49 rows=12442 width=696) > Filter: (requested_at > '2016-12-17 > 09:44:17.042659'::timestamp without time zone) > -> Index Scan Backward using feed_actions_2017_01_created_at_idx > on feed_actions_2017_01 (cost=0.14..45.19 rows=20 width=1184) > Filter: (requested_at > '2016-12-17 > 09:44:17.042659'::timestamp without time zone) > > > In the same time (for all partitions except feed_actions_2016_12) there are > no rows with requested_at > '2016-12-17 09:44:17.042659': > explain analyze select * from feed_actions_2016_09 "feed_actions" WHERE > (requested_at > '2016-12-17 09:44:17.042659'); > > QUERY PLAN > --------------------------------------------------------------------------------------------------------------------------------------------------------------------------- > Index Scan using feed_actions_2016_09_requested_at_idx on > feed_actions_2016_09 feed_actions (cost=0.56..4.60 rows=1 width=202) > (actual time=0.006..0.006 rows=0 loops=1) > Index Cond: (requested_at > '2016-12-17 09:44:17.042659'::timestamp > without time zone) > Planning time: 0.094 ms > Execution time: 0.024 ms > > What's more, selecting from the single partition with the same order > by/limit - producing expected/fast plan: > explain analyze select * from feed_actions_2016_09 "feed_actions" WHERE > (requested_at > '2016-12-17 09:44:17.042659') ORDER BY > "feed_actions"."created_at" DESC LIMIT 1;; > > QUERY PLAN > --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- > Limit (cost=4.61..4.61 rows=1 width=202) (actual time=0.012..0.012 rows=0 > loops=1) > -> Sort (cost=4.61..4.61 rows=1 width=202) (actual time=0.011..0.011 > rows=0 loops=1) > Sort Key: created_at DESC > Sort Method: quicksort Memory: 25kB > -> Index Scan using feed_actions_2016_09_requested_at_idx on > feed_actions_2016_09 feed_actions (cost=0.56..4.60 rows=1 width=202) > (actual time=0.006..0.006 rows=0 loops=1) > Index Cond: (requested_at > '2016-12-17 > 09:44:17.042659'::timestamp without time zone) > Planning time: 0.126 ms > Execution time: 0.052 ms > > > So statistics - are correct, but somehow the database think that search for > the 0 (may by 1) row via scanning whole partition with backward index scan - > will be fast. > Question - what's wrong with merge append cost estimates in that case? > Play with planner cost* setting - have no effect on the plan selection. > > > > PS: good/fast plan could be produced via common hack: > explain analyze SELECT "feed_actions".* FROM "feed_actions" WHERE > (requested_at > '2016-12-17 09:44:17.042659') ORDER BY > "feed_actions"."created_at"+'0 second'::interval DESC LIMIT 1; > > QUERY PLAN > ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ > Limit (cost=1212.68..1212.69 rows=1 width=705) (actual time=32.134..32.134 > rows=1 loops=1) > -> Sort (cost=1212.68..1243.86 rows=12470 width=705) (actual > time=32.133..32.133 rows=1 loops=1) > Sort Key: ((feed_actions.created_at + '00:00:00'::interval)) DESC > Sort Method: top-N heapsort Memory: 25kB > -> Result (cost=0.00..1150.33 rows=12470 width=705) (actual > time=0.052..24.068 rows=8397 loops=1) > -> Append (cost=0.00..994.46 rows=12470 width=697) (actual > time=0.047..16.364 rows=8397 loops=1) > -> Seq Scan on feed_actions (cost=0.00..0.00 rows=1 > width=1184) (actual time=0.003..0.003 rows=0 loops=1) > Filter: (requested_at > '2016-12-17 > 09:44:17.042659'::timestamp without time zone) > -> Index Scan using > feed_actions_2016_09_requested_at_idx on feed_actions_2016_09 > (cost=0.56..4.60 rows=1 width=202) (actual time=0.007..0.007 rows=0 > loops=1) > Index Cond: (requested_at > '2016-12-17 > 09:44:17.042659'::timestamp without time zone) > -> Index Scan using > feed_actions_2016_10_requested_at_idx on feed_actions_2016_10 > (cost=0.56..4.60 rows=1 width=202) (actual time=0.004..0.004 rows=0 > loops=1) > Index Cond: (requested_at > '2016-12-17 > 09:44:17.042659'::timestamp without time zone) > -> Index Scan using > feed_actions_2016_11_requested_at_idx on feed_actions_2016_11 > (cost=0.56..6.85 rows=1 width=214) (actual time=0.004..0.004 rows=0 > loops=1) > Index Cond: (requested_at > '2016-12-17 > 09:44:17.042659'::timestamp without time zone) > -> Index Scan using > feed_actions_2016_12_requested_at_idx on feed_actions_2016_12 > (cost=0.43..967.66 rows=12446 width=696) (actual time=0.027..13.166 > rows=8397 loops=1) > Index Cond: (requested_at > '2016-12-17 > 09:44:17.042659'::timestamp without time zone) > -> Seq Scan on feed_actions_2017_01 (cost=0.00..10.75 > rows=20 width=1184) (actual time=0.005..0.005 rows=0 loops=1) > Filter: (requested_at > '2016-12-17 > 09:44:17.042659'::timestamp without time zone) > Planning time: 1.020 ms > Execution time: 32.370 ms > > > > > > -- Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-bugs
Re: [BUGS] BUG #14469: Wrong cost estimates for merge append plan with partitions.
From
Tom Lane
Date:
Alexey Ermakov <alexey.ermakov@postgresql-consulting.com> writes: > I think we should change that estimation and in case when child node > expected to return only one row (and really 0 rows) we should use it's > total_cost as base for estimation of startup_cost. This would make sense as part of the overall rethinking of startup_cost discussed in https://www.postgresql.org/message-id/flat/31065.1481742760%40sss.pgh.pa.us but I'm disinclined to do it in isolation. Redefining startup_cost for just one path type seems likely to create more problems than it solves. regards, tom lane -- Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-bugs