Thread: BUG #17560: Planner can not find plan with lowest cost
The following bug has been logged on the website: Bug reference: 17560 Logged by: Stanisław Skonieczny Email address: stanislaw.skonieczny@gmail.com PostgreSQL version: 14.4 Operating system: Linux Description: We have found that one our queries takes hours instead of seconds. The query did a join between huge table (billions of rows) and 2 much smaller temp tables (~10k rows). Huge table was often analyzed. Temp tables used in query were analyzed just before making a query. The planer decided to use a seqscan on one of the temp tables in the loop causing huge number of rows removed by the join filter (`Rows Removed by Join Filter: 97633166212`) instead of using index scan. I was able to create a script which work on smaller data set, uses simplified version of the query and causes similar problem. We have also found that disabling seqscan allows PG to find better plan, even in terms of estimated cost. This is a script: ``` -- prepare data and tables create table ss_d as select i as id, i + 100 as v1, i + 200 as v2, 'asdasdasdasdasdasdasdasdasdasdasdasd' as data from generate_series(0, 100000000) as i; create unique index idx_ss_d on ss_d (id); analyze ss_d; create table ss_temp_table1 as select id, v1, v2 from ss_d order by id desc limit 100000; analyze ss_temp_table1; create index idx_ss_temp_table1 on ss_temp_table1 (id); create table ss_temp_table2 as select id from ss_d order by id desc limit 100000; analyze ss_temp_table2; create index idx_ss_temp_table2 on ss_temp_table2 (id); -- make a query explain analyze SELECT d.id, d.v1, d.v2 FROM ss_d AS d JOIN ss_temp_table1 AS t1 ON t1.id = d.id AND t1.v1 = d.v1 AND t1.v2 = d.v2 WHERE d.id IN (SELECT t2.id FROM ss_temp_table2 AS t2); -- make a query with disabled seqscan set enable_seqscan = 'off'; explain analyze SELECT d.id, d.v1, d.v2 FROM ss_d AS d JOIN ss_temp_table1 AS t1 ON t1.id = d.id AND t1.v1 = d.v1 AND t1.v2 = d.v2 WHERE d.id IN (SELECT t2.id FROM ss_temp_table2 AS t2); ``` Query plan is: ``` QUERY PLAN -------------------------------------------------------------------------------------------------------------------------------------- Nested Loop Semi Join (cost=0.11..737446.00 rows=1 width=12) (actual time=0.044..542211.940 rows=100000 loops=1) Join Filter: (t1.id = t2.id) Rows Removed by Join Filter: 4999950000 -> Nested Loop (cost=0.11..734953.00 rows=1 width=16) (actual time=0.032..457.484 rows=100000 loops=1) -> Seq Scan on ss_temp_table1 t1 (cost=0.00..1541.00 rows=100000 width=12) (actual time=0.013..13.157 rows=100000 loops=1) -> Index Scan using idx_ss_d on ss_d d (cost=0.11..7.32 rows=1 width=12) (actual time=0.003..0.003 rows=1 loops=100000) Index Cond: (id = t1.id) Filter: ((t1.v1 = v1) AND (t1.v2 = v2)) -> Seq Scan on ss_temp_table2 t2 (cost=0.00..1443.00 rows=100000 width=4) (actual time=0.003..2.732 rows=50000 loops=100000) Planning Time: 1.420 ms Execution Time: 542223.438 ms (11 rows) ``` Query plan with disabled seqscan is: ``` QUERY PLAN ----------------------------------------------------------------------------------------------------------------------------------------------------------------- Nested Loop Semi Join (cost=0.23..736160.14 rows=1 width=12) (actual time=0.024..362.647 rows=100000 loops=1) Join Filter: (t1.id = t2.id) -> Nested Loop (cost=0.17..736160.06 rows=1 width=16) (actual time=0.018..224.403 rows=100000 loops=1) -> Index Scan using idx_ss_temp_table1 on ss_temp_table1 t1 (cost=0.06..2748.06 rows=100000 width=12) (actual time=0.008..18.990 rows=100000 loops=1) -> Index Scan using idx_ss_d on ss_d d (cost=0.11..7.32 rows=1 width=12) (actual time=0.002..0.002 rows=1 loops=100000) Index Cond: (id = t1.id) Filter: ((t1.v1 = v1) AND (t1.v2 = v2)) -> Index Only Scan using idx_ss_temp_table2 on ss_temp_table2 t2 (cost=0.06..0.07 rows=1 width=4) (actual time=0.001..0.001 rows=1 loops=100000) Index Cond: (id = d.id) Heap Fetches: 226 Planning Time: 0.815 ms Execution Time: 368.078 ms (12 rows) ``` It looks like planner by default can not find best plan in terms of estimated cost. Second plan is cheaper. It is also much more effective (368ms vs 542223ms). And I believe that using `Seq Scan` on `ss_temp_table1` but `Index Only Scan` on `ss_temp_table2` should produce even cheaper plan, but I do not know how to force planner to find it. I submit a bug, because this behavior does not match the docs: ``` The task of the planner/optimizer is to create an optimal execution plan. [...] If it is computationally feasible, the query optimizer will examine each of these possible execution plans, ultimately selecting the execution plan that is expected to run the fastest. ``` Another thing is that there is huge number of rows removed by join filter (4999950000) and it looks like it is not taken into account when calculating estimated cost of first plan (737446.00). This might be more important problem, but it does not look like a bug. Original problem happened on PG 13.7. I was able to reproduce it with attached script on freshly installed PG 14.4 on Ubuntu.
PG Bug reporting form <noreply@postgresql.org> writes: > I was able to create a script which work on smaller data set, uses > simplified version of the query and causes similar problem. We have also > found that disabling seqscan allows PG to find better plan, even in terms of > estimated cost. I don't see any bug here, or at least nothing that we can fix in the short run. The problem is pretty obviously that this rowcount estimate is awful: > -> Nested Loop (cost=0.17..736160.06 rows=1 width=16) (actual > time=0.018..224.403 rows=100000 loops=1) and the reason that it is awful is that the planner is not aware of the 100% correlation between the contents of the tables, specifically the fact that the v1 and v2 join conditions add precisely zero selectivity beyond the id join condition. At some point we might be able to deal with that using "extended statistics" created on the redundant column sets, but I believe that right now we don't apply extended stats to join conditions. Fixing that is a future feature, not a bug fix. Right now the only advice I can give you is to try to avoid such strongly correlated join conditions. BTW, the fact that it isn't noticing the marginally-cheaper-estimated- cost plan isn't a bug either. We intentionally treat plans of essentially equal cost as redundant and keep only one, in order to reduce planning time. I've not traced through this example in detail, but I'm sure that some pruning step of that kind eliminated consideration of the slightly cheaper plan. The fact that in reality it's a *lot* cheaper escapes the planner because of the poor selectivity estimates. regards, tom lane
Stanislaw: On Thu, 28 Jul 2022 at 20:57, PG Bug reporting form <noreply@postgresql.org> wrote: > create table ss_temp_table1 as select id, v1, v2 from ss_d order by id desc > limit 100000; ... > FROM ss_d AS d > JOIN ss_temp_table1 AS t1 ON t1.id = d.id AND t1.v1 = d.v1 AND t1.v2 > = d.v2 Do your real tables/queries have the same kind of redudant condition ( i.e., "AND t1.v1 = d.v1 AND t1.v2" is redundant due to id being a "PK" )? If so you could try omitting it ( and maybe try w & w/o seqscan enabled too ) to see if this condition is leading the optimizer to believe a plan is much cheaper than it really is. If not you may be hitting a different problem with similar appearance. Francisco Olarte.
On Fri, Jul 29, 2022 at 3:16 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
and the reason that it is awful is that the planner is not aware of
the 100% correlation between the contents of the tables, specifically
the fact that the v1 and v2 join conditions add precisely zero
selectivity beyond the id join condition.
Yeah, and the planner thinks the three join conditions have independent
probabilities and estimates them by taking the product of their
selectivities.
probabilities and estimates them by taking the product of their
selectivities.
BTW, the fact that it isn't noticing the marginally-cheaper-estimated-
cost plan isn't a bug either. We intentionally treat plans of essentially
equal cost as redundant and keep only one, in order to reduce planning
time. I've not traced through this example in detail, but I'm sure that
some pruning step of that kind eliminated consideration of the slightly
cheaper plan. The fact that in reality it's a *lot* cheaper escapes
the planner because of the poor selectivity estimates.
Right. The planner does fuzzy cost comparison with a factor
STD_FUZZ_FACTOR. So they are fuzzily the same on total cost, while the
second plan is fuzzily worse on startup.
Thanks
Richard
STD_FUZZ_FACTOR. So they are fuzzily the same on total cost, while the
second plan is fuzzily worse on startup.
Thanks
Richard
On Fri, Jul 29, 2022 at 9:31 AM Francisco Olarte <folarte@peoplecall.com> wrote:
Stanislaw:
On Thu, 28 Jul 2022 at 20:57, PG Bug reporting form
<noreply@postgresql.org> wrote:
> create table ss_temp_table1 as select id, v1, v2 from ss_d order by id desc
> limit 100000;
...
> FROM ss_d AS d
> JOIN ss_temp_table1 AS t1 ON t1.id = d.id AND t1.v1 = d.v1 AND t1.v2
> = d.v2
Do your real tables/queries have the same kind of redudant condition (
i.e., "AND t1.v1 = d.v1 AND t1.v2" is redundant due to id being a "PK"
)?
It is not redundant. There might be rows in ss_d and ss_temp_table1 that have same id, but different v1, v2 values and this join should reject it, however usually that is a minority.
Francisco Olarte.
Stanisław Skonieczny