BUG #17560: Planner can not find plan with lowest cost - Mailing list pgsql-bugs

From PG Bug reporting form
Subject BUG #17560: Planner can not find plan with lowest cost
Date
Msg-id 17560-002af2afa26117b6@postgresql.org
Whole thread Raw
Responses Re: BUG #17560: Planner can not find plan with lowest cost  (Tom Lane <tgl@sss.pgh.pa.us>)
Re: BUG #17560: Planner can not find plan with lowest cost  (Francisco Olarte <folarte@peoplecall.com>)
List pgsql-bugs
The following bug has been logged on the website:

Bug reference:      17560
Logged by:          Stanisław Skonieczny
Email address:      stanislaw.skonieczny@gmail.com
PostgreSQL version: 14.4
Operating system:   Linux
Description:

We have found that one our queries takes hours instead of seconds. The query
did a join between huge table (billions of rows) and 2 much smaller temp
tables (~10k rows). Huge table was often analyzed. Temp tables used in query
were analyzed just before making a query. The planer decided to use a
seqscan on one of the temp tables in the loop causing huge number of rows
removed by the join filter (`Rows Removed by Join Filter: 97633166212`)
instead of using index scan.

I was able to create a script which work on smaller data set, uses
simplified version of the query and causes similar problem. We have also
found that disabling seqscan allows PG to find better plan, even in terms of
estimated cost.

This is a script:
```
-- prepare data and tables
create table ss_d as select i as id, i + 100 as v1, i + 200 as v2,
'asdasdasdasdasdasdasdasdasdasdasdasd' as data from generate_series(0,
100000000) as i;
create unique index idx_ss_d on ss_d (id);
analyze ss_d;

create table ss_temp_table1 as select id, v1, v2 from ss_d order by id desc
limit 100000;
analyze ss_temp_table1;
create index idx_ss_temp_table1 on ss_temp_table1 (id);

create table ss_temp_table2 as select id from ss_d order by id desc limit
100000;
analyze ss_temp_table2;
create index idx_ss_temp_table2 on ss_temp_table2 (id);

-- make a query
explain analyze SELECT d.id, d.v1, d.v2
        FROM ss_d AS d
        JOIN ss_temp_table1 AS t1 ON t1.id = d.id AND t1.v1 = d.v1 AND t1.v2
= d.v2
        WHERE
        d.id IN
            (SELECT t2.id
             FROM ss_temp_table2 AS t2);

-- make a query with disabled seqscan
set enable_seqscan = 'off';
explain analyze SELECT d.id, d.v1, d.v2
        FROM ss_d AS d
        JOIN ss_temp_table1 AS t1 ON t1.id = d.id AND t1.v1 = d.v1 AND t1.v2
= d.v2
        WHERE
        d.id IN
            (SELECT t2.id
             FROM ss_temp_table2 AS t2);
```

Query plan is:
```
                                                              QUERY PLAN
                                                         

--------------------------------------------------------------------------------------------------------------------------------------
 Nested Loop Semi Join  (cost=0.11..737446.00 rows=1 width=12) (actual
time=0.044..542211.940 rows=100000 loops=1)
   Join Filter: (t1.id = t2.id)
   Rows Removed by Join Filter: 4999950000
   ->  Nested Loop  (cost=0.11..734953.00 rows=1 width=16) (actual
time=0.032..457.484 rows=100000 loops=1)
         ->  Seq Scan on ss_temp_table1 t1  (cost=0.00..1541.00 rows=100000
width=12) (actual time=0.013..13.157 rows=100000 loops=1)
         ->  Index Scan using idx_ss_d on ss_d d  (cost=0.11..7.32 rows=1
width=12) (actual time=0.003..0.003 rows=1 loops=100000)
               Index Cond: (id = t1.id)
               Filter: ((t1.v1 = v1) AND (t1.v2 = v2))
   ->  Seq Scan on ss_temp_table2 t2  (cost=0.00..1443.00 rows=100000
width=4) (actual time=0.003..2.732 rows=50000 loops=100000)
 Planning Time: 1.420 ms
 Execution Time: 542223.438 ms
(11 rows)
```

Query plan with disabled seqscan is:
```

QUERY PLAN
         

-----------------------------------------------------------------------------------------------------------------------------------------------------------------
 Nested Loop Semi Join  (cost=0.23..736160.14 rows=1 width=12) (actual
time=0.024..362.647 rows=100000 loops=1)
   Join Filter: (t1.id = t2.id)
   ->  Nested Loop  (cost=0.17..736160.06 rows=1 width=16) (actual
time=0.018..224.403 rows=100000 loops=1)
         ->  Index Scan using idx_ss_temp_table1 on ss_temp_table1 t1
(cost=0.06..2748.06 rows=100000 width=12) (actual time=0.008..18.990
rows=100000 loops=1)
         ->  Index Scan using idx_ss_d on ss_d d  (cost=0.11..7.32 rows=1
width=12) (actual time=0.002..0.002 rows=1 loops=100000)
               Index Cond: (id = t1.id)
               Filter: ((t1.v1 = v1) AND (t1.v2 = v2))
   ->  Index Only Scan using idx_ss_temp_table2 on ss_temp_table2 t2
(cost=0.06..0.07 rows=1 width=4) (actual time=0.001..0.001 rows=1
loops=100000)
         Index Cond: (id = d.id)
         Heap Fetches: 226
 Planning Time: 0.815 ms
 Execution Time: 368.078 ms
(12 rows)
```

It looks like planner by default can not find best plan in terms of
estimated cost. Second plan is cheaper. It is also much more effective
(368ms vs 542223ms). And I believe that using `Seq Scan` on `ss_temp_table1`
but `Index Only Scan` on `ss_temp_table2` should produce even cheaper plan,
but I do not know how to force planner to find it.

I submit a bug, because this behavior does not match the docs:
```
The task of the planner/optimizer is to create an optimal execution plan.
[...] If it is computationally feasible, the query optimizer will examine
each of these possible execution plans, ultimately selecting the execution
plan that is expected to run the fastest.
```

Another thing is that there is huge number of rows removed by join filter
(4999950000) and it looks like it is not taken into account when calculating
estimated cost of first plan (737446.00). This might be more important
problem, but it does not look like a bug.

Original problem happened on PG 13.7. 
I was able to reproduce it with attached script on freshly installed PG 14.4
on Ubuntu.


pgsql-bugs by date:

Previous
From: "David G. Johnston"
Date:
Subject: Re: BUG #17434: CREATE/DROP DATABASE can be executed in the same transaction with other commands
Next
From: Tom Lane
Date:
Subject: Re: BUG #17560: Planner can not find plan with lowest cost