BUG #18834: Query planer is choosing the sub-optimal plan when limit is present - Mailing list pgsql-bugs

From PG Bug reporting form
Subject BUG #18834: Query planer is choosing the sub-optimal plan when limit is present
Date
Msg-id 18834-ec88fdda949458d0@postgresql.org
Whole thread Raw
Responses Re: BUG #18834: Query planer is choosing the sub-optimal plan when limit is present
List pgsql-bugs
The following bug has been logged on the website:

Bug reference:      18834
Logged by:          Michael Stoeckel
Email address:      michael@joincandidhealth.com
PostgreSQL version: 15.10
Operating system:   x86_64-pc-linux-gnu
Description:

We have a table, `fn_ledger`, containing approximately **80 million
records**. The table includes a **primary key (`id`)** of type **UUID** and
a **`provenance`** column of type **JSONB**, which is **GIN indexed**, along
with other fields.  
We observed **poor query performance** when using an **`ORDER BY`** clause
combined with **`LIMIT`**, which led us to investigate further.  
Consider the following simple query:
```
select *
from fn_ledger
where provenance @> '{"locator": "52ca2f03-8184-448f-9b9e-7a0ed99e6922"}'
order by id
```
In the query above, the `WHERE` clause is **highly selective**, meaning it
will return at most a handful of records.  
Through our investigation, we found that:  
- **When `LIMIT` is not present**, the planner correctly chooses to use the
**bitmap index scan** on `ix_fn_ledger_provenance`, resulting in efficient
query execution.  
- **When `LIMIT` is introduced**, the planner **incorrectly opts** for a
sequential scan on `fn_ledger_pkey` while filtering on the `provenance`
value, leading to poor performance.  

See the query plans below:
```
postgres=> explain analyze
select *
from fn_ledger
where provenance @> '{"locator": "52ca2f03-8184-448f-9b9e-7a0ed99e6922"}'
order by id;
                                                                 QUERY PLAN
                                                               

--------------------------------------------------------------------------------------------------------------------------------------------
 Sort  (cost=32625.36..32645.41 rows=8020 width=333) (actual
time=2.943..2.944 rows=1 loops=1)
   Sort Key: id
   Sort Method: quicksort  Memory: 25kB
   ->  Bitmap Heap Scan on fn_ledger  (cost=1230.16..32105.29 rows=8020
width=333) (actual time=2.939..2.939 rows=1 loops=1)
         Recheck Cond: (provenance @> '{"locator":
"52ca2f03-8184-448f-9b9e-7a0ed99e6922"}'::jsonb)
         Heap Blocks: exact=1
         ->  Bitmap Index Scan on ix_fn_ledger_provenance
(cost=0.00..1228.15 rows=8020 width=0) (actual time=2.923..2.924 rows=1
loops=1)
               Index Cond: (provenance @> '{"locator":
"52ca2f03-8184-448f-9b9e-7a0ed99e6922"}'::jsonb)
 Planning Time: 0.115 ms
 Execution Time: 2.978 ms
(10 rows)

postgres=> explain analyze
select *
from fn_ledger
where provenance @> '{"locator": "52ca2f03-8184-448f-9b9e-7a0ed99e6922"}'
order by id
limit 1;
                                                                      QUERY
PLAN                                                                      

------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=0.57..1816.54 rows=1 width=333) (actual
time=317508.674..317508.675 rows=1 loops=1)
   ->  Index Scan using fn_ledger_pkey on fn_ledger  (cost=0.57..14564073.97
rows=8020 width=333) (actual time=317508.673..317508.673 rows=1 loops=1)
         Filter: (provenance @> '{"locator":
"52ca2f03-8184-448f-9b9e-7a0ed99e6922"}'::jsonb)
         Rows Removed by Filter: 49197422
 Planning Time: 0.148 ms
 Execution Time: 317508.742 ms
(6 rows)
```
Furthermore, since the planner incorrectly chooses to perform an **index
scan** on `fn_ledger_pkey`, the query performance varies significantly
depending on **where the candidate record's `id` appears** within the
`fn_ledger_pkey` index.  
To illustrate this, I selected the **first and last records** based on the
ordering of the `id` column.
```
select id, provenance->'locator'
from fn_ledger
where id in (
'00000025-fb46-46cd-b7eb-19e111fa1e87',
'ffffffdb-23f6-4a6e-9385-86e5e32a9309'
)
===================
00000025-fb46-46cd-b7eb-19e111fa1e87
"9cdf7902-ba7d-4c07-9f92-451aeebe6e2d"
ffffffdb-23f6-4a6e-9385-86e5e32a9309
"f2d20507-a543-4c38-b3e3-c977ac407b42"
```
then, I executed the following queries, both with `LIMIT`
```
postgres=> explain analyze
select *
from fn_ledger
where fn_ledger.provenance @> '{"locator":
"9cdf7902-ba7d-4c07-9f92-451aeebe6e2d"}'
order by id
limit 1;
                                                                 QUERY PLAN
                                                               

--------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=0.57..1817.68 rows=1 width=333) (actual time=0.016..0.017
rows=1 loops=1)
   ->  Index Scan using fn_ledger_pkey on fn_ledger  (cost=0.57..14478744.69
rows=7968 width=333) (actual time=0.015..0.016 rows=1 loops=1)
         Filter: (provenance @> '{"locator":
"9cdf7902-ba7d-4c07-9f92-451aeebe6e2d"}'::jsonb)
 Planning Time: 0.184 ms
 Execution Time: 0.031 ms
(5 rows)

postgres=> explain analyze
select *
from fn_ledger
where fn_ledger.provenance @> '{"locator":
"f2d20507-a543-4c38-b3e3-c977ac407b42"}'
order by id
limit 1;
                                                                     QUERY
PLAN                                                                     

----------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=0.57..1817.75 rows=1 width=333) (actual
time=57515.333..57515.334 rows=1 loops=1)
   ->  Index Scan using fn_ledger_pkey on fn_ledger  (cost=0.57..14479340.89
rows=7968 width=333) (actual time=57515.331..57515.332 rows=1 loops=1)
         Filter: (provenance @> '{"locator":
"f2d20507-a543-4c38-b3e3-c977ac407b42"}'::jsonb)
         Rows Removed by Filter: 30847879
 Planning Time: 0.147 ms
 Execution Time: 57515.418 ms
(6 rows)
```
As you can see, both queries are **identical**, yet the first query is
**orders of magnitude faster** than the second. This difference occurs
because the **candidate record** for the first query appears **towards the
beginning** of the `fn_ledger_pkey` index, allowing it to be found quickly.
In contrast, the candidate record for the second query appears **towards the
end** of the index, requiring significantly more scanning.

Beyond this inconsistency, if the **`LIMIT` value exceeds the actual number
of records found by the query**, the index scan will have to **traverse the
entire index**, regardless of the candidate record’s position. This further
degrades performance.
```
postgres=> explain analyze
select *
from fn_ledger
where fn_ledger.provenance @> '{"locator":
"9cdf7902-ba7d-4c07-9f92-451aeebe6e2d"}'
order by id
limit 1;
                                                                 QUERY PLAN
                                                               

--------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=0.57..1816.41 rows=1 width=333) (actual time=0.013..0.014
rows=1 loops=1)
   ->  Index Scan using fn_ledger_pkey on fn_ledger  (cost=0.57..14564849.49
rows=8021 width=333) (actual time=0.013..0.013 rows=1 loops=1)
         Filter: (provenance @> '{"locator":
"9cdf7902-ba7d-4c07-9f92-451aeebe6e2d"}'::jsonb)
 Planning Time: 0.145 ms
 Execution Time: 0.027 ms
(5 rows)

postgres=> explain analyze
select *
from fn_ledger
where fn_ledger.provenance @> '{"locator":
"9cdf7902-ba7d-4c07-9f92-451aeebe6e2d"}'
order by id
limit 2;
                                                                   QUERY
PLAN                                                                    

-------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=0.57..3632.25 rows=2 width=333) (actual time=0.017..126786.189
rows=1 loops=1)
   ->  Index Scan using fn_ledger_pkey on fn_ledger  (cost=0.57..14564849.49
rows=8021 width=333) (actual time=0.016..126786.187 rows=1 loops=1)
         Filter: (provenance @> '{"locator":
"9cdf7902-ba7d-4c07-9f92-451aeebe6e2d"}'::jsonb)
         Rows Removed by Filter: 80472490
 Planning Time: 0.148 ms
 Execution Time: 126786.246 ms
(6 rows)
```
As you can see from the above queries, the only difference is that the
second query's `LIMIT` is 2 instead of 1.

In conclusion, it seems that the query planner is **placing too much
emphasis on the `LIMIT` clause**, prioritizing it over selecting the more
efficient `ix_fn_ledger_provenance` index. As a result, it **ignores the
optimal indexing strategy** and instead opts for a less efficient scan on
`fn_ledger_pkey`, leading to inconsistent and suboptimal query performance.


pgsql-bugs by date:

Previous
From: Greg Sabino Mullane
Date:
Subject: Re: Window Functions with identical PARTITION BY and ORDER BY clauses evaluated separately
Next
From: PG Bug reporting form
Date:
Subject: BUG #18835: spgist index fails to accept point with NaN