Re: BUG #18834: Query planer is choosing the sub-optimal plan when limit is present - Mailing list pgsql-bugs

From Tom Lane
Subject Re: BUG #18834: Query planer is choosing the sub-optimal plan when limit is present
Date
Msg-id 1144176.1741390599@sss.pgh.pa.us
Whole thread Raw
In response to BUG #18834: Query planer is choosing the sub-optimal plan when limit is present  (PG Bug reporting form <noreply@postgresql.org>)
List pgsql-bugs
PG Bug reporting form <noreply@postgresql.org> writes:
> We observed **poor query performance** when using an **`ORDER BY`** clause
> combined with **`LIMIT`**, which led us to investigate further.  

This is a well-known syndrome.  The fundamental problem is the poor
selectivity estimate:

>          ->  Bitmap Index Scan on ix_fn_ledger_provenance 
> (cost=0.00..1228.15 rows=8020 width=0) (actual time=2.923..2.924 rows=1
> loops=1)
>                Index Cond: (provenance @> '{"locator":
> "52ca2f03-8184-448f-9b9e-7a0ed99e6922"}'::jsonb)

If the thing thinks there are 8020 matches for the @> condition when
there's only one, it's really pure luck if you get an optimal plan.
All its cost estimates will be off by a factor of more than 1000,
and these decisions are by no means linear.  They're particularly
not linear in the presence of LIMIT, but query plans can go far
astray even without that.

So what you need to do about this is get a better selectivity
estimate.  You might be able to get somewhere with custom statistics
(see CREATE STATISTICS), or just by cranking up the statistics target
for the "provenance" column to the maximum.  But I'm not sure how
much that will help.  Fundamentally, putting stuff into unstructured
JSON storage and expecting to get efficient searches of it is an
antipattern.  A more reliable answer is to change your query.
You could, for example, make an expression index on
(provenance ->> 'locator') and then write

WHERE (provenance ->> 'locator') = '52ca2f03-8184-448f-9b9e-7a0ed99e6922'

That will provide both fast searches and reasonably trustworthy stats
(once ANALYZE has seen the index).

            regards, tom lane



pgsql-bugs by date:

Previous
From: PG Bug reporting form
Date:
Subject: BUG #18836: Cannot import foreign schema under different schema name when foreign table uses row types
Next
From: Masahiko Sawada
Date:
Subject: Re: BUG #18828: Crash when pg_get_logical_snapshot_meta() passed empty string