Thread: Query planner issue with preferring primary key over a better index when using ORDER BY and LIMIT

I keep running into problems like these:

Devs are using an ORM. It really likes to produce queries like:

SELECT "shipment_import_records".* FROM "shipment_import_records" WHERE shipment_import_records"."shipment_import_id" = 5090609 ORDER BY "shipment_import_records"."id" ASC LIMIT 1;

I don't know why they do this. Usually it's more like 50 for pagination which make more sense. But for whatever reason this keeps coming up.

The table has nearly 29 million records. 5069 of them match shipment_import_id = 5090609. There is an index on shipment_import_id, which the planner happily uses without the LIMIT specifically. Yet with it the query planner will always do something like:

# explain SELECT "shipment_import_records".* FROM "shipment_import_records" WHERE "shipment_import_records"."shipment_import_id" = 5090609 ORDER BY "shipment_import_records"."id" ASC LIMIT 1;         
                                                        QUERY PLAN                                                           
-----------------------------------------------------------------------------------------------------------------------------
Limit  (cost=0.44..873.35 rows=1 width=243)
  ->  Index Scan using shipment_import_records_pkey on shipment_import_records  (cost=0.44..5122227.70 rows=5868 width=243)
        Filter: (shipment_import_id = 5090609)

.. which takes minutes.

I know I can work around this. Generally I would just drop the index on shipment_import_id and create one on shipment_import_id,id. Or if I can get the devs to wrap their query in an inner select with a fake offset to fool the query planner that works too. But both seem hacky.

Just wondering if there's a knob I can turn to make these more likely to work without constantly implementing workarounds?

Thanks for any help.
What is your default_statistics_target and how accurate is that estimate of 5668 rows? What is random_page_cost set to by the way?

More importantly, what is the better plan that you'd like the planner to use with your existing indexes? It would seem logical to me to scan for the matching shipment_import_id if the estimate is saying 5868 out of 29 million should match and then sort and only get the smallest ID. Doing an index scan on ID and looking up in the table to see if shipment_import_id matches when the planner expects that to be about a .0001 chance... I can't imagine that plan performing well at all.

Certainly a composite index would be very helpful here. Using explain analyze and sharing the output would give more info to go on.
On 12/6/21 10:03 AM, Alan Hodgson wrote:
> I keep running into problems like these:
>
> Devs are using an ORM. It really likes to produce queries like:
>
> SELECT "shipment_import_records".* FROM "shipment_import_records" 
> WHERE shipment_import_records"."shipment_import_id" = 5090609 ORDER BY 
> "shipment_import_records"."id" ASC LIMIT 1;
>
> I don't know why they do this. Usually it's more like 50 for 
> pagination which make more sense. But for whatever reason this keeps 
> coming up.
>
To be clear, is it the devs or the ORM that's adding the ORDER  and the 
LIMIT?  I'm betting on devs.  Do they need the smallest id (first 
occurrance?) or do they need data common to all 5096 entries (Name?) and 
any record will do?.  For the former they might be better off asking for 
just the attributes they need and for the latter you need to provide an 
option which gets them that single record.  Of course, If they have the 
"smallest id" in hand they should request that.




On Mon, 6 Dec 2021 at 18:03, Alan Hodgson <ahodgson@lists.simkin.ca> wrote:
...
> The table has nearly 29 million records. 5069 of them match shipment_import_id = 5090609. There is an index on
shipment_import_id,which the planner happily uses without the LIMIT specifically. Yet with it the query planner will
alwaysdo something like:
 
>
> # explain SELECT "shipment_import_records".* FROM "shipment_import_records" WHERE
"shipment_import_records"."shipment_import_id"= 5090609 ORDER BY "shipment_import_records"."id" ASC LIMIT 1;
 
>                                                         QUERY PLAN
>
-----------------------------------------------------------------------------------------------------------------------------
> Limit  (cost=0.44..873.35 rows=1 width=243)
>   ->  Index Scan using shipment_import_records_pkey on shipment_import_records  (cost=0.44..5122227.70 rows=5868
width=243)
>         Filter: (shipment_import_id = 5090609)
> .. which takes minutes.

Can you post an explain analyze? To me it seems like the planner
thinks shipment_import_id is randomly distributed and the table is
well correlated with it's PK, so scanning it for the first id should
be fast.

But from the names of the field you may have correlation between
shipment_import_id and id hidden somewhere ( like they are two serial
growing together, you query for the latest shipment ids and it scans
all the table ). An explain analyze should show that ( or three, one
for that shipment import id, one for 1, one for a really big one )

> Just wondering if there's a knob I can turn to make these more likely to work without constantly implementing
workarounds?

You may try a composite index.

Francisco Olarte.



On Mon, 2021-12-06 at 10:18 -0700, Michael Lewis wrote:
> What is your default_statistics_target and how accurate is that
> estimate of 5668 rows? What is random_page_cost set to by the way?
> 
> 
> 

default_statistics_target = 1000
random_page_cost = 2.0 (it's on AWS on a 9000 iops gp2 volume)

Postgresql 13.5 btw.

The estimate was reasonably accurate, there were 5069 actual rows
matching.


> More importantly, what is the better plan that you'd like the planner
> to use with your existing indexes?

Well, it takes a few ms to grab all 5000 rows by shipment_import_id
and then sort/limit them. It takes 30 seconds to do what it is doing
instead, and only when the table is mostly cached already, more like
4-5 minutes otherwise.

         
   #explain analyze SELECT "shipment_import_records".* FROM 
   shipment_import_records" WHERE 
   shipment_import_records"."shipment_import_id" = 5090609 ORDER BY
   "shipment_import_records"."id" ASC LIMIT 1;
   ---------------------------------------------------------------------
   ---------------------------------------------------------------------
   -------------------------------------
    Limit  (cost=0.44..873.08 rows=1 width=243) (actual
   time=31689.725..31689.726 rows=1 loops=1)
      ->  Index Scan using shipment_import_records_pkey on
   shipment_import_records  (cost=0.44..5122405.71 rows=5870 width=243)
   (actual time=31689.723..31689.724 rows=1 loops=1)
            Filter: (shipment_import_id = 5090609)
            Rows Removed by Filter: 28710802
    Planning Time: 0.994 ms
    Execution Time: 31689.744 ms
   (6 rows)

Just with a kludge to force the better index:

   # explain analyze SELECT * FROM (SELECT "shipment_import_records".*
   FROM "shipment_import_records" WHERE
   "shipment_import_records"."shipment_import_id" = 5090609 OFFSET 0) AS
   x ORDER BY "id" ASC LIMIT 1;
                                                                       
   QUERY PLAN                                                          
   ---------------------------------------------------------------------
   ---------------------------------------------------------------------
   ------------------
    Limit  (cost=10655.34..10655.34 rows=1 width=243) (actual
   time=4.868..4.869 rows=1 loops=1)
      ->  Sort  (cost=10655.34..10670.02 rows=5870 width=243) (actual
   time=4.867..4.868 rows=1 loops=1)
            Sort Key: shipment_import_records.id
            Sort Method: top-N heapsort  Memory: 27kB
            ->  Index Scan using
   index_shipment_import_records_on_shipment_import_id on
   shipment_import_records  (cost=0.44..10567.29 rows=5870 width=243)
   (actual time=0.037..3.560 rows=5069 loops=1)
                  Index Cond: (shipment_import_id = 5090609)
    Planning Time: 0.135 ms
    Execution Time: 4.885 ms
   (8 rows)


> 
> Certainly a composite index would be very helpful here. Using explain
> analyze and sharing the output would give more info to go on.
> 

Yeah I am going to just do the composite index for now, but was 
hoping for a more generic option.

Thanks for looking at it.






On Mon, 2021-12-06 at 18:20 +0100, Francisco Olarte wrote:

Can you post an explain analyze? To me it seems like the planner
thinks shipment_import_id is randomly distributed and the table is
well correlated with it's PK, so scanning it for the first id should
be fast.

   #explain analyze SELECT "shipment_import_records".* FROM 
   shipment_import_records" WHERE 
   shipment_import_records"."shipment_import_id" = 5090609 ORDER BY
   "shipment_import_records"."id" ASC LIMIT 1;
   ---------------------------------------------------------------------
   ---------------------------------------------------------------------
   -------------------------------------
    Limit  (cost=0.44..873.08 rows=1 width=243) (actual
   time=31689.725..31689.726 rows=1 loops=1)
      ->  Index Scan using shipment_import_records_pkey on
   shipment_import_records  (cost=0.44..5122405.71 rows=5870 width=243)
   (actual time=31689.723..31689.724 rows=1 loops=1)
            Filter: (shipment_import_id = 5090609)
            Rows Removed by Filter: 28710802
    Planning Time: 0.994 ms
    Execution Time: 31689.744 ms
   (6 rows)


      The biggest one (but yes "earlier"):

# explain analyze SELECT "shipment_import_records".* FROM "shipment_import_records" WHERE "shipment_import_records"."shipment_import_id" = 1247888 ORDER BY
"shipment_import_records"."id" ASC LIMIT 1; 
                                                                                  QUERY PLAN                                                                          
        
----------------------------------------------------------------------------------------------------------------------------------------------------------------------
--------
Limit  (cost=0.44..426.59 rows=1 width=243) (actual time=8007.069..8007.070 rows=1 loops=1)
  ->  Index Scan using shipment_import_records_pkey on shipment_import_records  (cost=0.44..5126628.40 rows=12030 width=243) (actual time=8007.068..8007.068 rows=1 l
oops=1)
        Filter: (shipment_import_id = 1247888)
        Rows Removed by Filter: 10929193
Planning Time: 0.584 ms
Execution Time: 8007.086 ms
(6 rows)


And the smallest/latest, which actually uses the "right" index:

 # explain analyze SELECT "shipment_import_records".* FROM "shipment_import_records" WHERE "shipment_import_records"."shipment_import_id" = 5116174 ORDER BY 
"shipment_import_records"."id" ASC LIMIT 1;      
                                                                                  QUERY PLAN                                                                  
                        
----------------------------------------------------------------------------------------------------------------------------------------------------------------------
------------------------
Limit  (cost=145.44..145.44 rows=1 width=243) (actual time=0.018..0.018 rows=1 loops=1)
  ->  Sort  (cost=145.44..145.64 rows=79 width=243) (actual time=0.017..0.018 rows=1 loops=1)
        Sort Key: id
        Sort Method: quicksort  Memory: 26kB
        ->  Index Scan using index_shipment_import_records_on_shipment_import_id on shipment_import_records  (cost=0.44..145.05 rows=79 width=243) (actual time=0.013
..0.014 rows=1 loops=1)
              Index Cond: (shipment_import_id = 5116174)
Planning Time: 0.104 ms
Execution Time: 0.032 ms
(8 rows)


But from the names of the field you may have correlation between
shipment_import_id and id hidden somewhere ( like they are two serial
growing together, you query for the latest shipment ids and it scans
all the table ). An explain analyze should show that ( or three, one
for that shipment import id, one for 1, one for a really big one )


This is definitely the case. And we are generally looking for newer data for most operations.

Thanks for looking at it.
On Mon, 2021-12-06 at 10:19 -0700, Rob Sargent wrote:
To be clear, is it the devs or the ORM that's adding the ORDER  and the 
LIMIT?  I'm betting on devs.  Do they need the smallest id (first
occurrance?) or do they need data common to all 5096 entries (Name?) and
any record will do?.  For the former they might be better off asking for
just the attributes they need and for the latter you need to provide an
option which gets them that single record.  Of course, If they have the
"smallest id" in hand they should request that.

That assumes I could figure what bit of ORM code is generating this, talk to them, and then get them to actually think about what data they're looking for and it's impact on the database. :/ Given my 25 year track record with devs, I'm thinking of that as plan B. Hopefully though if they're looking for something common to all the records they would look at the parent table instead.

I do expect the dev actually specified the order/limit for some reason.

Thank you for the suggestions.
On 12/6/21 10:02, Alan Hodgson wrote:
> On Mon, 2021-12-06 at 10:19 -0700, Rob Sargent wrote:
>> To be clear, is it the devs or the ORM that's adding the ORDER  and the
>> LIMIT?  I'm betting on devs.  Do they need the smallest id (first
>> occurrance?) or do they need data common to all 5096 entries (Name?) and
>> any record will do?.  For the former they might be better off asking for
>> just the attributes they need and for the latter you need to provide an
>> option which gets them that single record.  Of course, If they have the
>> "smallest id" in hand they should request that.
> 
> That assumes I could figure what bit of ORM code is generating this, 
> talk to them, and then get them to actually think about what data 
> they're looking for and it's impact on the database. :/ Given my 25 year 
> track record with devs, I'm thinking of that as plan B. Hopefully though 
> if they're looking for something common to all the records they would 
> look at the parent table instead.
> 
> I do expect the dev actually specified the order/limit for some reason.

Maybe I'm silly, but why is asking them a Plan B?

> 
> Thank you for the suggestions.


-- 
Adrian Klaver
adrian.klaver@aklaver.com





po 6. 12. 2021 v 18:21 odesílatel Francisco Olarte <folarte@peoplecall.com> napsal:
On Mon, 6 Dec 2021 at 18:03, Alan Hodgson <ahodgson@lists.simkin.ca> wrote:
...
> The table has nearly 29 million records. 5069 of them match shipment_import_id = 5090609. There is an index on shipment_import_id, which the planner happily uses without the LIMIT specifically. Yet with it the query planner will always do something like:
>
> # explain SELECT "shipment_import_records".* FROM "shipment_import_records" WHERE "shipment_import_records"."shipment_import_id" = 5090609 ORDER BY "shipment_import_records"."id" ASC LIMIT 1;
>                                                         QUERY PLAN
> -----------------------------------------------------------------------------------------------------------------------------
> Limit  (cost=0.44..873.35 rows=1 width=243)
>   ->  Index Scan using shipment_import_records_pkey on shipment_import_records  (cost=0.44..5122227.70 rows=5868 width=243)
>         Filter: (shipment_import_id = 5090609)
> .. which takes minutes.

Can you post an explain analyze? To me it seems like the planner
thinks shipment_import_id is randomly distributed and the table is
well correlated with it's PK, so scanning it for the first id should
be fast.

But from the names of the field you may have correlation between
shipment_import_id and id hidden somewhere ( like they are two serial
growing together, you query for the latest shipment ids and it scans
all the table ). An explain analyze should show that ( or three, one
for that shipment import id, one for 1, one for a really big one )

> Just wondering if there's a knob I can turn to make these more likely to work without constantly implementing workarounds?

You may try a composite index.

+1 These issues can be solved by composite indexes. The low limit clause deforms costs and when the data are not really random, then index scan can be too long.



Francisco Olarte.


On 12/6/21 11:02 AM, Alan Hodgson wrote:
On Mon, 2021-12-06 at 10:19 -0700, Rob Sargent wrote:
To be clear, is it the devs or the ORM that's adding the ORDER  and the 
LIMIT?  I'm betting on devs.  Do they need the smallest id (first
occurrance?) or do they need data common to all 5096 entries (Name?) and
any record will do?.  For the former they might be better off asking for
just the attributes they need and for the latter you need to provide an
option which gets them that single record.  Of course, If they have the
"smallest id" in hand they should request that.

That assumes I could figure what bit of ORM code is generating this, talk to them, and then get them to actually think about what data they're looking for and it's impact on the database. :/ Given my 25 year track record with devs, I'm thinking of that as plan B. Hopefully though if they're looking for something common to all the records they would look at the parent table instead.

I do expect the dev actually specified the order/limit for some reason.
Until you know what they're after it's hard to make the correct adjustment.  Another index is more over-head and may not be necessary or even the best solution for getting that one tuple they're after. 

Thank you for the suggestions.
Up thread you hoped for a general solution.  The devs writing better queries is your best bet. 
If you cannot or choose not to talk with the devs or at least their manager you'll be chasing these for a long time.  Those dealing with ORM code are server-side: they are supposed to be the smart ones! Maybe they can be taught.





On Mon, 2021-12-06 at 19:22 +0100, Pavel Stehule wrote:

> po 6. 12. 2021 v 18:21 odesílatel Francisco Olarte <folarte@peoplecall.com> napsal:
> > On Mon, 6 Dec 2021 at 18:03, Alan Hodgson <ahodgson@lists.simkin.ca> wrote:
> > > # explain SELECT "shipment_import_records".* FROM "shipment_import_records" WHERE
"shipment_import_records"."shipment_import_id"= 5090609 ORDER BY "shipment_import_records"."id" ASC LIMIT 1;
 
> > >                                                          QUERY PLAN
> > >
-----------------------------------------------------------------------------------------------------------------------------
> > > Limit  (cost=0.44..873.35 rows=1 width=243)
> > >    ->  Index Scan using shipment_import_records_pkey on shipment_import_records  (cost=0.44..5122227.70 rows=5868
width=243)
> > >          Filter: (shipment_import_id = 5090609)
> > > .. which takes minutes.
> > >
> > > Just wondering if there's a knob I can turn to make these more likely to work without constantly implementing
workarounds?
> > 
> > You may try a composite index.
>
> +1 These issues can be solved by composite indexes. The low limit clause deforms costs and when the data are not
reallyrandom, then index scan can be too long.
 

An ugly alternative is to use "ORDER BY id + 0", which prevents PostgreSQL
from using the index.

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com




On 12/6/21 22:16, Laurenz Albe wrote:
> An ugly alternative is to use "ORDER BY id + 0", which prevents PostgreSQL
> from using the index.

That was actually the earliest form of Oracle hints. I remember doing 
exactly that in Oracle 5.1.22 on VAX/VMS.

-- 
Mladen Gogala
Database Consultant
Tel: (347) 321-1217
https://dbwhisperer.wordpress.com




Alan:

On Mon, 6 Dec 2021 at 18:58, Alan Hodgson <ahodgson@lists.simkin.ca> wrote:
> On Mon, 2021-12-06 at 18:20 +0100, Francisco Olarte wrote:
> Can you post an explain analyze? To me it seems like the planner
> thinks shipment_import_id is randomly distributed and the table is
> well correlated with it's PK, so scanning it for the first id should
> be fast.
>    #explain analyze SELECT "shipment_import_records".* FROM
>    shipment_import_records" WHERE
>    shipment_import_records"."shipment_import_id" = 5090609 ORDER BY
>    "shipment_import_records"."id" ASC LIMIT 1;
>    ---------------------------------------------------------------------
>     Limit  (cost=0.44..873.08 rows=1 width=243) (actual
>    time=31689.725..31689.726 rows=1 loops=1)
>       ->  Index Scan using shipment_import_records_pkey on
>    shipment_import_records  (cost=0.44..5122405.71 rows=5870 width=243)
>    (actual time=31689.723..31689.724 rows=1 loops=1)
>             Filter: (shipment_import_id = 5090609)
>             Rows Removed by Filter: 28710802

This seems to be the problem, IIRC you said you had 29M rows, it is
index-scanning nearly all the table.

I'm not an explain guru, but It seems to me you have a correlation
undetected by the planer.


>       The biggest one (but yes "earlier"):
>
> # explain analyze SELECT "shipment_import_records".* FROM "shipment_import_records" WHERE
"shipment_import_records"."shipment_import_id"= 1247888 ORDER BY
 
> "shipment_import_records"."id" ASC LIMIT 1;
> --------
> Limit  (cost=0.44..426.59 rows=1 width=243) (actual time=8007.069..8007.070 rows=1 loops=1)
>   ->  Index Scan using shipment_import_records_pkey on shipment_import_records  (cost=0.44..5126628.40 rows=12030
width=243)(actual time=8007.068..8007.068 rows=1 l
 
> oops=1)
>         Filter: (shipment_import_id = 1247888)
>         Rows Removed by Filter: 10929193
> Planning Time: 0.584 ms
> Execution Time: 8007.086 ms
> (6 rows)

mmm, the times seem to correlate with rows removed by filter ~ rows
read. To me it seems the planer thinks they are well distributed but
they are not ( from the names it seem like you insert a lot of
shipment_import_records with the same shipment_import_id, and they get
a serial or similar id, they are bunched. The planner thinks they are
distributed and tries to read the table in order, probably because it
correlates well with the id index, but when you select one of the
later-inserted imports it fails miserabley and has to scan the whole /
a lot of the table.


> And the smallest/latest, which actually uses the "right" index:
>  # explain analyze SELECT "shipment_import_records".* FROM "shipment_import_records" WHERE
"shipment_import_records"."shipment_import_id"= 5116174 ORDER BY
 
> "shipment_import_records"."id" ASC LIMIT 1;
> ------------------------
> Limit  (cost=145.44..145.44 rows=1 width=243) (actual time=0.018..0.018 rows=1 loops=1)
>   ->  Sort  (cost=145.44..145.64 rows=79 width=243) (actual time=0.017..0.018 rows=1 loops=1)
>         Sort Key: id
>         Sort Method: quicksort  Memory: 26kB
>         ->  Index Scan using index_shipment_import_records_on_shipment_import_id on shipment_import_records
(cost=0.44..145.05rows=79 width=243) (actual time=0.013
 
> ..0.014 rows=1 loops=1)
>               Index Cond: (shipment_import_id = 5116174)
> Planning Time: 0.104 ms
> Execution Time: 0.032 ms

This one found few rows, and probably the planner knew it ( rom the
rows=79 ) so it went on the import_record_id index route.

> But from the names of the field you may have correlation between
> shipment_import_id and id hidden somewhere ( like they are two serial
> growing together, you query for the latest shipment ids and it scans
> all the table ). An explain analyze should show that ( or three, one
> for that shipment import id, one for 1, one for a really big one )
> This is definitely the case. And we are generally looking for newer data for most operations.
> Thanks for looking at it.

Well, I have not solution for that, but at least I think the problem
is identified.

You could try building an index on shipment_import_id + shipment_id,
it would solve this kind of lookups, and would certainly pull its
weight if your ORM makes lots of this kind of queries,  but I'm not
sure on how to make the planner use it, but it may be worth testing.

Francisco Olarte.