Home > mailing lists

Query planner issue with preferring primary key over a better index when using ORDER BY and LIMIT - Mailing list pgsql-general

From	Alan Hodgson
Subject	Query planner issue with preferring primary key over a better index when using ORDER BY and LIMIT
Date	December 6, 2021 20:03:02
Msg-id	bc47e9d0c490942eebb01dc1129097012005e1d2.camel@lists.simkin.ca Whole thread Raw
Responses	Re: Query planner issue with preferring primary key over a better index when using ORDER BY and LIMIT (Michael Lewis <mlewis@entrata.com>) Re: Query planner issue with preferring primary key over a better index when using ORDER BY and LIMIT (Rob Sargent <robjsargent@gmail.com>) Re: Query planner issue with preferring primary key over a better index when using ORDER BY and LIMIT (Francisco Olarte <folarte@peoplecall.com>)
List	pgsql-general

Tree view

I keep running into problems like these:

Devs are using an ORM. It really likes to produce queries like:

SELECT "shipment_import_records".* FROM "shipment_import_records" WHERE shipment_import_records"."shipment_import_id" = 5090609 ORDER BY "shipment_import_records"."id" ASC LIMIT 1;

I don't know why they do this. Usually it's more like 50 for pagination which make more sense. But for whatever reason this keeps coming up.

The table has nearly 29 million records. 5069 of them match shipment_import_id = 5090609. There is an index on shipment_import_id, which the planner happily uses without the LIMIT specifically. Yet with it the query planner will always do something like:

# explain SELECT "shipment_import_records".* FROM "shipment_import_records" WHERE "shipment_import_records"."shipment_import_id" = 5090609 ORDER BY "shipment_import_records"."id" ASC LIMIT 1;
                                                        QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.44..873.35 rows=1 width=243)
  -> Index Scan using shipment_import_records_pkey on shipment_import_records (cost=0.44..5122227.70 rows=5868 width=243)
        Filter: (shipment_import_id = 5090609)

.. which takes minutes.

I know I can work around this. Generally I would just drop the index on shipment_import_id and create one on shipment_import_id,id. Or if I can get the devs to wrap their query in an inner select with a fake offset to fool the query planner that works too. But both seem hacky.

Just wondering if there's a knob I can turn to make these more likely to work without constantly implementing workarounds?

Thanks for any help.

pgsql-general by date:

From: Avi Weinberg
Date: 06 December 2021, 17:50:35
Subject: Are Foreign Key Disabled During Logical Replication Initial Sync?

From: Michael Lewis
Date: 06 December 2021, 20:18:11
Subject: Re: Query planner issue with preferring primary key over a better index when using ORDER BY and LIMIT

Query planner issue with preferring primary key over a better index when using ORDER BY and LIMIT - Mailing list pgsql-general

Previous

Next