Home > mailing lists

BUG #18036: Query planner chooses sub-optimal query path given enough IN tuples. - Mailing list pgsql-bugs

From	PG Bug reporting form
Subject	BUG #18036: Query planner chooses sub-optimal query path given enough IN tuples.
Date	July 26, 2023 07:34:29
Msg-id	18036-f168e30b92352489@postgresql.org Whole thread Raw
Responses	Re: BUG #18036: Query planner chooses sub-optimal query path given enough IN tuples.
List	pgsql-bugs

Tree view

The following bug has been logged on the website:

Bug reference:      18036
Logged by:          Patrick Leamon
Email address:      patrick@redbubble.com
PostgreSQL version: 14.3
Operating system:   Unsure - AWS Aurora PostgreSQL
Description:

Hi,
  I'm hitting some odd query plans.  The structure of the query is like
this:

SELECT field1, field2, field3
FROM my_table
WHERE (field1, field2) IN (("a", "b"), ("c", "b"))

There is a composite index on field1 + field2 in that order.

When the query is working well, I end up with a query plan that looks
like:

BitmapOr
 -> Bitmap Index Scan
    -> Index Cond (field1="a" AND field2="b")
 -> Bitmap Index Scan
    -> Index Cond (field1="c" AND field2="d")

This is great and super fast.

When I give "too many" tuples in the IN clause, I end up with a query plan
like this:

BitmapAnd
  BitmapOr
   -> Bitmap Index Scan
      -> Index Cond (field1="a" AND field2="b")
   -> Bitmap Index Scan
      -> Index Cond (field1="c" AND field2="b")
  Bitmap Index Scan
    ->  Index Cond (field2="b")
    
This is not great and very slow.

The example above is simplified, in reality this is a 4 billion row table.
field 2 has only around 100 different values, where field 1 has millions.
So scanning the index on the "wrong side" of the composite index is very
costly and ends up timing out.

Over time the value of "too many" tuples is trending down.  100 used to be
fine, now that's having issues too.

Why would the query planner ever choose to scan a composite index on the
secondary column?

pgsql-bugs by date:

From: Tom Lane
Date: 26 July 2023, 06:54:33
Subject: Re: BUG #18014: Releasing catcache entries makes schema_to_xmlschema() fail when parallel workers are used

From: Masahiko Sawada
Date: 26 July 2023, 08:55:23
Subject: Re: BUG #18031: Segmentation fault after deadlock within VACUUM's parallel worker

BUG #18036: Query planner chooses sub-optimal query path given enough IN tuples. - Mailing list pgsql-bugs

Previous

Next