Re: Trying to understand why a query is filtering when there is a composite index - Mailing list pgsql-performance

From Shiv Iyer
Subject Re: Trying to understand why a query is filtering when there is a composite index
Date
Msg-id CAALLqt-xk0ckt9s_9OvsNfcFZOxgo4QdpkxoWz6xne7Y7OGDdg@mail.gmail.com
Whole thread Raw
In response to Trying to understand why a query is filtering when there is a composite index  ("Stephen Samuel (Sam)" <sam@sksamuel.com>)
List pgsql-performance
Hello, 

The query's behavior is expected due to how PostgreSQL handles composite indexes and MVCC. The index on `(a, b)` is used efficiently for the `a` condition, but the `b IN (<ids>)` filter is more complex, leading to additional filtering rather than direct index usage. Although the index-only scan is utilized, heap fetches still occur to verify tuple visibility, a necessary step when the visibility map doesn’t confirm visibility or to apply the `b` filter accurately. This is standard in PostgreSQL’s handling of such queries, ensuring data consistency and accuracy. Performance remains good, but these heap fetches could be optimized if needed by reconsidering the index structure or query design. Thank you! 




On Mon, Aug 19, 2024 at 7:26 AM Stephen Samuel (Sam) <sam@sksamuel.com> wrote:
Hi folks.

I have a table with 4.5m rows per partition (16 partitions) (I know, very small, probably didn't need to be partitioned).

The table has two columns, a bigint and b text.
There is a unique index on (a,b)
The query is:

SELECT b
FROM table
WHERE a = <id>
AND b IN (<ids>)

The visibility map is almost exclusively true.
This table gets few updates.

The planner says index only scan, but is filtering on b.

Index Only Scan using pkey on table  (cost=0.46..29.09 rows=1 width=19) (actual time=0.033..0.053 rows=10 loops=1)
  Index Cond: (a = 662028765)
"  Filter: (b = ANY ('{634579987:662028765,561730945:662028765,505555183:662028765,472806302:662028765,401361055:662028765,363587258:662028765,346093772:662028765,314369897:662028765,289498328:662028765,217993946:662028765}'::text[]))"
  Rows Removed by Filter: 1
  Heap Fetches: 11
Planning Time: 0.095 ms
Execution Time: 0.070 ms

My question is, why isn't it using the index for column b? Is this expected? And why is it doing heap lookups for every row,.
Performance is still good, but I am curious.
Thanks in advance!


--

Best Regards 
Shiv Iyer 

pgsql-performance by date:

Previous
From: "Stephen Samuel (Sam)"
Date:
Subject: Re: Trying to understand why a query is filtering when there is a composite index
Next
From: Peter Geoghegan
Date:
Subject: Re: Trying to understand why a query is filtering when there is a composite index