Home > mailing lists

Re: Trying to understand why a query is filtering when there is a composite index - Mailing list pgsql-performance

From	Shiv Iyer
Subject	Re: Trying to understand why a query is filtering when there is a composite index
Date	August 19, 2024 11:05:40
Msg-id	CAALLqt-xk0ckt9s_9OvsNfcFZOxgo4QdpkxoWz6xne7Y7OGDdg@mail.gmail.com Whole thread
In response to	Trying to understand why a query is filtering when there is a composite index ("Stephen Samuel (Sam)" <sam@sksamuel.com>)
List	pgsql-performance

Tree view

Hello,

The query's behavior is expected due to how PostgreSQL handles composite indexes and MVCC. The index on `(a, b)` is used efficiently for the `a` condition, but the `b IN (<ids>)` filter is more complex, leading to additional filtering rather than direct index usage. Although the index-only scan is utilized, heap fetches still occur to verify tuple visibility, a necessary step when the visibility map doesn’t confirm visibility or to apply the `b` filter accurately. This is standard in PostgreSQL’s handling of such queries, ensuring data consistency and accuracy. Performance remains good, but these heap fetches could be optimized if needed by reconsidering the index structure or query design. Thank you!

On Mon, Aug 19, 2024 at 7:26 AM Stephen Samuel (Sam) <sam@sksamuel.com> wrote:

Hi folks.

I have a table with 4.5m rows per partition (16 partitions) (I know, very small, probably didn't need to be partitioned).

The table has two columns, a bigint and b text.

There is a unique index on (a,b)

The query is:

SELECT b
FROM table
WHERE a = <id>
  AND b IN (<ids>)


The visibility map is almost exclusively true. 
This table gets few updates.

The planner says index only scan, but is filtering on b.

Index Only Scan using pkey on table  (cost=0.46..29.09 rows=1 width=19) (actual time=0.033..0.053 rows=10 loops=1)
  Index Cond: (a = 662028765)
"  Filter: (b = ANY ('{634579987:662028765,561730945:662028765,505555183:662028765,472806302:662028765,401361055:662028765,363587258:662028765,346093772:662028765,314369897:662028765,289498328:662028765,217993946:662028765}'::text[]))"
  Rows Removed by Filter: 1
  Heap Fetches: 11
Planning Time: 0.095 ms
Execution Time: 0.070 ms

My question is, why isn't it using the index for column b? Is this expected? And why is it doing heap lookups for every row,.

Performance is still good, but I am curious.

Thanks in advance!

Best Regards

Shiv Iyer

pgsql-performance by date:

From: "Stephen Samuel (Sam)"
Date: 19 August 2024, 07:16:45
Subject: Re: Trying to understand why a query is filtering when there is a composite index

From: Peter Geoghegan
Date: 19 August 2024, 18:21:13
Subject: Re: Trying to understand why a query is filtering when there is a composite index

Re: Trying to understand why a query is filtering when there is a composite index - Mailing list pgsql-performance

Previous

Next