Home > mailing lists

Re: Understanding partial index selection - Mailing list pgsql-general

From	David Rowley
Subject	Re: Understanding partial index selection
Date	November 29, 2023 02:08:27
Msg-id	CAApHDvrV6Nx0CVzwdyCW6K60m9O_LLmyANc=NFM+4RucjhniGw@mail.gmail.com Whole thread Raw
In response to	Understanding partial index selection (Owen Nelson <onelson@gmail.com>)
List	pgsql-general

Tree view

On Wed, 29 Nov 2023 at 11:23, Owen Nelson <onelson@gmail.com> wrote:
>     "message_payload_not_null_pidx" btree (expiration) WHERE payload IS NOT NULL

> I periodically run a query like this:
> ```
> UPDATE message SET payload = NULL WHERE id IN (

> Update on message  (cost=1773.41..44611.36 rows=5000 width=283) (actual time=20913.192..20913.194 rows=0 loops=1)
>   ->  Nested Loop  (cost=1773.41..44611.36 rows=5000 width=283) (actual time=20881.320..20886.541 rows=51 loops=1)
>         ->  HashAggregate  (cost=1772.85..1822.85 rows=5000 width=88) (actual time=20881.286..20882.052 rows=51
loops=1)
>               Group Key: ("ANY_subquery".id)::text
>               ->  Subquery Scan on "ANY_subquery"  (cost=0.00..1760.35 rows=5000 width=88) (actual
time=8425.022..20881.244rows=51 loops=1)
 
>                     ->  Limit  (cost=0.00..1710.35 rows=5000 width=38) (actual time=8425.017..20881.219 rows=51
loops=1)
>                           ->  LockRows  (cost=0.00..2112304.92 rows=6175068 width=38) (actual
time=8425.016..20881.212rows=51 loops=1)
 
>                                 ->  Seq Scan on message message_1  (cost=0.00..2050554.24 rows=6175068 width=38)
(actualtime=8424.977..20880.945 rows=65 loops=1)
 
>                                       Filter: ((payload IS NOT NULL) AND (expiration <= now()))
>                                       Rows Removed by Filter: 37772897
>         ->  Index Scan using pk_message on message  (cost=0.56..8.56 rows=1 width=191) (actual time=0.073..0.073
rows=1loops=51)
 
>               Index Cond: ((id)::text = ("ANY_subquery".id)::text)
> Planning Time: 0.237 ms
> Execution Time: 20913.310 ms

I think the most likely cause is that the index has just become
bloated from all the updates.  If you run the query after running SET
enable_seqscan TO off; then, proving the planner opts to use the
message_payload_not_null_pidx, you'll see what the planner's estimated
cost of that scan is. If you see the index being used, then that'll at
least confirm the index was not picked due to costs.

If the index gets used, then I'd check the size of the
message_payload_not_null_pidx index.  You could also consider using
pgstatindex() [1] to check the state of the index and if it's bloated,
reindex it.

David

[1] https://www.postgresql.org/docs/current/pgstattuple.html

pgsql-general by date:

From: Tom Lane
Date: 29 November 2023, 02:07:57
Subject: Re: Understanding partial index selection

From: Owen Nelson
Date: 29 November 2023, 04:12:51
Subject: Re: Understanding partial index selection

Re: Understanding partial index selection - Mailing list pgsql-general

Previous

Next