Re: Understanding partial index selection - Mailing list pgsql-general

From David Rowley
Subject Re: Understanding partial index selection
Date
Msg-id CAApHDvrV6Nx0CVzwdyCW6K60m9O_LLmyANc=NFM+4RucjhniGw@mail.gmail.com
Whole thread Raw
In response to Understanding partial index selection  (Owen Nelson <onelson@gmail.com>)
List pgsql-general
On Wed, 29 Nov 2023 at 11:23, Owen Nelson <onelson@gmail.com> wrote:
>     "message_payload_not_null_pidx" btree (expiration) WHERE payload IS NOT NULL

> I periodically run a query like this:
> ```
> UPDATE message SET payload = NULL WHERE id IN (

> Update on message  (cost=1773.41..44611.36 rows=5000 width=283) (actual time=20913.192..20913.194 rows=0 loops=1)
>   ->  Nested Loop  (cost=1773.41..44611.36 rows=5000 width=283) (actual time=20881.320..20886.541 rows=51 loops=1)
>         ->  HashAggregate  (cost=1772.85..1822.85 rows=5000 width=88) (actual time=20881.286..20882.052 rows=51
loops=1)
>               Group Key: ("ANY_subquery".id)::text
>               ->  Subquery Scan on "ANY_subquery"  (cost=0.00..1760.35 rows=5000 width=88) (actual
time=8425.022..20881.244rows=51 loops=1)
 
>                     ->  Limit  (cost=0.00..1710.35 rows=5000 width=38) (actual time=8425.017..20881.219 rows=51
loops=1)
>                           ->  LockRows  (cost=0.00..2112304.92 rows=6175068 width=38) (actual
time=8425.016..20881.212rows=51 loops=1)
 
>                                 ->  Seq Scan on message message_1  (cost=0.00..2050554.24 rows=6175068 width=38)
(actualtime=8424.977..20880.945 rows=65 loops=1)
 
>                                       Filter: ((payload IS NOT NULL) AND (expiration <= now()))
>                                       Rows Removed by Filter: 37772897
>         ->  Index Scan using pk_message on message  (cost=0.56..8.56 rows=1 width=191) (actual time=0.073..0.073
rows=1loops=51)
 
>               Index Cond: ((id)::text = ("ANY_subquery".id)::text)
> Planning Time: 0.237 ms
> Execution Time: 20913.310 ms

I think the most likely cause is that the index has just become
bloated from all the updates.  If you run the query after running SET
enable_seqscan TO off; then, proving the planner opts to use the
message_payload_not_null_pidx, you'll see what the planner's estimated
cost of that scan is. If you see the index being used, then that'll at
least confirm the index was not picked due to costs.

If the index gets used, then I'd check the size of the
message_payload_not_null_pidx index.  You could also consider using
pgstatindex() [1] to check the state of the index and if it's bloated,
reindex it.

David

[1] https://www.postgresql.org/docs/current/pgstattuple.html



pgsql-general by date:

Previous
From: Tom Lane
Date:
Subject: Re: Understanding partial index selection
Next
From: Owen Nelson
Date:
Subject: Re: Understanding partial index selection