Re: Question about behavior of conditional indexes - Mailing list pgsql-general

From Koen De Groote
Subject Re: Question about behavior of conditional indexes
Date
Msg-id CAGbX52EyNLqW2W9mwVdE3zh3DqKjghqZA87rYrTz+zEEfC1F7w@mail.gmail.com
Whole thread Raw
In response to Re: Question about behavior of conditional indexes  (Michael Lewis <mlewis@entrata.com>)
Responses Re: Question about behavior of conditional indexes  (Michael Lewis <mlewis@entrata.com>)
List pgsql-general
Thanks for the replies, everyone.

Gavin - I can't upgrade to a more recent version, at least not for the foreseeable future. From what I'm reading, it's the best path forward, but there's considerations to be made that I can't overrule.

Ninad - As I suspected about VACUUM and VACUUM FULL. Thanks for confirming

Michael - Yes, my mistakes in manually obfuscating.

Going over the documentation, is seems after pg 11, several additions are made, like "vacuum_index_cleanup" and "VACUUM INDEX_CLEANUP".
From what I'm reading, these did not exist in pg11.Googling a bit shows me this: https://www.depesz.com/2019/05/01/waiting-for-postgresql-12-allow-vacuum-to-be-run-with-index-cleanup-disabled/

From which I assume that index cleaning is the default in pg11 and cannot be controlled. Though it appears to not be "enough" for my use case.

There is already a cronjob that does "VACUUM ANALYZE" during the night. Though it seems the index keeps growing, regardless of cleanup... I'm not sure what to think of that, or what to look for, apart from more agressive autovacuum.

I'll look into dropping and recreating this index without those fields, and the fillfactor you mention.







On Tue, Sep 21, 2021 at 6:15 PM Michael Lewis <mlewis@entrata.com> wrote:
You likely need to tune your autovacuum settings such that the index bloat reaches a steady-state and does not continue to increase. When vacuum runs, it will remove dead (and no longer visible to any running transactions) tuples aka row versions from each page (8KB block by default) in the file for that table's data. It will also update the index, except in newer versions of Postgres where that behavior becomes optional (see manual for which version and whether it still runs by default). If you are I/O bound and cannot tune the system defaults to autovacuum more proactively (when a smaller percent of rows are dead), then perhaps just change the settings for that table as it seems to be functioning a bit like a queue. Or you might consider setting up a daily job to vacuum analyze on all tables, if there is a period of low system activity. If you didn't have the index on the columns you are updating, then reducing fillfactor would be an option to increase HOT updates and thereby prevent the index bloat. Alas, that's not an option with the index needing to reference those values that are being changed.

"index002" btree (action_performed, should_still_perform_action, action_performed_at DESC) WHERE should_still_perform_action = false AND action_performed = true
That index seems odd to me. Why index a single value for the first two columns? I would recreate it with those skipped. Also, based on the names, I'd expect the opposite for true and false. That is, the "interesting" rows are where the action has NOT yet been performed yet and it is needed. I'd expect the index as defined to cover most of the table rather than a small fraction. Perhaps just a typo from manual obfuscation.
For what it's worth, you can create new concurrently, drop old concurrently, then rename new to old. That would be the same result as a reindex concurrently.

pgsql-general by date:

Previous
From: Gavin Flower
Date:
Subject: Re: Timestamp with vs without time zone.
Next
From: Michael Lewis
Date:
Subject: Re: Question about behavior of conditional indexes