Re: Frequetly updated partial index leads to bloat on index for Postresql 11 - Mailing list pgsql-general

From Peter Geoghegan
Subject Re: Frequetly updated partial index leads to bloat on index for Postresql 11
Date
Msg-id CAH2-WzmnfOyY2=orNS6RRecEj6o=MqXwRn3Ej2Quc=Sx75Aizg@mail.gmail.com
Whole thread Raw
In response to Re: Frequetly updated partial index leads to bloat on index for Postresql 11  (Tom Dearman <tom.dearman@gmail.com>)
List pgsql-general
On Fri, Jul 16, 2021 at 9:19 AM Tom Dearman <tom.dearman@gmail.com> wrote:
> Other indexes do bloat, but the percentage bloat is a lot less

I have to imagine that the remaining problems have a lot to do with
the fact that this is a partial index -- the partial index naturally
gets vacuumed much less frequently than what would generally be
considered ideal for the index itself. In general VACUUM scheduling
makes the naive assumption that the indexes have the same needs as the
table, which is far from the case with this partial index, for your
workload. It's all of the specifics, taken together.

It sounds like this is a case where bottom-up index deletion won't
help -- it will only trigger in those indexes that are not "logically
modified" by updates. But you're logically modifying these values. Or
you're causing them to not need to be in the index anymore, by
modifying the predicate. But that won't trigger bottom-up deletion.
It's a bit like a delete, as far as the physical index structure is
concerned -- the index won't be eagerly modified by the executor.

The overall picture is that you cycle through all of the values in the
table, and no cleanup can take place other than plain VACUUM (at least
not to any significant extent). Although only a few hundred values are
logically required to be indexed by the partial index at any one time,
in practice no cleanup can run for long stretches of time (autovacuum
just doesn't know about cases like this). This is why the partial
index inevitably exceeds its theoretical pristine/high watermark size,
which is actually more than 1 page/8KB, but still probably a lot less
than what you actually see -- the partial index "falls through the
cracks", even with recent enhancements that made cleanup more eager
and more dynamic in certain other cases.

I am afraid that I don't have a good suggestion right now. I can think
of incremental improvements that would address this case, but for now
they're just ideas. Fundamentally, we need to get to the partial index
much more frequently than the other indexes, either within VACUUM or
within some other mechanism. For example a conservative implementation
of retail index tuple deletion might totally fix the issue. It does
very much look like a problem in these partial indexes in particular
-- it's quite possible that the other indexes won't grow at all due to
garbage index tuples, especially on Postgres 14.

--
Peter Geoghegan



pgsql-general by date:

Previous
From: Adrian Klaver
Date:
Subject: Re: Getting "ERROR: upper bound of FOR loop cannot be null" when trying to use a stored proc in a check constraint def.
Next
From: "David G. Johnston"
Date:
Subject: Re: Getting "ERROR: upper bound of FOR loop cannot be null" when trying to use a stored proc in a check constraint def.