Thread: Partial index "microvacuum"

Partial index "microvacuum"

From
Marko Tiikkaja
Date:
So I've been looking at issues we used to have in production some time
ago which eventually lead us to migrating away from partial indexes in
some cases.  In the end, I'm surprised how easy this (or at least a
similar case) was to reproduce.  The attached program does some
UPDATEs where around every third update deletes the row from the
partial index since it doesn't match indpred anymore.  In that case
the row is immediately UPDATEd back to match the index WHERE clause
again.  This roughly emulates what some of our processes do in
production.

Today, running the program for a few minutes (until the built-in
262144 iteration limit), I usually end up with a partial index through
which producing the only row takes milliseconds on a cold cache, and
over a millisecond on a hot one.  Finding the row through the primary
key is still fast, because the bloat there gets cleaned up.  As far as
I can tell, after the index has gotten into this state, there's no way
to clean it up except VACUUMing the entire table or a REINDEX.  Both
solutions are pretty bad.

My working theory was that this has to do with the fact that
HeapTupleSatisfiesMVCC doesn't set the HEAP_XMAX_COMMITTED bit here,
but I'm not so sure anymore.  Has anyone seen something like this?  If
that really is what's happening here, then I can see why we wouldn't
want to slow down SELECTs with expensive visibility checks.  But that
really leaves me wishing for something like VACUUM INDEX partial_idx.
Otherwise your elephant just keeping getting slower and slower until
you get called at 2 AM to play REINDEX.

(I've tested this on 9.6, v11 and v13.  13 seems to be a bit better
here, but not "fixed", I think.)


.m

Attachment

Re: Partial index "microvacuum"

From
Peter Geoghegan
Date:
On Wed, Sep 15, 2021 at 7:18 AM Marko Tiikkaja <marko@joh.to> wrote:
> So I've been looking at issues we used to have in production some time
> ago which eventually lead us to migrating away from partial indexes in
> some cases.  In the end, I'm surprised how easy this (or at least a
> similar case) was to reproduce.

> (I've tested this on 9.6, v11 and v13.  13 seems to be a bit better
> here, but not "fixed", I think.)

What about v14? There were significant changes to the
microvacuum/index deletion stuff in that release:

https://www.postgresql.org/docs/14/btree-implementation.html#BTREE-DELETION

-- 
Peter Geoghegan



Re: Partial index "microvacuum"

From
Marko Tiikkaja
Date:
On Wed, Sep 15, 2021 at 7:25 PM Peter Geoghegan <pg@bowt.ie> wrote:
> What about v14? There were significant changes to the
> microvacuum/index deletion stuff in that release:
>
> https://www.postgresql.org/docs/14/btree-implementation.html#BTREE-DELETION

Huh.  Interesting.  I'm sorry, I wasn't aware of this work and didn't
have version 14 at hand.  But it looks like both the partial index as
well as the secondary index on (id::text) get cleaned up nicely there.
I even tried a version where I have a snapshot open for the entire
run, and the subsequents SELECTs clean the bloat up.  I'll need to
read up on the details a bit to understand exactly what changed, but
it appears that at least this particular pattern has already been
fixed.

Thank you so much for your work on this!


.m



Re: Partial index "microvacuum"

From
Peter Geoghegan
Date:
On Thu, Sep 16, 2021 at 4:45 AM Marko Tiikkaja <marko@joh.to> wrote:
> Huh.  Interesting.  I'm sorry, I wasn't aware of this work and didn't
> have version 14 at hand.  But it looks like both the partial index as
> well as the secondary index on (id::text) get cleaned up nicely there.

That's great.

I understand why other hackers see partial indexes as a special case,
but I don't really see them that way. The only substantive difference
is the considerations for HOT safety in your scenario, versus a
scenario with an equivalent non-partial index. By equivalent I mean an
index that is the same in every way, but doesn't have a predicate. And
with the same workload. In other words, an index that really should
have been partial (because the "extra" index tuples are useless in
practice), but for whatever reason wasn't defined that way.

If you look at what's going on at the level of the constantly modified
leaf pages in each scenario, then you'll see no differences -- none at
all. The problem of VACUUM running infrequently is really no worse
with the partial index. VACUUM runs infrequently relative to the small
useful working set in *either* scenario. The useless extra index
tuples in the non-partial-index scenario only *hide* the problem --
obviously they're not protective in any way.

> I even tried a version where I have a snapshot open for the entire
> run, and the subsequents SELECTs clean the bloat up.  I'll need to
> read up on the details a bit to understand exactly what changed, but
> it appears that at least this particular pattern has already been
> fixed.

Bottom-up index deletion tends to help even when a snapshot holds back
cleanup. For example:

https://www.postgresql.org/message-id/CAGnEbogATZS1mWMVX8FzZHMXzuDEcb10AnVwwhCtXtiBpg3XLQ@mail.gmail.com

It's hard to explain exactly why this happens. The short version is
that there is a synergy between deduplication and bottom-up index
deletion. As bottom-up index deletion starts to fail (because it
fundamentally isn't possible to delete any more index tuples on the
page due to the basic invariants for cleanup not allowing it),
deduplication "takes over for the page". Deduplication can "absorb"
extra versions from non-hot updates. A deduplication pass could easily
buy us enough time for the old snapshot to naturally go away. Next
time around a bottom-up index deletion pass is attempted for the same
page, we'll probably find something to delete.

Just accepting version-driven page splits was always a permanent
solution to a temporary problem.

> Thank you so much for your work on this!

Thanks Marko!

-- 
Peter Geoghegan