Re: decoupling table and index vacuum - Mailing list pgsql-hackers

From Peter Geoghegan
Subject Re: decoupling table and index vacuum
Date
Msg-id CAH2-WzmRvMrnhO4aRcd1vnoFkriQKSZqtP20xLi9L9YMGkRZ1A@mail.gmail.com
Whole thread Raw
In response to Re: decoupling table and index vacuum  (Peter Geoghegan <pg@bowt.ie>)
Responses Re: decoupling table and index vacuum  (Andres Freund <andres@anarazel.de>)
List pgsql-hackers
On Fri, Apr 23, 2021 at 1:04 PM Peter Geoghegan <pg@bowt.ie> wrote:
> I think that a simple heuristic could work very well here, but it
> needs to be at least a little sensitive to the extremes. And I mean
> all of the extremes, not just the one from my example -- every
> variation exists and will cause problems if given zero weight.

To expand on this a bit, my objection to counting the number of live
tuples in the index (as a means to determining how aggressively each
individual index needs to be vacuumed) is this: it's driven by
positive feedback, not negative feedback. We should focus on *extreme*
adverse events (e.g., version-driven page splits) instead. We don't
even need to understand ordinary adverse events (e.g., how many dead
tuples are in the index).

The cost of accumulating dead tuples in an index (could be almost any
index AM) grows very slowly at first, and then suddenly explodes
(actually it's more like a cascade of correlated explosions, but for
the purposes of this explanation that doesn't matter). In a way, this
makes life easy for us. The cost of accumulating dead tuples rises so
dramatically at a certain inflection point that we can reasonably
assume that that's all that matters -- just stop the explosions. An
extremely simple heuristic that prevents these extreme adverse events
can work very well because that's where almost all of the possible
downside is. We can be sure that these extreme adverse events are
universally very harmful (workload doesn't matter). Note that the same
is not true for an approach driven by positive feedback -- it'll be
fragile because it depends on workload characteristics in unfathomably
many ways. We should focus on what we can understand with a high
degree of confidence.

We just need to identify what the extreme adverse event is in each
index AM, count them, and focus on those (could be a VACUUM thing,
could be local to the index AM like bottom-up deletion is). We need to
notice when things are *starting* to go really badly and intervene
aggressively. So we need to be willing to try a generic index
vacuuming strategy first, and then notice that it has just failed, or
is just about to fail. Something like version-driven page splits
really shouldn't ever happen, so even a very crude approach will
probably work very well.

-- 
Peter Geoghegan



pgsql-hackers by date:

Previous
From: Andrew Dunstan
Date:
Subject: Re: pg_amcheck option to install extension
Next
From: Andres Freund
Date:
Subject: Re: decoupling table and index vacuum