Re: New IndexAM API controlling index vacuum strategies - Mailing list pgsql-hackers

From Peter Geoghegan
Subject Re: New IndexAM API controlling index vacuum strategies
Date
Msg-id CAH2-Wzm+6sMeggGnhbWvU8MdmqT1sSTOaZ-9jJvfVyTV4Nn3Dg@mail.gmail.com
Whole thread Raw
In response to Re: New IndexAM API controlling index vacuum strategies  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
On Mon, Mar 22, 2021 at 7:05 AM Robert Haas <robertmhaas@gmail.com> wrote:
> I agree. I was having trouble before understanding exactly what you
> are proposing, but this makes sense to me and I agree it's a good
> idea.

Our ambition is for this to be one big multi-release umbrella project,
with several individual enhancements that each deliver a user-visible
benefit on their own. The fact that we're talking about a few things
at once is confusing, but I think that you need a "grand bargain" kind
of discussion for this. I believe that it actually makes sense to do
it that way, difficult though it may be.

Sometimes the goal is to improve performance, other times the goal is
to improve robustness. Although the distinction gets blurry at the
margins. If VACUUM was infinitely fast (say because of sorcery), then
performance would bee *unbeatable* -- plus we'd never have to worry
about anti-wraparound vacuums not completing in time!

> I'm not 100% sure whether we need a new GUC for this or not. I think
> that if by default this triggers at the 90% of the hard-shutdown
> limit, it would be unlikely, and perhaps unreasonable, for users to
> want to raise the limit. However, I wonder whether some users will
> want to lower the limit. Would it be reasonable for someone to want to
> trigger this at 50% or 70% of XID exhaustion rather than waiting until
> things get really bad?

I wanted to avoid inventing a GUC for the mechanism in the third patch
(not the emergency mechanism we're discussing right now, which was
posted separately by Masahiko). I think that a GUC to control skipping
index vacuuming purely because there are very few index tuples to
delete in indexes will become a burden before long. In particular, we
should eventually be able to vacuum some indexes but not others (on
the same table) based on the local needs of each index.

As I keep pointing out, bottom-up index deletion has created a
situation where there can be dramatically different needs among
indexes on the same table -- it can literally prevent 100% of all page
splits from version churn in those indexes that are never subject to
logically changes from non-HOT updates. Whereas bottom-up index
deletion does nothing for any index that is logically updated, for the
obvious reason -- there is now frequently a sharp qualitative
difference among indexes that vacuumlazy.c currently imagines have
basically the same needs. Vacuuming these remaining indexes is a cost
that users will actually understand and accept, too.

But that has nothing to do with the emergency mechanism we're talking
about right now. I actually like your idea of making the emergency
mechanism a GUC. It's equivalent to index_cleanup, except that it is
continuous and dynamic (not discrete and static). The fact that this
GUC expresses what VACUUM should do in terms of the age of the target
table's current relfrozenxid age (and nothing else) seems like exactly
the right thing. As I said before: What else could possibly matter? So
I don't see any risk of such a GUC becoming a burden. I also think
that it's a useful knob to be able to tune. It's also going to help a
lot with testing the feature. So let's have a GUC for that.

> Also, one thing that I dislike about the current system is that, from
> a user perspective, when something goes wrong, nothing happens for a
> while and then the whole system goes bananas. It seems desirable to me
> to find ways of gradually ratcheting up the pressure, like cranking up
> the effective cost limit if we can somehow figure out that we're not
> keeping up. If, with your mechanism, there's an abrupt point when we
> switch from never doing this for any table to always doing this for
> every table, that might not be as good as something which does this
> "sometimes" and then, if that isn't enough to avoid disaster, does it
> "more," and eventually ramps up to doing it always, if trouble
> continues. I don't know whether that's possible here, or what it would
> look like, or even whether it's appropriate at all in this particular
> case, so I just offer it as food for thought.

That is exactly the kind of thing that some future highly evolved
version of the broader incremental/dynamic VACUUM design might do.
Your thoughts about the effective delay/throttling lessening as
conditions change is in line with the stuff that we're thinking of
doing. Though I don't believe Masahiko and I have talked about the
delay stuff specifically in any of our private discussions about it.

I am a big believer in the idea that we should have a variety of
strategies that are applied incrementally and dynamically, in response
to an immediate local need (say at the index level). VACUUM should be
able to organically figure out the best strategy (or combination of
strategies) itself, over time. Sometimes it will be very important to
recognize that most indexes have been able to use techniques like
bottom-up index deletion, and so really don't need to be vacuumed at
all. Other times the cost delay stuff will matter much more. Maybe
it's both together, even. The system ought to discover the best
approach dynamically. There will be tremendous variation across tables
and over time -- much too much for anybody to predict and understand
as a practical matter. The intellectually respectable term for what
I'm describing is a complex system.

My work on B-Tree index bloat led me to the idea that sometimes a
variety of strategies can be the real strategy. Take the example of
the benchmark that Victor Yegorov performed, which consisted of a
queue-based workload with deletes, inserts, and updates, plus
constantly holding snapshots for multiple minutes:

https://www.postgresql.org/message-id/CAGnEbogATZS1mWMVX8FzZHMXzuDEcb10AnVwwhCtXtiBpg3XLQ@mail.gmail.com

Bottom-up index deletion appeared to practically eliminate index bloat
here. When we only had deduplication (without bottom-up deletion) the
indexes still ballooned in size. But I don't believe that that's a
100% accurate account. I think that it's more accurate to characterize
what we saw there as a case where deduplication and bottom-up deletion
complemented each other to great effect. If deduplication can buy you
time until the next page split (by reducing the space required for
recently dead but not totally dead index tuples caused by version
churn), and if bottom-up index deletion can avoid page splits (by
deleting now-totally-dead index tuples), then we shouldn't be too
surprised to see complementary effects. Though I have to admit that I
was quite surprised at how true this was in the case of Victor's
benchmark -- it worked very well with the workload, without any
designer predicting or understanding anything specific.

--
Peter Geoghegan



pgsql-hackers by date:

Previous
From: James Coleman
Date:
Subject: Re: Nicer error when connecting to standby with hot_standby=off
Next
From: Andrey Lepikhov
Date:
Subject: Re: [POC] Fast COPY FROM command for the table with foreign partitions