Re: New IndexAM API controlling index vacuum strategies - Mailing list pgsql-hackers

From Robert Haas
Subject Re: New IndexAM API controlling index vacuum strategies
Date
Msg-id CA+TgmoYNgmqz0QZ+=qiYb+_+2f2xWo9BTUMhe05Ccn9Yrz6r1A@mail.gmail.com
Whole thread Raw
In response to Re: New IndexAM API controlling index vacuum strategies  (Peter Geoghegan <pg@bowt.ie>)
Responses Re: New IndexAM API controlling index vacuum strategies
List pgsql-hackers
On Thu, Mar 18, 2021 at 9:42 PM Peter Geoghegan <pg@bowt.ie> wrote:
> The fact that we can *continually* reevaluate if an ongoing VACUUM is
> at risk of taking too long is entirely the point here. We can in
> principle end index vacuuming dynamically, whenever we feel like it
> and for whatever reasons occur to us (hopefully these are good reasons
> -- the point is that we get to pick and choose). We can afford to be
> pretty aggressive about not giving up, while still having the benefit
> of doing that when it *proves* necessary. Because: what are the
> chances of the emergency mechanism ending index vacuuming being the
> wrong thing to do if we only do that when the system clearly and
> measurably has no more than about 10% of the possible XID space to go
> before the system becomes unavailable for writes?

I agree. I was having trouble before understanding exactly what you
are proposing, but this makes sense to me and I agree it's a good
idea.

> > But ... should the thresholds for triggering these kinds of mechanisms
> > really be hard-coded with no possibility of being configured in the
> > field? What if we find out after the release is shipped that the
> > mechanism works better if you make it kick in sooner, or later, or if
> > it depends on other things about the system, which I think it almost
> > certainly does? Thresholds that can't be changed without a recompile
> > are bad news. That's why we have GUCs.
>
> I'm fine with a GUC, though only for the emergency mechanism. The
> default really matters, though -- it shouldn't be necessary to tune
> (since we're trying to address a problem that many people don't know
> they have until it's too late). I still like 1.8 billion XIDs as the
> value -- I propose that that be made the default.

I'm not 100% sure whether we need a new GUC for this or not. I think
that if by default this triggers at the 90% of the hard-shutdown
limit, it would be unlikely, and perhaps unreasonable, for users to
want to raise the limit. However, I wonder whether some users will
want to lower the limit. Would it be reasonable for someone to want to
trigger this at 50% or 70% of XID exhaustion rather than waiting until
things get really bad?

Also, one thing that I dislike about the current system is that, from
a user perspective, when something goes wrong, nothing happens for a
while and then the whole system goes bananas. It seems desirable to me
to find ways of gradually ratcheting up the pressure, like cranking up
the effective cost limit if we can somehow figure out that we're not
keeping up. If, with your mechanism, there's an abrupt point when we
switch from never doing this for any table to always doing this for
every table, that might not be as good as something which does this
"sometimes" and then, if that isn't enough to avoid disaster, does it
"more," and eventually ramps up to doing it always, if trouble
continues. I don't know whether that's possible here, or what it would
look like, or even whether it's appropriate at all in this particular
case, so I just offer it as food for thought.

> > On another note, I cannot say enough bad things about the function
> > name two_pass_strategy(). I sincerely hope that you're not planning to
> > create a function which is a major point of control for VACUUM whose
> > name gives no hint that it has anything to do with vacuum.
>
> You always hate my names for things. But that's fine by me -- I'm
> usually not very attached to them. I'm happy to change it to whatever
> you prefer.

My taste in names may be debatable, but including the subsystem name
in the function name seems like a pretty bare-minimum requirement,
especially when the other words in the function name could refer to
just about anything.

-- 
Robert Haas
EDB: http://www.enterprisedb.com



pgsql-hackers by date:

Previous
From: Stephen Frost
Date:
Subject: Re: Disable WAL logging to speed up data loading
Next
From: Fabrízio de Royes Mello
Date:
Subject: Re: Minimal logical decoding on standbys