Re: New IndexAM API controlling index vacuum strategies - Mailing list pgsql-hackers

From Peter Geoghegan
Subject Re: New IndexAM API controlling index vacuum strategies
Date
Msg-id CAH2-WznH1NcrvGOzXSr++KNQMUicvvkTvmAVe4NvPwdX3qpqzA@mail.gmail.com
Whole thread Raw
In response to Re: New IndexAM API controlling index vacuum strategies  (Greg Stark <stark@mit.edu>)
List pgsql-hackers
On Sun, Mar 21, 2021 at 1:24 AM Greg Stark <stark@mit.edu> wrote:
> What I've seen is an application that regularly ran ANALYZE on a
> table. This worked fine as long as vacuums took less than the interval
> between analyzes (in this case 1h) but once vacuum started taking
> longer than that interval autovacuum would cancel it every time due to
> the conflicting lock.
>
> That would have just continued until the wraparound vacuum which
> wouldn't self-cancel except that there was also a demon running which
> would look for sessions stuck on a lock and kill the blocker -- which
> included killing the wraparound vacuum.

That's a new one! Though clearly it's an example of what I described.
I do agree that sometimes the primary cause is the special rules for
cancellations with anti-wraparound autovacuums.

> And yes, this demon is obviously a terrible idea but of course it was
> meant for killing buggy user queries. It wasn't expecting to find
> autovacuum jobs blocking things.  The real surprise for that user was
> that VACUUM could be blocked by things that someone would reasonably
> want to run regularly like ANALYZE.

The infrastructure from my patch to eliminate the tupgone special case
(the patch that fully decouples index and heap vacuuming from pruning
and freezing) ought to enable smarter autovacuum cancellations. It
should be possible to make "canceling" an autovacuum worker actually
instruct the worker to consider the possibility of finishing off the
VACUUM operation very quickly, by simply ending index vacuuming (and
heap vacuuming). It should only be necessary to cancel when that
strategy won't work out, because we haven't finished all required
pruning and freezing yet -- which are the only truly essential tasks
of any "successful" VACUUM operation.

Maybe it would only be appropriate to do something like that for
anti-wraparound VACUUMs, which, as you say, don't get cancelled when
they block the acquisition of a lock (which is a sensible design,
though only because of the specific risk of not managing to advance
relfrozenxid). There wouldn't be a question of canceling an
anti-wraparound VACUUM in the conventional sense with this mechanism.
It would simply instruct the anti-wraparound VACUUM to finish as
quickly as possible by skipping the indexes. Naturally the
implementation wouldn't really need to consider whether that meant the
anti-wraparound VACUUM could end almost immediately, or some time
later -- the point is that it completes ASAP.

-- 
Peter Geoghegan



pgsql-hackers by date:

Previous
From: Jan Wieck
Date:
Subject: Re: Fix pg_upgrade to preserve datdba
Next
From: Jan Wieck
Date:
Subject: Re: Fix pg_upgrade to preserve datdba