Re: do only critical work during single-user vacuum? - Mailing list pgsql-hackers

From Peter Geoghegan
Subject Re: do only critical work during single-user vacuum?
Date
Msg-id CAH2-WzkpZcat1YmEVAKpcMBaWRXUd2vGVKHtrBjzBGDL=j6mbA@mail.gmail.com
Whole thread Raw
In response to Re: do only critical work during single-user vacuum?  (John Naylor <john.naylor@enterprisedb.com>)
Responses Re: do only critical work during single-user vacuum?  (Peter Geoghegan <pg@bowt.ie>)
Re: do only critical work during single-user vacuum?  (Masahiko Sawada <sawada.mshk@gmail.com>)
List pgsql-hackers
On Mon, Feb 14, 2022 at 10:04 PM John Naylor
<john.naylor@enterprisedb.com> wrote:
> Well, the point of inventing this new vacuum mode was because I
> thought that upon reaching xidStopLimit, we couldn't issue commands,
> period, under the postmaster. If it was easier to get a test instance
> to xidStopLimit, I certainly would have discovered this sooner.

I did notice from my own testing of the failsafe (by artificially
inducing wraparound failure using an XID burning C function) that
autovacuum seemed to totally correct the problem, even when the system
had already crossed xidStopLimit - it came back on its own. I wasn't
completely sure of how robust this effect was, though.

> When
> Andres wondered about getting away from single user mode, I assumed
> that would involve getting into areas too deep to tackle for v15. As
> Robert pointed out, lazy_truncate_heap is the only thing that can't
> happen for vacuum at this point, and fully explains why in versions <
> 14 our client's attempts to vacuum resulted in error. Since the
> failsafe mode turns off truncation, vacuum should now *just work* near
> wraparound. If there is any doubt, we can tighten the check for
> entering failsafe.

Obviously having to enter single user mode is horrid. If we can
reasonably update the advice to something more reasonable now, then
that would help users that find themselves in this situation a great
deal.

> Now, it's certainly possible that autovacuum is either not working at
> all because of something broken, or is not working on the oldest
> tables at the moment, so one thing we could do is to make VACUUM [with
> no tables listed] get the tables from pg_class in reverse order of
> max(xid age, mxid age). That way, the horizon will eventually pull
> back over time and the admin can optionally cancel the vacuum at some
> point. Since the order is harmless when it's not needed, we can do
> that unconditionally.

My ongoing work on freezing/relfrozenxid tends to make the age of
relfrozenxid much more indicative of the amount of work that VACUUM
would have to do when run -- not limited to freezing. You could
probably do this anyway, but it's nice that that'll be true.

-- 
Peter Geoghegan



pgsql-hackers by date:

Previous
From: Julien Rouhaud
Date:
Subject: Re: Mark all GUC variable as PGDLLIMPORT
Next
From: Nitin Jadhav
Date:
Subject: Re: Refactor CheckpointWriteDelay()