Re: do only critical work during single-user vacuum? - Mailing list pgsql-hackers

From Peter Geoghegan
Subject Re: do only critical work during single-user vacuum?
Date
Msg-id CAH2-Wzk0ZrsYx7FxQZ2dGm4XLDNwp2jpNDXwjpASiJNs4HBhDQ@mail.gmail.com
Whole thread Raw
In response to Re: do only critical work during single-user vacuum?  (John Naylor <john.naylor@enterprisedb.com>)
Responses Re: do only critical work during single-user vacuum?
List pgsql-hackers
On Wed, Feb 16, 2022 at 12:43 AM John Naylor
<john.naylor@enterprisedb.com> wrote:
> I'll put some effort in finding any way that it might not be robust.
> After that, changing the message and docs is trivial.

It would be great to be able to totally drop the idea of using
single-user mode before Postgres 15 feature freeze. How's that going?

I suggest that we apply the following patch as part of that work. It
adds one last final failsafe check at the point that VACUUM makes a
final decision on rel truncation.

It seems unlikely that the patch will ever make the crucial difference
in a wraparound scenario -- in practice it's very likely that we'd
have triggered the wraparound at that point if we run into trouble
with the target rel's relfrozenxid age. And even if it does get to
that point, it would still be possible for the autovacuum launcher to
launch another autovacuum -- this time around we will avoid rel
truncation, restoring the system to normal operation (i.e. no more
xidStopLimit state).

On the other hand it's possible that lazy_cleanup_all_indexes() will
take a very long time to run, and it runs after the current final
failsafe check. An index AM's amvacuumcleanup() routine can take a
long time to run sometimes, especially with GIN indexes. And so it's
just about possible that we won't have triggered the failsafe by the
time lazy_cleanup_all_indexes() is called, which then spends a long
time doing index cleanup -- long enough for the system to reach
xidStopLimit due to the target rel's relfrozenxid age crossing the
crucial xidStopLimit crossover point.

This patch makes this problem scenario virtually impossible. Right now
I'm only prepared to say it's very unlikely. I don't see a reason to
take any chances, though.

-- 
Peter Geoghegan

Attachment

pgsql-hackers by date:

Previous
From: Nathan Bossart
Date:
Subject: Re: Optimize external TOAST storage
Next
From: Greg Stark
Date:
Subject: Re: Skip partition tuple routing with constant partition key