Re: Decoupling antiwraparound autovacuum from special rules around auto cancellation - Mailing list pgsql-hackers

From Peter Geoghegan
Subject Re: Decoupling antiwraparound autovacuum from special rules around auto cancellation
Date
Msg-id CAH2-Wzmi-viO==nTirX-juvkfXHVCTgrAShUd25YtsU2vdckSw@mail.gmail.com
Whole thread Raw
In response to Re: Decoupling antiwraparound autovacuum from special rules around auto cancellation  (Laurenz Albe <laurenz.albe@cybertec.at>)
List pgsql-hackers
On Sun, Nov 27, 2022 at 8:54 AM Laurenz Albe <laurenz.albe@cybertec.at> wrote:
> That's exactly what I was trying to debate.  Wouldn't it make sense to
> trigger VACUUM earlier so that it has a chance of being less heavy?
> On the other hand, if there are not sufficiently many modifications
> on the table to trigger autovacuum, perhaps it doesn't matter in many
> cases.

Maybe. There is a deeper problem here, though: table age is a really
terrible proxy for whether or not it's appropriate for VACUUM to
freeze preexisting all-visible pages. It's not obvious that half
autovacuum_freeze_max_age is much better than
autovacuum_freeze_max_age if your concern is avoiding getting too far
into debt on freezing. Afterall, this is debt that must be paid back
by freezing some number of physical heap pages, which in general has
approximately zero relationship with table age (we need physical units
for this, not logical units).

This is a long standing problem that I hope and expect will be fixed
in 16, by my ongoing work to completely remove the concept of
aggressive mode VACUUM:

https://commitfest.postgresql.org/40/3843/

This makes VACUUM care about both table age and the number of unfrozen
heap pages (mostly the latter). It weighs everything at the start of
each VACUUM, and decides on how it must advance relfrozenxid based on
the conditions in the table and the picture over time. Note that
performance stability is the main goal; we will not just keep
accumulating unfrozen pages for no good reason. All of the behaviors
previously associated with aggressive mode are retained, but are
individually applied on a timeline that is attuned to the needs of the
table (we can still wait for a cleanup lock, but that happens much
later than the point that the same page first becomes eligible for
freezing, not at exactly the same time).

In short, "aggressiveness" becomes a continuous thing, rather than a
discrete mode of operation, improving performance stability. We go
back to having only one kind of lazy vacuum, which is how things
worked prior to the introduction of the visibility map. (We did have
antiwraparound autovacuums in 8.3, but we did not have
aggressive/scan_all VACUUMs at the time.)

> Is that really so much less aggressive?  Will that autovacuum run want
> to process all pages that are not all-frozen?  If not, it probably won't
> do much good.  If yes, it will be just as heavy as an anti-wraparound
> autovacuum (except that it won't block other sessions).

Even if we assume that my much bigger patch set won't make it into 16,
it'll probably still be a good idea to do this in 16. I admit that I
haven't really given that question enough thought to be sure of that,
though. Naturally my goal is to get everything in. Hopefully I'll
never have to make that call.

It is definitely true that this patch is "the autovacuum side" of the
work from the other much larger patchset (which handles "the VACUUM
side" of things). This antiwraparound patch should probably be
considered in that context, even though it's theoretically independent
work. It just worked out that way.

> True.  On the other hand, it might happen that after this, people start
> worrying about normal autovacuum runs because they occasionally experience
> a table age autovacuum that is much heavier than the other ones.  And
> they can no longer tell the reason, because it doesn't show up anywhere.

But you can tell the reason, just by looking at the autovacuum log
reports. The only thing you can't do is see "(to prevent wraparound)"
in pg_stat_activity. That (and the autocancellation behavioral change)
are the only differences.

The big picture is that users really will have no good reason to care
very much about autovacuums that were triggered to advance
relfrozenxid (at least in the common case where we haven't needed to
make them antiwraparound autovacuums). They could almost (though not
quite) now be explained as "an autovacuum that takes place because
it's been a while since we did an autovacuum to deal with bloat and/or
tuple inserts". That will at least be reasonable if you assume all of
the patches get in.

-- 
Peter Geoghegan



pgsql-hackers by date:

Previous
From: Ted Yu
Date:
Subject: Re: Add tracking of backend memory allocated to pg_stat_activity
Next
From: Andrey Borodin
Date:
Subject: Re: pglz compression performance, take two