Re: Removing more vacuumlazy.c special cases, relfrozenxid optimizations - Mailing list pgsql-hackers

From Peter Geoghegan
Subject Re: Removing more vacuumlazy.c special cases, relfrozenxid optimizations
Date
Msg-id CAH2-Wzk0C1O-MKkOrj4YAfsGRru2=cA2VQpqM-9R1HNuG3nFaQ@mail.gmail.com
Whole thread Raw
In response to Re: Removing more vacuumlazy.c special cases, relfrozenxid optimizations  (Peter Geoghegan <pg@bowt.ie>)
Responses Re: Removing more vacuumlazy.c special cases, relfrozenxid optimizations
List pgsql-hackers
On Sun, Mar 13, 2022 at 9:05 PM Peter Geoghegan <pg@bowt.ie> wrote:
> Attached is v10. While this does still include the freezing patch,
> it's not in scope for Postgres 15. As I've said, I still think that it
> makes sense to maintain the patch series with the freezing stuff,
> since it's structurally related.

Attached is v11. Changes:

* No longer includes the patch that adds page-level freezing. It was
making it harder to assess code coverage for the patches that I'm
targeting Postgres 15 with. And so including it with each new revision
no longer seems useful. I'll pick it up for Postgres 16.

* Extensive isolation tests added to v11-0001-*, exercising a lot of
hard-to-hit code paths that are reached when VACUUM is unable to
immediately acquire a cleanup lock on some heap page. In particular,
we now have test coverage for the code in heapam.c that handles
tracking the oldest extant XID and MXID in the presence of MultiXacts
(on a no-cleanup-lock heap page).

* v11-0002-* (which is the patch that avoids missing out on advancing
relfrozenxid in non-aggressive VACUUMs due to a race condition on
HEAD) now moves even more of the logic for deciding how VACUUM will
skip using the visibility map into its own helper routine. Now
lazy_scan_heap just does what the state returned by the helper routine
tells it about the current skippable range -- it doesn't make any
decisions itself anymore. This is far simpler than what we do
currently, on HEAD.

There are no behavioral changes here, but this approach could be
pushed further to improve performance. We could easily determine
*every* page that we're going to scan (not skip) up-front in even the
largest tables, very early, before we've even scanned one page. This
could enable things like I/O prefetching, or capping the size of the
dead_items array based on our final scanned_pages (not on rel_pages).

* A new patch (v11-0003-*) alters the behavior of VACUUM's
DISABLE_PAGE_SKIPPING option. DISABLE_PAGE_SKIPPING no longer forces
aggressive VACUUM -- now it only forces the use of the visibility map,
since that behavior is totally independent of aggressiveness.

I don't feel too strongly about the DISABLE_PAGE_SKIPPING change. It
just seems logical to decouple no-vm-skipping from aggressiveness --
it might actually be helpful in testing the work from the patch series
in the future. Any page counted in scanned_pages has essentially been
processed by VACUUM with this work in place -- that was the idea
behind the lazy_scan_noprune stuff from commit 44fa8488. Bear in mind
that the relfrozenxid tracking stuff from v11-0001-* makes it almost
certain that a DISABLE_PAGE_SKIPPING-without-aggressiveness VACUUM
will still manage to advance relfrozenxid -- usually by the same
amount as an equivalent aggressive VACUUM would anyway. (Failing to
acquire a cleanup lock on some heap page might result in the final
older relfrozenxid being appreciably older, but probably not, and we'd
still almost certainly manage to advance relfrozenxid by *some* small
amount.)

Of course, anybody that wants both an aggressive VACUUM and a VACUUM
that never skips even all-frozen pages in the visibility map will
still be able to get that behavior quite easily. For example,
VACUUM(DISABLE_PAGE_SKIPPING, FREEZE) will do that. Several of our
existing tests must already use both of these options together,
because the tests require an effective vacuum_freeze_min_age of 0 (and
vacuum_multixact_freeze_min_age of 0) -- DISABLE_PAGE_SKIPPING alone
won't do that on HEAD, which seems to confuse the issue (see commit
b700f96c for an example of that).

In other words, since DISABLE_PAGE_SKIPPING doesn't *consistently*
force lazy_scan_noprune to refuse to process a page on HEAD (it all
depends on FreezeLimit/vacuum_freeze_min_age), it is logical for
DISABLE_PAGE_SKIPPING to totally get out of the business of caring
about that -- better to limit it to caring only about the visibility
map (by no longer making it force aggressiveness).

-- 
Peter Geoghegan

Attachment

pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: ubsan
Next
From: Peter Eisentraut
Date:
Subject: Re: Proposal: Support custom authentication methods using hooks