Re: Removing more vacuumlazy.c special cases, relfrozenxid optimizations - Mailing list pgsql-hackers

From Peter Geoghegan
Subject Re: Removing more vacuumlazy.c special cases, relfrozenxid optimizations
Date
Msg-id CAH2-Wz=c8LUnMuE1ioU=eLbfB4-7hNPu_yBpPyA1WRrHXLRaOQ@mail.gmail.com
Whole thread Raw
In response to Re: Removing more vacuumlazy.c special cases, relfrozenxid optimizations  (Peter Geoghegan <pg@bowt.ie>)
Responses Re: Removing more vacuumlazy.c special cases, relfrozenxid optimizations
List pgsql-hackers
On Tue, Nov 30, 2021 at 11:52 AM Peter Geoghegan <pg@bowt.ie> wrote:
> I haven't had time to work through any of your feedback just yet --
> though it's certainly a priority for. I won't get to it until I return
> home from PGConf NYC next week.

Attached is v3, which works through most of your (Andres') feedback.

Changes in v3:

* While the first patch still gets rid of the "pinskipped_pages"
instrumentation, the second patch adds back a replacement that's
better targeted: it tracks and reports "missed_dead_tuples". This
means that log output will show the number of fully DEAD tuples with
storage that could not be pruned away due to the fact that that would
have required waiting for a cleanup lock. But we *don't* generally
report the number of pages that we couldn't get a cleanup lock on,
because that in itself doesn't mean that we skipped any useful work
(which is very much the point of all of the refactoring in the first
patch).

* We now have FSM processing in the lazy_scan_noprune case, which more
or less matches the standard lazy_scan_prune case.

* Many small tweaks, based on suggestions from Andres, and other
things that I noticed.

* Further simplification of the "consider skipping pages using
visibility map" logic -- now we always don't skip the last block in
the relation, without calling should_attempt_truncation() to make sure
we have a reason.

Note that this means that we'll always read the final page during
VACUUM, even when doing so is provably unhelpful. I'd prefer to keep
the code that deals with skipping pages using the visibility map as
simple as possible. There isn't much downside to always doing that
once my refactoring is in place: there is no risk that we'll wait for
a cleanup lock (on the final page in the rel) for no good reason.
We're only wasting one page access, at most.

(I'm not 100% sure that this is the right trade-off, actually, but
it's at least worth considering.)

Not included in v3:

* Still haven't added the isolation test for rel truncation, though
it's on my TODO list.

* I'm still working on the optimization that we discussed on this
thread: the optimization that allows the final relfrozenxid (that we
set in pg_class) to be determined dynamically, based on the actual
XIDs we observed in the table (we don't just naively use FreezeLimit).

I'm not ready to post that today, but it shouldn't take too much
longer to be good enough to review.

Thanks
-- 
Peter Geoghegan

Attachment

pgsql-hackers by date:

Previous
From: David Zhang
Date:
Subject: Question about 001_stream_rep.pl recovery test
Next
From: Colin Gilbert
Date:
Subject: Re: Appetite for Frama-C annotations?