Re: should vacuum's first heap pass be read-only? - Mailing list pgsql-hackers

From Peter Geoghegan
Subject Re: should vacuum's first heap pass be read-only?
Date
Msg-id CAH2-Wz=03-qc0c467KdDikN=Kmrc4G7NoK6uJTBVoU263KkcdQ@mail.gmail.com
Whole thread Raw
In response to Re: should vacuum's first heap pass be read-only?  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
On Fri, Apr 1, 2022 at 11:39 AM Robert Haas <robertmhaas@gmail.com> wrote:
> So I'm completely confused here. If we always start a vacuum with
> lazy_scan_heap(), as you said you wanted, then we will not save any
> heap scanning.

The term "start a VACUUM" becomes ambiguous with the conveyor belt.

What I was addressed in a nearby email back in February [1] was the
idea of doing heap vacuuming of the last run (or several runs) of dead
TIDs on top of heap pruning to create the next run/runs of dead TIDs.

> What am I missing?

There is a certain sense in which we are bound to always "start a
vacuum" in lazy_scan_prune(), with any design based on the current
one. How else are we ever going to make a basic initial determination
about which heap LP_DEAD items need their TIDs deleted from indexes,
sooner or later? Obviously that information must always have
originated in lazy_scan_prune (or in lazy_scan_noprune).

With the conveyor belt, and a non-HOT-update heavy workload, we'll
eventually need to exhaustively do index vacuuming of all indexes
(even those that don't need it for their own sake) to make it safe to
remove heap line pointer bloat (to set heap LP_DEAD items to
LP_UNUSED). This will happen least often of all, and is the one
dependency conveyor belt can't help with.

To answer your question: when heap vacuuming does finally happen, we
at least don't need to call lazy_scan_prune for any pages first
(neither the pages we're vacuuming, nor any other heap pages). Plus
the decision to finally clean up line pointer bloat can be made based
on known facts about line pointer bloat, without tying that to other
processing done by lazy_scan_prune() -- so there's greater separation
of concerns.

That having been said...maybe it would make sense to also call
lazy_scan_prune() right after these relatively rare calls to
lazy_vacuum_heap_page(), opportunistically (since we already dirtied
the page once). But that would be an additional optimization, at best; it
wouldn't be the main way that we call lazy_scan_prune().

[1] https://www.postgresql.org/message-id/CAH2-WzmG%3D_vYv0p4bhV8L73_u%2BBkd0JMWe2zHH333oEujhig1g%40mail.gmail.com
--
Peter Geoghegan



pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: standby recovery fails (tablespace related) (tentative patch and discussion)
Next
From: Andrew Dunstan
Date:
Subject: Re: Can we automatically add elapsed times to tap test log?