Re: should vacuum's first heap pass be read-only? - Mailing list pgsql-hackers

From Greg Stark
Subject Re: should vacuum's first heap pass be read-only?
Date
Msg-id CAM-w4HPqAk-W3Uwep-dJ+MOk3je71NSZG5q0KaMtcCXTjCOjJg@mail.gmail.com
Whole thread Raw
In response to should vacuum's first heap pass be read-only?  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: should vacuum's first heap pass be read-only?  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
On Thu, 3 Feb 2022 at 12:21, Robert Haas <robertmhaas@gmail.com> wrote:
>
> VACUUM's first pass over the heap is implemented by a function called
> lazy_scan_heap(), while the second pass is implemented by a function
> called lazy_vacuum_heap_rel(). This seems to imply that the first pass
> is primarily an examination of what is present, while the second pass
> does the real work. This used to be more true than it now is.

I've been out of touch for a while but I'm trying to catch up with the
progress of the past few years.

Whatever happened to the idea to "rotate" the work of vacuum. So all
the work of the second pass would actually be deferred until the first
pass of the next vacuum cycle.

That would also have the effect of eliminating the duplicate work,
both the  writes with the wal generation as well as the actual scan.
The only heap scan would be "remove line pointers previously cleaned
from indexes and prune dead tuples recording them to clean from
indexes in future". The index scan would remove line pointers and
record them to be removed from the heap in a future heap scan.

The downside would mainly be in the latency before the actual tuples
get cleaned up from the table. That is not so much of an issue as far
as space these days with tuple pruning but is more and more of an
issue with xid wraparound. Also, having to record the line pointers
that have been cleaned from indexes somewhere on disk for the
subsequent vacuum would be extra state on disk and we've learned that
means extra complexity.

-- 
greg



pgsql-hackers by date:

Previous
From: Stephen Frost
Date:
Subject: Re: Support for NSS as a libpq TLS backend
Next
From: Nathan Bossart
Date:
Subject: Re: make MaxBackends available in _PG_init