Re: Show various offset arrays for heap WAL records - Mailing list pgsql-hackers

From Peter Geoghegan
Subject Re: Show various offset arrays for heap WAL records
Date
Msg-id CAH2-Wz=0ztNEy+8G3_Xj56qPfZ7eJzrHtfOtGdLavjrZJJMmeA@mail.gmail.com
Whole thread Raw
In response to Re: Show various offset arrays for heap WAL records  (Peter Geoghegan <pg@bowt.ie>)
Responses Re: Show various offset arrays for heap WAL records  (Peter Geoghegan <pg@bowt.ie>)
Recording whether Heap2/PRUNE records are from VACUUM or from opportunistic pruning (Was: Show various offset arrays for heap WAL records)  (Peter Geoghegan <pg@bowt.ie>)
List pgsql-hackers
On Mon, Mar 13, 2023 at 6:41 PM Peter Geoghegan <pg@bowt.ie> wrote:
> There are several different things that seem important to me
> personally. These are in tension with each other, to a degree. These
> are:
>
> 1. Like Andres, I'd really like to have some way of inspecting things
> like heapam PRUNE, VACUUM, and FREEZE_PAGE records in significant
> detail. These record types happen to be very important in general, and
> the ability to see detailed information about the WAL record would
> definitely help with some debugging scenarios. I've really missed
> stuff like this while debugging serious issues under time pressure.

One problem that I often run into when performing analysis of VACUUM
using pg_walinspect is the issue of *who* pruned which heap page, for
any given PRUNE record. Was it VACUUM/autovacuum, or was it
opportunistic pruning? There is no way of knowing for sure right now.
You *cannot* rely on an xid of 0 as an indicator of a given PRUNE
record coming from VACUUM; it could just have been an opportunistic
prune operation that happened to take place when a SELECT query ran,
before any XID was ever allocated.

I think that we should do something like the attached, to completely
avoid this ambiguity. This patch adds a new XLOG_HEAP2 bit that's
similar to XLOG_HEAP_INIT_PAGE -- XLOG_HEAP2_BYVACUUM. This allows all
XLOG_HEAP2 record types to indicate that they took place during
VACUUM, by XOR'ing the flag with the record type/info when
XLogInsert() is called. For now this is only used by PRUNE records.
Tools like pg_walinspect will report a separate "Heap2/PRUNE+BYVACUUM"
record_type, as well as the unadorned Heap2/PRUNE record_type, which
we'll now know must have been opportunistic pruning.

The approach of using a bit in the style of the heapam init bit makes
sense to me, because the bit is available, and works in a way that is
minimally invasive. Also, one can imagine needing to resolve a similar
ambiguity in the future, when (say) opportunistic freezing is added.

I think that it makes sense to treat this within the scope of
Melanie's ongoing work to improve the instrumentation of these records
-- meaning that it's in scope for Postgres 16. Admittedly this is a
slightly creative interpretation, so if others disagree then I won't
argue. This is quite a small patch, though, which makes debugging
significantly easier. I think that there could be a great deal of
utility in being able to easily "pair up" corresponding
"Heap2/PRUNE+BYVACUUM" and "Heap2/VACUUM" records in debugging
scenarios. I can imagine linking these to "Heap2/FREEZE_PAGE" and
"Heap2/VISIBLE" records, too, since they're all closely related record
types.

--
Peter Geoghegan

Attachment

pgsql-hackers by date:

Previous
From: Matthias van de Meent
Date:
Subject: Re: Save a few bytes in pg_attribute
Next
From: Peter Smith
Date:
Subject: Re: Allow logical replication to copy tables in binary format