Re: Improving the "Routine Vacuuming" docs - Mailing list pgsql-hackers

From David G. Johnston
Subject Re: Improving the "Routine Vacuuming" docs
Date
Msg-id CAKFQuwbGWVmYQqp=Oujieb5pVcy+pRh-8N95kX1XWMhGWxBzVw@mail.gmail.com
Whole thread Raw
In response to Improving the "Routine Vacuuming" docs  (Peter Geoghegan <pg@bowt.ie>)
Responses Re: Improving the "Routine Vacuuming" docs
List pgsql-hackers
On Tue, Apr 12, 2022 at 2:53 PM Peter Geoghegan <pg@bowt.ie> wrote:
Recent work on VACUUM and relfrozenxid advancement required that I
update the maintenance.sgml VACUUM documentation ("Routine
Vacuuming"). It was tricky to keep things current, due in part to
certain structural problems. Many of these problems are artifacts of
how the document evolved over time.

"Routine Vacuuming" ought to work as a high level description of how
VACUUM keeps the system going over time. The intended audience is
primarily DBAs, so low level implementation details should either be
given much less prominence, or not even mentioned. We should keep it
practical -- without going too far in the direction of assuming that
we know the limits of what information might be useful.

+1

I've attached some off-the-cuff thoughts on reworking the first three paragraphs and the note.

It's hopefully useful for providing perspective if nothing else.


My high level concerns are:

* Instead of discussing FrozenTransactionId (and then explaining how
that particular magic value is not really used anymore anyway), why
not describe freezing in terms of the high level rules?

Agreed and considered

Something along the lines of the following seems more useful: "A tuple
whose xmin is frozen (and xmax is unset) is considered visible to
every possible MVCC snapshot. In other words, the transaction that
inserted the tuple is treated as if it ran and committed at some point
that is now *infinitely* far in the past."

I'm assuming and caring only about visible rows when I'm reading this section. Maybe we need to make that explicit - only xmin matters (and the invisible frozen flag)?


It might also be useful to describe freezing all of a live tuple's
XIDs as roughly the opposite process as completely physically removing
a dead tuple. It follows that we don't necessarily need to freeze
anything to advance relfrozenxid (especially not on Postgres 15).

I failed to pickup on how this and "mod-2^32" math interplay, and I'm not sure I care when reading this.  It made more sense to consider "shortest path" along the "circle".


Currently, "25.1.5. Preventing Transaction ID Wraparound Failures"
says this, right up-front:

"But since transaction IDs have limited size (32 bits) a cluster that
runs for a long time (more than 4 billion transactions) would suffer
transaction ID wraparound"

I both agree and disagree - where I settled (as of now) is reflected in the patch.
 
* The description of wraparound sounds terrifying, implying that data
corruption can result.

Agreed, though I just skimmed a bit after the material the patch covers.

* XID space isn't really a precious resource -- it isn't even a
resource at all IMV.

Agreed

* We don't cleanly separate discussion of anti-wraparound autovacuums,
and aggressive vacuums, and the general danger of wraparound (by which
I actually mean the danger of having the xidStopLimit stop limit kick
in).

Didn't really get this far.

I am wondering, for the more technical details, is there an existing place to send xrefs, do you plan to create one, or is it likely unnecessary?
David J.

Attachment

pgsql-hackers by date:

Previous
From: Nathan Bossart
Date:
Subject: Re: make MaxBackends available in _PG_init
Next
From: Zheng Li
Date:
Subject: Re: Support logical replication of DDLs