On Tue, Apr 12, 2022 at 2:53 PM Peter Geoghegan <pg@bowt.ie> wrote:
Recent work on VACUUM and relfrozenxid advancement required that I update the maintenance.sgml VACUUM documentation ("Routine Vacuuming"). It was tricky to keep things current, due in part to certain structural problems. Many of these problems are artifacts of how the document evolved over time.
"Routine Vacuuming" ought to work as a high level description of how VACUUM keeps the system going over time. The intended audience is primarily DBAs, so low level implementation details should either be given much less prominence, or not even mentioned. We should keep it practical -- without going too far in the direction of assuming that we know the limits of what information might be useful.
+1
I've attached some off-the-cuff thoughts on reworking the first three paragraphs and the note.
It's hopefully useful for providing perspective if nothing else.
My high level concerns are:
* Instead of discussing FrozenTransactionId (and then explaining how that particular magic value is not really used anymore anyway), why not describe freezing in terms of the high level rules?
Agreed and considered
Something along the lines of the following seems more useful: "A tuple whose xmin is frozen (and xmax is unset) is considered visible to every possible MVCC snapshot. In other words, the transaction that inserted the tuple is treated as if it ran and committed at some point that is now *infinitely* far in the past."
I'm assuming and caring only about visible rows when I'm reading this section. Maybe we need to make that explicit - only xmin matters (and the invisible frozen flag)?
It might also be useful to describe freezing all of a live tuple's XIDs as roughly the opposite process as completely physically removing a dead tuple. It follows that we don't necessarily need to freeze anything to advance relfrozenxid (especially not on Postgres 15).
I failed to pickup on how this and "mod-2^32" math interplay, and I'm not sure I care when reading this. It made more sense to consider "shortest path" along the "circle".
Currently, "25.1.5. Preventing Transaction ID Wraparound Failures" says this, right up-front:
"But since transaction IDs have limited size (32 bits) a cluster that runs for a long time (more than 4 billion transactions) would suffer transaction ID wraparound"
I both agree and disagree - where I settled (as of now) is reflected in the patch.
* The description of wraparound sounds terrifying, implying that data corruption can result.
Agreed, though I just skimmed a bit after the material the patch covers.
* XID space isn't really a precious resource -- it isn't even a resource at all IMV.
Agreed
* We don't cleanly separate discussion of anti-wraparound autovacuums, and aggressive vacuums, and the general danger of wraparound (by which I actually mean the danger of having the xidStopLimit stop limit kick in).
Didn't really get this far.
I am wondering, for the more technical details, is there an existing place to send xrefs, do you plan to create one, or is it likely unnecessary?