Improving the "Routine Vacuuming" docs - Mailing list pgsql-hackers

From Peter Geoghegan
Subject Improving the "Routine Vacuuming" docs
Date
Msg-id CAH2-WznjPXEdVBX2ZaTs5BWfUY8ow6yrX2B9uruf64To-cqwLA@mail.gmail.com
Whole thread Raw
Responses Re: Improving the "Routine Vacuuming" docs
Re: Improving the "Routine Vacuuming" docs
List pgsql-hackers
Recent work on VACUUM and relfrozenxid advancement required that I
update the maintenance.sgml VACUUM documentation ("Routine
Vacuuming"). It was tricky to keep things current, due in part to
certain structural problems. Many of these problems are artifacts of
how the document evolved over time.

"Routine Vacuuming" ought to work as a high level description of how
VACUUM keeps the system going over time. The intended audience is
primarily DBAs, so low level implementation details should either be
given much less prominence, or not even mentioned. We should keep it
practical -- without going too far in the direction of assuming that
we know the limits of what information might be useful.

My high level concerns are:

* Instead of discussing FrozenTransactionId (and then explaining how
that particular magic value is not really used anymore anyway), why
not describe freezing in terms of the high level rules?

Something along the lines of the following seems more useful: "A tuple
whose xmin is frozen (and xmax is unset) is considered visible to
every possible MVCC snapshot. In other words, the transaction that
inserted the tuple is treated as if it ran and committed at some point
that is now *infinitely* far in the past."

It might also be useful to describe freezing all of a live tuple's
XIDs as roughly the opposite process as completely physically removing
a dead tuple. It follows that we don't necessarily need to freeze
anything to advance relfrozenxid (especially not on Postgres 15).

* The general description of how the XID space works similarly places
way too much emphasis on low level details that are of very little
relevance.

These details would even seem totally out of place if I was the
intended audience. The problem isn't really that the information is
too technical. The problem is that we emphasize mechanistic stuff
while never quite explaining the point of it all.

Currently, "25.1.5. Preventing Transaction ID Wraparound Failures"
says this, right up-front:

"But since transaction IDs have limited size (32 bits) a cluster that
runs for a long time (more than 4 billion transactions) would suffer
transaction ID wraparound"

This is way too mechanistic. We totally muddle things by even
mentioning 4 billion XIDs in the first place. It seems like a
confusing artefact of a time before freezing was invented, back when
you really could have XIDs that were more than 2 billion XIDs apart.

This statement has another problem: it's flat-out untrue. The
xidStopLimit stuff will reliably kick in at about 2 billion XIDs.

* The description of wraparound sounds terrifying, implying that data
corruption can result.

The alarming language isn't proportionate to the true danger
(something I complained about in a dedicated thread last year [1]).

* XID space isn't really a precious resource -- it isn't even a
resource at all IMV.

ISTM that we should be discussing wraparound as an issue about the
maximum *difference* between any two unfrozen XIDs in a
cluster/installation.

Talking about an abstract-sounding XID space seems to me to be quite
counterproductive. The logical XID space is practically infinite,
after all. We should move away from the idea that physical XID space
is a precious resource. Sure, users are often concerned that the
xidStopLimit mechanism might kick-in, effectively resulting in an
outage. That makes perfect sense. But it doesn't follow that XIDs are
precious, and implying that they are intrinsically valuable just
confuses matters.

First of all, physical XID space is usually abundantly available. A
"distance" of ~2 billion XIDs is a vast distance in just about any
application (barring those with pathological problems, such as a
leaked replication slot). Second of all, Since the amount of physical
freezing required to be able to advance relfrozenxid by any given
amount (amount of XIDs) varies enormously, and is not even predictable
for a given table (because individual tables don't get their own
physical XID space), the age of datfrozenxid predicts very little
about how close we are to having the dreaded xidStopLimit mechanism
kick in. We do need some XID-wise slack, but that's just a way of
absorbing shocks -- it's ballast, usually only really needed for one
or two very large tables.

Third of all, and most importantly, the whole idea that we can just
put off freezing indefinitely and actually reduce the pain (rather
than having a substantial increase in problems) seems to have just
about no basis in reality, at least once you get into the tens of
millions range (though usually well before that).

Why should you be better off if all of your freezing occurs in one big
balloon payment? Sometimes getting into debt for a while is useful,
but why should it make sense to keep delaying freezing? And if it
doesn't make sense, then why does it still make sense to treat XID
space as a precious resource?

* We don't cleanly separate discussion of anti-wraparound autovacuums,
and aggressive vacuums, and the general danger of wraparound (by which
I actually mean the danger of having the xidStopLimit stop limit kick
in).

I think that we should move towards a world in which we explicitly
treat the autovacuum anti-wraparound criteria as not all that
different to any of the standard criteria (so we probably still have
the behavior with autovacuums not being cancellable, but it would be a
dynamic thing that didn't depend on the original reason why
autovacuum.c launched an autovacuum worker). But even now we aren't
clear enough about the fact that anti-wraparound autovacuums really
aren't all that special. Which makes them seem scarier than they
should be.

[1] https://postgr.es/m/CAH2-Wzk_FxfJvs4TnUtj=DCsokbiK0CxfjZ9jjrfSx8sTWkeUg@mail.gmail.com
-- 
Peter Geoghegan



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: make MaxBackends available in _PG_init
Next
From: Nathan Bossart
Date:
Subject: Re: make MaxBackends available in _PG_init