Improving the "Routine Vacuuming" docs - Mailing list pgsql-hackers
From | Peter Geoghegan |
---|---|
Subject | Improving the "Routine Vacuuming" docs |
Date | |
Msg-id | CAH2-WznjPXEdVBX2ZaTs5BWfUY8ow6yrX2B9uruf64To-cqwLA@mail.gmail.com Whole thread Raw |
Responses |
Re: Improving the "Routine Vacuuming" docs
Re: Improving the "Routine Vacuuming" docs |
List | pgsql-hackers |
Recent work on VACUUM and relfrozenxid advancement required that I update the maintenance.sgml VACUUM documentation ("Routine Vacuuming"). It was tricky to keep things current, due in part to certain structural problems. Many of these problems are artifacts of how the document evolved over time. "Routine Vacuuming" ought to work as a high level description of how VACUUM keeps the system going over time. The intended audience is primarily DBAs, so low level implementation details should either be given much less prominence, or not even mentioned. We should keep it practical -- without going too far in the direction of assuming that we know the limits of what information might be useful. My high level concerns are: * Instead of discussing FrozenTransactionId (and then explaining how that particular magic value is not really used anymore anyway), why not describe freezing in terms of the high level rules? Something along the lines of the following seems more useful: "A tuple whose xmin is frozen (and xmax is unset) is considered visible to every possible MVCC snapshot. In other words, the transaction that inserted the tuple is treated as if it ran and committed at some point that is now *infinitely* far in the past." It might also be useful to describe freezing all of a live tuple's XIDs as roughly the opposite process as completely physically removing a dead tuple. It follows that we don't necessarily need to freeze anything to advance relfrozenxid (especially not on Postgres 15). * The general description of how the XID space works similarly places way too much emphasis on low level details that are of very little relevance. These details would even seem totally out of place if I was the intended audience. The problem isn't really that the information is too technical. The problem is that we emphasize mechanistic stuff while never quite explaining the point of it all. Currently, "25.1.5. Preventing Transaction ID Wraparound Failures" says this, right up-front: "But since transaction IDs have limited size (32 bits) a cluster that runs for a long time (more than 4 billion transactions) would suffer transaction ID wraparound" This is way too mechanistic. We totally muddle things by even mentioning 4 billion XIDs in the first place. It seems like a confusing artefact of a time before freezing was invented, back when you really could have XIDs that were more than 2 billion XIDs apart. This statement has another problem: it's flat-out untrue. The xidStopLimit stuff will reliably kick in at about 2 billion XIDs. * The description of wraparound sounds terrifying, implying that data corruption can result. The alarming language isn't proportionate to the true danger (something I complained about in a dedicated thread last year [1]). * XID space isn't really a precious resource -- it isn't even a resource at all IMV. ISTM that we should be discussing wraparound as an issue about the maximum *difference* between any two unfrozen XIDs in a cluster/installation. Talking about an abstract-sounding XID space seems to me to be quite counterproductive. The logical XID space is practically infinite, after all. We should move away from the idea that physical XID space is a precious resource. Sure, users are often concerned that the xidStopLimit mechanism might kick-in, effectively resulting in an outage. That makes perfect sense. But it doesn't follow that XIDs are precious, and implying that they are intrinsically valuable just confuses matters. First of all, physical XID space is usually abundantly available. A "distance" of ~2 billion XIDs is a vast distance in just about any application (barring those with pathological problems, such as a leaked replication slot). Second of all, Since the amount of physical freezing required to be able to advance relfrozenxid by any given amount (amount of XIDs) varies enormously, and is not even predictable for a given table (because individual tables don't get their own physical XID space), the age of datfrozenxid predicts very little about how close we are to having the dreaded xidStopLimit mechanism kick in. We do need some XID-wise slack, but that's just a way of absorbing shocks -- it's ballast, usually only really needed for one or two very large tables. Third of all, and most importantly, the whole idea that we can just put off freezing indefinitely and actually reduce the pain (rather than having a substantial increase in problems) seems to have just about no basis in reality, at least once you get into the tens of millions range (though usually well before that). Why should you be better off if all of your freezing occurs in one big balloon payment? Sometimes getting into debt for a while is useful, but why should it make sense to keep delaying freezing? And if it doesn't make sense, then why does it still make sense to treat XID space as a precious resource? * We don't cleanly separate discussion of anti-wraparound autovacuums, and aggressive vacuums, and the general danger of wraparound (by which I actually mean the danger of having the xidStopLimit stop limit kick in). I think that we should move towards a world in which we explicitly treat the autovacuum anti-wraparound criteria as not all that different to any of the standard criteria (so we probably still have the behavior with autovacuums not being cancellable, but it would be a dynamic thing that didn't depend on the original reason why autovacuum.c launched an autovacuum worker). But even now we aren't clear enough about the fact that anti-wraparound autovacuums really aren't all that special. Which makes them seem scarier than they should be. [1] https://postgr.es/m/CAH2-Wzk_FxfJvs4TnUtj=DCsokbiK0CxfjZ9jjrfSx8sTWkeUg@mail.gmail.com -- Peter Geoghegan
pgsql-hackers by date: