Re: What is "wraparound failure", really? - Mailing list pgsql-hackers
From | Andrew Dunstan |
---|---|
Subject | Re: What is "wraparound failure", really? |
Date | |
Msg-id | 9b18e359-1183-a45b-4c99-ca93655edced@dunslane.net Whole thread Raw |
In response to | What is "wraparound failure", really? (Peter Geoghegan <pg@bowt.ie>) |
Responses |
Re: What is "wraparound failure", really?
|
List | pgsql-hackers |
On 6/27/21 4:36 PM, Peter Geoghegan wrote: > The wraparound failsafe mechanism added by commit 1e55e7d1 had minimal > documentation -- just a basic description of how the GUCs work. I > think that it certainly merits some discussion under "25.1. Routine > Vacuuming" -- more specifically under "25.1.5. Preventing Transaction > ID Wraparound Failures". One reason why this didn't happen in the > original commit was that I just didn't know where to start with it. > The docs in question have said this since 2006's commit 48188e16 first > added autovacuum_freeze_max_age: > > "The sole disadvantage of increasing autovacuum_freeze_max_age (and > vacuum_freeze_table_age along with it) is that the pg_xact and > pg_commit_ts subdirectories of the database cluster will take more > space..." > > This sentence seems completely unreasonable to me. It seems to just > ignore the huge disadvantage of increasing autovacuum_freeze_max_age: > the *risk* that the system will stop being able to allocate new XIDs > because GetNewTransactionId() errors out with "database is not > accepting commands to avoid wraparound data loss...". Sure, it's > possible to take a lot of risk here without it ever blowing up in your > face. And if it doesn't blow up then the downside really is zero. This > is hardly a sensible way to talk about this important risk. Or any > risk at all. > > At first I thought that the sentence was not just misguided -- it > seemed downright bizarre. I thought that it was directly at odds with > the title "Preventing Transaction ID Wraparound Failures". I thought > that the whole point of this section was how not to have a wraparound > failure (as I understand the term), and yet we seem to deliberately > ignore the single most important practical aspect of making sure that > that doesn't happen. But I now suspect that the basic definitions have > been mixed up in a subtle but important way. > > What the documentation calls a "wraparound failure" seems to be rather > different to what I thought that that meant. As I said, I thought that > that meant the condition of being unable to get new transaction IDs > (at least until the DBA runs VACUUM in single user mode). But the > documentation in question seems to actually define it as "the > condition of an old MVCC snapshot failing to see a version from the > distant past, because somehow an XID wraparound suddenly makes it look > as if it's in the distant future rather than in the past". It's > actually talking about a subtly different thing, so the "sole > disadvantage" sentence is not actually bizarre. It does still seem > impractical and confusing, though. > > I strongly suspect that my interpretation of what "wraparound failure" > means is actually the common one. Of course the system is never under > any circumstances allowed to give totally wrong answers to queries, no > matter what -- users should be able to take that much for granted. > What users care about here is sensibly managing XIDs as a resource -- > preventing "XID exhaustion" while being conservative, but not > ridiculously conservative. Could the documentation be completely > misleading users here? > > I have two questions: > > 1. Do I have this right? Is there really confusion about what a > "wraparound failure" means, or is the confusion mine alone? > > 2. How do I go about integrating discussion of the failsafe here? > Anybody have thoughts on that? > AIUI, actual wraparound (i.e. an xid crossing the event horizon so it appears to be in the future) is no longer possible. But it once was a very real danger. Maybe the docs haven't quite caught up. In practical terms, there is an awful lot of head room between the default for autovacuum_freeze_max_age and any danger of major anti-wraparound measures. Say you increase it to 1bn from the default 200m. That still leaves you ~1bn transactions of headroom. cheers andrew -- Andrew Dunstan EDB: https://www.enterprisedb.com
pgsql-hackers by date: