Re: visibility map corruption - Mailing list pgsql-hackers
From | Peter Geoghegan |
---|---|
Subject | Re: visibility map corruption |
Date | |
Msg-id | CAH2-WznU9L5K8PFAKdaPuFgPB9wUWq6Ps_OQm=KNPgR+Rkxk4A@mail.gmail.com Whole thread Raw |
In response to | Re: visibility map corruption (Bruce Momjian <bruce@momjian.us>) |
Responses |
Re: visibility map corruption
|
List | pgsql-hackers |
On Fri, Jul 23, 2021 at 5:08 PM Bruce Momjian <bruce@momjian.us> wrote: > However, I am now stuck on the commit message text, and I think this is > the point Peter Geoghegan was trying to make earlier --- while we know > that preserving the oldest xid in pg_control is the right thing to do, > and that setting it to the current xid - 2 billion (the old behavior) > causes vacuum freeze to run on all tables, but what else does this patch > affect? As far as I know the only other thing that it might affect is the traditional use of pg_resetwal: recovering likely-corrupt data. Getting the database to limp along for long enough to pg_dump. That is the only interpretation that makes sense, because the code in question predates pg_upgrade. AFAICT that was the original spirit of the code that we're changing here. > As far as I know, seeing a very low oldest xid causes autovacuum to > check all objects and make sure their relfrozenxid is less then > autovacuum_freeze_max_age, but isn't that just a check? Would that > cause any table scans? I would think not. And would this cause > incorrect truncation of pg_xact or fsm or vm files? I would think not > too. Tom actually wrote this code. I believe that he questioned the whole basis of it himself quite recently. Whether or not it's okay to change the behavior in contexts outside of pg_upgrade (contexts where the user invokes pg_resetwal -x to get the system to start) is perhaps debatable. It probably doesn't matter very much if you preserve that behavior for non-pg_upgrade cases -- hard to say. At the same time it's now easy to see that pg_upgrade shouldn't be doing this. > Even if the old and new cluster had mismatched autovacuum_freeze_max_age > values, I don't see how that would cause any corruption either. Sometimes the pg_control value for oldest XID is used as the oldest non-frozen XID that's expected in the table. Other times it's relfrozenxid itself IIRC. > I could perhaps see corruption happening if pg_control's oldest xid > value was closer to the current xid value than it should be, but I can't > see how having it 2-billion away could cause harm, unless perhaps > pg_upgrade itself used enough xids to cause the counter to wrap more > than 2^31 away from the oldest xid recorded in pg_control. > > What I am basically asking is how to document this and what it fixes. ISTM that this is a little like commits 78db307bb2 and a61daa14. Maybe take a look at those? -- Peter Geoghegan
pgsql-hackers by date: