Re: Incomplete freezing when truncating a relation during vacuum - Mailing list pgsql-hackers
From | Heikki Linnakangas |
---|---|
Subject | Re: Incomplete freezing when truncating a relation during vacuum |
Date | |
Msg-id | 5295DE0A.6070909@vmware.com Whole thread Raw |
In response to | Re: Incomplete freezing when truncating a relation during vacuum (Andres Freund <andres@2ndquadrant.com>) |
Responses |
Re: Incomplete freezing when truncating a relation during
vacuum
|
List | pgsql-hackers |
On 11/27/13 11:15, Andres Freund wrote: > On 2013-11-27 11:01:55 +0200, Heikki Linnakangas wrote: >> On 11/27/13 01:21, Andres Freund wrote: >>> On 2013-11-26 13:32:44 +0100, Andres Freund wrote: >>>> This seems to be the case since >>>> b4b6923e03f4d29636a94f6f4cc2f5cf6298b8c8. I suggest we go back to using >>>> scan_all to determine whether we can set new_frozen_xid. That's a slight >>>> pessimization when we scan a relation fully without explicitly scanning >>>> it in its entirety, but given this isn't the first bug around >>>> scanned_pages/rel_pages I'd rather go that way. The aforementioned >>>> commit wasn't primarily concerned with that. >>>> Alternatively we could just compute new_frozen_xid et al before the >>>> lazy_truncate_heap. >>> >>> I've gone for the latter in this preliminary patch. Not increasing >>> relfrozenxid after an initial data load seems like a bit of a shame. >>> >>> I wonder if we should just do scan_all || vacrelstats->scanned_pages < >>> vacrelstats->rel_pages? >> >> Hmm, you did (scan_all || vacrelstats->scanned_pages < >> vacrelstats->rel_pages) for relminmxid, and just (vacrelstats->scanned_pages >> < vacrelstats->rel_pages) for relfrozenxid. That was probably not what you >> meant to do, the thing you did for relfrozenxid looks good to me. > > I said it's a preliminary patch ;), really, I wasn't sure what of both > to go for. > >> Does the attached look correct to you? > > Looks good. Ok, committed and backpatched that. > I wonder if we need to integrate any mitigating logic? Currently the > corruption may only become apparent long after it occurred, that's > pretty bad. And instructing people run a vacuum after the ugprade will > cause the corrupted data being lost if they are already 2^31 xids. Ugh :-(. Running vacuum after the upgrade is the right thing to do to prevent further damage, but you're right that it will cause any already-wrapped around data to be lost forever. Nasty. > But integrating logic to fix things into heap_page_prune() looks > somewhat ugly as well. I think any mitigating logic we might add should go into vacuum. It should be possible for a DBA to run a command, and after it's finished, be confident that you're safe. That means vacuum. > Afaics the likelihood of the issue occuring on non-all-visible pages is > pretty low, since they'd need to be skipped due to lock contention > repeatedly. Hmm. If a page has its visibility-map flag set, but contains a tuple that appears to be dead because you've wrapped around, vacuum will give a warning: "page containing dead tuples is marked as all-visible in relation \"%s\" page %u". So I think if a manual VACUUM FREEZE passes without giving that warning, that vacuum hasn't destroyed any data. So we could advise to take a physical backup of the data directory, and run a manual VACUUM FREEZE on all databases. If it doesn't give a warning, you're safe from that point onwards. If it does, you'll want to recover from an older backup, or try to manually salvage just the lost rows from the backup, and re-index. Ugly, but hopefully rare in practice. Unfortunately that doesn't mean that you haven't already lost some data. Wrap-around could've already happened, and vacuum might already have run and removed some rows. You'll want to examine your logs and grep for that warning. - Heikki
pgsql-hackers by date: