Re: Incomplete freezing when truncating a relation during vacuum - Mailing list pgsql-hackers

From Andres Freund
Subject Re: Incomplete freezing when truncating a relation during vacuum
Date
Msg-id 20131127091510.GC28863@alap2.anarazel.de
Whole thread Raw
In response to Re: Incomplete freezing when truncating a relation during vacuum  (Heikki Linnakangas <hlinnakangas@vmware.com>)
Responses Re: Incomplete freezing when truncating a relation during vacuum  (Heikki Linnakangas <hlinnakangas@vmware.com>)
List pgsql-hackers
On 2013-11-27 11:01:55 +0200, Heikki Linnakangas wrote:
> On 11/27/13 01:21, Andres Freund wrote:
> >On 2013-11-26 13:32:44 +0100, Andres Freund wrote:
> >>This seems to be the case since
> >>b4b6923e03f4d29636a94f6f4cc2f5cf6298b8c8. I suggest we go back to using
> >>scan_all to determine whether we can set new_frozen_xid. That's a slight
> >>pessimization when we scan a relation fully without explicitly scanning
> >>it in its entirety, but given this isn't the first bug around
> >>scanned_pages/rel_pages I'd rather go that way. The aforementioned
> >>commit wasn't primarily concerned with that.
> >>Alternatively we could just compute new_frozen_xid et al before the
> >>lazy_truncate_heap.
> >
> >I've gone for the latter in this preliminary patch. Not increasing
> >relfrozenxid after an initial data load seems like a bit of a shame.
> >
> >I wonder if we should just do scan_all || vacrelstats->scanned_pages <
> >vacrelstats->rel_pages?
> 
> Hmm, you did (scan_all || vacrelstats->scanned_pages <
> vacrelstats->rel_pages) for relminmxid, and just (vacrelstats->scanned_pages
> < vacrelstats->rel_pages) for relfrozenxid. That was probably not what you
> meant to do, the thing you did for relfrozenxid looks good to me.

I said it's a preliminary patch ;), really, I wasn't sure what of both
to go for.

> Does the attached look correct to you?

Looks good.

I wonder if we need to integrate any mitigating logic? Currently the
corruption may only become apparent long after it occurred, that's
pretty bad. And instructing people run a vacuum after the ugprade will
cause the corrupted data being lost if they are already 2^31 xids. But
integrating logic to fix things into heap_page_prune() looks somewhat
ugly as well.
Afaics the likelihood of the issue occuring on non-all-visible pages is
pretty low, since they'd need to be skipped due to lock contention
repeatedly.

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



pgsql-hackers by date:

Previous
From: Karsten Hilbert
Date:
Subject: Re: [GENERAL] pg_upgrade ?deficiency
Next
From: Andres Freund
Date:
Subject: Re: INSERT...ON DUPLICATE KEY LOCK FOR UPDATE