Re: Incomplete freezing when truncating a relation during vacuum - Mailing list pgsql-hackers

From Noah Misch
Subject Re: Incomplete freezing when truncating a relation during vacuum
Date
Msg-id 20131127231802.GA1088462@tornado.leadboat.com
Whole thread Raw
In response to Re: Incomplete freezing when truncating a relation during vacuum  (Andres Freund <andres@2ndquadrant.com>)
Responses Re: Incomplete freezing when truncating a relation during vacuum  (Andres Freund <andres@2ndquadrant.com>)
List pgsql-hackers
On Wed, Nov 27, 2013 at 10:43:05PM +0100, Andres Freund wrote:
> On 2013-11-27 14:53:27 -0500, Noah Misch wrote:
> > How would you characterize the chances of this happening with default
> > *vacuum_freeze_*_age settings?  Offhand, it seems you would need to encounter
> > this bug during each of ~10 generations of autovacuum_freeze_max_age before
> > the old rows actually become invisible.
> 
> I think realistically, to actually trigger the bug, it needs to happen
> quite a bit more often. But in some workloads it's pretty darn easy to
> hit. E.g. if significant parts of the table are regularly deleted, lots,
> if not most, of your vacuums will spuriously increase relfrozenxid above
> the actual value. Each time only by a small amount, but due to that
> small increase there never will be an actual full table vacuum since
> freeze_table_age will never even remotely be reached.

That makes sense.

> > Maximizing detection is valuable, and the prognosis for automated repair is
> > poor.  I would want a way to extract tuples having xmin outside the range of
> > CLOG that are marked HEAP_XMIN_COMMITTED or appear on an all-visible
> > page.
> 
> I think the likelihood of the problem affecting !all-visible pages is
> close to zero. Each vacuum will try to clean those, so they surely will
> get vacuumed at some point. I think the only way that could happen is if
> the ConditionalLockBufferForCleanup() fails in each vacuum. And that
> seems a bit unlikely.

The page could have sat all-visible (through multiple XID epochs, let's say)
until a recent UPDATE.

> > At first, I supposed we could offer a tool to blindly freeze such tuples.
> > However, there's no guarantee that they are in harmony with recent changes to
> > the database; transactions that wrongly considered those tuples invisible may
> > have made decisions incompatible with their existence.  For example, reviving
> > such a tuple could violate a UNIQUE constraint if the user had already
> > replaced the missing row manually.
> 
> Good point, although since they are all on all-visible pages sequential
> scans will currently already find those. It's primarily index scans that
> won't. So it's not really reviving them...

True.  Since a dump/reload of the database would already get the duplicate key
violation, the revival is not making anything clearly worse.  And if we hope
for manual repair, many DBAs just won't do that at all.

> The primary reason why I think it might be a good idea to "revive"
> automatically is, that an eventual full-table/freeze vacuum will
> currently delete them which seems bad.

Will it?  When the page became all-visible, the tuples were all hinted.  They
will never be considered dead.  Every 2B transactions, they will alternate
between live and not-yet-committed.

-- 
Noah Misch
EnterpriseDB                                 http://www.enterprisedb.com



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Modify the DECLARE CURSOR command tag depending on the scrollable flag
Next
From: Andres Freund
Date:
Subject: Re: MultiXact bugs