Re: B-tree page deletion boundary cases - Mailing list pgsql-hackers

From Noah Misch
Subject Re: B-tree page deletion boundary cases
Date
Msg-id 20120424130052.GC20512@tornado.leadboat.com
Whole thread Raw
In response to Re: B-tree page deletion boundary cases  (Simon Riggs <simon@2ndQuadrant.com>)
List pgsql-hackers
On Tue, Apr 24, 2012 at 09:08:36AM +0100, Simon Riggs wrote:
> On Sat, Apr 21, 2012 at 5:52 PM, Noah Misch <noah@leadboat.com> wrote:
> > The fix is to compare the stored XID to RecentGlobalXmin, not RecentXmin. ?We
> > already use RecentGlobalXmin when wal_level = hot_standby. ?If no running
> > transaction has an XID and all running transactions began since the last
> > transaction that did bear an XID, RecentGlobalXmin == ReadNewTransactionId().
> > Therefore, the correct test is btpo.xact < RecentGlobalXmin, not btpo.xact <=
> > RecentGlobalXmin as we have today. ?This also cleanly removes the need for the
> > bit of code in _bt_getbuf() that decrements btpo.xact before sending it down
> > for ResolveRecoveryConflictWithSnapshot(). ?I suggested[2] that decrement on
> > an unprincipled basis; it was just masking the off-by-one of using "<=
> > RecentGlobalXmin" instead of "< RecentGlobalXmin" in _bt_page_recyclable().
> 
> Looks like the right fix. I'll apply this to 9.0/9.1/HEAD.

Thanks.  8.3 and 8.4 need it, too, albeit simplified due to some of the
affected code not existing yet.

> > This change makes empty B-tree pages wait through two generations of running
> > transactions before reuse, so some additional bloat will arise.
> 
> Could arise, in some circumstances. But that assumes VACUUMs are
> fairly frequent and that they would be delayed/rendered less effective
> by this, which I don't think will be the case.
> 
> I note that we don't take any account of the number of pages that may
> be reused when we VACUUM, so when HOT avoids a VACUUM we may
> accumulate pages for a longer period. Looks like there is more work to
> be done yet in cleaning indexes.

Agreed.  Taking one VACUUM visit to mark the page BTP_DELETED and another to
add it to the FSM is fairly pessimistic; perhaps we can do better.

nm


pgsql-hackers by date:

Previous
From: "Albe Laurenz"
Date:
Subject: Re: Foreign table scan estimates
Next
From: Robert Haas
Date:
Subject: Re: DOMAINs and CASTs