Re: On-the-fly index tuple deletion vs. hot_standby - Mailing list pgsql-hackers

From Noah Misch
Subject Re: On-the-fly index tuple deletion vs. hot_standby
Date
Msg-id 20110616144746.GA13694@tornado.leadboat.com
Whole thread Raw
In response to Re: On-the-fly index tuple deletion vs. hot_standby  (Simon Riggs <simon@2ndQuadrant.com>)
Responses Re: On-the-fly index tuple deletion vs. hot_standby
List pgsql-hackers
On Thu, Jun 16, 2011 at 12:02:47AM +0100, Simon Riggs wrote:
> On Tue, Jun 14, 2011 at 5:28 AM, Noah Misch <noah@leadboat.com> wrote:
> > On Mon, Jun 13, 2011 at 04:16:06PM +0100, Simon Riggs wrote:
> >> On Mon, Jun 13, 2011 at 3:11 AM, Robert Haas <robertmhaas@gmail.com> wrote:
> >> > On Sun, Jun 12, 2011 at 3:01 PM, Noah Misch <noah@leadboat.com> wrote:
> >> >> Assuming that conclusion, I do think it's worth starting
> >> >> with something simple, even if it means additional bloat on the master in the
> >> >> wal_level=hot_standby + vacuum_defer_cleanup_age / hot_standby_feedback case.
> >> >> In choosing those settings, the administrator has taken constructive steps to
> >> >> accept master-side bloat in exchange for delaying recovery conflict. ?What's
> >> >> your opinion?
> >> >
> >> > I'm pretty disinclined to go tinkering with 9.1 at this point, too.
> >>
> >> Not least because a feature already exists in 9.1 to cope with this
> >> problem: hot standby feedback.
> >
> > A standby's receipt of an XLOG_BTREE_REUSE_PAGE record implies that the
> > accompanying latestRemovedXid preceded or equaled the master's RecentXmin at the
> > time of issue (see _bt_page_recyclable()). ?Neither hot_standby_feedback nor
> > vacuum_defer_cleanup_age affect RecentXmin. ?Therefore, neither facility delays
> > conflicts arising directly from B-tree page reuse. ?See attached test script,
> > which yields a snapshot conflict despite active hot_standby_feedback.
> 
> OK, agreed. Bug. Good catch, Noah.
> 
> Fix is to use RecentGlobalXmin for the cutoff when in Hot Standby
> mode, so that it is under user control.
> 
> Attached patch will be applied to head and backpatched to 9.1 and 9.0
> to fix this.

Thanks.  We still hit a conflict when btpo.xact == RecentGlobalXmin and the
standby has a transaction older than any master transaction.  This happens
because the tests at nbtpage.c:704 and procarray.c:1843 both pass when the xid
exactly is that of the oldest standby transaction (line numbers as of git
cb94db91b).  I only know this because the test script from my last message hits
this case; it might never get hit in real usage.  Still, seems like a hole not
worth leaving.  I think the most-correct fix is to TransactionIdRetreat the
btpo.xact before using it as xl_btree_reuse_page.lastestRemovedXid.  btpo.xact
is the first known-safe xid, but latestRemovedXid is the last known-unsafe xmin.

nm


pgsql-hackers by date:

Previous
From: Florian Pflug
Date:
Subject: Re: Patch - Debug builds without optimization
Next
From: Peter Geoghegan
Date:
Subject: Re: Latch implementation that wakes on postmaster death on both win32 and Unix