On Thu, Jan 10, 2013 at 02:45:36AM +0000, Simon Riggs wrote:
> On 8 January 2013 02:49, Noah Misch <noah@leadboat.com> wrote:
> > There is a bug in lazy_scan_heap()'s
> > bookkeeping for the xid to place in that WAL record. Each call to
> > heap_page_prune() simply overwrites vacrelstats->latestRemovedXid, but
> > lazy_scan_heap() expects it to only ever increase the value. I have a
> > attached a minimal fix to be backpatched. It has lazy_scan_heap() ignore
> > heap_page_prune()'s actions for the purpose of this conflict xid, because
> > heap_page_prune() emitted an XLOG_HEAP2_CLEAN record covering them.
>
> Interesting. Yes, bug, and my one of mine also.
>
> ISTM the right fix is fix to correctly initialize on pruneheap.c line 176
> prstate.latestRemovedXid = *latestRemovedXid;
> better to make it work than to just leave stuff hanging.
That works, too.
> I very much like your patch to remove all that cruft altogether; good
> riddance. I think you're missing removing a few calls to
> HeapTupleHeaderAdvanceLatestRemovedXid(), perhaps even that routine as
> well.
Thanks. Did you have a particular HeapTupleHeaderAdvanceLatestRemovedXid()
call site in mind? The three calls remaining in the tree look right to me.
> Not sure about the skipping WAL records and share locking part, that's
> too radical for me.
Certainly a fair point of discussion. In particular, using a plain exclusive
lock wouldn't be much worse.