Re: [HACKERS] A design for amcheck heapam verification - Mailing list pgsql-hackers

From Peter Geoghegan
Subject Re: [HACKERS] A design for amcheck heapam verification
Date
Msg-id CAH2-Wz=JTEtU4n26LyHZGgW_X1u+KM2vgnXGTj28sKPW3WAfUw@mail.gmail.com
Whole thread Raw
In response to Re: [HACKERS] A design for amcheck heapam verification  (Peter Geoghegan <pg@bowt.ie>)
Responses Re: [HACKERS] A design for amcheck heapam verification  (Peter Geoghegan <pg@bowt.ie>)
List pgsql-hackers
On Mon, May 1, 2017 at 2:10 PM, Peter Geoghegan <pg@bowt.ie> wrote:
> Actually, I guess amcheck would need to use its own scan's snapshot
> xmin instead. This is true because it cares about visibility in a way
> that's "backwards" relative to existing code that tests something
> against RecentGlobalXmin. Is there any existing thing that works that
> way?

Looks like pg_visibility has a similar set of concerns, and so
sometimes calls GetOldestXmin() to "recompute" what it calls
OldestXmin (which I gather is like RecentGlobalXmin, but comes from
calling GetOldestXmin() at least once). This happens within
pg_visibility's collect_corrupt_items(). So, I could either follow
that approach, or, more conservatively, call GetOldestXmin()
immediately after each "amcheck whole index scan" finishes, for use
later on, when we go to the heap. Within the heap, we expect that any
committed tuple whose xmin precedes FooIndex.OldestXmin should be
present in that index's bloom filter. Of course, when there are
multiple indexes, we might only arrive at the heap much later. (I
guess we'd also want to check if the MVCC Snapshot's xmin preceded
FooIndex.OldestXmin, and set that as FooIndex.OldestXmin when that
happened to be the case.)

Anyone have an opinion on any of this? Offhand, I think that calling
GetOldestXmin() once per index when its "amcheck whole index scan"
finishes would be safe, and yet provide appreciably better test
coverage than only expecting things visible to our original MVCC
snapshot to be present in the index. I don't see a great reason to be
more aggressive and call GetOldestXmin() more often than once per
whole index scan, though.

--
Peter Geoghegan

VMware vCenter Server
https://www.vmware.com/



pgsql-hackers by date:

Previous
From: Neha Khatri
Date:
Subject: Re: [HACKERS] Description of create_singleton_array()
Next
From: Peter Geoghegan
Date:
Subject: Re: [HACKERS] A design for amcheck heapam verification