Home > mailing lists

Re: [HACKERS] A design for amcheck heapam verification - Mailing list pgsql-hackers

From	Pavan Deolasee
Subject	Re: [HACKERS] A design for amcheck heapam verification
Date	March 27, 2018 19:48:08
Msg-id	CABOikdMDr4r_MWwRSKhSZxNuQKi0JTHX6ANGWtAZMkZEDKBkEw@mail.gmail.com Whole thread Raw
In response to	Re: [HACKERS] A design for amcheck heapam verification (Peter Geoghegan <pg@bowt.ie>)
Responses	Re: [HACKERS] A design for amcheck heapam verification (Peter Geoghegan <pg@bowt.ie>)
List	pgsql-hackers

Tree view

On Tue, Mar 27, 2018 at 8:50 AM, Peter Geoghegan <pg@bowt.ie> wrote:

On Fri, Mar 23, 2018 at 7:13 AM, Andrey Borodin <x4mmm@yandex-team.ru> wrote:
> I've just flipped patch to WoA. But if above issues will be fixed I think that patch is ready for committer.

Attached is v7, which has the small tweaks that you suggested.

Thank you for the review. I hope that this can be committed shortly.

Sorry for coming a bit too late on this thread, but I started looking at 0002 patch.

+ * When index-to-heap verification is requested, a Bloom filter is used to

+ * fingerprint all tuples in the target index, as the index is traversed to

+ * verify its structure. A heap scan later verifies the presence in the heap

+ * of all index tuples fingerprinted within the Bloom filter.

+ *

Is that correct? Aren't we actually verifying the presence in the index of all

heap tuples?

@@ -116,37 +140,47 @@ static inline bool invariant_leq_nontarget_offset(BtreeCheckState *state,

static Page palloc_btree_page(BtreeCheckState *state, BlockNumber blocknum);

- * bt_index_check(index regclass)

+ * bt_index_check(index regclass, heapallindexed boolean)

Can we come up with a better name for heapallindexed? May be "check_heapindexed"?

+ /*

+ * Register our own snapshot in !readonly case, rather than asking

+ * IndexBuildHeapScan() to do this for us later. This needs to happen

+ * before index fingerprinting begins, so we can later be certain that

+ * index fingerprinting should have reached all tuples returned by

+ * IndexBuildHeapScan().

+ */

+ if (!state->readonly)

+ snapshot = RegisterSnapshot(GetTransactionSnapshot());

+ }

So this looks safe. In !readonly mode, we take the snapshot *before*

fingerprinting the index. Since we're using MVCC snapshot, any tuple which is

visible to heapscan must be reachable via indexscan too. So we must find the

index entry for every heap tuple returned by the scan.

What I am not sure about is whether we can really examine an index which is

valid but whose indcheckxmin hasn't crossed our xmin horizon? Notice that this

amcheck might be running in a transaction block, probably in a repeatable-read

isolation level and hence GetTransactionSnapshot() might actually return a

snapshot which can't yet read the index consistently. In practice, this is

quite unlikely, but I think we should check for that case if we agree that it

could be a problem.

The case with readonly mode is also interesting. Since we're scanning heap with

SnapshotAny, heapscan may return us tuples which are RECENTLY_DEAD. So the

question is: can this happen?

- some concurrent index scan sees a heaptuple as DEAD and marks the index

entry as LP_DEAD

- our index fingerprinting sees index tuple as LP_DEAD

- our heap scan still sees the heaptuple as RECENTLY_DEAD

Now that looks improbable given that we compute OldestXmin after our index

fingerprinting was done i.e between step 2 and 3 and hence if a tuple looked

DEAD to some OldestXmin/RecentGlobalXmin computed before we computed our

OldestXmin, then surely our OldestXmin should also see the tuple DEAD. Or is

there a corner case that we're missing?

Are there any interesting cases around INSERT_IN_PROGRESS/DELETE_IN_PROGRESS

tuples, especially if those tuples were inserted/deleted by our own

transaction? It probably worth thinking.

Apart from that, comments in IndexBuildHeapRangeScan() claim that the function

is called only with ShareLock and above, which is no longer true. We should

check if that has any side-effects. I can't think of any, but better to verify

and update the comments to reflect new reality,

The partial indexes look fine since the non-interesting tuples never get called

back.

One thing that worth documenting/noting is the fact that a !readonly check will

run with a long-duration registered snapshot, thus holding OldestXmin back. Is

there anything we can do to lessen that burden like telling other backends to

ignore our xmin while computing OldestXmin (like vacuum does)?

Thanks,

Pavan

Pavan Deolasee http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

pgsql-hackers by date:

From: Teodor Sigaev
Date: 27 March 2018, 19:34:24
Subject: Re: PATCH: Exclude temp relations from base backup

From: "Jonathan S. Katz"
Date: 27 March 2018, 19:51:34
Subject: PostgreSQL 11 Release Management Team & Feature Freeze

Re: [HACKERS] A design for amcheck heapam verification - Mailing list pgsql-hackers

Previous

Next