Re: [HACKERS] Patch: Write Amplification Reduction Method (WARM) - Mailing list pgsql-hackers

From Peter Geoghegan
Subject Re: [HACKERS] Patch: Write Amplification Reduction Method (WARM)
Date
Msg-id CAH2-WzkfZWHq7f4UgeZRY6qM_0EmusRusq-T+t_e_--yfBzO+A@mail.gmail.com
Whole thread Raw
In response to Re: [HACKERS] Patch: Write Amplification Reduction Method (WARM)  (Pavan Deolasee <pavan.deolasee@gmail.com>)
List pgsql-hackers
On Sun, Mar 19, 2017 at 12:15 AM, Pavan Deolasee
<pavan.deolasee@gmail.com> wrote:
>> It seems like an important invariant for WARM is that any duplicate
>> index values ought to have different TIDs (actually, it's a bit
>> stricter than that, since btrecheck() cares about simple binary
>> equality).
>
> Yes. I think in the current code, indexes can never duplicate TIDs (at least
> for btrees and hash). With WARM, indexes can have duplicate TIDs, but iff
> index values differ. In addition there can only be one more duplicate and
> one of them must be a Blue pointer (or a non-WARM pointer if we accept the
> new nomenclature proposed a few mins back).

It looks like those additional Red/Blue details are available right
from the IndexTuple, which makes the check a good fit for amcheck (no
need to bring the heap into it).

>> You wouldn't have to teach amcheck about the heap, because a TID that
>> points to the heap can only be duplicated within a B-Tree index
>> because of WARM. So, if we find that two adjacent tuples are equal,
>> check if the TIDs are equal. If they are also equal, check for strict
>> binary equality. If strict binary equality is indicated, throw an
>> error due to invariant failing.
>>
>
> Wouldn't this be much more expensive for non-unique indexes?

Only in the worst case, where there are many many duplicates, and only
if you insisted on being completely comprehensive, rather than merely
very comprehensive. That is, you can store the duplicate TIDs in local
memory up to a quasi-arbitrary budget, since you do have to make sure
that any local buffer cannot grow in an unbounded fashion. Certainly,
if you stored 10,000 TIDs, there is always going to be a theoretical
case where that wasn't enough. But you can always say something like
that. We are defending against Murphy here, not Machiavelli.

You're going to have to qsort() a particular value's duplicate TIDs
once you encounter a distinct value, and therefore need to evaluate
the invariant. That's not a big deal, because sorting less than 1,000
items is generally very fast. It's well worth it. I'd probably choose
a generic budget for storing TIDs in local memory, and throw out half
of the TIDs when that budget is exceeded.

I see no difficulty with race conditions when you have only an
AccessShareLock on target. Concurrent page splits won't hurt, because
you reliably skip over those by always moving right. I'm pretty sure
that VACUUM killing IndexTuples that you've already stored with the
intention of sorting later is also not a complicating factor, since
you know that the heap TIDs that are WARM root pointers are not going
to be recycled in the lifetime of the amcheck query such that you get
a false positive.

A WARM check seems like a neat adjunct to what amcheck does already.
It seems like a really good idea for WARM to buy into this kind of
verification. It is, at worst, cheap insurance.

-- 
Peter Geoghegan



pgsql-hackers by date:

Previous
From: Todd Sedano
Date:
Subject: [HACKERS] [PATCH] Removes uninitialized variable compiler warning
Next
From: Ashutosh Bapat
Date:
Subject: Re: [HACKERS] Partition-wise join for join between (declaratively)partitioned tables