Re: [HACKERS] Index corruption with CREATE INDEX CONCURRENTLY - Mailing list pgsql-hackers

From Keith Fiske
Subject Re: [HACKERS] Index corruption with CREATE INDEX CONCURRENTLY
Date
Msg-id CAG1_KcC8EYd0M00WXtTVmAF495NKbu+1e0TZn=RBF2KXAKcy9Q@mail.gmail.com
Whole thread Raw
In response to Re: [HACKERS] Index corruption with CREATE INDEX CONCURRENTLY  (Amit Kapila <amit.kapila16@gmail.com>)
Responses Re: [HACKERS] Index corruption with CREATE INDEX CONCURRENTLY  (Alvaro Herrera <alvherre@2ndquadrant.com>)
List pgsql-hackers

On Mon, Feb 6, 2017 at 10:17 PM, Amit Kapila <amit.kapila16@gmail.com> wrote:
On Mon, Feb 6, 2017 at 10:28 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Amit Kapila <amit.kapila16@gmail.com> writes:
>> Hmm.  Consider that the first time relcahe invalidation occurs while
>> computing id_attrs, so now the retry logic will compute the correct
>> set of attrs (considering two indexes, if we take the example given by
>> Alvaro above.).  However, the attrs computed for hot_* and key_* will
>> be using only one index, so this will lead to a situation where two of
>> the attrs (hot_attrs and key_attrs) are computed with one index and
>> id_attrs is computed with two indexes. I think this can lead to
>> Assertion in HeapSatisfiesHOTandKeyUpdate().
>
> That seems like something we'd be best off to fix by changing the
> assertion.
>

The above-mentioned problem doesn't exist in your version of the patch
(which is now committed) because the index attrs are cached after
invalidation and I have mentioned the same in my yesterday's followup
mail.   The guarantee of safety is that we don't handle invalidation
between two consecutive calls to RelationGetIndexAttrBitmap() in
heap_update, but if we do handle due to any reason which is not
apparent to me, then I think there is some chance of hitting the
assertion.

>  I doubt it's really going to be practical to guarantee that
> those bitmapsets are always 100% consistent throughout a transaction.
>

Agreed.  As the code stands, I think it is only expected to be
guaranteed across three consecutive calls to
RelationGetIndexAttrBitmap() in heap_update. Now, if the assertion in
HeapSatisfiesHOTandKeyUpdate() is useless and we can remove it, then
probably we don't need a guarantee to be consistent in three
consecutive calls as well.

>> Okay, please find attached patch which is an extension of Tom's and
>> Andres's patch and I think it fixes the above problem noted by me.
>
> I don't like this patch at all.  It only fixes the above issue if the
> relcache inval arrives while RelationGetIndexAttrBitmap is executing.
> If the inval arrives between two such calls, you still have the problem
> of the second one delivering a bitmapset that might be inconsistent
> with the first one.
>

I think it won't happen between two consecutive calls to
RelationGetIndexAttrBitmap in heap_update.  It can happen during
multi-row update, but that should be fine.  I think this version of
patch only defers the creation of fresh bitmapsets where ever
possible.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Was just curious if anyone was able to come up with any sort of method to test whether an index was corrupted by this bug, other than just waiting for bad query results? We've used concurrent index rebuilding quite extensively over the years to remove bloat from busy systems, but reindexing the entire database "just in case" is unrealistic in many of our cases.

pgsql-hackers by date:

Previous
From: Alvaro Herrera
Date:
Subject: Re: [HACKERS] pg_recvlogical.c doesn't build with--disable-integer-datetimes
Next
From: Tom Lane
Date:
Subject: Re: [HACKERS] Instability in select_parallel regression test