Re: [HACKERS] Index corruption with CREATE INDEX CONCURRENTLY - Mailing list pgsql-hackers

From Pavan Deolasee
Subject Re: [HACKERS] Index corruption with CREATE INDEX CONCURRENTLY
Date
Msg-id CABOikdP88gE9Ekrw0f3c5tMqnMDgp5qp583mSyFxpxk1=6at9A@mail.gmail.com
Whole thread Raw
In response to Re: [HACKERS] Index corruption with CREATE INDEX CONCURRENTLY  (Amit Kapila <amit.kapila16@gmail.com>)
Responses Re: [HACKERS] Index corruption with CREATE INDEX CONCURRENTLY  (Amit Kapila <amit.kapila16@gmail.com>)
List pgsql-hackers


On Mon, Feb 6, 2017 at 8:15 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:
On Mon, Feb 6, 2017 at 8:01 AM, Pavan Deolasee <pavan.deolasee@gmail.com> wrote:
>
>
> On Mon, Feb 6, 2017 at 1:44 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>
>>
>>
>> > 2. In the second patch, we tried to recompute attribute lists if a
>> > relcache
>> > flush happens in between and index list is invalidated. We've seen
>> > problems
>> > with that, especially it getting into an infinite loop with
>> > CACHE_CLOBBER_ALWAYS. Not clear why it happens, but apparently relcache
>> > flushes keep happening.
>>
>> Well, yeah: the point of CLOBBER_CACHE_ALWAYS is to make a relcache flush
>> happen whenever it possibly could.
>
>
> Ah, ok. That explains why the retry logic as originally proposed was causing
> infinite loop.
>
>>
>> The way to deal with that without
>> looping is to test whether the index set *actually* changed, not whether
>> it just might have changed.
>
>
> I like this approach. I'll run the patch on a build with
> CACHE_CLOBBER_ALWAYS, but I'm pretty sure it will be ok. I also like the
> fact that retry logic is now limited to only when index set changes, which
> fits in the situation we're dealing with.
>

Don't you think it also has the same problem as mentioned by me in
Alvaro's patch and you also agreed on it? 

No, I don't think so. There can't be any more relcache flushes occurring between the first RelationGetIndexAttrBitmap() and the subsequent RelationGetIndexAttrBitmap() calls. The problem with the earlier proposed approach was that we were not caching, yet returning stale information. That implied the next call will again recompute a possibly different value. The retry logic is trying to close a small race yet important race condition where index set invalidation happens, but we cache stale information.
 
Do you see any need to fix
it locally in
RelationGetIndexAttrBitmap(), why can't we clear at transaction end?


We're trying to fix it in RelationGetIndexAttrBitmap() because that's where the race exists. I'm not saying there aren't other and better ways of handling it, but we don't know (I've studied Andres's proposal yet).

Thanks,
Pavan

-- 
 Pavan Deolasee                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

pgsql-hackers by date:

Previous
From: Josh Soref
Date:
Subject: Re: [HACKERS] Possible spelling fixes
Next
From: Pavan Deolasee
Date:
Subject: Re: [HACKERS] Index corruption with CREATE INDEX CONCURRENTLY