Re: [HACKERS] Index corruption with CREATE INDEX CONCURRENTLY - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: [HACKERS] Index corruption with CREATE INDEX CONCURRENTLY
Date
Msg-id CAA4eK1JoPKOgAO+aq6XEmgGi+3vj64_d_RkeNj9JfOYEC=J0mA@mail.gmail.com
Whole thread Raw
In response to Re: [HACKERS] Index corruption with CREATE INDEX CONCURRENTLY  (Andres Freund <andres@anarazel.de>)
Responses Re: [HACKERS] Index corruption with CREATE INDEX CONCURRENTLY
List pgsql-hackers
On Mon, Feb 6, 2017 at 6:27 AM, Andres Freund <andres@anarazel.de> wrote:
> Hi,
>
> On 2017-02-05 16:37:33 -0800, Andres Freund wrote:
>> >  RelationGetIndexList(Relation relation)
>> > @@ -4746,8 +4747,10 @@ RelationGetIndexPredicate(Relation relat
>> >   * we can include system attributes (e.g., OID) in the bitmap representation.
>> >   *
>> >   * Caller had better hold at least RowExclusiveLock on the target relation
>> > - * to ensure that it has a stable set of indexes.  This also makes it safe
>> > - * (deadlock-free) for us to take locks on the relation's indexes.
>> > + * to ensure it is safe (deadlock-free) for us to take locks on the relation's
>> > + * indexes.  Note that since the introduction of CREATE INDEX CONCURRENTLY,
>> > + * that lock level doesn't guarantee a stable set of indexes, so we have to
>> > + * be prepared to retry here in case of a change in the set of indexes.
>>
>> I've not yet read the full thread, but I'm a bit confused so far. We
>> obviously can get changing information about indexes here, but isn't
>> that something we have to deal with anyway?  If we guarantee that we
>> don't loose knowledge that there's a pending invalidation, why do we
>> have to retry?  Pretty much by definition the knowledge in a relcache
>> entry can be outdated as soon as returned unless locking prevents that
>> from being possible - which is not the case here.
>>
>> ISTM it'd be better not to have retry logic here, but to follow the more
>> general pattern of making sure that we know whether the information
>> needs to be recomputed at the next access.  We could either do that by
>> having an additional bit of information about the validity of the
>> bitmaps (so we have invalid, building, valid - and only set valid at the
>> end of computing the bitmaps when still building and not invalid again),
>> or we simply move the bitmap computation into the normal relcache build.
>
> To show what I mean here's an *unpolished* and *barely tested* patch
> implementing the first of my suggestions.
>

+ * Signal that the attribute bitmaps are being built. If there's any
+ * relcache invalidations while building them, rd_bitmapsvalid will be
+ * reset to 0.  In that case return the bitmaps, but don't mark them as
+ * valid.
+ */

I don't see in your patch that you are setting rd_bitmapsvalid to 0.
Also, I think this suffers from the same problem as the patch proposed
by Alvaro which is that different attrs (hot_attrs, key_attrs and
id_attrs) will get computed based on different index lists which can
cause a problem as mentioned in my mail up thread.  However, I think
your general approach and idea that we have to set the things up for
next time is on spot. I am not sure but I think your solution can be
extended to make it somewhat similar to what we do with rd_indexvalid
(basically, we should set rd_bitmapsvalid to 2 (bitmap is temporarily
forced) at rel cache invalidation and then clean it up transaction
end).  I can try something on those lines.


-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



pgsql-hackers by date:

Previous
From: Pavan Deolasee
Date:
Subject: Re: [HACKERS] Index corruption with CREATE INDEX CONCURRENTLY
Next
From: Pavan Deolasee
Date:
Subject: Re: [HACKERS] Index corruption with CREATE INDEX CONCURRENTLY