Re: Duplicate Item Pointers in Gin index - Mailing list pgsql-hackers

From Masahiko Sawada
Subject Re: Duplicate Item Pointers in Gin index
Date
Msg-id CAD21AoD=SPHEx56YtWg2GQmrWGqe9+zjy-Ha7=f+iMCRdM26qA@mail.gmail.com
Whole thread Raw
In response to Re: Duplicate Item Pointers in Gin index  (Masahiko Sawada <sawada.mshk@gmail.com>)
Responses Re: Duplicate Item Pointers in Gin index
List pgsql-hackers
On Thu, Feb 22, 2018 at 10:26 AM, Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> On Thu, Feb 22, 2018 at 8:28 AM, Peter Geoghegan <pg@bowt.ie> wrote:
>> On Wed, Feb 21, 2018 at 3:02 PM, R, Siva <sivasubr@amazon.com> wrote:
>>> Did you mean pin on the metapage buffer during ginInsertCleanup and not lock
>>> during addition of tuples to the accumulator? The exclusive lock on metapage
>>> buffer is released after reading/locking head of pending list and before we
>>> process pages/add tuples to the accumulator in ginInsertCleanup [1].
>>
>> AFAICT, nobody ever holds just a pin on the metapage as some kind of
>> interlock (since nobody else ever acquires a "super exclusive lock" on
>> the metapage -- if anybody else ever did that, then simply holding a
>> pin might make sense as a way of blocking the "super exclusive" lock
>> acquisition). Maybe you're thinking of the root page of posting trees?
>>
>> I think that Sawada-san simply means that holding an ExclusiveLock on
>> the metapage makes writers block each other, and concurrent VACUUMs.
>> At least, for as long as they're in ginInsertCleanup().
>
> Yes, but I realized my previous mail was wrong, sorry. Insertion to
> pending list doesn't acquire ExclusiveLock on metapage. So we can
> insert tuples to pending list while cleaning up.
>

Sorry for the very late response.

FWIW, I've looked at this again. I think that the situation Siva
reported in the first mail can happen before we get commit 3b2787e.
That is, gin indexes had had a data corruption bug. I've reproduced
the situation with PostgreSQL 10.1 and observed that a gin index can
corrupt. However, gingetbitmap (fortunately?) returned a correct
result even when the gin index is corrupted.

The minimum situation I reproduced is that each gin entry has two
pointers to the same TID as follows.

gin-entry 1     gin-entry2
(1, 147)         (1, 147)
(1, 147)         (1, 147)

The above situation is surely corrupted where I executed the all steps
Siva described in the first mail. The first TID of both entries points
to an already-vacuumed itempointer (the tuple is inserted, deleted and
vacuumed), whereas the second entries points to a live itempointer on
heap. In entryGetItem, since we check advancePast it doesn't return
the second TIDs in both posting list case and posting tree case. Also
even in partial match case, since TIDbitmap eliminates the duplication
entryGetItem can return a correct result. The corrupted gin index
returned a correct result actually but no assertion failure happened.

I'm not sure how you figured this duplicated item pointers issue out
but what I got through investigating this issue is that gin indexes
could return a correct result without no assertion failure even if it
somewhat corrupted. So maybe having amcheck for gin indexes would
resolve part of problems.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: Server crashed with dense_rank on partition table.
Next
From: Peter Geoghegan
Date:
Subject: Re: Duplicate Item Pointers in Gin index