Re: Duplicate Item Pointers in Gin index - Mailing list pgsql-hackers

From Alexander Korotkov
Subject Re: Duplicate Item Pointers in Gin index
Date
Msg-id CAPpHfdv=KjkAs4CdOaf-SvG6=dQp1CGyBApq9j_C9AZeEFYdYA@mail.gmail.com
Whole thread Raw
In response to Re: Duplicate Item Pointers in Gin index  (Masahiko Sawada <sawada.mshk@gmail.com>)
Responses Re: Duplicate Item Pointers in Gin index
List pgsql-hackers
On Wed, Jun 13, 2018 at 11:40 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Wed, Jun 13, 2018 at 3:32 PM, Peter Geoghegan <pg@bowt.ie> wrote:
> > On Tue, Jun 12, 2018 at 11:01 PM, Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >> FWIW, I've looked at this again. I think that the situation Siva
> >> reported in the first mail can happen before we get commit 3b2787e.
> >> That is, gin indexes had had a data corruption bug. I've reproduced
> >> the situation with PostgreSQL 10.1 and observed that a gin index can
> >> corrupt.
> >
> > So, you've recreated the problem with Postgres from before 3b2787e,
> > but not after 3b2787e? Are you suggesting that 3b2787e might have
> > fixed it, or that it only hid the problem, or something else?
>
> I meant 3b2787e fixed it. I checked that at least the situation
> doesn't happen after 3b2787e.

I also think that 3b2787e should fix such problems.  After 3b2787e,
vacuum is forced to cleanup all pending list entries, which were
inserted before vacuum start.  So, vacuum should have everything to be
vaccumed merged into posting lists/trees.

> > How did you recreate the problem? Do you have a test case you can share?
>
> I recreated it by executing each steps step by step using gdb. So I
> can share the test case but it might not help.
>
> create extension pageinspect;
> create table g (c int[]);
> insert into g select ARRAY[1] from generate_series(1,1000);
> create index g_idx on g using gin (c);
> alter table g set (autovacuum_enabled = off);
> insert into g select ARRAY[1] from generate_series(1, 408); -- 408
> items fit in exactly one page of pending list
> insert into g select ARRAY[1] from generate_series(1, 100); -- insert
> into 2nd page of pending list
> select n_pending_pages, n_pending_tuples from
> gin_metapage_info(get_raw_page('g_idx', 0));
> insert into g select ARRAY[999]; -- insert into 2nd pending list page
> delete from g where c = ARRAY[999];
> -- At this point, gin entry of 'ARRAY[999]' exists on 2nd page of
> pending list and deleted.

Is this test case completed?  It looks like there should be a
continuation with concurrent vacuum and insertions managed by gdb...

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


pgsql-hackers by date:

Previous
From: Joe Conway
Date:
Subject: Re: [Proposal] Table-level Transparent Data Encryption (TDE) and KeyManagement Service (KMS)
Next
From: Robert Haas
Date:
Subject: Re: why partition pruning doesn't work?