Re: BUG #16792: silent corruption of GIN index resulting in SELECTs returning non-matching rows - Mailing list pgsql-bugs

From Pawel Kudzia
Subject Re: BUG #16792: silent corruption of GIN index resulting in SELECTs returning non-matching rows
Date
Msg-id CAJYBUS8gnevDjh3WRB65M4SfTMJZVAdc+qTs+zHiPfdWcCJaTg@mail.gmail.com
Whole thread Raw
In response to Re: BUG #16792: silent corruption of GIN index resulting in SELECTs returning non-matching rows  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: BUG #16792: silent corruption of GIN index resulting in SELECTs returning non-matching rows  (Pawel Kudzia <kudzia@gmail.com>)
List pgsql-bugs
On Thu, Jun 17, 2021 at 4:07 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Pawel Kudzia <kudzia@gmail.com> writes:
> > with help from IRC we've found that decreasing work_mem from 1MB to 256kB
> > or less makes the problem go away:
>
> Hmm.  So that suggests that the index itself is *not* corrupt,
> but the problem is associated with a bug in the indexscan
> algorithms.
>
> Have you experimented with different index opclasses?  Your
> original report said you were using gin__int_ops, but that's
> not the default, and we sort of deprecate intarray these days.

Hello,

Three weeks ago we've changed from

CREATE INDEX entity_attribute_name_ids_gin ON public.entity USING gin
(attribute_name_ids public.gin__int_ops);

into

CREATE INDEX entity_attribute_name_ids_gin ON public.entity USING gin
(attribute_name_ids);

Sadly - the problems are back.

I have:
* queries that return incorrect responses:
SET enable_seqscan = OFF;
SELECT entity_id FROM entity WHERE ( attribute_name_ids && '{1485}' )
AND NOT ( (attribute_name_ids||0) && '{1485}') LIMIT 10;
SELECT entity_id FROM entity WHERE ( attribute_name_ids && '{144}' )
AND NOT ( (attribute_name_ids||0) && '{144}') LIMIT 10;
SELECT entity_id FROM entity WHERE ( attribute_name_ids && '{2652}' )
AND NOT ( (attribute_name_ids||0) && '{2652}') LIMIT 10;

all three return:
 entity_id |      primname
-----------+--------------------
  99311962 | msci_674591ltepsgt
  99312880 | msci_674667mgu

data=# explain analyze SELECT entity_id FROM entity WHERE (
attribute_name_ids && '{2652}' ) AND NOT ( (attribute_name_ids||0) &&
'{2652}') LIMIT 10;
                                                                     QUERY PLAN

-----------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=30.59..31.71 rows=10 width=8) (actual
time=149.988..200.749 rows=2 loops=1)
   ->  Bitmap Heap Scan on entity  (cost=30.59..3798.81 rows=33644
width=8) (actual time=149.985..200.744 rows=2 loops=1)
         Recheck Cond: (attribute_name_ids && '{2652}'::integer[])
         Rows Removed by Index Recheck: 79741
         Filter: (NOT ((attribute_name_ids || 0) && '{2652}'::integer[]))
         Rows Removed by Filter: 16723
         Heap Blocks: exact=8277 lossy=9989
         ->  Bitmap Index Scan on entity_attribute_name_ids_gin
(cost=0.00..30.59 rows=33813 width=0) (actual time=9.712..9.713
rows=37367 loops=1)
               Index Cond: (attribute_name_ids && '{2652}'::integer[])
 Planning Time: 0.472 ms
 Execution Time: 200.977 ms
(11 rows)


* first time when hourly consistency check reported the inconsistency
[ 2021-08-01 19:04:15 for entity_id=99312880, 2021-08-01 23:04:15 for
entity_id=99311962 ]
* timestamps when affected row has been updated [ entity_id=99312880:
2021-08-01 16:51:59, 2021-08-01 19:04:06 [ most likely that update led
to inconsistency ], 2021-08-01 23:04:14, 2021-08-02 01:08:18,
2021-08-02 05:12:15, entity_id=99311962 2021-08-01 16:51:59,
2021-08-01 19:04:05, 2021-08-01 23:04:13 [ likely culprit ],
2021-08-02 01:08:16 ]
* wal files from the **master server** from the time when the
inconsistency started to occur [ from 2021-08-01 14:50:37 till
2021-08-02 04:31:00 ; sadly i don't have enough to cover the whole
period between snapshots mentioned below  ]
* file-level snapshot of the database taken from streaming
**replication slave** before the problem started to occur [ 2021-08-01
12:00:01 ] and after it occured [ 2021-08-02 00:00:01 ]

I'll be happy to run any diagnostics needed, provide access to
particular wal file.

Greetings!

-- 
regards,
Pawel Kudzia



pgsql-bugs by date:

Previous
From: Wojciech Strzalka
Date:
Subject: psql doesn't show tables duplicated in multiple schemas
Next
From: Pavel Stehule
Date:
Subject: Re: psql doesn't show tables duplicated in multiple schemas