Re: BUG #17245: Index corruption involving deduplicated entries - Mailing list pgsql-bugs

From Andres Freund
Subject Re: BUG #17245: Index corruption involving deduplicated entries
Date
Msg-id 20211030023740.qbnsl2xaoh2grq3d@alap3.anarazel.de
Whole thread Raw
In response to Re: BUG #17245: Index corruption involving deduplicated entries  (Kamigishi Rei <iijima.yun@koumakan.jp>)
Responses Re: BUG #17245: Index corruption involving deduplicated entries  (Andres Freund <andres@anarazel.de>)
Re: BUG #17245: Index corruption involving deduplicated entries  (Kamigishi Rei <iijima.yun@koumakan.jp>)
List pgsql-bugs
Hi,

Amit, this looks to be a parallel vacuum related bug, see below.


Whew, we found the bug, I think. Or at least one that can create exactly this
situation.

The problem is that parallel vacuum doesn't have correct handling for small
indexes, when there are other indexes that are big enough to use parallelism.

CREATE TABLE multiple_indexes(id serial primary key, data text, other serial unique);
CREATE INDEX ON multiple_indexes(data );
INSERT INTO multiple_indexes(data) SELECT 1 FROM generate_series(1, 50000);

\di+ multiple_indexes_*
                                                        List of relations

┌────────┬────────────────────────────┬───────┬────────┬──────────────────┬─────────────┬───────────────┬─────────┬─────────────┐
│ Schema │            Name            │ Type  │ Owner  │      Table       │ Persistence │ Access method │  Size   │
Description│
 

├────────┼────────────────────────────┼───────┼────────┼──────────────────┼─────────────┼───────────────┼─────────┼─────────────┤
│ public │ multiple_indexes_data_idx  │ index │ andres │ multiple_indexes │ permanent   │ btree         │ 336 kB  │
       │
 
│ public │ multiple_indexes_other_key │ index │ andres │ multiple_indexes │ permanent   │ btree         │ 1112 kB │
       │
 
│ public │ multiple_indexes_pkey      │ index │ andres │ multiple_indexes │ permanent   │ btree         │ 1112 kB │
       │
 

└────────┴────────────────────────────┴───────┴────────┴──────────────────┴─────────────┴───────────────┴─────────┴─────────────┘
(3 rows)

DELETE FROM multiple_indexes;

postgres[1253065][1]=# SHOW min_parallel_index_scan_size;
┌──────────────────────────────┐
│ min_parallel_index_scan_size │
├──────────────────────────────┤
│ 512kB                        │
└──────────────────────────────┘
(1 row)

I.e. we have two indexes that are big enough to use parallelism, and one that
isn't.


first heap scan:

VACUUM (VERBOSE,PARALLEL 4) multiple_indexes;
INFO:  00000: vacuuming "public.multiple_indexes"
LOCATION:  lazy_scan_heap, vacuumlazy.c:937

index vacuum:
INFO:  00000: launched 1 parallel vacuum worker for index vacuuming (planned: 1)
LOCATION:  do_parallel_vacuum_or_cleanup, vacuumlazy.c:2768
INFO:  00000: scanned index "multiple_indexes_pkey" to remove 50000 row versions
DETAIL:  CPU: user: 0.01 s, system: 0.00 s, elapsed: 0.01 s
LOCATION:  lazy_vacuum_one_index, vacuumlazy.c:3057
INFO:  00000: scanned index "multiple_indexes_other_key" to remove 50000 row versions
DETAIL:  CPU: user: 0.01 s, system: 0.00 s, elapsed: 0.01 s
LOCATION:  lazy_vacuum_one_index, vacuumlazy.c:3057

second heap scan:
INFO:  00000: table "multiple_indexes": removed 50000 dead item identifiers in 271 pages
DETAIL:  CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s
LOCATION:  lazy_vacuum_heap_rel, vacuumlazy.c:2384
INFO:  00000: table "multiple_indexes": found 50000 removable, 0 nonremovable row versions in 271 out of 271 pages
DETAIL:  0 dead row versions cannot be removed yet, oldest xmin: 748
Skipped 0 pages due to buffer pins, 0 frozen pages.
CPU: user: 0.02 s, system: 0.00 s, elapsed: 0.02 s.
LOCATION:  lazy_scan_heap, vacuumlazy.c:1668
...


So the two bigger indexes were scanned, but the small one was not.


The reason for this is that

begin_parallel_vacuum() does:
    /*
     * Initialize variables for shared index statistics, set NULL bitmap and
     * the size of stats for each index.
     */
    memset(shared->bitmap, 0x00, BITMAPLEN(nindexes));
    for (int idx = 0; idx < nindexes; idx++)
    {
        if (!can_parallel_vacuum[idx])
            continue;

        /* Set NOT NULL as this index does support parallelism */
        shared->bitmap[idx >> 3] |= 1 << (idx & 0x07);
    }

which causes do_parallel_processing() to skip over the index, because
parallel_stats_for() returns NULL due to checking that bitmap.

And do_serial_processing_for_unsafe_indexes() skips over the index because:
        /*
         * We're only here for the unsafe indexes
         */
        if (parallel_processing_is_safe(indrel, lvshared))
            continue;

but the index is actually not unsafe.


Which leaves us with a severely corrupted index. Unfortunately one that
amcheck at this stage will not recognize as corrupted :(. It'll only get there
later, if new rows for the same tids will be inserted.

I haven't yet checked whether this is a bug introduced in 14, or whether it
was possible to hit before as well.


Greetings,

Andres Freund



pgsql-bugs by date:

Previous
From: "David M. Calascibetta"
Date:
Subject: RE: BUG #17258: Unexpected results in CHAR(1) data type
Next
From: Thomas Munro
Date:
Subject: Re: conchuela timeouts since 2021-10-09 system upgrade