Re: BUG #17268: Possible corruption in toast index after reindex index concurrently - Mailing list pgsql-bugs

From Andres Freund
Subject Re: BUG #17268: Possible corruption in toast index after reindex index concurrently
Date
Msg-id 20211108203641.whpia3pzjofellf4@alap3.anarazel.de
Whole thread Raw
In response to Re: BUG #17268: Possible corruption in toast index after reindex index concurrently  (Andres Freund <andres@anarazel.de>)
Responses Re: BUG #17268: Possible corruption in toast index after reindex index concurrently
List pgsql-bugs
Hi,

On 2021-11-08 10:53:17 -0800, Andres Freund wrote:
> Oh, wow (or ugh). It narrows down quite a bit. A single pgbench session
> running
> INSERT INTO wide(wide) SELECT string_agg(g.i::text || random()::text, '') FROM generate_series(1, 500) g(i);
> 
> triggers the issue. Sometimes takes quite a few iterations of REINDEX INDEX
> CONCURRENTLY, but...

After several wrong turns, I found the cause of the problem:

The problem is that toast_save_datum() releases the lock on the toast relation
directly after inserting. That breaks the interlock between
WaitForLockersMultiple() in ReindexRelationConcurrently() and the "toast
inserter". The comments in ReindexRelationConcurrently() don't meaningfully
explain the need for that interlock, but the equivalent case in DefineIndex()
does:

    /*
     * Phase 2 of concurrent index build (see comments for validate_index()
     * for an overview of how this works)
     *
     * Now we must wait until no running transaction could have the table open
     * with the old list of indexes.  Use ShareLock to consider running
     * transactions that hold locks that permit writing to the table.  Note we
     * do not need to worry about xacts that open the table for writing after
     * this point; they will see the new index when they open it.
     *
     * Note: the reason we use actual lock acquisition here, rather than just
     * checking the ProcArray and sleeping, is that deadlock is possible if
     * one of the transactions in question is blocked trying to acquire an
     * exclusive lock on our table.  The lock code will detect deadlock and
     * error out properly.
     */
    WaitForLockers(heaplocktag, ShareLock, true);


The problem that this causes is that the toast-inserting backend will not
necessarily have seen the new index at this point (since there was no need to
process relcache invalidations), but the scan below validate_index() will not
necessarily find new tuples inserted after the start of the scan (particularly
if they're in a new portion of the table).

I verified that if I change the table_close() in toast_save_datum() to not
release the lock the problem doesn't happen anymore.


This problem doesn't exist for CREATE INDEX CONCURRENTLY afaict, because we
don't allow that on system tables, including toast tables. But we *do* reindex
toast indexes concurrently.

One possible way to fix this would be to make ReindexRelationConcurrently()
acquire a lock on the underlying table when reindexing a toast table. Another
to not release the lock in toast_save_datum().

Greetings,

Andres Freund



pgsql-bugs by date:

Previous
From: Andres Freund
Date:
Subject: Re: BUG #17268: Possible corruption in toast index after reindex index concurrently
Next
From: Noah Misch
Date:
Subject: Re: BUG #17268: Possible corruption in toast index after reindex index concurrently