Home > mailing lists

BUG #17401: REINDEX TABLE CONCURRENTLY creates a race condition on a streaming replica - Mailing list pgsql-bugs

From	PG Bug reporting form
Subject	BUG #17401: REINDEX TABLE CONCURRENTLY creates a race condition on a streaming replica
Date	February 8, 2022 20:41:54
Msg-id	17401-9df851bb16dde397@postgresql.org Whole thread Raw
Responses	Re: BUG #17401: REINDEX TABLE CONCURRENTLY creates a race condition on a streaming replica Re: BUG #17401: REINDEX TABLE CONCURRENTLY creates a race condition on a streaming replica
List	pgsql-bugs

Tree view

The following bug has been logged on the website:

Bug reference:      17401
Logged by:          Ben Chobot
Email address:      bench@silentmedia.com
PostgreSQL version: 12.9
Operating system:   Linux (Ubuntu)
Description:

This bug is is almost identical to BUG #17389, which I filed blaming
pg_repack; however, further testing shows the same symptoms using vanilla
REINDEX TABLE CONCURRENTLY.

1. Put some data in a table with a single btree index:
create table public.simple_test (id int primary key);
insert into public.simple_test(id) (select generate_series(1,1000));

2. Set up streaming replication to a secondary db.

3. In a loop on the primary, concurrently REINDEX that table:
while `true`; do psql -tAc "select now(),relfilenode from pg_class where
relname='simple_test_pkey'" >> log; psql -tAc "reindex table concurrently
public.simple_test"; done

4. In a loop on the secondary, have psql query the secondary db for an
indexed value of that table:
while `true`; do psql -tAc "select count(*) from simple_test where id='3';
select relfilenode from pg_class where relname='simple_test_pkey'" || break;
done; date

With those 4 steps, the client on the replica will reliably fail to open the
OID of the index within 30 minutes of looping. ("ERROR:  could not open
relation with OID 6715827") When we run the same client loop on the primary
instead of the replica, or if we reindex without the CONCURRENTLY clause,
then the client loop will run for hours without fail, but neither of those
workarounds are options for us in production.

Like I said before, this isn't a new problem - we've seen it since at least
9.5 - but pre-12 we saw it using pg_repack, which is an easy (and
reasonable) scapegoat. But now that we've upgraded to 12 and are still
seeing it using vanilla concurrent reindexing, it seems more clear this is
an actual postgres bug?

pgsql-bugs by date:

From: Tom Lane
Date: 08 February 2022, 19:19:53
Subject: Re: BUG #17391: While using --with-ssl=openssl and PG_TEST_EXTRA='ssl' options, SSL tests fail on OpenBSD 7.0

From: Peter Geoghegan
Date: 08 February 2022, 21:23:34
Subject: Re: BUG #17401: REINDEX TABLE CONCURRENTLY creates a race condition on a streaming replica

BUG #17401: REINDEX TABLE CONCURRENTLY creates a race condition on a streaming replica - Mailing list pgsql-bugs

Previous

Next