Bug reference: 17389 Logged by: Ben Chobot Email address: bench@silentmedia.com PostgreSQL version: 12.9 Operating system: Linux (Ubuntu) Description:
We've noticed that at least since 9.5, running pg_repack causes a race conditions on our streaming replicas, but _not_ on the primary where pg_repack is running. This manifests itself as a client briefly unable to open the relation getting repacked - but, in our testing and experience, only on the replica. I would blame pg_repack - its whole purpose for being is to transparently remake tables, and quite possibly it got some of the details wrong - except that if its behavior appears atomic to clients on the primary, then surely it should on the replicas too?
Using these steps below, I can reliably get the client on the replica to have an OID error within 30 minutes. The same steps fail to generate an error when I query in a loop on the primary.
We've very occasionally seen something similar with a script that did CREATE INDEX CONCURRENTLY, RENAME INDEX and DROP INDEX CONCURRENTLY on the primary, but not since we upgraded from 9.4 to 12 and switched to using REINDEX CONCURRENTLY. In our case the OID in the error belonged to the index that was dropped, not the table.
I think It'd be worth noting the OIDs of the table and its indexes before each run, so that you can tell if it belongs to an index in your case.