could not open relation with OID - Mailing list pgsql-general

From Ben Chobot
Subject could not open relation with OID
Date
Msg-id 93f68bda-97d2-9944-3e09-2045339b71ab@silentmedia.com
Whole thread Raw
Responses Re: could not open relation with OID
List pgsql-general
We do a lot of queries per day, over a lot of hosts, all of which are on 12.9. We've recently started doing a better job at analyzing our db logs and have found that, a few times a day, every day, we see some of our queries fail with errors like:

could not open relation with OID 201940279

In the cases we've examined so far, the failed query succeeds just fine when we run it manually. The failed query also had run on an async streaming replica, and the primary has completed at least one autovacuum since the failure. I don't know if either of those two facts are relevant, but I'm not sure what else to blame. The internet seems to want to blame issues like this on temp tables, which makes sense, but in our case, most of the queries that are failing this way are simple PK scans, which then fall back to the table to pull all the columns. The tables themselves are small in row count - although some values are likely TOASTed - so I would be surprised if anything is spilling to disk for sorting, which might have counted as a temp table enough to give such an error.

This is a minuscule failure percentage, so replicating it is going to be hard, but it is still breaking for reasons I don't understand, and so I'd like to fix it. Has anybody else seen this, or have an ideas of what to look at?

Other things we've considered:
    - we run pg_repack, which certainly seems like it could make an error like this, but we see this error in places and times that pg_repack isn't currently running
    - although all our servers are currently on 12.9, I don't think this is a new error for us. I believe we might have seen it on previous minor versions of 12 and probably on 9.5 as well.
    - our filesystem is xfs and seems reliable. I would expect that if it was a filesystem level error, it would not be so transient. We do, occasionally, expand our filesystems, but not at all the times we've seen this error.

pgsql-general by date:

Previous
From: Lucas
Date:
Subject: PostgreSQL Management and monitoring tool
Next
From: Bryn Llewellyn
Date:
Subject: Re: Packages and inner subprograms for PL/pgSQL