Re: ERROR: cache lookup failed for relation 17442 (repost) - Mailing list pgsql-bugs

From Tom Lane
Subject Re: ERROR: cache lookup failed for relation 17442 (repost)
Date
Msg-id 189.1107814067@sss.pgh.pa.us
Whole thread Raw
In response to ERROR: cache lookup failed for relation 17442 (repost)  (Michael Guerin <guerin@rentec.com>)
Responses Re: ERROR: cache lookup failed for relation 17442 (repost)  (Alvaro Herrera <alvherre@dcc.uchile.cl>)
List pgsql-bugs
Michael Guerin <guerin@rentec.com> writes:
>     I've been getting these errors ("ERROR:  cache lookup failed for
> relation 17442")  in my logs for a while now.

Turning on verbose error logging shows that the error invariably comes
from RelationIsVisible():

2005-02-07 16:44:35 EST 2994 ERROR:  XX000: cache lookup failed for relation 168051
2005-02-07 16:44:35 EST 2994 CONTEXT:  SQL statement "SELECT  EXISTS (SELECT relname FROM pg_class WHERE relname ~*
('^'||  $1  || '$') AND relkind = 'r' and pg_table_is_visible(oid))" 
    PL/pgSQL function "table_exists" line 7 at if
    PL/pgSQL function "fngetcompositeids2" line 13 at if
2005-02-07 16:44:35 EST 2994 LOCATION:  RelationIsVisible, namespace.c:320

What is happening, I believe, is that the SELECT finds a row in pg_class
that has already been deleted (because the temp table in question
belonged to another backend that already dropped it).  Under MVCC rules
the row is still visible to the SELECT.  However, pg_table_is_visible
depends on backend internal catalog lookup functions, and those lookups
always follow SnapshotNow rules ... so there is a window in which the
pg_table_is_visible call can fail because the other backend's DROP
already committed.

The only quick-fix way I can see for making this safe is to modify
pg_table_is_visible and friends to silently return FALSE instead of
erroring out when they are handed an OID that doesn't seem to match any
existing catalog row.  There was some talk of doing this once before:
http://archives.postgresql.org/pgsql-hackers/2002-11/msg00067.php
but we never got around to it, mainly because it seemed too likely to
mask mistakes.

A more general issue is that we now have a pile of "catalog inquiry"
functions that all make use of backend internal lookups and therefore
reflect SnapshotNow behavior to the user.  This has bothered me for some
time, mainly because I'm worried that it could result in pg_dump failing
to emit a consistent snapshot --- eg, it could emit a CREATE VIEW
command that reflects a view change that occurred later than the start
of the dump.  Defending against this looks horridly messy though :-(
... it almost seems to require duplicate sets of catalog lookup code.

In the meantime, you might want to think about identifying your own temp
tables by matching their relnamespace to current_schemas(), instead of
relying on pg_table_is_visible.

            regards, tom lane

pgsql-bugs by date:

Previous
From: Michael Guerin
Date:
Subject: ERROR: cache lookup failed for relation 17442 (repost)
Next
From: Tom Lane
Date:
Subject: Re: ERROR: cache lookup failed for relation 17442 (repost)