Cache lookup failed for relation - Mailing list pgsql-general

From David Clymer
Subject Cache lookup failed for relation
Date
Msg-id CAKDVv36-F-aLwHkC-bzHsCMKujNa3-zLeK8jMnjk84r61UK6nA@mail.gmail.com
Whole thread Raw
Responses Re: Cache lookup failed for relation  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-general
I've been seeing the following error in one database of ours:

  "cache lookup failed for relation 7640518"

The SQL that apparently triggers this is:

   drop table if exists ns_e5461ae570429d0b7863cce9ef4d4ead;

Unfortunately, manual attempts to reproduce the issue have failed. In
normal operation, this statement is run as one of several parallel queries,
and the tables are by nature, short lived. That said, they are not
temporary tables.

This is one of two very similar databases, and we are running the same
software (same version) on top of each. The databases are in different
versions of postgresql. Db #1 is postgresql 9.2.3, and db #2 (the one
exhibiting the above behavior) is postgresql 9.0.11.

One other item of note: db #2 has recently had an OID wrap-around, which
makes me suspect that plays some part in this behavior. I've looked at the
caching code in postgresql, and though I cannot claim to have a thorough
understanding of how it works, I have a theory.

How the cache works (as I understand it):

    When a cache lookup is performed, tuples are looked up by OID. The
cache contents are hashed into buckets. If an item is found in the cache,
it is promoted to the top of the bucket so that subsequent searches are
faster. If an item is not in the cache, it is looked up in the system
catalog, and an entry is inserted into the cache. If a lookup in the
catalog fails, a negative entry is added to the cache for the tuple.
Multiple entries can exist for the same tuple. The latest one is just
promoted to the top of the bucket, and the other gets aged out of the
cache, since it is never again accessed.

Theory:

    Given that we have wrapped around our OID counter, it is possible to
have multiple entries in the cache for the same OID. If one relation is
deleted, and a negative entry inserted into the cache, attempts to look up
the other may erroneously produce a negative cache hit, yielding our "cache
lookup failed for relation" error.


Is this a possibility? Are there any other obvious explanation for this?
The results from google related to this error seem to point to catalog
corruption, or a postgres bug.

Any pointers/enlightenment would be appreciated.

-davidc

--
*David Clymer*
VistaShare
 866-828-4782, ext. 828
www.VistaShare.com <http://www.vistashare.com/>

[image: Facebook]   www.facebook.com/vistashare
[image: Twitter]   www.twitter.com/vistashare

pgsql-general by date:

Previous
From: Pavel Stehule
Date:
Subject: Re: undefined symbol: SPI_plan_get_cached_plan
Next
From: Tom Lane
Date:
Subject: Re: Cache lookup failed for relation