Re: Large Database \d: ERROR: cache lookup failed for relation ... - Mailing list pgsql-general
From | Jim Nasby |
---|---|
Subject | Re: Large Database \d: ERROR: cache lookup failed for relation ... |
Date | |
Msg-id | EE2BA3FF-5847-42A7-BD51-857778B85019@decibel.org Whole thread Raw |
In response to | Re: Large Database \d: ERROR: cache lookup failed for relation ... (Erik Jones <erik@myemma.com>) |
List | pgsql-general |
I'm working with these guys to resolve the immediate issue, but I suspect there's a race condition somewhere in the code. What's happened is that OIDs have been changed in the system. There's not a lot of table DDL that happens, but there is a substantial amount of view DDL that can take place. In a nutshell, tables will sometimes have fields added to them, and when that happens a whole set of views needs to be re-created to take the new fields into account. The files for corrupted tables do exist; this seems to be mostly a catalog corruption issue. I'm seeing both what appear to be inconsistencies between relcache and the catalog tables as well as corruption between tables themselves: emma2=# select * from userdata_8464_campaigns; ERROR: could not open relation with OID 138807643 emma2=# \d userdata_8464_campaigns Table "public.userdata_8464_campaigns" Column | Type | Modifiers -------------------------------+----------------------------- +------------------------------------------------------------------ campaign_id | bigint | not null default nextval(('emma_campaigns_seq'::text)::regclass) account_id | bigint | not null cep_object_id | bigint | not null default nextval(('cep_object_seq'::text)::regclass) campaign_name | character varying(255) | not null campaign_subject | character varying(255) | not null layout_page_id | bigint | not null layout_content_id | bigint | not null campaign_create_date | timestamp without time zone | not null default now() campaign_last_mod_date | timestamp without time zone | not null default now() campaign_status | character varying(50) | not null campaign_parent_id | bigint | published_campaign_id | bigint | campaign_plaintext | text | campaign_plaintext_ds | timestamp without time zone | delivery_old_score | double precision | campaign_person_defaults | text | Inherits: emma_campaigns select oid from pg_class where relname='userdata_8464_campaigns'; oid -------- 533438 (1 row) And that file actually does exist on disk... select * from pg_index where indexrelid=138807643; indexrelid | indrelid | indnatts | indisunique | indisprimary | indisclustered | indisvalid | indkey | indclass | indexprs | indpred ------------+----------+----------+-------------+-------------- +----------------+------------+--------+----------+----------+--------- 138807643 | 533438 | 1 | t | t | f | t | 1 | 1980 | | (1 row) select * from pg_class where oid=138807643; relname | relnamespace | reltype | relowner | relam | relfilenode | reltablespace | relpages | reltuples | reltoastrelid | reltoastidxid | relhasindex | relisshared | relkind | relnatts | relchecks | reltriggers | relukeys | relfkeys | relrefs | relhasoids | relhaspkey | relhasrules | relhassubclass | relfrozenxid | relacl | reloptions ---------+--------------+---------+----------+-------+------------- +---------------+----------+-----------+--------------- +---------------+-------------+-------------+---------+---------- +-----------+-------------+----------+----------+--------- +------------+------------+-------------+---------------- +--------------+--------+------------ (0 rows) On Jun 5, 2007, at 11:27 AM, Erik Jones wrote: > I originally sent this message from my gmail account yesterday as > we were having issues with our work mail servers yesterday, but > seeing that it hasn't made it to the lists yet, I'm resending from > my registered address. You have my apologies if you receive this > twice. > > "Thomas F. O'Connell" <tf ( at ) o ( dot ) ptimized ( dot ) com> > writes: > > I'm dealing with a database where there are ~150,000 rows in > > > information_schema.tables. I just tried to do a \d, and it came back > > with this: > > > ERROR: cache lookup failed for relation [oid] > > > Is this indicative of corruption, or is it possibly a resource > issue? > > Greetings, > > This message is a follow-up to Thomas's message quoted above (we're > working together on the same database). He received one response > when he sent the above message which was from Tom Lane and can be > easily summarized as him having said that that could happen tables > were being created or dropped while running the \d in psql. > Unfortunately, that wasn't the case, we have now determined that > there is some corruption in our database and we are hoping some of > you back-end gurus might have some suggestions. > > How we verified that there is corruption was simply to reindex all > of our tables in addition to getting the same errors when running a > dump this past weekend. We so far have a list of five tables for > which reindex fails with the error: "ERROR: could not open relation > with OID xxxx" (sub xxxx with the five different #s) and one that > fails reindexing with "ERROR: xxxxx is an index" where is an index > on a completely different table. After dropping all of the indexes > on these tables (a couple didn't have any to begin with), we still > cannot run reindex on them. In addition, we can't drop the tables > either (we get the same errors). We can however run alter table > statements on them. So, we have scheduled a downtime for an evening > later this week wherein we plan on bringing the database down for a > REINDEX SYSTEM and before that we are going to run a dump excluding > those tables, restore that on a separate machine and see if these > errors crop up there anywhere. Is there anything else anyone can > think of that we can do to narrow down where the actual corruption > is or how to fix it? > > Erik Jones > > Software Developer | Emma® > erik@myemma.com > 800.595.4401 or 615.292.5888 > 615.292.0777 (fax) > > Emma helps organizations everywhere communicate & market in style. > Visit us online at http://www.myemma.com > > > > ---------------------------(end of > broadcast)--------------------------- > TIP 9: In versions below 8.0, the planner will ignore your desire to > choose an index scan if your joining column's datatypes do not > match > -- Jim Nasby jim@nasby.net EnterpriseDB http://enterprisedb.com 512.569.9461 (cell)
pgsql-general by date: