Thread: Unable to open DB: 2 Errors (unexpected zero page + could not open relation with OID)
Unable to open DB: 2 Errors (unexpected zero page + could not open relation with OID)
From
dave sale
Date:
Postgres 8.3.3 on 2.6.21 Some sort of corruption happened overnight and borked things on a server. Normally I'd just restore and go, but the DB wasmoved about 6 weeks ago and it looks like the backup task was not, shame on me. I really just need 2 very small tablesout of this and then I can re-init. Any help is appreciated There have been other pg databases running on this hardweare for over a year now. There is no indication of any other issueand the other instances are still happily running. The first error greets me is: $ psql -d v psql: FATAL: index "pg_amproc_fam_proc_index" contains unexpected zero page at block 0 HINT: Please REINDEX it. so I do: template1=# reindex index pg_amproc_fam_proc_index; REINDEX and still get the same error. thinking it's a system table, i try it in single mode: $ postgres --single -P -D /postgres/v PostgreSQL stand-alone backend 8.3.3 backend> reindex index pg_amproc_fam_proc_index; backend> Followed by back into the normal server, and then I get: psql: FATAL: could not open relation with OID 3606 I'm pretty sure the file is there for 3606, but it's another system index from what I can tell: postgres=> select oid,* from pg_class where oid=3606; oid | relname | relnamespace | reltype | relowner | relam | relfilenode | reltablespace | relpages |reltuples | reltoastrelid | reltoastidxid | relhasindex | relisshared | relkind | relnatts | relchecks | reltriggers | relukeys| relfkeys | relrefs | relhasoids | relhaspkey | relhasrules | relhassubclass | relfrozenxid | relacl | reloptions ------+----------------------------+--------------+---------+----------+-------+-------------+---------------+----------+-----------+---------------+---------------+-------------+-------------+---------+----------+-----------+-------------+----------+----------+---------+------------+------------+-------------+----------------+--------------+--------+------------ 3606 | pg_ts_parser_prsname_index | 11 | 0 | 10 | 403 | 33505478 | 0 | 2 | 1 | 0 | 0 | f | f | i | 2 | 0 | 0 | 0 | 0 | 0 | f | f | f | f | 0 | | (1 row) -rw------- 1 postgres postgres 16K Jan 26 09:19 ./base/11511/33505478 The logging is still hitting on the pg_amproc_fam_proc_index now, but psql gets stuck at "OID 3606"... Thanks in advance. dave
Re: Unable to open DB: 2 Errors (unexpected zero page + could not open relation with OID)
From
Tom Lane
Date:
dave sale <postgres@slipt.net> writes: > psql: FATAL: could not open relation with OID 3606 > I'm pretty sure the file is there for 3606, but it's another system index from what I can tell: > postgres=> select oid,* from pg_class where oid=3606; > oid | relname | relnamespace | reltype | relowner | relam | relfilenode | reltablespace | relpages| reltuples | reltoastrelid | reltoastidxid | relhasindex | relisshared | relkind | relnatts | relchecks | reltriggers| relukeys | relfkeys | relrefs | relhasoids | relhaspkey | relhasrules | relhassubclass | relfrozenxid | relacl| reloptions > ------+----------------------------+--------------+---------+----------+-------+-------------+---------------+----------+-----------+---------------+---------------+-------------+-------------+---------+----------+-----------+-------------+----------+----------+---------+------------+------------+-------------+----------------+--------------+--------+------------ > 3606 | pg_ts_parser_prsname_index | 11 | 0 | 10 | 403 | 33505478 | 0 | 2| 1 | 0 | 0 | f | f | i | 2 | 0 | 0 | 0 | 0 | 0 | f | f | f | f | 0 | | > (1 row) That's bizarre. A look into the source code says you can only get that particular error message when the relcache code is unable to find a pg_class row for OID 3606 ... so how come there is one? My best guess at the moment is that the row looks valid to an MVCC snapshot but not under SnapshotNow rules. What do the xmin and xmax fields of that row contain? Also, you might be able to get out of this problem (at least far enough to dump the tables you need) by copying pg_internal.init from an undamaged database into the damaged one. Better keep the old copy of the file (if any) in case this makes things worse, though. regards, tom lane
Re: Unable to open DB: 2 Errors (unexpected zero page + could not open relation with OID)
From
dave sale
Date:
Thanks for your help; template1=# select * from pg_class where oid=3606; relname | relnamespace | reltype | relowner | relam | relfilenode | reltablespace | relpages | reltuples| reltoastrelid | reltoastidxid | relhasindex | relisshared | relkind | relnatts | relchecks | reltriggers | relukeys| relfkeys | relrefs | relhasoids | relhaspkey | relhasrules | relhassubclass | relfrozenxid | relacl | reloptions ----------------------------+--------------+---------+----------+-------+-------------+---------------+----------+-----------+---------------+---------------+-------------+-------------+---------+----------+-----------+-------------+----------+----------+---------+------------+------------+-------------+----------------+--------------+--------+------------ pg_ts_parser_prsname_index | 11 | 0 | 10 | 403 | 3606 | 0 | 2 | 1 | 0 | 0 | f | f | i | 2 | 0 | 0 | 0| 0 | 0 | f | f | f | f | 0 | | (1 row) copying the pg_internal.init from various other databases did nothing to help. I'm in-process of the pg_internal.init method,but the first pass resulted in the same error. I'm not giving up yet... thanks dave Tom Lane wrote: > dave sale <postgres@slipt.net> writes: >> psql: FATAL: could not open relation with OID 3606 > >> I'm pretty sure the file is there for 3606, but it's another system index from what I can tell: > >> postgres=> select oid,* from pg_class where oid=3606; >> oid | relname | relnamespace | reltype | relowner | relam | relfilenode | reltablespace | relpages| reltuples | reltoastrelid | reltoastidxid | relhasindex | relisshared | relkind | relnatts | relchecks | reltriggers| relukeys | relfkeys | relrefs | relhasoids | relhaspkey | relhasrules | relhassubclass | relfrozenxid | relacl| reloptions >> ------+----------------------------+--------------+---------+----------+-------+-------------+---------------+----------+-----------+---------------+---------------+-------------+-------------+---------+----------+-----------+-------------+----------+----------+---------+------------+------------+-------------+----------------+--------------+--------+------------ >> 3606 | pg_ts_parser_prsname_index | 11 | 0 | 10 | 403 | 33505478 | 0 | 2 | 1 | 0 | 0 | f | f | i | 2 | 0 | 0| 0 | 0 | 0 | f | f | f | f | 0 | | >> (1 row) > > That's bizarre. A look into the source code says you can only get that > particular error message when the relcache code is unable to find a > pg_class row for OID 3606 ... so how come there is one? My best guess > at the moment is that the row looks valid to an MVCC snapshot but not > under SnapshotNow rules. What do the xmin and xmax fields of that row > contain? > > Also, you might be able to get out of this problem (at least far enough > to dump the tables you need) by copying pg_internal.init from an > undamaged database into the damaged one. Better keep the old copy of > the file (if any) in case this makes things worse, though. > > regards, tom lane -- Dave Crane The Media Trust Company Voice: 202.558.6438 Fax: 703.349.5468 6861 Elm Street Suite 3C McLean, VA 22101 USA