Thread: FATAL 2: open of pg_clog error

FATAL 2: open of pg_clog error

From
"Bjoern Metzdorf"
Date:
Hi,

we are using Postgresql 7.2 on 2.4.17 SMP.

since this morning we are getting this error message while vacuuming:

2002-03-05 12:42:08 DEBUG:  --Relation pg_toast_16854--
2002-03-05 12:42:10 FATAL 2:  open of /raid/pgdata/pg_clog/0202 failed: No
such file or directory
2002-03-05 12:42:10 DEBUG:  server process (pid 14201) exited with exit code
2
2002-03-05 12:42:10 DEBUG:  terminating any other active server processes

A quick search in the archives
(http://archives.postgresql.org/pgsql-bugs/2002-01/msg00066.php) showed that
Tom fixed a potential problem in src/backend/utils/time/tqual.c v 1.46, but
7.2 has v 1.49 already.

We are vacuuming several tables every 5 minutes. A vacuum analyze brings up
the same error.

Any hints besides doing an initdb?

Greetings,
Bjoern

PS: ls -la /raid/pgdata/pg_clog gives this:

drwx------    2 postgres postgres     4096 Mar  4 23:48 .
drwx------    6 postgres postgres     4096 Mar  5 12:18 ..
-rw-------    1 postgres postgres   262144 Feb 14 12:52 0006
-rw-------    1 postgres postgres   262144 Feb 14 17:57 0007
-rw-------    1 postgres postgres   262144 Feb 14 22:10 0008
-rw-------    1 postgres postgres   262144 Feb 15 11:57 0009
-rw-------    1 postgres postgres   262144 Feb 15 17:25 000A
-rw-------    1 postgres postgres   262144 Feb 15 22:33 000B
-rw-------    1 postgres postgres   262144 Feb 16 12:51 000C
-rw-------    1 postgres postgres   262144 Feb 16 17:40 000D
-rw-------    1 postgres postgres   262144 Feb 16 23:01 000E
-rw-------    1 postgres postgres   262144 Feb 17 13:13 000F
-rw-------    1 postgres postgres   262144 Feb 17 17:31 0010
-rw-------    1 postgres postgres   262144 Feb 17 21:27 0011
-rw-------    1 postgres postgres   262144 Feb 18 10:44 0012
-rw-------    1 postgres postgres   262144 Feb 18 16:44 0013
-rw-------    1 postgres postgres   262144 Feb 18 20:43 0014
-rw-------    1 postgres postgres   262144 Feb 19 06:50 0015
-rw-------    1 postgres postgres   262144 Feb 19 15:52 0016
-rw-------    1 postgres postgres   262144 Feb 19 19:59 0017
-rw-------    1 postgres postgres   262144 Feb 20 00:23 0018
-rw-------    1 postgres postgres   262144 Feb 20 14:33 0019
-rw-------    1 postgres postgres   262144 Feb 20 18:37 001A
-rw-------    1 postgres postgres   262144 Feb 20 22:33 001B
-rw-------    1 postgres postgres   262144 Feb 21 12:53 001C
-rw-------    1 postgres postgres   262144 Feb 21 17:16 001D
-rw-------    1 postgres postgres   262144 Feb 21 20:58 001E
-rw-------    1 postgres postgres   262144 Feb 22 05:36 001F
-rw-------    1 postgres postgres   262144 Feb 22 15:42 0020
-rw-------    1 postgres postgres   262144 Feb 22 19:40 0021
-rw-------    1 postgres postgres   262144 Feb 23 00:20 0022
-rw-------    1 postgres postgres   262144 Feb 23 13:28 0023
-rw-------    1 postgres postgres   262144 Feb 23 17:47 0024
-rw-------    1 postgres postgres   262144 Feb 23 23:35 0025
-rw-------    1 postgres postgres   262144 Feb 24 12:52 0026
-rw-------    1 postgres postgres   262144 Feb 24 16:45 0027
-rw-------    1 postgres postgres   262144 Feb 24 20:25 0028
-rw-------    1 postgres postgres   262144 Feb 25 00:20 0029
-rw-------    1 postgres postgres   262144 Feb 25 15:18 002A
-rw-------    1 postgres postgres   262144 Feb 25 19:02 002B
-rw-------    1 postgres postgres   262144 Feb 25 22:15 002C
-rw-------    1 postgres postgres   262144 Feb 26 11:35 002D
-rw-------    1 postgres postgres   262144 Feb 26 16:35 002E
-rw-------    1 postgres postgres   262144 Feb 26 19:45 002F
-rw-------    1 postgres postgres   262144 Feb 26 22:55 0030
-rw-------    1 postgres postgres   262144 Feb 27 13:40 0031
-rw-------    1 postgres postgres   262144 Feb 27 17:34 0032
-rw-------    1 postgres postgres   262144 Feb 27 20:55 0033
-rw-------    1 postgres postgres   262144 Feb 28 01:31 0034
-rw-------    1 postgres postgres   262144 Feb 28 15:33 0035
-rw-------    1 postgres postgres   262144 Feb 28 19:11 0036
-rw-------    1 postgres postgres   262144 Feb 28 22:36 0037
-rw-------    1 postgres postgres   262144 Mar  1 12:29 0038
-rw-------    1 postgres postgres   262144 Mar  1 16:59 0039
-rw-------    1 postgres postgres   262144 Mar  1 21:27 003A
-rw-------    1 postgres postgres   262144 Mar  2 09:23 003B
-rw-------    1 postgres postgres   262144 Mar  2 15:20 003C
-rw-------    1 postgres postgres   262144 Mar  2 20:03 003D
-rw-------    1 postgres postgres   262144 Mar  3 03:14 003E
-rw-------    1 postgres postgres   262144 Mar  3 14:39 003F
-rw-------    1 postgres postgres   262144 Mar  3 18:25 0040
-rw-------    1 postgres postgres   262144 Mar  3 21:56 0041
-rw-------    1 postgres postgres   262144 Mar  4 09:41 0042
-rw-------    1 postgres postgres   262144 Mar  4 16:22 0043
-rw-------    1 postgres postgres   262144 Mar  4 19:57 0044
-rw-------    1 postgres postgres   262144 Mar  4 23:48 0045
-rw-------    1 postgres postgres   155648 Mar  5 12:54 0046





Re: FATAL 2: open of pg_clog error

From
Tom Lane
Date:
"Bjoern Metzdorf" <bm@turtle-entertainment.de> writes:
> since this morning we are getting this error message while vacuuming:

> 2002-03-05 12:42:08 DEBUG:  --Relation pg_toast_16854--
> 2002-03-05 12:42:10 FATAL 2:  open of /raid/pgdata/pg_clog/0202 failed: No
> such file or directory

Given that you don't have any actual clog segments beyond 0046, it would
seem that pg_toast_16854 contains a trashed tuple --- specifically, one
having a bogus xmin or xmax that's far beyond the existing range of
transaction IDs.

> Any hints besides doing an initdb?

You shouldn't need to initdb to get out of a problem with just one
table.  I'd look in pg_class to see which table this is the toast table
for (look for reltoastrelid = (oid of pg_toast_16854)).  Then see if
you can pg_dump that one table.  If so, drop the table and reload from
the dump.  If not, consider dropping the table anyway --- it beats
initdb for your whole database.

Another interesting question is whether the problem stems from a
hardware fault (eg, disk dropped a few bytes) or software (did Postgres
screw up?)  Perhaps you could just rename the broken table out of the
way, instead of dropping it, so as to preserve it for future analysis.
I for one would be interested in looking at the broken data.

            regards, tom lane