Thread: FATAL 2: open of /usr/local/pgsql/data/pg_clog/0943 failed
I have this error in postgres log FATAL 2: open of /usr/local/pgsql/data/pg_clog/0943 failed: No such file or directory When it happend i did indexing on big table (~32M rows) On small selects, inserts, updates all seems to work fine, but when I try to do select count(*) from big table or intex it, then server crashes and automaticaly restarts (file name in error message changes from time to time) Theare is about 3.5GB free disk space. Postgres version 7.2.1 Theare is no more unusual messages in log. What happend? How to fix this problem? Google didn't help so much :( Mark P.S. Additional info: last files in /usr/local/pgsql/data/pg_clog are -rw------- 1 postgres daemon 262144 Jan 31 16:17 0099 -rw------- 1 postgres daemon 262144 Feb 3 09:11 009A -rw------- 1 postgres daemon 262144 Feb 3 16:26 009B -rw------- 1 postgres daemon 262144 Feb 4 13:54 009C -rw------- 1 postgres daemon 262144 Feb 5 11:11 009D -rw------- 1 postgres daemon 262144 Feb 5 21:32 009E -rw------- 1 postgres daemon 262144 Feb 7 10:52 009F -rw------- 1 postgres daemon 262144 Feb 7 14:35 00A0 -rw------- 1 postgres daemon 262144 Feb 9 21:57 00A1 -rw------- 1 postgres daemon 262144 Feb 10 16:29 00A2 -rw------- 1 postgres daemon 262144 Feb 11 13:06 00A3 -rw------- 1 postgres daemon 262144 Feb 11 23:22 00A4 -rw------- 1 postgres daemon 262144 Feb 12 15:53 00A5 -rw------- 1 postgres daemon 262144 Feb 13 13:52 00A6 -rw------- 1 postgres daemon 262144 Feb 14 10:33 00A7 -rw------- 1 postgres daemon 262144 Feb 15 13:27 00A8 -rw------- 1 postgres daemon 262144 Feb 17 12:44 00A9 -rw------- 1 postgres daemon 237568 Feb 17 20:44 00AA That's all
Martins Zarins <mark@vestnesis.lv> writes: > FATAL 2: open of /usr/local/pgsql/data/pg_clog/0943 failed: No such file or > directory You evidently have a row with a corrupted transaction number in that table. The system is trying to look up the status of that transaction, and it's failing because the number is far beyond the actually valid range of transaction numbers in your database. Frequently, this failure is just the first detectable symptom of a completely-corrupted page. But it might just be the one row that's bad. If you want to try to narrow down where the corruption is, you can experiment with commands like select ctid,* from big_table offset N limit 1; This will fail with the clog-open error for all N greater than some critical value, which you can home in on by trial and error. Once you know the largest safe N, the ctid reported for that N tells you a block number just before the broken tuple or page. Armed with that, you can look for trouble using a hex editor or pg_filedump (but I recommend pg_filedump --- see http://sources.redhat.com/rhdb/tools.html). If you aren't interested in investigating, you could recover by just dropping the table and recreating it from backup. (I hope you have a backup, as you have certainly lost at least one row and possibly several pages' worth.) In any case, it'd be a good idea to run some memory and disk diagnostics to try to determine what caused the data corruption. regards, tom lane
You really need to upgrade to 7.2.3/7.2.4. This issue is fixed in 7.2.3. To fix this problem, stop postmaster, make a dummy clog containing zeros (/dev/zero), start postmaster. for security you can dump all data then (perhaps you need to repeat the clog dummy step) and freshly re-insert it. Regards, Bjoern ----- Original Message ----- From: "Martins Zarins" <mark@vestnesis.lv> To: <pgsql-admin@postgresql.org> Sent: Monday, February 17, 2003 8:03 PM Subject: [ADMIN] FATAL 2: open of /usr/local/pgsql/data/pg_clog/0943 failed I have this error in postgres log FATAL 2: open of /usr/local/pgsql/data/pg_clog/0943 failed: No such file or directory When it happend i did indexing on big table (~32M rows) On small selects, inserts, updates all seems to work fine, but when I try to do select count(*) from big table or intex it, then server crashes and automaticaly restarts (file name in error message changes from time to time) Theare is about 3.5GB free disk space. Postgres version 7.2.1 Theare is no more unusual messages in log. What happend? How to fix this problem? Google didn't help so much :( Mark P.S. Additional info: last files in /usr/local/pgsql/data/pg_clog are -rw------- 1 postgres daemon 262144 Jan 31 16:17 0099 -rw------- 1 postgres daemon 262144 Feb 3 09:11 009A -rw------- 1 postgres daemon 262144 Feb 3 16:26 009B -rw------- 1 postgres daemon 262144 Feb 4 13:54 009C -rw------- 1 postgres daemon 262144 Feb 5 11:11 009D -rw------- 1 postgres daemon 262144 Feb 5 21:32 009E -rw------- 1 postgres daemon 262144 Feb 7 10:52 009F -rw------- 1 postgres daemon 262144 Feb 7 14:35 00A0 -rw------- 1 postgres daemon 262144 Feb 9 21:57 00A1 -rw------- 1 postgres daemon 262144 Feb 10 16:29 00A2 -rw------- 1 postgres daemon 262144 Feb 11 13:06 00A3 -rw------- 1 postgres daemon 262144 Feb 11 23:22 00A4 -rw------- 1 postgres daemon 262144 Feb 12 15:53 00A5 -rw------- 1 postgres daemon 262144 Feb 13 13:52 00A6 -rw------- 1 postgres daemon 262144 Feb 14 10:33 00A7 -rw------- 1 postgres daemon 262144 Feb 15 13:27 00A8 -rw------- 1 postgres daemon 262144 Feb 17 12:44 00A9 -rw------- 1 postgres daemon 237568 Feb 17 20:44 00AA That's all ---------------------------(end of broadcast)--------------------------- TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org