Eric Cholet <cholet@logilune.com> writes:
> I get this error when vacuuming a table:
> PANIC: open of /usr/local/pgsql/data/pg_clog/0005 failed: No such file
> or directory
> using 7.3.2.
>> Hm, interesting. You had any crashes recently?
> Yes, I've had many crashes. Always when vacuuming a largish (500 Mb) table.
> I suspected faulty hardware, so I dropped and recreated the tables
> several times.
Did that help? What were the crash symptoms exactly --- are you talking
about previous occurrences of this same error message, or other things?
Anything interesting in the postmaster's stderr log?
>> Could you show us an
>> "ls -l" listing of those clog files (I want to know their sizes and
>> mod dates...)
> -rw------- 1 postgres wheel 262144 Dec 30 03:49 0000
> -rw------- 1 postgres wheel 262144 Jan 2 19:12 0001
> -rw------- 1 postgres wheel 262144 Feb 12 12:30 0002
> -rw------- 1 postgres wheel 262144 Mar 10 06:51 0003
> -rw------- 1 postgres wheel 253952 Mar 12 17:53 0004
You seem to be still at least several tens of thousands of transactions
away from actually needing an 0005 clog segment. (It'd be worth your
time to run pg_controldata and verify that the next transaction ID
counter is still short of 5meg, ie 5242880.)
I'm guessing that the problem is data corruption in the table that you
are vacuuming when you get the error. If you're lucky it's just one row
broken with a bogus xmin (or xmax) transaction ID.
What you can do is manually create an 0005 segment file. Make sure it
contains exactly 262144 zero bytes (dd from /dev/zero may help here).
Give it the same ownership and permissions as the existing files. Then,
when you vacuum, the broken row will look like it came from a failed
transaction, and it should disappear automatically.
But you'd better look into the root cause of the problem. Have you run
memory and disk diagnostics lately?
regards, tom lane