Re: vacuum error - Mailing list pgsql-general

From Tom Lane
Subject Re: vacuum error
Date
Msg-id 28122.1047493941@sss.pgh.pa.us
Whole thread Raw
In response to Re: vacuum error  (Eric Cholet <cholet@logilune.com>)
List pgsql-general
Eric Cholet <cholet@logilune.com> writes:
> I get this error when vacuuming a table:
> PANIC:  open of /usr/local/pgsql/data/pg_clog/0005 failed: No such file
> or  directory
> using 7.3.2.

>> Hm, interesting.  You had any crashes recently?

> Yes, I've had many crashes. Always when vacuuming a largish (500 Mb) table.
> I suspected faulty hardware, so I dropped and recreated the tables
> several times.

Did that help?  What were the crash symptoms exactly --- are you talking
about previous occurrences of this same error message, or other things?
Anything interesting in the postmaster's stderr log?

>> Could you show us an
>> "ls -l" listing of those clog files (I want to know their sizes and
>> mod dates...)

> -rw-------  1 postgres  wheel  262144 Dec 30 03:49 0000
> -rw-------  1 postgres  wheel  262144 Jan  2 19:12 0001
> -rw-------  1 postgres  wheel  262144 Feb 12 12:30 0002
> -rw-------  1 postgres  wheel  262144 Mar 10 06:51 0003
> -rw-------  1 postgres  wheel  253952 Mar 12 17:53 0004

You seem to be still at least several tens of thousands of transactions
away from actually needing an 0005 clog segment.  (It'd be worth your
time to run pg_controldata and verify that the next transaction ID
counter is still short of 5meg, ie 5242880.)

I'm guessing that the problem is data corruption in the table that you
are vacuuming when you get the error.  If you're lucky it's just one row
broken with a bogus xmin (or xmax) transaction ID.

What you can do is manually create an 0005 segment file.  Make sure it
contains exactly 262144 zero bytes (dd from /dev/zero may help here).
Give it the same ownership and permissions as the existing files.  Then,
when you vacuum, the broken row will look like it came from a failed
transaction, and it should disappear automatically.

But you'd better look into the root cause of the problem.  Have you run
memory and disk diagnostics lately?

            regards, tom lane

pgsql-general by date:

Previous
From: Eric Cholet
Date:
Subject: Re: vacuum error
Next
From: Aleksey Serba
Date:
Subject: Re: Migration from 7.2.3 to 7.3.2 ( missing datetime data type )