Re: Version 7.2.3 unrecoverable crash on missing pg_clog - Mailing list pgsql-bugs

From Tom Lane
Subject Re: Version 7.2.3 unrecoverable crash on missing pg_clog
Date
Msg-id 29509.1042124271@sss.pgh.pa.us
Whole thread Raw
In response to Version 7.2.3 unrecoverable crash on missing pg_clog  (Andy Osborne <andy@sift.co.uk>)
List pgsql-bugs
Andy Osborne <andy@sift.co.uk> writes:
> One of our databases crashed yesterday with a bug that looks
> a lot like the non superuser vacuum issue that 7.2.3 was
> intended to fix, although we do our vacuum with a user that
> has usesuper=t in pg_user so I guess it's not that simple.

> FATAL 2:  open of /u0/pgdata/pg_clog/0726 failed: No such file or directory

What range of file names do you actually see in pg_clog?

The fixes in 7.2.3 were for cases that would try to access
already-removed clog segments (file numbers less than what's present).
In this case the accessed file name is large enough that I'm thinking
the problem is due to a garbage transaction number being passed to the
transaction-status-check code.  So my bet is on physical data corruption
in the table that was causing the problem.  It turns out that the first
detectable symptom of a trashed tuple header is often a failure like
this :-(.

You didn't happen to make a physical copy of the news table before
dropping it, did you?  It'd be interesting to examine the remains.
So far, the cases I have seen like this all seem to be due to hardware
faults, but we've seen it just often enough to make me wonder if there
is a software issue too.

            regards, tom lane

pgsql-bugs by date:

Previous
From: Tom Lane
Date:
Subject: Re: Bug #868: temp schema can't be clean automatically
Next
From: Andy Osborne
Date:
Subject: Re: Version 7.2.3 unrecoverable crash on missing pg_clog