[Fwd: Re: Question about pgfsck] - Mailing list pgsql-admin
From | Dan Hrabarchuk |
---|---|
Subject | [Fwd: Re: Question about pgfsck] |
Date | |
Msg-id | 1097166324.6719.221.camel@dan.geotivity.net Whole thread Raw |
Responses |
Re: [Fwd: Re: Question about pgfsck]
|
List | pgsql-admin |
Is there a FAQ or a HOWTO on recovering from database corruption? If not, then I am willing to write it. I will need help getting the document together because as much as I google, I come up empty handed. What I need is something that tells you: A> You get errors in your server log like: PANIC: open of /psql/pg_clog/0C95 failed: No such file or directory - or - PANIC: ZeroFill failed to write /psql/pg_xlog/xlogtemp.27517: No space left on device :) - or - LOG: server process (pid 48146) was terminated by signal 6 - or - LOG: ReadRecord: record with zero length at 39D/A53F7800 - or - LOG: ReadRecord: unexpected pageaddr 39A/78D4000 in log file 922, segment 15, offset 9256960 - or - ERROR: Invalid page header in block 2879878 of channeldata etc. Which are serious? What needs to be done? B> How to find all corrupt tuples, whether they are in indexes or the tables themselves. C> How to eliminate the corrupt tuples. Delete them of course. But how when the server errors on deleting them?? pgfsck looks like an awesome tool. It allows you to search areas of your database that are rarely used. So you don't get a 1AM phone call from the sales guy needing some report for tomorrow at 7AM and he can't get it. I want to have something that can be easily digested and doesn't require advanced knowledge of postgres. What does the beginning PostgreSQL admin do? A database disaster guide. Thanks Dan Hrabarchuk -----Forwarded Message----- > From: Martijn van Oosterhout <kleptog@svana.org> > To: Dan Hrabarchuk <dan@kwasar.biz> > Subject: Re: Question about pgfsck > Date: Thu, 07 Oct 2004 10:38:33 +0200 > > The problem is obviously with the structure of the tuples. So you need > to identify it by something then does not require parsing the tuples. > Hence the suggestion of ctid or oid or something else fixed. > > Good luck, > > On Wed, Oct 06, 2004 at 02:57:31PM -0700, Dan Hrabarchuk wrote: > > > > > > 3. Finding the file with the error. > > > > > > And the tablename, ergo, filename > > > > > > > 4. Finally fixing the error. > > > > > > Delete the record is the usual way. > > > > Okay the following query: > > SELECT * FROM channeldata where cd_id=6268 and tstamp<???2004-09-20??? and > > tstamp>???2004-09-15???; > > > > Will cause a signal 6 and have the database backend restart. > > > > If I change it to: DELETE FROM channeldata where cd_id=6268 and > > tstamp<???2004-09-20??? and tstamp>???2004-09-15???; > > > > I get an error and the transaction does not commit. How do you delete > > the records? Or is this not the usual way. > > > > I can recover the data after it's deleted. I just need to delete it. > > > > Thanks again. > > > > Dan > >
pgsql-admin by date: