Re: A nightmare - Mailing list pgsql-admin
From | Mauri Sahlberg |
---|---|
Subject | Re: A nightmare |
Date | |
Msg-id | 1115200077.11341.84.camel@localhost.localdomain Whole thread Raw |
In response to | Re: A nightmare (Tom Lane <tgl@sss.pgh.pa.us>) |
Responses |
Re: A nightmare
|
List | pgsql-admin |
ma, 2005-05-02 kello 10:52 -0400, Tom Lane kirjoitti: > Mauri Sahlberg <Mauri.Sahlberg@claymountain.com> writes: > > I'm starting to become desperate. On saturday I dumped all databases, > > wiped whole postgresql installation. Installed newest rpms for Fedora 1, > > restored databases. Recompiled client libraries and binaries. Restarted > > and after five hours of operation: > > May 1 21:34:19 claymountain postgres[6337]: [2-1] ERROR: could not > > access status of transaction 4250811410 > > May 1 21:34:19 claymountain postgres[6337]: [2-2] DETAIL: could not > > open file > > "/var/lib/pgsql/data/pg_clog/0FD5": No such file or directory > > Which exactly are the "newest rpms for Fedora 1" ... what PG version > and where did you get them from? > Name : postgresql-server Relocations: (not relocateable) Version : 7.4.7 Vendor: (none) Release : 2PGDG Build Date: Fri 25 Feb 2005 01:42:54 PM EET Got them from http://www.postgresql.org/ftp/binary/v7.4.7/rpms/fedora/fedora-core-1/ > It looks like a corrupt-data issue to me. You could follow the usual > sorts of procedures to try to isolate and get rid of the bad data > (see the list archives for details). But I think first you need to > question what caused it. Could your disk drive be failing (or other > hardware problem)? How much do you trust the specific kernel version > you are currently running? I have no control over the kernel version I am running. The server is located on virtual machine and the kernel version claims to be Linux claymountain.planeetta.com 2.4.20-021stab028.5.777-enterprise #1 SMP Tue Feb 22 17:44:46 MSK 2005 i686 i686 i386 GNU/Linux. I have no trust or distrust against it. I've tried to contact the virtual server provider but so far the guy who is supposed to know something about virtual servers has not been in and is not returning my calls. As far as I can tell, the hardware "looks" fine at least when looked at from a virtual server. I moved the database that seemed to cause the corruption to an another machine and now both servers have been happily running for more than 24 hours without any indication of data corruption. I am happy but scared. I would still like to know what caused the corruption. My current guess is that it could be network related. Corruption occurred when the data was collected on a different machine than where the database was located. Collector is a c++-application using libpq++-4.0. The corruption could have something to do with locales and network errors. Regards, Mauri Sahlberg
pgsql-admin by date: