Re: database errors - Mailing list pgsql-hackers
From | Michael Brusser |
---|---|
Subject | Re: database errors |
Date | |
Msg-id | DEEIJKLFNJGBEMBLBAHCKEHBEKAA.michael@synchronicity.com Whole thread Raw |
In response to | Re: database errors (Tom Lane <tgl@sss.pgh.pa.us>) |
Responses |
Re: database errors
|
List | pgsql-hackers |
It looks that "No such file or directory" followed by the abort signal resulted from manually removing logs. pg_resetxlog took care of this, but other problems persisted. I got a copy of the database and installed it on the local partition. It does seem badly corrupted, these are some hard errors. pg_dump fails and dumps the core: pg_dump: ERROR: XLogFlush: request 0/A971020 is not satisfied --- flushed only to 0/5000050 ... lost synchronization withserver, resetting connection looking at the core file: (dbx) where 15 =>[1] _libc_kill(0x0, 0x6, 0x0, 0xffffffff, 0x2eaf00, 0xff135888), at 0xff19f938 [2] abort(0xff1bc004, 0xff1c3a4c, 0x0, 0x7efefeff, 0x21c08, 0x2404c4), at 0xff13596c [3] elog(0x14, 0x267818, 0x0, 0xa971020, 0x0, 0x5006260), at 0x2407dc [4] XLogFlush(0xffbee908, 0xffbee908, 0x827e0,0x0, 0x0, 0x0), at 0x78530 [5] BufferSync(0x0, 0x0, 0x0, 0x0, 0x0, 0x0), at 0x18df2c [6] FlushBufferPool(0x2, 0x1e554,0x0, 0x30000, 0x0, 0xffbeea79), at 0x18e5c4 [7] CreateCheckPoint(0x0, 0x0, 0x82c00, 0xff1bc004, 0x2212c, 0x83534), at 0x7d93c [8] BootstrapMain(0x5, 0xffbeec50, 0x10, 0xffbeec50, 0xffbeebc8, 0xffbeebc8), at 0x836bc [9] SSDataBase(0x3, 0x40a24a8a, 0x2e3800, 0x4, 0x2212c, 0x16f504), at 0x172590 [10] ServerLoop(0x5091, 0x2e398c, 0x2e3800, 0xff1c2940, 0xff1bc004, 0xff1c2940), at 0x16f3a0 [11] PostmasterMain(0x1, 0x323ad0, 0x2af000, 0x0, 0x65720000, 0x65720000), at 0x16ef88 [12] main(0x1, 0xffbef68c, 0xffbef694, 0x2eaf08, 0x0, 0x0), at 0x12864c ====================== (I don't have the debug build at the moment to get more details) this query fails: LOG: query: select count (1) from note_links_aux; ERROR: XLogFlush: request 0/A971020 is not satisfied --- flushed only to 0/5006260 drop table fails: drop table note_links_aux; ERROR: getObjectDescription: Rule 17019 does not exist Are there any pointers as to why this could happen, aside of potential memory and disk problems? As for NFS... I know how strong the Postgresql community is advising against it, but we have to face it: our customers ARE running on NFS and they WILL be running on NFS. Is there such a thing as "better" and "worse" NFS versions? (I made a note of what was said about hard mount vs. soft mount, etc) Tom, you recommended upgrade from 7.3.2 to 7.3.6 Out next release is using v 7.3.4. (maybe it's not too late to upgrade) Would v. 7.3.6 provide more protection against problems like this? Thank you, Mike > -----Original Message----- ... ... > The messages you quote certainly read like a badly corrupted database to > me. In the case of a local filesystem I'd be counseling you to start > running memory and disk diagnostics. That may still be appropriate > here, but you had better also reconsider the decision to use NFS. > > If you're absolutely set on using NFS, one possibly useful tip is to > make sure it's a hard mount not a soft mount. If your systems support > NFS-over-TCP instead of UDP, that might be worth trying too. > > Also I would strongly advise an update to PG 7.3.6. 7.3.2 has serious > known bugs. > > regards, tom lane >
pgsql-hackers by date: