Thread: CRITICAL HELP NEEDED! DEAD DB!
Sep 24 10:22:37 snafu postgres[18306]: [2-1] LOG: database system was interrupted while in recovery at 2004-09-24 10:21:41 MST Sep 24 10:22:37 snafu postgres[18306]: [2-2] HINT: This probably means that some data is corrupted and you will have to use the last backup for recovery. Sep 24 10:22:37 snafu postgres[18306]: [3-1] LOG: checkpoint record is at 9A/C2022368 Sep 24 10:22:37 snafu postgres[18306]: [4-1] LOG: redo record is at 9A/C2022368; undo record is at 0/0; shutdown FALSE Sep 24 10:22:37 snafu postgres[18306]: [5-1] LOG: next transaction ID: 197841225; next OID: 715436086 Sep 24 10:22:37 snafu postgres[18306]: [6-1] LOG: database system was not properly shut down; automatic recovery in progress Sep 24 10:22:37 snafu postgres[18306]: [7-1] LOG: redo starts at 9A/C20223B0 Sep 24 10:22:37 snafu postgres[18306]: [8-1] PANIC: btree_insert_redo: failed to add item Sep 24 10:22:37 snafu postgres[18299]: [2-1] LOG: startup process (PID 18306) was terminated by signal 6 Sep 24 10:22:37 snafu postgres[18299]: [3-1] LOG: aborting startup due to startup process failure Any suggestions to recover?! I'm dead in the water! Please!!!
> -----Original Message----- > From: pgsql-hackers-owner@postgresql.org > [mailto:pgsql-hackers-owner@postgresql.org] On Behalf Of Cott Lang > Sent: Friday, September 24, 2004 10:21 AM > To: pgsql-hackers@postgresql.org > Subject: [HACKERS] CRITICAL HELP NEEDED! DEAD DB! > > > Sep 24 10:22:37 snafu postgres[18306]: [2-1] LOG: database > system was interrupted while in recovery at 2004-09-24 > 10:21:41 MST Sep 24 10:22:37 snafu postgres[18306]: [2-2] > HINT: This probably means that some data is corrupted and > you will have to use the last backup for recovery. Sep 24 > 10:22:37 snafu postgres[18306]: [3-1] LOG: checkpoint record > is at 9A/C2022368 Sep 24 10:22:37 snafu postgres[18306]: > [4-1] LOG: redo record is at 9A/C2022368; undo record is at > 0/0; shutdown FALSE Sep 24 10:22:37 snafu postgres[18306]: > [5-1] LOG: next transaction ID: 197841225; next OID: > 715436086 Sep 24 10:22:37 snafu postgres[18306]: [6-1] LOG: > database system was not properly shut down; automatic > recovery in progress Sep 24 10:22:37 snafu postgres[18306]: > [7-1] LOG: redo starts at 9A/C20223B0 Sep 24 10:22:37 snafu > postgres[18306]: [8-1] PANIC: btree_insert_redo: failed to > add item Sep 24 10:22:37 snafu postgres[18299]: [2-1] LOG: > startup process (PID > 18306) was terminated by signal 6 > Sep 24 10:22:37 snafu postgres[18299]: [3-1] LOG: aborting > startup due to startup process failure > > > Any suggestions to recover?! I'm dead in the water! Please!!! When did you do your last backup? This message is a clue: "HINT: This probably means that some data is corrupted and you will have to use the last backup for recovery." If you do a restore from your last backup, you will lose the data between that time and the time of the problem. Any other solution will be fraught with peril, I think. Otherwise, maybe something here will help: http://svana.org/kleptog/pgsql/pgfsck.html
Cott Lang <cott@internetstaff.com> writes: > Sep 24 10:22:37 snafu postgres[18306]: [2-1] LOG: database system was > interrupted while in recovery at 2004-09-24 10:21:41 MST > Sep 24 10:22:37 snafu postgres[18306]: [2-2] HINT: This probably means > that some data is corrupted and you will have to use the last backup for > recovery. > Sep 24 10:22:37 snafu postgres[18306]: [3-1] LOG: checkpoint record is > at 9A/C2022368 > Sep 24 10:22:37 snafu postgres[18306]: [4-1] LOG: redo record is at > 9A/C2022368; undo record is at 0/0; shutdown FALSE > Sep 24 10:22:37 snafu postgres[18306]: [5-1] LOG: next transaction ID: > 197841225; next OID: 715436086 > Sep 24 10:22:37 snafu postgres[18306]: [6-1] LOG: database system was > not properly shut down; automatic recovery in progress > Sep 24 10:22:37 snafu postgres[18306]: [7-1] LOG: redo starts at > 9A/C20223B0 > Sep 24 10:22:37 snafu postgres[18306]: [8-1] PANIC: btree_insert_redo: > failed to add item > Sep 24 10:22:37 snafu postgres[18299]: [2-1] LOG: startup process (PID > 18306) was terminated by signal 6 > Sep 24 10:22:37 snafu postgres[18299]: [3-1] LOG: aborting startup due > to startup process failure > Any suggestions to recover?! I'm dead in the water! Please!!! I think your only chance is pg_resetxlog. Be aware that you won't necessarily have a consistent database afterwards --- in particular, whichever index that failure is about is certainly broken. I'd recommend a dump and reload, plus as much manual verification of data consistency as you can manage. How did you get into this state, anyway? regards, tom lane
On Fri, 2004-09-24 at 11:43, Tom Lane wrote: > > I think your only chance is pg_resetxlog. Be aware that you won't > necessarily have a consistent database afterwards --- in particular, > whichever index that failure is about is certainly broken. I'd > recommend a dump and reload, plus as much manual verification of data > consistency as you can manage. That's what I've done, so far so good, although we are still checking consistency against the last backup. Thanks for the info. Luckily this was one of our smaller databases ... > How did you get into this state, anyway? I wish I knew - this is what appeared to start it: Sep 24 10:19:41 snafu postgres[18176]: [464-1] ERROR: could not open segment 1 of relation "idx_ordl_id" (target block 1719234412): No such file or Sep 24 10:19:41 snafu postgres[18176]: [464-2] directory I can't figure out what the exact problem is; there were no I/O errors or any other relative messages at the time, the box was empty, and nothing remarkable was going on. :( thanks, Cott PS: No, I don't think it's a PG problem. :)
Does pgfsck work on 7.4.x? > > Otherwise, maybe something here will help: > http://svana.org/kleptog/pgsql/pgfsck.html > > ---------------------------(end of broadcast)--------------------------- > TIP 9: the planner will ignore your desire to choose an index scan if your > joining column's datatypes do not match
Cott Lang wrote: > I wish I knew - this is what appeared to start it: > > Sep 24 10:19:41 snafu postgres[18176]: [464-1] ERROR: could not open > segment 1 of relation "idx_ordl_id" (target block 1719234412): No such > file or > Sep 24 10:19:41 snafu postgres[18176]: [464-2] directory > > I can't figure out what the exact problem is; there were no I/O errors > or any other relative messages at the time, the box was empty, and > nothing remarkable was going on. :( I saw that exact error message, with no logged I/O system errors, when using SAN attached storage a month or so ago. It turned out to be the SAN silently corrupting files. We did eventually start to see scsi errors, but not at the beginning. Joe
For starters a little more detail would be helpful, for example: What version of PostgreSQL? What OS? What compiler? What happened that caused this? Server Crash? Matthew Cott Lang wrote: >Sep 24 10:22:37 snafu postgres[18306]: [2-1] LOG: database system was >interrupted while in recovery at 2004-09-24 10:21:41 MST >Sep 24 10:22:37 snafu postgres[18306]: [2-2] HINT: This probably means >that some data is corrupted and you will have to use the last backup for >recovery. >Sep 24 10:22:37 snafu postgres[18306]: [3-1] LOG: checkpoint record is >at 9A/C2022368 >Sep 24 10:22:37 snafu postgres[18306]: [4-1] LOG: redo record is at >9A/C2022368; undo record is at 0/0; shutdown FALSE >Sep 24 10:22:37 snafu postgres[18306]: [5-1] LOG: next transaction ID: >197841225; next OID: 715436086 >Sep 24 10:22:37 snafu postgres[18306]: [6-1] LOG: database system was >not properly shut down; automatic recovery in progress >Sep 24 10:22:37 snafu postgres[18306]: [7-1] LOG: redo starts at >9A/C20223B0 >Sep 24 10:22:37 snafu postgres[18306]: [8-1] PANIC: btree_insert_redo: >failed to add item >Sep 24 10:22:37 snafu postgres[18299]: [2-1] LOG: startup process (PID >18306) was terminated by signal 6 >Sep 24 10:22:37 snafu postgres[18299]: [3-1] LOG: aborting startup due >to startup process failure > > >Any suggestions to recover?! I'm dead in the water! Please!!! > > > > >---------------------------(end of broadcast)--------------------------- >TIP 8: explain analyze is your friend > > >