Re: Database corruption? - Mailing list pgsql-general

From Tom Lane
Subject Re: Database corruption?
Date
Msg-id 28353.1003796089@sss.pgh.pa.us
Whole thread Raw
In response to Database corruption?  (Alvaro Herrera <alvherre@atentus.com>)
Responses Re: Database corruption?  (Alvaro Herrera <alvherre@atentus.com>)
Re: Database corruption?  (Alvaro Herrera <alvherre@atentus.com>)
List pgsql-general
Alvaro Herrera <alvherre@atentus.com> writes:
> FATAL 2:  XLogFlush: request is not satisfied

We had a previous report of this same failure message --- see
the thread starting at
http://fts.postgresql.org/db/mw/msg.html?mid=1033586

> And here is a backtrace taken from a core file I found laying around,
> which has a timestamp makes me think it has something to say:

> (gdb) bt
> #0  0x4018cbf4 in memmove () from /lib/libc.so.6
> #1  0x08100f85 in PageRepairFragmentation ()
> #2  0x080ae9a7 in scan_heap ()
> #3  0x080adfb4 in vacuum_rel ()
> #4  0x080adbee in vac_vacuum ()
> #5  0x080adb68 in vacuum ()

It would be useful to look into that too, for sure, but I think it is
probably not related to your XLog problem.

> The database has been running for months without trouble. I'm now trying
> desperate measures, but I fear I will have to restore from backup (a week
> old). I have taken a tarball of the complete location (pg_xlog included and
> all that stuff) if anyone wants to see it (but it's 2 GB).

As I said to Denis in the earlier thread, it would be good to try to
track down which page is corrupted and maybe then we'd understand how
it got that way.  Since you have the database tarball, you have the
raw material to look into it --- you'd need to rebuild Postgres with
debug symbols enabled and trace back from the failure points to learn
more.  Are you up to that, or could you grant access to your machine to
someone who is?

As for your immediate problem, I'd counsel reducing that elog(STOP) to
elog(DEBUG) so that you can bring the database up, and then you can
try to pg_dump your current data.  You'll probably still want to
re-initdb and restore once you get a consistent dump.

Um, Vadim?  Still of the opinion that elog(STOP) is a good idea here?
That's two people now for whom that decision has turned localized
corruption into complete database failure.  I don't think it's a good
tradeoff.

            regards, tom lane

pgsql-general by date:

Previous
From: "Dave Cramer"
Date:
Subject: Re: How to insert with a serial
Next
From: Tom Lane
Date:
Subject: Re: MATCH FULL flawed?