Re: 8.3.5 broken after power fail SOLVED - Mailing list pgsql-admin

From Michael Monnerie
Subject Re: 8.3.5 broken after power fail SOLVED
Date
Msg-id 200902211358.59500@zmi.at
Whole thread Raw
In response to Re: 8.3.5 broken after power fail SOLVED  (Scott Marlowe <scott.marlowe@gmail.com>)
List pgsql-admin
On Samstag 21 Februar 2009 Scott Marlowe wrote:
> We preach this again and again.  PostgreSQL can only survive a power
> outage type failure ONLY if the hardware / OS / filesystem don't lie
> about fsync.  If they do, all bets are off, and this kind of failure
> means you should really failover to another machine or restore a
> backup.

The shit thing is, I just discussed with the XFS devs last week, whether
it is save to have a virtualization like VMware or XEN, and the answer
was "depends on the hypervisor". I had such an issue with VMware 2 years
ago, and now with XEN, so I would say they are not save. But there must
be something you can configure in order not to have such drastic errors
on power fail. It's just nobody seems to know (or want to tell) how to
do that. At least, not to me ;-)

> It's why you have to do possibly destructive tests to see if your
> server stands at least some chance of surviving this kind of failure,
> log shipping for recovery, and / or replication of another form
> (slony etc...) to have a reliable server.

As I need another Postgres setup with a server syncing dbmail to
another, I guess I'll do that with WAL, so at least then I can recover
to that latest entry.

> The recommendations for recovery of data are just that, recovery
> oriented.  They can't fix a broken database at that point.  You need
> to take it offline after this kind of failure if you can't trust your
> hardware.
>
> Usually when it finds something wrong it just won't start up.

The problem was I wasn't working this week, and did just a basic check
if everything is up again. There were e-mails arriving, so I thought
it's OK. I was very pissed when some days later I found strange things
happening, and then to see that a table was broken and ate nearly all e-
mails. If at least Postgres would have whined and stopped working...

I know it's not Postgres' fault to have fsync messed up, but at least
error recovery should have found the problem, latest at the moment the
first transaction touched the problematic table. Instead of throwing the
data effectively to /dev/null :-(

mfg zmi
--
// Michael Monnerie, Ing.BSc    -----      http://it-management.at
// Tel: 0660 / 415 65 31                      .network.your.ideas.
// PGP Key:         "curl -s http://zmi.at/zmi.asc | gpg --import"
// Fingerprint: AC19 F9D5 36ED CD8A EF38  500E CE14 91F7 1C12 09B4
// Keyserver: wwwkeys.eu.pgp.net                  Key-ID: 1C1209B4


Attachment

pgsql-admin by date:

Previous
From: Michael Monnerie
Date:
Subject: Re: 8.3.5 broken after power fail
Next
From: Jan-Peter Seifert
Date:
Subject: Re: very, very slow performance