Re: beta testing version - Mailing list pgsql-hackers

From ncm@zembu.com (Nathan Myers)
Subject Re: beta testing version
Date
Msg-id 20001130171529.P22345@store.zembu.com
Whole thread Raw
In response to Re: beta testing version  ("Mitch Vincent" <mitch@venux.net>)
Responses Re: beta testing version  (Don Baccus <dhogaza@pacifier.com>)
List pgsql-hackers
On Thu, Nov 30, 2000 at 05:37:58PM -0800, Mitch Vincent wrote:
> > > No, WAL does help, cause you can then pull in your last dump and recover
> > > up to the moment that power cable was pulled out of the wall ...
> >
> > False, on so many counts I can't list them all.
> 
> Why? If we're not talking hardware damage and you have a dump made
> sometime previous to the crash, why wouldn't that work to restore the
> database? I've had to restore a corrupted database from a dump before,
> there wasn't any hardware damage, the database (more specifically the
> indexes) were corrupted. Of course WAL wasn't around but I don't see
> why this wouldn't work...

I posted a more detailed explanation a few minutes ago, but
it appears to have been eaten by the mailing list server.

I won't re-post the explanations that you all have seen over the 
last two days, about disk behavior during a power outage; they're 
in the archives (I assume -- when last I checked, web access to it 
didn't work).  Suffice to say that if you pull the plug, there is 
just too much about the state of the disks that is unknown.

As for replaying logs against a restored snapshot dump... AIUI, a 
dump records tuples by OID, but the WAL refers to TIDs.  Therefore, 
the WAL won't work as a re-do log to recover your transactions 
because the TIDs of the restored tables are all different.   

To get replaying we need an "update log", something that might be
in 7.2 if somebody does a lot of work.

> Note I'm not saying you're wrong, just asking that you explain your
> comment a little more. If WAL can't be used to help recover from
> crashes where database corruption occurs, what good is it?

The WAL is a performance optimization for the current recovery
capabilities, which assume uncorrupted table files.  It protects
against those database server crashes that happen not to corrupt 
the table files (i.e. most).  It doesn't protect against corruption 
of the tables, by bugs in PG or in the OS or from "hardware events".  
It also doesn't protect against OS crashes that result in 
write-buffered sectors not having been written before the crash.  
Practically, this means that WAL file entries older than a few 
seconds are not useful for much.

In general, it's foolish to expect a single system to store very
valuable data with much confidence.  To get full recoverability, 
you need a "hot failover" system duplicating your transactions in 
real time.  (Even then, you're vulnerable to application-level 
mistakes.)

Nathan Myers
ncm@zembu.com



pgsql-hackers by date:

Previous
From: Vince Vielhaber
Date:
Subject: Re: beta testing version
Next
From: Don Baccus
Date:
Subject: Re: beta testing version