Re: disaster recovery - Mailing list pgsql-general
From | Alex Satrapa |
---|---|
Subject | Re: disaster recovery |
Date | |
Msg-id | 3FC55833.8080806@lintelsys.com.au Whole thread Raw |
In response to | disaster recovery ("Jason Tesser" <JTesser@nbbc.edu>) |
Responses |
Re: disaster recovery
Re: disaster recovery |
List | pgsql-general |
Jason Tesser wrote: > We are evaluating Postgres and would like some input about disaster recovery. I'm going to try to communicate what I understand, and other list members can correct me at their selected level of vehemence :) Please send corrections to the list - I may take days to post follow-ups. > I know in MsSQL they have a feature called transactional > logs that would enable a database to be put back together based off those logs. A roughly parallel concept in PostgreSQL (what's the correct capitalisation and spelling?) is the "Write Ahead Log" (WAL). There is also a quite dissimilar concept called the query log - which is good to inspect for common queries to allow database tuning, but is not replay-able. The theory is that given a PostgreSQL database and the respective WAL, you can recreate the database to the time that the last entry of the WAL was written to disk. Some caveats though: 1) Under Linux, if you have the file system containing the WAL mounted with asynchronous writes, "all bets are off". The *BSD crowd (that I know of) take great pleasure in constantly reminding me that if the power fails, my file system will be in an indeterminate state - things could be half-written all over the file system. 2) If you're using IDE drives, under any operating system, and have write-caching turned on in the IDE drives themselves, again "all bets are off" 3) If you're using IDE drives behind a RAID controller, YMMV. So to play things safe, one recommendation to ensure database robustness is to: 1) Store the WAL on a separate physical drive 2) Under Linux, mount that file system with synchronous writes (ie: fsync won't return until the data is actually, really, written to the interface) 3) If using IDE drives, turn off write caching on the WAL volume so that you know data is actually written to disk when the drive claims it is. Note that disabling write caching will impact write performance significantly. Most people *want* write caching turned on for throughput-critical file systems, and turned off for mission-critical file systems. Note too that SCSI systems tend to have no "write cache" as such, since they use "tagged command queues". The OS can say to the SCSI drive something that is effectively, "here are 15 blocks of data to write to disk, get back to me when the last one is actually written to the media", and continue on its way. On IDE, the OS can only have one command outstanding - the purpose of the write cache is to allow multiple commands to be received and "acknowledged" before any data is actually written to the media. When the host is correctly configured, you can recover a PostgreSQL database from a hardware failure by recovering the database file itself and "replaying" the WAL to that database. Read more about WAL here: http://www.postgresql.org/docs/current/static/wal.html Regards Alex PS: Please send corrections to the list PPS: Don't forget to include "fire drills" as part of your disaster recovery plan - get plenty of practice at recovering a database from a crashed machine so that you don't make mistakes when the time comes that you actually need to do it! PPPS: And follow your own advice ;)
pgsql-general by date: