Thread: What happens when wal fails?
If I put the pg_xlog directory on its own disk, then that disk fails, does that mean the postgres is hosed or does it just mean that postgres no longer safe from a power outage? Does pg detect a problem with the wal and then call fsync() on the database files if wal isn't working?
Joseph Shraibman wrote: > If I put the pg_xlog directory on its own disk, then that disk fails, > does that mean the postgres is hosed or does it just mean that postgres > no longer safe from a power outage? Does pg detect a problem with the > wal and then call fsync() on the database files if wal isn't working? I'm guessing hosed, or at least potentially so. You'd fit a new disk, restart PG and it would complain that it couldn't re-run the WAL files. That implies that at least some of your transactions might be lost. Of course PITR would reduce the danger of this, even if you just copied the WAL to another disk on the same machine. I don't know about fsync-ing database files in their absence I'm afraid. -- Richard Huxton Archonet Ltd
Joseph Shraibman <jks@selectacast.net> writes: > If I put the pg_xlog directory on its own disk, then that disk fails, > does that mean the postgres is hosed or does it just mean that postgres > no longer safe from a power outage? The latter. The WAL is actually write-only during normal operation. However you need to define "fail". If it fails in such a way that the OS notices (which is likely) then the database is going to lock up because it can't write to WAL. regards, tom lane
Tom Lane wrote: > Joseph Shraibman <jks@selectacast.net> writes: >>If I put the pg_xlog directory on its own disk, then that disk fails, >>does that mean the postgres is hosed or does it just mean that postgres >>no longer safe from a power outage? > > The latter. The WAL is actually write-only during normal operation. Well, data loss is certainly possible. Suppose a power failure caused the machine to go down and (for whatever reason) also resulted in losing the disk on which the WAL is stored. Since recovery will not be possible, there will probably be data corruption. -Neil