Josh Berkus wrote:
> Thing is, if archive_command is failing, then the backup is useless
> regardless until it's fixed. And sending the archives to /dev/null (the
> fix you're essentially recommending above) doesn't make the backup any
> more useful.
That's not what I said to do first. If it's possible to fix your
archive_command, and it never returned bad "I'm saying success but I
didn't really do the right thing" information to the server--it just
failed--this situation is completely recoverable with no damage to the
backup. Just fix the archive_command, reload the configuration, and the
queue of archived files will flow and eventually your consistent backup
completes. This it the only behavior someone who is trying to recover
from a mistake in production is likely to find acceptable, and as Simon
has pointed out that is what the current situation is optimized for.
Only in the situation where the archive_command was so bad that it
returned the wrong data to the server--saying the segment was saved but
it really wasn't--did I suggest that you might as well change
archive_command to go nowhere. Because in that case, your backup is
already screwed, you lost an essential piece of it.
As far your comment about treating this like it's a problem specific to
you, did you miss the part where I pointed out I was just expressing
concerns about poor visiblity into this area ("what is the archiver
doing?") recently? I'm well aware this path is full of difficult to
escape from holes. We just need to be careful not do something that
screws over production users in the name of reducing the learning curve.
--
Greg Smith 2ndQuadrant US Baltimore, MD
PostgreSQL Training, Services and Support
greg@2ndQuadrant.com www.2ndQuadrant.us