On Mon, May 13, 2013 at 3:02 PM, Peter Geoghegan <pg@heroku.com> wrote:
> Has anyone else thought about approaches to mitigating the problems
> that arise when an archive_command continually fails, and the DBA must
> manually clean up the mess?
Notably, the most common problem in this vein suffered at Heroku has
nothing to do with archive_command failing, and everything to do with
the ratio of block device write performance (hence, backlog) versus
the archiving performance. When CPU is uncontended it's not a huge
deficit, but it is there and it causes quite a bit of stress.
Archive commands failing are definitely a special case there, where it
might be nice to bring write traffic to exactly zero for a time.