On Mon, 2012-04-30 at 17:23 -0400, Andrew Hannon wrote:
> 1. Is our data intact? PG eventually starts up, and it seems like once
> the streaming suffers the FATAL error, it falls back to performing log
> restores.
I don't see anything alarming there. Postgres will not start up if it
thinks it's really missing data.
I'd advise using an archive command that does not output anything unless
it's something you really need to know. A log file missing from the
archive is normal operation for recovery mode, so notices telling you
that are just cluttering the log.
> 2. What triggers this error? Too much time between log recovery,
> streaming startup and a low wal_keep_segments value (currently 128)?
128 sounds like a high-enough number, so after it catches up fully, it
should be plenty.
It looks like, while trying to catch up, it falls within the 128
segments and begins streaming, and then momentarily falls back out and
needs to restore from the archive.
Unless you have steady-state replication lag, it should catch up fully
and then just be able to use streaming all the time. Do you see it
resume streaming later on in the logfile?
Disclaimer: I'm not 100% confident in my response, so please take it
with a grain of salt, but I hope it is helpful anyway.
Regards,
Jeff Davis