Robert Haas <robertmhaas@gmail.com> wrote:
> ...with this patch, following the above, you get:
>
> FATAL: invalid record in WAL stream
> HINT: Take a new base backup, or remove recovery.conf and restart
> in read-write mode.
> LOG: startup process (PID 6126) exited with exit code 1
> LOG: terminating any other active server processes
If someone is sloppy about how they copy the WAL files around, they
could temporarily have a truncated file. If we want to be tolerant
of straight file copies, without a temporary name or location with a
move on completion, we would need some kind of retry or timeout. It
appears that you have this hard-coded to five retries. I'm not
saying this is a bad setting, but I always wonder about hard-coded
magic numbers like this. What's the delay between retries? How did
you arrive at five as the magic number?
-Kevin