Simon Riggs wrote:
> If you were running pg_standby as the restore_command then this error
> wouldn't happen. So you need to explain why running pg_standby cannot
> solve your problem and why we must fix it by replicating code that has
> previously existed elsewhere.
pg_standby cannot be used with streaming replication.
I guess you're next question is: why not?
The startup process alternates between streaming, and restoring files
from archive using restore_command. It will progress using streaming as
long as it can, but if the connection is lost, it will try to poll the
archive until the connection is established again. The startup process
expects the restore_command to try to restore the file and fail if it's
not found. If the restore_command goes into sleep, waiting for the file
to arrive, that will defeat the retry logic in the server because the
startup process won't get control again to retry establishing the
connection.
That's the the essence of my proposal here:
http://archives.postgresql.org/message-id/4B50AFB4.4060902@enterprisedb.com
which is what has now been implemented.
To suppport a restore_command that does the sleeping itself, like
pg_standby, would require a major rearchitecting of the retry logic. And
I don't see why that'd desirable anyway. It's easier for the admin to
set up using simple commands like 'cp' or 'scp', than require him/her to
write scripts that handle the sleeping and retry logic.
The real problem we have right now is missing documentation. It's
starting to hurt us more and more every day, as more people start to
test this. As shown by this thread and some other recent posts.
-- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com