Thanks for the confirmation in what I thought might be happening. Now, to try to convince the QA folks that this is
what'sgoing on.
Sent from my iPad
> On Oct 7, 2014, at 1:56 PM, Jerry Sievers <gsievers19@comcast.net> wrote:
>
> Sorry, meant 'restore_command' not archive_command. See below...
>
> John Scalia <jayknowsunix@gmail.com> writes:
>
>> Hi all,
>>
>> My setup is: postgresql V9.3.3 running on a CentOS 6.5 (kernel
>> 2.6.32-358.18.1.el6.x86_64) and I have 3 servers, one primary and two
>> hot standbys. In our failover and loss of communications testing, I
>> have seen a couple of issues that I'm hard to explain. For instance,
>> we took one hot standby out of service by shutting down postgresql on
>> it. Now, we're hot standby with log shipping as an insurance policy,
>> so the WAL segments continued to be copied onto that out of service
>> standby for a few minutes. On restart, I see:
>>
>> cp: cannot stat '/mnt/wallogs/archive/0000000C.history': No such file or directory
>> cp: cannot stat '/mnt/wallogs/archive/0000000B0000001900000077': No such file or directory
>>
>> Later in the logfile, I see another failure for 000000B.history.
>
> These .history files may or may not exist depending on whether timeline
> branching has been done.
>
> The only way Pg knows how to check for them is invoking archive_command
> which is your case is cp and thus the message.
>
> HTH
>
>>
>> In looking at the /mnt/wallogs/archive directory, those files aren't
>> there, but as the primary never had an issue and continued to copy WAL
>> segments to this directory, why was the standby looking for them? What
>> triggered this? Also, in that directory, I often see files generated
>> by the pg_basebackup command used to build the standby, files like
>> "0000000B0000001900000000.00000028.backup" or generally files ending
>> with .backup in their names. These never get removed automatically by
>> the standby server. We have to manually remove them. So, I'm guessing
>> they weren't necessary, so why did the primary copy them here using
>> its archive_command? Why aren't they removed by some mechanism on the
>> standby?
>>
>> --
>> Jay
>
> --
> Jerry Sievers
> Postgres DBA/Development Consulting
> e: postgres.consulting@comcast.net
> p: 312.241.7800