On 01/08/2014 02:32 PM, Matheus de Oliveira wrote:
> On Tue, Jan 7, 2014 at 10:42 PM, Matheus de Oliveira <
> matioli.matheus@gmail.com> wrote:
>
>> How did you set up the standby? Did you initialize it from an offline
>>> backup of the master's data directory, perhaps? The log shows that the
>>> startup took the the "crash recovery first, then start archive recovery"
>>> path, because there was no backup label file. In that mode, the standby
>>> assumes that the system is consistent after replaying all the WAL in
>>> pg_xlog, which is correct if you initialize from an offline backup or
>>> atomic filesystem snapshot, for example. But "WAL contains references to
>>> invalid pages" could also be a symptom of an inconsistent base backup,
>>> cause by incorrect backup procedure. In particular, I have to ask because
>>> I've seen it before: you didn't delete backup_label from the backup, did
>>> you?
>>
>> Well, I cannot answer this right now, but makes all sense and is possible.
>
> I've just confirmed. That was indeed the case, the script was removing the
> backup_label. I've just removed this line and synced it again, it is
> running nice (for past 1 hour at least).
A-ha! ;-)
> Thank you guys for all your help, and sorry for all the confusion I caused.
That seems to be a very common mistake to make. I wish we could do
something about it. Do you think it would've helped in your case if
there was a big fat warning in the beginning of backup_label, along the
lines of: "# DO NOT REMOVE THIS FILE FROM A BACKUP" ? Any other ideas
how we could've made it more obvious to the script author to not remove it?
- Heikki