Thread: Postgres: PANIC: WAL contains references to invalid pages

Postgres: PANIC: WAL contains references to invalid pages

From
Henley Wing
Date:
Hi guys,

I have a slave database that is being replicated from a Master DB (through
rsync).

All of a sudden, my slave DB just exited and wasn't able to restart. Here
are the error messages when it exits. Similar ones appear when I restart:

WARNING:  page 21561636 of relation base/16384/54391 is uninitialized
CONTEXT:  xlog redo visible: rel 1663/16384/54391; blk 21561636
PANIC:  WAL contains references to invalid pages
CONTEXT:  xlog redo visible: rel 1663/16384/54391; blk 21561636
LOG:  startup process (PID 1573) was terminated by signal 6: Aborted
LOG:  terminating any other active server processes

Both slave and master are using *Postgres 9.2.3* in a *Centos 5* quad-core
server with 32 GB of RAM each.

When I try to restart the slave, with "zero_damaged_pages=true", the server
still refuses to start with lines like the ones above. The master is still
running and isn't corrupt at all.

At this point, I don't care if the slave ever starts as I can start from
scratch. But is this an issue with Postgres? Has anyone encountered this
and is it fixed in future versions?

I've searched the mailing list and forums. I found vague references to this
problem, but seem to get the impression these issues were fixed in prior
versions of Postgres.

I have the files of the corrupt slave all intact, so if there's anything
you guys need to see in order to dig into this issue, let me know.

Re: Postgres: PANIC: WAL contains references to invalid pages

From
Andres Freund
Date:
Hello,

On 2013-11-10 17:43:50 -0500, Henley Wing wrote:
> I have a slave database that is being replicated from a Master DB (through
> rsync).
>
> All of a sudden, my slave DB just exited and wasn't able to restart. Here
> are the error messages when it exits. Similar ones appear when I restart:
>
> WARNING:  page 21561636 of relation base/16384/54391 is uninitialized
> CONTEXT:  xlog redo visible: rel 1663/16384/54391; blk 21561636
> PANIC:  WAL contains references to invalid pages
> CONTEXT:  xlog redo visible: rel 1663/16384/54391; blk 21561636
> LOG:  startup process (PID 1573) was terminated by signal 6: Aborted
> LOG:  terminating any other active server processes
>
> Both slave and master are using *Postgres 9.2.3* in a *Centos 5* quad-core
> server with 32 GB of RAM each.

> When I try to restart the slave, with "zero_damaged_pages=true", the server
> still refuses to start with lines like the ones above. The master is still
> running and isn't corrupt at all.

You unfortunately have to reclone the standby :(

> At this point, I don't care if the slave ever starts as I can start from
> scratch. But is this an issue with Postgres? Has anyone encountered this
> and is it fixed in future versions?

This hopefully is fixed in 17fa4c321ccf9693de406faffe6b235e949aa25f -
9.2.5.
For some reason it's not contained in the release notes, sorry for that.

Greetings,

Andres Freund

--
 Andres Freund                       http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services