Re: WAL contains references to invalid pages - Mailing list pgsql-general

From Fabrízio de Royes Mello
Subject Re: WAL contains references to invalid pages
Date
Msg-id CAFcNs+rAs9M=2k+=C2qTDvqpd5GasBGA5oi1nVa_fA4hsQd=6A@mail.gmail.com
Whole thread Raw
In response to WAL contains references to invalid pages  (JotaComm <jota.comm@gmail.com>)
Responses Re: WAL contains references to invalid pages  (JotaComm <jota.comm@gmail.com>)
List pgsql-general

On Thu, May 16, 2013 at 11:12 AM, JotaComm <jota.comm@gmail.com> wrote:

[...]
Yesterday I identified the following messages in my log file (slave):

user=,db= WARNING:  page 6629 of relation base/20449/24818 is uninitialized
user=,db= CONTEXT:  xlog redo vacuum: rel 1663/20449/24818; blk 6631, lastBlockVacuumed 6626
user=,db= PANIC:  WAL contains references to invalid pages
user=,db= CONTEXT:  xlog redo vacuum: rel 1663/20449/24818; blk 6631, lastBlockVacuumed 6626
user=,db= LOG:  startup process (PID 26293) was terminated by signal 6: Aborted
user=,db= LOG:  terminating any other active server processes

Information:

PostgreSQL 9.2.3 (master and slave)
Operational System: CentOS release 6.3 (Final)
The parameter full_page_writes is enabled in both servers.

Analyzing the objects in my cluster (master) I identified the database [20449] and the relation [24818]. The relation 24818 is an index, so I ran the command REINDEX to try solving the problem. Immediately after, I tried to up the slave but I received the same errors.

user=,db= WARNING:  page 6629 of relation base/20449/24818 is uninitialized
user=,db= CONTEXT:  xlog redo vacuum: rel 1663/20449/24818; blk 6631, lastBlockVacuumed 6626
user=,db= PANIC:  WAL contains references to invalid pages
user=,db= CONTEXT:  xlog redo vacuum: rel 1663/20449/24818; blk 6631, lastBlockVacuumed 6626
user=,db= LOG:  startup process (PID 26293) was terminated by signal 6: Aborted
user=,db= LOG:  terminating any other active server processes

As the problem is in the wal file, so the process (above) doesn't work according my wish.

Any idea?


Hi JotaComm,

IMHO as it is your slave you could just rebuild it.

However if you want to make an attempt to recover you can do:

1) make a physical backup of this cluster
2) in your postgresql.conf set 'zero_damaged_pages = on' [1] 
3) start your cluster

I really don't know if it will work, but you can try... :-)


Regards,

-- 
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL
>> Blog sobre TI: http://fabriziomello.blogspot.com
>> Perfil Linkedin: http://br.linkedin.com/in/fabriziomello
>> Twitter: http://twitter.com/fabriziomello

pgsql-general by date:

Previous
From: "David M. Kaplan"
Date:
Subject: Re: problem with lost connection while running long PL/R query
Next
From: Matt Brock
Date:
Subject: Re: Deploying PostgreSQL on CentOS with SSD and Hardware RAID