Thread: Replica (v 9.3.2) crashed with "PANIC: WAL contains references to invalid pages"
Replica (v 9.3.2) crashed with "PANIC: WAL contains references to invalid pages"
Hello everybody,
This weeked both replicas of our main db crashed at the same time with this error :
2014-02-09 11:42:51 GMT 0 52c671da.14da - PANIC: WAL contains references to invalid pages
2014-02-09 11:42:51 GMT 0 52c671da.14da - CONTEXT: xlog redo vacuum: rel 1663/16433/29449; blk 181466, lastBlockVacuumed 181463
2014-02-09 11:42:52 GMT 0 52c671d9.14d1 - LOG: startup process (PID 5338) was terminated by signal 6: Aborted
2014-02-09 11:42:52 GMT 0 52c671d9.14d1 - LOG: terminating any other active server processes
All three servers (main + two replicas) are on v. 9.3.2 running on Centos 6.4
We upgraded one month ago the main db from v 9.2.6 to 9.3.2 through pg_upgrade and had the replicas rebuilt on 9.3.2
I searched the mailing lists and found someone that had the same problem in the past but it seems that their problem was fixed by already released patches.
( see thread http://www.postgresql.org/message-id/675b7cee-b7f0-4e32-8e34-1efaf3ca5fe9@email.android.com)
So it seems that our problem is a new one since we are running the latest version…….
Thank you for your help
Marco Cassiano
Manifatture del Nord srl unipersonale
Gruppo MaxMara
via Mazzacurati 6
C.P. n° 20 - San Maurizio
42122 Reggio Emilia RE
ITALY
Tel. +39 0522 358215
Fax +39 0522 268715
email : mcassiano@manord.com
---------------------------------------------------------------------------------------------
Il contenuto della presente comunicazione è riservato e destinato esclusivamente ai destinatari indicati. Nel caso in cui sia ricevuto da persona diversa dal destinatario sono proibite la diffusione, la distribuzione e la copia. Nel caso riceveste la presente per errore, Vi preghiamo di informarci e di distruggerlo e/o cancellarlo dal Vostro computer, senza utilizzare i dati contenuti.
La presente comunicazione (comprensiva dei documenti allegati) non avrà valore di proposta contrattuale e/o accettazione di proposte provenienti dal destinatario, nè rinuncia o riconoscimento di diritti, debiti e/o crediti, nè sarà impegnativa, qualora non sia sottoscritto successivo accordo da chi può validamente obbligarci. Non deriverà alcuna responsabilità precontrattuale a ns. carico, se la presente non sia seguita da contratto sottoscritto dalle parti.
---------------------------------------------------------------------------------------------
The content of the above communication is strictly confidential and reserved solely for the referred addressees. In the event of receipt by persons different from the addressee, copying, alteration and distribution are forbidden. If received by mistake we ask you to inform us and to destroy and/or delete from your computer without using the data herein contained. The present message (eventual annexes inclusive) shall not be considered a contractual proposal and/or acceptance of offer from the addressee, nor waiver recognizance of rights, debts and/or credits, nor shall it be binding when not executed as a subsequent agreement by persons who could lawfully represent us. No pre-contractual liability shall apply to us when the present communication is not followed by any binding agreement between the parties.
---------------------------------------------------------------------------------------------
Attachment
Re: Replica (v 9.3.2) crashed with "PANIC: WAL contains references to invalid pages"
Hello everybody,
This weeked both replicas of our main db crashed at the same time with this error :
2014-02-09 11:42:51 GMT 0 52c671da.14da - PANIC: WAL contains references to invalid pages
2014-02-09 11:42:51 GMT 0 52c671da.14da - CONTEXT: xlog redo vacuum: rel 1663/16433/29449; blk 181466, lastBlockVacuumed 181463
2014-02-09 11:42:52 GMT 0 52c671d9.14d1 - LOG: startup process (PID 5338) was terminated by signal 6: Aborted
2014-02-09 11:42:52 GMT 0 52c671d9.14d1 - LOG: terminating any other active server processes
All three servers (main + two replicas) are on v. 9.3.2 running on Centos 6.4
We upgraded one month ago the main db from v 9.2.6 to 9.3.2 through pg_upgrade and had the replicas rebuilt on 9.3.2
I searched the mailing lists and found someone that had the same problem in the past but it seems that their problem was fixed by already released patches.
( see thread http://www.postgresql.org/message-id/675b7cee-b7f0-4e32-8e34-1efaf3ca5fe9@email.android.com)
So it seems that our problem is a new one since we are running the latest version…….
Thank you for your help
Marco Cassiano
Manifatture del Nord srl unipersonale
Gruppo MaxMaravia Mazzacurati 6
C.P. n° 20 - San Maurizio
42122 Reggio Emilia RE
ITALYTel. +39 0522 358215
Fax +39 0522 268715
email : mcassiano@manord.com
---------------------------------------------------------------------------------------------
Il contenuto della presente comunicazione è riservato e destinato esclusivamente ai destinatari indicati. Nel caso in cui sia ricevuto da persona diversa dal destinatario sono proibite la diffusione, la distribuzione e la copia. Nel caso riceveste la presente per errore, Vi preghiamo di informarci e di distruggerlo e/o cancellarlo dal Vostro computer, senza utilizzare i dati contenuti.
La presente comunicazione (comprensiva dei documenti allegati) non avrà valore di proposta contrattuale e/o accettazione di proposte provenienti dal destinatario, nè rinuncia o riconoscimento di diritti, debiti e/o crediti, nè sarà impegnativa, qualora non sia sottoscritto successivo accordo da chi può validamente obbligarci. Non deriverà alcuna responsabilità precontrattuale a ns. carico, se la presente non sia seguita da contratto sottoscritto dalle parti.
---------------------------------------------------------------------------------------------
The content of the above communication is strictly confidential and reserved solely for the referred addressees. In the event of receipt by persons different from the addressee, copying, alteration and distribution are forbidden. If received by mistake we ask you to inform us and to destroy and/or delete from your computer without using the data herein contained. The present message (eventual annexes inclusive) shall not be considered a contractual proposal and/or acceptance of offer from the addressee, nor waiver recognizance of rights, debts and/or credits, nor shall it be binding when not executed as a subsequent agreement by persons who could lawfully represent us. No pre-contractual liability shall apply to us when the present communication is not followed by any binding agreement between the parties.
---------------------------------------------------------------------------------------------
Attachment
R: Replica (v 9.3.2) crashed with "PANIC: WAL contains references to invalid pages"
I resend the mail with gzipped attachment due to mailing list message size limits
--------------------------
Thank you Mat,
here are the additional infos :
1) All of the three servers (main+2 replicas) are virtual on VMware Esxi 5.0
2) Each server is on a different storage and on different vmware hosts
3) The Log of the primary are with no errors
4) Attached : pg_controldata output, postgres log, and /var/log/messages
5) Fsck on the colume containing the database folders reports no error :
[root@pg64prod_rep /]# umount /dev/sdb1
[root@pg64prod_rep /]# fsck -n /dev/sdb1
fsck from util-linux-ng 2.17.2
e2fsck 1.41.12 (17-May-2010)
/dev/sdb1 has gone 193 days without being checked, check forced.
Pass 1: Checking inodes, blocks, and sizes
Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Pass 5: Checking group summary information
/dev/sdb1: 7886/13107200 files (8.3% non-contiguous), 38818825/52428119 blocks
Marco
Da: desmodemone [mailto:desmodemone@gmail.com]
Inviato: lunedì 10 febbraio 2014 10:30
A: Cassiano, Marco
Cc: pgsql-admin@postgresql.org
Oggetto: Re: [ADMIN] Replica (v 9.3.2) crashed with "PANIC: WAL contains references to invalid pages"
Hello,
please, could you post some details about? Your replica are on the same storage as the primary ? Are virtual or physical ? hypervisor type?
Could you attach / post /var/log/messages and postgres log ?
Could you attach / post pg_controldata output of the replica ?
Did you verify the filesystem integrity of the replica ?
Are the log of the primary without errors ?
Thank you very much
Mat
Attachment
R: Replica (v 9.3.2) crashed with "PANIC: WAL contains references to invalid pages"
Thank you Mat,
here are the additional infos :
1) All of the three servers (main+2 replicas) are virtual on VMware Esxi 5.0
2) Each server is on a different storage and on different vmware hosts
3) The Log of the primary are with no errors
4) Attached : pg_controldata output, postgres log, and /var/log/messages
5) Fsck on the colume containing the database folders reports no error :
[root@pg64prod_rep /]# umount /dev/sdb1
[root@pg64prod_rep /]# fsck -n /dev/sdb1
fsck from util-linux-ng 2.17.2
e2fsck 1.41.12 (17-May-2010)
/dev/sdb1 has gone 193 days without being checked, check forced.
Pass 1: Checking inodes, blocks, and sizes
Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Pass 5: Checking group summary information
/dev/sdb1: 7886/13107200 files (8.3% non-contiguous), 38818825/52428119 blocks
Marco
Da: desmodemone [mailto:desmodemone@gmail.com]
Inviato: lunedì 10 febbraio 2014 10:30
A: Cassiano, Marco
Cc: pgsql-admin@postgresql.org
Oggetto: Re: [ADMIN] Replica (v 9.3.2) crashed with "PANIC: WAL contains references to invalid pages"
Hello,
please, could you post some details about? Your replica are on the same storage as the primary ? Are virtual or physical ? hypervisor type?
Could you attach / post /var/log/messages and postgres log ?
Could you attach / post pg_controldata output of the replica ?
Did you verify the filesystem integrity of the replica ?
Are the log of the primary without errors ?
Thank you very much
Mat
Attachment
Re: Replica (v 9.3.2) crashed with "PANIC: WAL contains references to invalid pages"
I resend the mail with gzipped attachment due to mailing list message size limits
--------------------------
Thank you Mat,
here are the additional infos :
1) All of the three servers (main+2 replicas) are virtual on VMware Esxi 5.0
2) Each server is on a different storage and on different vmware hosts
3) The Log of the primary are with no errors
4) Attached : pg_controldata output, postgres log, and /var/log/messages
5) Fsck on the colume containing the database folders reports no error :
[root@pg64prod_rep /]# umount /dev/sdb1
[root@pg64prod_rep /]# fsck -n /dev/sdb1
fsck from util-linux-ng 2.17.2
e2fsck 1.41.12 (17-May-2010)
/dev/sdb1 has gone 193 days without being checked, check forced.
Pass 1: Checking inodes, blocks, and sizes
Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Pass 5: Checking group summary information
/dev/sdb1: 7886/13107200 files (8.3% non-contiguous), 38818825/52428119 blocks
Marco
Da: desmodemone [mailto:desmodemone@gmail.com]
Oggetto: Re: [ADMIN] Replica (v 9.3.2) crashed with "PANIC: WAL contains references to invalid pages"
Hello,
please, could you post some details about? Your replica are on the same storage as the primary ? Are virtual or physical ? hypervisor type?
Could you attach / post /var/log/messages and postgres log ?
Could you attach / post pg_controldata output of the replica ?
Did you verify the filesystem integrity of the replica ?
Are the log of the primary without errors ?
Thank you very much
Mat
log_min_messages <= DEBUG2 || client_min_messages <= DEBUG2
will be reported about the invalid page of the relation on which investigate.