Home > mailing lists

Index corruption - Mailing list pgsql-hackers

From	Marc Munro
Subject	Index corruption
Date	June 28, 2006 16:28:27
Msg-id	1151512094.26442.34.camel@bloodnok.com Whole thread Raw
Responses	Re: Index corruption (Tom Lane <tgl@sss.pgh.pa.us>) Re: Index corruption (Marc Munro <marc@bloodnok.com>) Re: Index corruption (Tom Lane <tgl@sss.pgh.pa.us>)
List	pgsql-hackers

Tree view

We have now experienced index corruption on two separate but identical
slony clusters.  In each case the slony subscriber failed after
attempting to insert a duplicate record.  In each case reindexing the
sl_log_1 table on the provider fixed the problem.

The latest occurrence was on our production cluster yesterday.  This has
only happened since we performed kernel upgrades and we are uncertain
whether this represents a kernel bug, or a postgres bug exposed by
different timings in the new kernel.

Our systems are:

Sun v40z 4 x Dual Core AMD Opteron(tm) Processor 875
Kernel 2.6.16.14 #8 SMP x86_64 x86_64 x86_64 GNU/Linux
kernel boot option: elevator=deadline
16 Gigs of RAM
postgresql-8.0.3-1PGDG
Bonded e1000/tg3 NICs with 8192 MTU.
Slony 1.1.0

NetApp FAS270 OnTap 7.0.3
Mounted with the NFS options
rw,nfsvers=3,hard,rsize=32768,wsize=32768,timeo=600,tcp,noac
Jumbo frames 8192 MTU.

All postgres data and logs are stored on the netapp.

In the latest episode, the index corruption was coincident with a
slony-induced vacuum.  I don't know if this was the case with our test
system failures.

What can we do to help identify the cause of this?  I believe we will be
able to reproduce this on a test system if there is some useful
investigation we can perform.

__
Marc

pgsql-hackers by date:

From: Phil Frost
Date: 28 June 2006, 16:24:12
Subject: Re: optimizing constant quals within outer joins

From: Thomas Hallgren
Date: 28 June 2006, 16:39:24
Subject: Fixed length datatypes. WAS [GENERAL] UUID's as primary keys

Index corruption - Mailing list pgsql-hackers

Previous

Next