Thread: BUG #10095: primary key corruption

BUG #10095: primary key corruption

From
lukecoldiron@hotmail.com
Date:
The following bug has been logged on the website:

Bug reference:      10095
Logged by:          Luke Coldiron
Email address:      lukecoldiron@hotmail.com
PostgreSQL version: 9.3.3
Operating system:   Ubuntu Linux 12.04 "Precise Pangolin" 32bit
Description:

I am seeing a problem where different primary keys in my database are being
corrupted.

ERROR:  could not read block 0 in file "base/16407/41243": read only 0 of
8192 bytes

When I look on the filesystem the "base/16407/41243" file is zero bytes.
When I lookup the object name that is currupt via select relname from
pg_class where relfilenode = 41243; it is always a primary key and not
always on the same table.

The system was previously upgraded from pg 8.3.7 and these issues did not
occur.

I haven't tried upgrading to 9.3.4 since it didn't look like any of the bug
fixes where targeted at the issue I am seeing.

Unfortunately, I have not yet be able to create a reproducible test case or
find a log where the issue first appeared. Any ideas would be much
appreciated.

Re: BUG #10095: primary key corruption

From
Matheus de Oliveira
Date:
On Mon, Apr 21, 2014 at 5:08 PM, <lukecoldiron@hotmail.com> wrote:

> ERROR:  could not read block 0 in file "base/16407/41243": read only 0 of
> 8192 bytes
>
>
Is this server a slave? Or has it been at some point (and now promoted to
master)?


>  When I look on the filesystem the "base/16407/41243" file is zero bytes.
> When I lookup the object name that is currupt via select relname from
> pg_class where relfilenode =3D 41243; it is always a primary key and not
> always on the same table.
>
>
For now, you can fix the corrupted indexes by simple issuing REINDEX.
Although I strongly recommend you doing a dump of all your databases,
remove it all and execute initdb again, and then restore the dumps.


> The system was previously upgraded from pg 8.3.7 and these issues did not
> occur.
>

How have you managed the upgrade? Also, has been any hardware issue
recently? I also recommend you checking for disk and memory corruption.

Best regards,
--=20
Matheus de Oliveira
Analista de Banco de Dados
Dextra Sistemas - MPS.Br n=C3=ADvel F!
www.dextra.com.br/postgres

Re: BUG #10095: primary key corruption

From
Luke Coldiron
Date:
On Mon=2C Apr 21=2C 2014 at 5:08 PM=2C  <lukecoldiron@hotmail.com> wrote:
=0A=
=0A=
=0A=
ERROR:  could not read block 0 in file "base/16407/41243": read only 0 of
=0A=
8192 bytes
=0A=


Is this server a slave? Or has it been at some point (and now promoted to m=
aster)?

It is not a slave server nor has it been at any point in time. =0A=
=0A=
=0A=
When I look on the filesystem the "base/16407/41243" file is zero bytes.
=0A=
When I lookup the object name that is currupt via select relname from
=0A=
pg_class where relfilenode =3D 41243=3B it is always a primary key and not
=0A=
always on the same table.
=0A=


For now=2C you can fix the corrupted indexes by simple issuing REINDEX. Alt=
hough I strongly recommend you doing a dump of all your databases=2C remove=
 it all and execute initdb again=2C and then restore the dumps.
=0A=
=0A=
 =0A=
The system was previously upgraded from pg 8.3.7 and these issues did not
=0A=
occur.
How have you managed the upgrade? Also=2C has been any hardware issue recen=
tly? I also recommend you checking for disk and memory corruption.

I need to give a little more background on this. The database is installed =
standalone on many different hardware instances that are exactly the same. =
The database is used for configuration in a closed software appliance much =
like a consumer router. Acceptance testing of a fresh (no upgrade) pg 9.3.3=
 database instance has yielded a number of units with the primary key corru=
ption issue after running for a short period of time (within a week of test=
ing operation). As shown from the error message above the file that should =
hold the primary key is truncated. The table corresponding to this also con=
tains zero rows but is not corrupt and is expected to have zero rows. I am =
suspecting a change in some behavior between pg 8.3.7 and 9.3.3 as the caus=
e everything else being equal. At the moment I don't have much to go on as =
I have not been able to reproduce the issue on demand however I am still wo=
rking at trying to be a reproducible test case.=0A=
=0A=
Best regards=2C
--=20
Matheus de Oliveira
Analista de Banco de Dados
Dextra Sistemas - MPS.Br n=EDvel F!
www.dextra.com.br/postgres
=0A=
=0A=

=0A=
                           =