Thread: corruption since 7.4.13 update ?

corruption since 7.4.13 update ?

From
Franz.Rasper@izb.de
Date:
Hello,

i have a "new" machine with following problem:
postgres-7.4.13[7874]: [457-1] ERROR:  could not open segment 1 of relation
"data_server_id_idx" (target block 791807): No such file or directory

I my point of view it seems the same error as:
http://archives.postgresql.org/pgsql-admin/2006-07/msg00028.php

http://archives.postgresql.org/pgsql-admin/2006-07/msg00103.php

The server is an HP DL 380 G4 with Battery Backup write cache.

I am doing a lot of inserts. (Linux + Postgresql 7.4.13)
I works correctly on an other server (Linux + Postgresl 7.4.10 , HP DL 380
G3 without Battery Backup Write Cache).

Linux + Postgresql 7.4.10 + HP DL 380 G3 without Battery Backup Write Cache
it takes about 4 hours
Linux + Postgresql 7.4.13 + HP DL 380 G4 with Battery Backup Write Cache it
takes about 15 minutes

Amount of RAM is the same (4 GB ) both have ext3 Filesystems

What could be the reason ?  Postgresql 7.4.13 ? Configuration ? Linux Kernel
(2.4.28 vs. 2.4.32) ? Backup Write Cache ? Hardware ?

The first time it works correctly. Next day ist doesnt work.

Any help is appreciated.

regrads,

-Franz

Re: corruption since 7.4.13 update ?

From
Franz.Rasper@izb.de
Date:
Just for your information.

Our problem was caused by a bad firmware of one of the harddisk in the
mirror.

"An HP ProLiant server configured with any of the Ultra320 hard drives
listed in Table 1 may experience an unsuccessful write operation when
transferring data to the hard drive during periods of extreme disk intensive
I/O operations. This event occurs only under rare conditions and ALL of the
following criteria must be met in order for this event to occur:
Disk intensive I/O operations are occurring that generate a queue of 112
commands on the affected disk drive.
The 113th command queued on the drive is a write command that requires 256
blocks (or more) of data to be written to the drive.

Note: While the depth of the command queue and the number of commands in the
queue are critera for the occurrence of this event, obtaining this
information is not necessary in order to determine affected hard drives.
Affected hard drives are listed in Table 1 below and are identified when the
HDDETECT Utility is run."
http://h20000.www2.hp.com/bizsupport/TechSupport/Document.jsp?lang=3Den&cc=
=3Dus&
taskId=3D110&prodSeriesId=3D397634&prodTypeId=3D15351&prodSeriesId=3D397634=
&objectID
=3DPSD_EX050119_CW01

regards,

-Franz

-----Urspr=FCngliche Nachricht-----
Von: Rasper, Franz=20
Gesendet: Mittwoch, 9. August 2006 10:10
An: pgsql-bugs@postgresql.org
Betreff: corruption since 7.4.13 update ?=20


Hello,

i have a "new" machine with following problem:
postgres-7.4.13[7874]: [457-1] ERROR:  could not open segment 1 of relation
"data_server_id_idx" (target block 791807): No such file or directory

I my point of view it seems the same error as:
http://archives.postgresql.org/pgsql-admin/2006-07/msg00028.php

http://archives.postgresql.org/pgsql-admin/2006-07/msg00103.php

The server is an HP DL 380 G4 with Battery Backup write cache.

I am doing a lot of inserts. (Linux + Postgresql 7.4.13)
I works correctly on an other server (Linux + Postgresl 7.4.10 , HP DL 380
G3 without Battery Backup Write Cache).

Linux + Postgresql 7.4.10 + HP DL 380 G3 without Battery Backup Write Cache
it takes about 4 hours
Linux + Postgresql 7.4.13 + HP DL 380 G4 with Battery Backup Write Cache it
takes about 15 minutes

Amount of RAM is the same (4 GB ) both have ext3 Filesystems

What could be the reason ?  Postgresql 7.4.13 ? Configuration ? Linux Kernel
(2.4.28 vs. 2.4.32) ? Backup Write Cache ? Hardware ?

The first time it works correctly. Next day ist doesnt work.

Any help is appreciated.

regrads,

-Franz