pg_dump or hardware? - Mailing list pgsql-bugs

From andrea suisani
Subject pg_dump or hardware?
Date
Msg-id 44461CF6.4060302@opinioni.net
Whole thread Raw
Responses Re: pg_dump or hardware?  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-bugs
Hi all,

short summary:

platform: i386 SMP (dual PIII)
os: linux 2.6.8.1
vendor: debian (3.1, stable)
pgsql ver: 7.4.7 (deb)
disk: tech. SCSI vendor. IBM model. DDYS-T36950N rev. S96H
controller: adaptec aic-7892a

description:

we're experiencing a weird problem
trying to get a dump of our db for backup purposes,
the executed command is:

/usr/bin/pg_dump -U postgres -h 6pali elenco | /usr/bin/bzip2 > elenco_test.bz2

the output:

pg_dump: ERROR:  could not open relation with OID 201327173
pg_dump: SQL command to dump the contents of table "nominativi" failed: PQendcopy() failed.
pg_dump: Error message from server: ERROR:  could not open relation with OID 201327173
pg_dump: The command was: COPY public.nominativi (nome_cogno, indirizzo, cap, citta, prov,
prefisso, telefono1, telefono2, note, idpersona, estrazione, num_estra, occupato,
cod_prov, cod_com, cod_reg, capoluo, rand) TO stdout;


so it seems that we've got some problems with the "nominativi" table
(a 20 million-row table), in fact the following command also fails:

pg_dump -t nominativi -U postgres -h 6pali elenco | /usr/bin/bzip2 > nominativi.bz2

with the same err msg as before. Before the erros occurs we're are able to
get a partial backup, see:

#> ls -l nominativi.bz2
-rw-r--r--    1 sickpig  users        2.5M apr 19 12:35 nominativi.bz2

#> wc -l nominativi
  145904 nominativi

We're trying to understand whether this is due to data corruption or
hardware failure. We run long self-tests on our SCSI disk through
smartmontools on a regular basis. see attached file for "smartctl -a /dev/sda"
output. All suggestions are welcome.


Regards,


Andrea






smartctl version 5.32 Copyright (C) 2002-4 Bruce Allen
Home page is http://smartmontools.sourceforge.net/

Device: IBM      DDYS-T36950N     Version: S96H
Serial number:         5FFL3272
Device type: disk
Transport protocol: Fibre channel (FCP-2)
Local Time is: Wed Apr 19 13:14:01 2006 CEST
Device supports SMART and is Enabled
Temperature Warning Disabled or Not Supported
SMART Health Status: OK

Current Drive Temperature:     41 C
Drive Trip Temperature:        85 C
Manufactured in week 06 of year 2001
Current start stop count:      147 times
Recommended maximum start stop count:  10000 times

Error counter log:
          Errors Corrected    Total      Total   Correction     Gigabytes    Total
              delay:       [rereads/    errors   algorithm      processed    uncorrected
            minor | major  rewrites]  corrected  invocations   [10^9 bytes]  errors
read:          0        0         0         5          5       6628.657           0
write:         0        0         0         0          0       4231.306           0

Non-medium error count:        0

SMART Self-test log
Num  Test              Status                 segment  LifeTime  LBA_first_err [SK ASC ASQ]
     Description                              number   (hours)
# 1  Background long   Completed                   - 22800                   - [-   -    -]
# 2  Background long   Completed                   - 22631                   - [-   -    -]
# 3  Background long   Completed                   - 22463                   - [-   -    -]
# 4  Background long   Completed                   - 22294                   - [-   -    -]
# 5  Background long   Completed                   - 22126                   - [-   -    -]
# 6  Background long   Completed                   - 21958                   - [-   -    -]
# 7  Background long   Completed                   - 21789                   - [-   -    -]
# 8  Background long   Completed                   - 21621                   - [-   -    -]
# 9  Background long   Completed                   - 21452                   - [-   -    -]
#10  Background long   Completed                   - 21284                   - [-   -    -]
#11  Background long   Completed                   - 21115                   - [-   -    -]
#12  Background long   Completed                   - 20947                   - [-   -    -]
#13  Background long   Completed                   - 20801                   - [-   -    -]
#14  Background long   Completed                   - 20633                   - [-   -    -]
#15  Background long   Completed                   - 20464                   - [-   -    -]
#16  Background long   Completed                   - 20296                   - [-   -    -]
#17  Background long   Completed                   - 20127                   - [-   -    -]
#18  Background long   Completed                   - 19959                   - [-   -    -]
#19  Background long   Completed                   - 19790                   - [-   -    -]
#20  Background long   Completed                   - 19622                   - [-   -    -]

Long (extended) Self Test duration: 1340 seconds [22.3 minutes]

pgsql-bugs by date:

Previous
From: Bruce Momjian
Date:
Subject: Re: bug in windows xp
Next
From: Tom Lane
Date:
Subject: Re: bug in windows xp