Re: I/O error on data file, can't run backup - Mailing list pgsql-general

From Craig Ringer
Subject Re: I/O error on data file, can't run backup
Date
Msg-id 4E8D377F.3070206@ringerc.id.au
Whole thread Raw
In response to Re: I/O error on data file, can't run backup  (Leif Biberg Kristensen <leif@solumslekt.org>)
Responses Re: I/O error on data file, can't run backup  (Leif Biberg Kristensen <leif@solumslekt.org>)
List pgsql-general
On 10/06/2011 03:06 AM, Leif Biberg Kristensen wrote:
> I seemingly fixed the problem by stopping postgres and doing:
>
> balapapa 612249 # mv 11658 11658.old
> balapapa 612249 # mv 11658.old 11658
>
> And the backup magically works.

Woooooo! That's ... "interesting".

I'd be inclined to suspect filesystem corruption, a file system bug /
kernel bug (not very likely if you're on ext3), flakey RAM, etc rather
than a failing disk ... though a failing disk _could_ still be the culprit.

Use smartmontools to do a self-test; if 'smartctl -d ata -t long
/dev/sdx' (where 'x' is the drive node) is reported by 'smartctl -d ata
-a /dev/sdx' as having passed, there are no pending or uncorrectable
sectors, and the disk status is reported as 'HEALTHY' your disk is quite
likely OK. Note that a 'PASSED' or 'HEALTHY' report by its self doesn't
mean much, disk firmwares often return HEALTHY even when the disk can't
even read sector 0.

I strongly recommend making a full backup, both a pg_dump *and* a
file-system level copy of the datadir. Personally I'd then do a test
restore of the pg_dump backup on a separate Pg instance and if it looked
OK I'd re-initdb then reload from the dump.

--
Craig Ringer

pgsql-general by date:

Previous
From: Amit Dor-Shifer
Date:
Subject: plpgsql: type of array cells
Next
From: khizer
Date:
Subject: Fwd: Postgresql-8.2 Replication