Re: Weird disk/table space consumption problem - Mailing list pgsql-general

From Craig Ringer
Subject Re: Weird disk/table space consumption problem
Date
Msg-id 1247387952.18105.19.camel@ayaki
Whole thread Raw
In response to Re: Weird disk/table space consumption problem  (Dirk Riehle <dirk@riehle.org>)
List pgsql-general
On Sat, 2009-07-11 at 18:19 -0700, Dirk Riehle wrote:

> I do have some weird every few days error where the soft raid blocks for
> a couple of seconds and I get this kernel log output:
>
> Jul  7 19:58:55 server kernel: [40336.000239] ata1.00: status: { DRDY }
> Jul  7 19:58:55 server kernel: [40336.000244] ata1.00: cmd
> 61/08:a0:a7:44:21/00:00:00:00:00/40 tag 20 ncq 4096 out
> Jul  7 19:58:55 server kernel: [40336.000245]          res
> 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)

Have you used smartctl (from the smartmontools package - on
Debian/Ubuntu at least) to examine the drive?

In particular, you should ask the drive to do a self-test and media
scan. This will not prevent take it out of the RAID or prevent it from
servicing normal operations, though it may slow it down a bit. Run:

smartctl -d ata -t long /dev/sda

then "sleep" however long it says the test will take, eg "sleep 2h".

When the sleep command exits, run:

smartctl -d ata -a /dev/sda

to see general info on the drive, its error logs, and its test logs. If
you see errors logged on the drive, if the test shows as failed, if you
see a non-zero "reallocated sector" count, or if "pending sector" is
non-zero, then it's time to replace the drive.

--
Craig Ringer


pgsql-general by date:

Previous
From: Craig Ringer
Date:
Subject: Re: INSERT only unique records
Next
From: Roy Walter
Date:
Subject: xpath() subquery for empty array