Re: the un-vacuumable table - Mailing list pgsql-hackers

From Andrew Hammond
Subject Re: the un-vacuumable table
Date
Msg-id 5a0a9d6f0807071200h4895ecd5m6eee060ab4ea2953@mail.gmail.com
Whole thread Raw
In response to Re: the un-vacuumable table  ("Andrew Hammond" <andrew.george.hammond@gmail.com>)
Responses Re: the un-vacuumable table  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
On Thu, Jul 3, 2008 at 10:57 PM, Andrew Hammond
<andrew.george.hammond@gmail.com> wrote:
> On Thu, Jul 3, 2008 at 3:47 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Have you looked into the machine's kernel log to see if there is any
>> evidence of low-level distress (hardware or filesystem level)?  I'm
>> wondering if ENOSPC is being reported because it is the closest
>> available errno code, but the real problem is something different than
>> the error message text suggests.  Other than the errno the symptoms
>> all look quite a bit like a bad-sector problem ...

da1 is the storage device where the PGDATA lives.

Jun 19 03:06:14 db1 kernel: mpt1: request 0xffffffff929ba560:6810
timed out for ccb 0xffffff0000e20000 (req->ccb 0xffffff0000e20000)
Jun 19 03:06:14 db1 kernel: mpt1: request 0xffffffff929b90c0:6811
timed out for ccb 0xffffff0001081000 (req->ccb 0xffffff0001081000)
Jun 19 03:06:14 db1 kernel: mpt1: request 0xffffffff929b9f88:6812
timed out for ccb 0xffffff0000d93800 (req->ccb 0xffffff0000d93800)
Jun 19 03:06:14 db1 kernel: mpt1: attempting to abort req
0xffffffff929ba560:6810 function 0
Jun 19 03:06:14 db1 kernel: mpt1: request 0xffffffff929bcc90:6813
timed out for ccb 0xffffff03e132dc00 (req->ccb 0xffffff03e132dc00)
Jun 19 03:06:14 db1 kernel: mpt1: completing timedout/aborted req
0xffffffff929ba560:6810
Jun 19 03:06:14 db1 kernel: mpt1: abort of req 0xffffffff929ba560:0 completed
Jun 19 03:06:14 db1 kernel: mpt1: attempting to abort req
0xffffffff929b90c0:6811 function 0
Jun 19 03:06:14 db1 kernel: mpt1: completing timedout/aborted req
0xffffffff929b90c0:6811
Jun 19 03:06:14 db1 kernel: mpt1: abort of req 0xffffffff929b90c0:0 completed
Jun 19 03:06:14 db1 kernel: mpt1: attempting to abort req
0xffffffff929b9f88:6812 function 0
Jun 19 03:06:14 db1 kernel: (da1:mpt1:0:0:0): WRITE(16). CDB: 8a 0 0 0
0 1 6c 99 9 c0 0 0 0 20 0 0
Jun 19 03:06:14 db1 kernel: (da1:mpt1:0:0:0): CAM Status: SCSI Status Error
Jun 19 03:06:14 db1 kernel: (da1:mpt1:0:0:0): SCSI Status: Check Condition
Jun 19 03:06:14 db1 kernel: (da1:mpt1:0:0:0): UNIT ATTENTION asc:29,0
Jun 19 03:06:14 db1 kernel: (da1:mpt1:0:0:0): Power on, reset, or bus
device reset occurred
Jun 19 03:06:14 db1 kernel: (da1:mpt1:0:0:0): Retrying Command (per Sense Data)
Jun 19 03:06:14 db1 kernel: mpt1: completing timedout/aborted req
0xffffffff929b9f88:6812
Jun 19 03:06:14 db1 kernel: mpt1: abort of req 0xffffffff929b9f88:0 completed
Jun 19 03:06:14 db1 kernel: mpt1: attempting to abort req
0xffffffff929bcc90:6813 function 0
Jun 19 03:06:14 db1 kernel: mpt1: completing timedout/aborted req
0xffffffff929bcc90:6813
Jun 19 03:06:14 db1 kernel: mpt1: abort of req 0xffffffff929bcc90:0 completed
Jun 19 03:06:14 db1 kernel: (da1:mpt1:0:0:0): WRITE(16). CDB: 8a 0 0 0
0 1 65 1b 71 a0 0 0 0 20 0 0
Jun 19 03:06:14 db1 kernel: (da1:mpt1:0:0:0): CAM Status: SCSI Status Error
Jun 19 03:06:14 db1 kernel: (da1:mpt1:0:0:0): SCSI Status: Check Condition
Jun 19 03:06:14 db1 kernel: (da1:mpt1:0:0:0): UNIT ATTENTION asc:29,0
Jun 19 03:06:14 db1 kernel: (da1:mpt1:0:0:0): Power on, reset, or bus
device reset occurred
Jun 19 03:06:14 db1 kernel: (da1:mpt1:0:0:0): Retrying Command (per Sense Data)
Jun 19 03:18:16 db1 kernel: mpt3: request 0xffffffff929d5900:56299
timed out for ccb 0xffffff03df7f5000 (req->ccb 0xffffff03df7f5000)

I think this is a smoking gun.

Andrew


pgsql-hackers by date:

Previous
From: Adriano Lange
Date:
Subject: GEQO Publications
Next
From: Zdenek Kotala
Date:
Subject: Re: PATCH: CITEXT 2.0 v2