Re: IO Timeout - Mailing list pgsql-admin

From Alex Turner
Subject Re: IO Timeout
Date
Msg-id 33c6269f05031018197ae6206b@mail.gmail.com
Whole thread Raw
In response to Re: IO Timeout  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: IO Timeout
List pgsql-admin
Well - I am sort of trying to piece together exactly what happened.

Here's what I know.

Around 02:52 I get messages in my syslog stating that there were
problems writing to a controler channel:
Mar 10 02:52:29 tsunami kernel: 3w-9xxx: scsi1: WARNING:
(0x06:0x002C): Unit #1: Command (0x28) timed out, resetting card.
Mar 10 02:52:29 tsunami kernel: 3w-9xxx: scsi1: ERROR: (0x06:0x001F):
Microcontroller not ready during reset sequence.
Mar 10 02:52:29 tsunami kernel: 3w-9xxx: scsi1: AEN: INFO
(0x04:0x005E): <NULL>:unit=0.
...
Mar 10 02:58:41 tsunami kernel: 3w-9xxx: scsi1: ERROR: (0x03:0x0107):
Duplicate request ID:RequestID=23.
Mar 10 02:58:41 tsunami kernel: end_request: I/O error, dev sdd,
sector 282528903
Mar 10 02:58:41 tsunami kernel: 3w-9xxx: scsi1: ERROR: (0x03:0x0107):
Duplicate request ID:RequestID=23.
Mar 10 02:58:41 tsunami kernel: end_request: I/O error, dev sdd,
sector 282528903
Mar 10 02:58:41 tsunami kernel: 3w-9xxx: scsi1: ERROR: (0x03:0x0107):
Duplicate request ID:RequestID=23.
Mar 10 02:58:41 tsunami kernel: end_request: I/O error, dev sdc, sector 28837057


At around 07:30 all connections were failing giving the error:
InternalError: FATAL:  the database system is in recovery mode
(pygresql - similar error in PHP also)

I reboot the server, and one of the discs comes up as innaccesible
(it's part of a RAID 10), but other than that, everything restarts as
normal.

Nothing significant in /var/log/pg_log which is where I have it
logging to (my log leve is pretty low though).

Alex Turner
netEconomist


On Thu, 10 Mar 2005 19:04:29 -0500, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Alex Turner <armtuk@gmail.com> writes:
> > I have a question about IO timeouts:
> > We are using the 3ware escalade 9500S series of cards, and we had a
> > drive failure this morning.  Apparnetly the card waits 30 seconds for
> > the drive to respond, and if it doesn't, it put's the drive in a fail
> > state.  Postgres it seems didn't wait 30 seconds before it decided
> > that the system was upset, and put the database in maintainence mode.
>
> > Is there a way to increase to IO wait timeout so this doesn't happen?
>
> Postgres hasn't got any "IO timeouts".  Your concern would be better
> directed to whatever kernel you're using; any sort of timeout on a disk
> operation would be happening at the kernel level.
>
> For that matter, Postgres hasn't got any concept of "putting the
> database in maintainence mode", so you haven't described what happened
> very accurately at all.
>
>                         regards, tom lane
>

pgsql-admin by date:

Previous
From: star star
Date:
Subject: Unicode!
Next
From: Bruce Momjian
Date:
Subject: Re: 7.4.5 file write issue