Re: IO Timeout - Mailing list pgsql-admin
From | Alex Turner |
---|---|
Subject | Re: IO Timeout |
Date | |
Msg-id | 33c6269f05031018197ae6206b@mail.gmail.com Whole thread Raw |
In response to | Re: IO Timeout (Tom Lane <tgl@sss.pgh.pa.us>) |
Responses |
Re: IO Timeout
|
List | pgsql-admin |
Well - I am sort of trying to piece together exactly what happened. Here's what I know. Around 02:52 I get messages in my syslog stating that there were problems writing to a controler channel: Mar 10 02:52:29 tsunami kernel: 3w-9xxx: scsi1: WARNING: (0x06:0x002C): Unit #1: Command (0x28) timed out, resetting card. Mar 10 02:52:29 tsunami kernel: 3w-9xxx: scsi1: ERROR: (0x06:0x001F): Microcontroller not ready during reset sequence. Mar 10 02:52:29 tsunami kernel: 3w-9xxx: scsi1: AEN: INFO (0x04:0x005E): <NULL>:unit=0. ... Mar 10 02:58:41 tsunami kernel: 3w-9xxx: scsi1: ERROR: (0x03:0x0107): Duplicate request ID:RequestID=23. Mar 10 02:58:41 tsunami kernel: end_request: I/O error, dev sdd, sector 282528903 Mar 10 02:58:41 tsunami kernel: 3w-9xxx: scsi1: ERROR: (0x03:0x0107): Duplicate request ID:RequestID=23. Mar 10 02:58:41 tsunami kernel: end_request: I/O error, dev sdd, sector 282528903 Mar 10 02:58:41 tsunami kernel: 3w-9xxx: scsi1: ERROR: (0x03:0x0107): Duplicate request ID:RequestID=23. Mar 10 02:58:41 tsunami kernel: end_request: I/O error, dev sdc, sector 28837057 At around 07:30 all connections were failing giving the error: InternalError: FATAL: the database system is in recovery mode (pygresql - similar error in PHP also) I reboot the server, and one of the discs comes up as innaccesible (it's part of a RAID 10), but other than that, everything restarts as normal. Nothing significant in /var/log/pg_log which is where I have it logging to (my log leve is pretty low though). Alex Turner netEconomist On Thu, 10 Mar 2005 19:04:29 -0500, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Alex Turner <armtuk@gmail.com> writes: > > I have a question about IO timeouts: > > We are using the 3ware escalade 9500S series of cards, and we had a > > drive failure this morning. Apparnetly the card waits 30 seconds for > > the drive to respond, and if it doesn't, it put's the drive in a fail > > state. Postgres it seems didn't wait 30 seconds before it decided > > that the system was upset, and put the database in maintainence mode. > > > Is there a way to increase to IO wait timeout so this doesn't happen? > > Postgres hasn't got any "IO timeouts". Your concern would be better > directed to whatever kernel you're using; any sort of timeout on a disk > operation would be happening at the kernel level. > > For that matter, Postgres hasn't got any concept of "putting the > database in maintainence mode", so you haven't described what happened > very accurately at all. > > regards, tom lane >
pgsql-admin by date: