Re: IO Timeout - Mailing list pgsql-admin

From Tom Lane
Subject Re: IO Timeout
Date
Msg-id 28300.1110514147@sss.pgh.pa.us
Whole thread Raw
In response to Re: IO Timeout  (Alex Turner <armtuk@gmail.com>)
Responses Re: IO Timeout  (Alex Turner <armtuk@gmail.com>)
List pgsql-admin
Alex Turner <armtuk@gmail.com> writes:
> Well - I am sort of trying to piece together exactly what happened.
> Here's what I know.

> Around 02:52 I get messages in my syslog stating that there were
> problems writing to a controler channel:
> [ various hardware errors snipped ]

> At around 07:30 all connections were failing giving the error:
> InternalError: FATAL:  the database system is in recovery mode

I think what happened here is that Postgres got a write error on WAL,
which would probably cause a PANIC, and then the ensuing database reboot
got hung up trying to re-read WAL.  Client connection requests would be
refused with messages like the above until the recovery process
completed.  The fact that this was still going on 4+ hours later shows
that Postgres is *not* timing out on stuck disk operations ... very much
the reverse in fact.

You'd be best off to take the matter up with some kernel hackers.
If there's anything to be done to improve the behavior, it's at
the kernel device driver level.

            regards, tom lane

pgsql-admin by date:

Previous
From: Bruce Momjian
Date:
Subject: Re: 7.4.5 file write issue
Next
From: Tom Lane
Date:
Subject: Re: 7.4.5 file write issue