Re: "recovery mode" - Mailing list pgsql-general

From Steve Wolfe
Subject Re: "recovery mode"
Date
Msg-id 004b01c0855f$856a56e0$50824e40@iboats.com
Whole thread Raw
In response to "recovery mode"  ("Steve Wolfe" <steve@iboats.com>)
List pgsql-general
> I don't think recovery mode actually does much in 7.0.* --- I think it's
> just a stub (Vadim might know better though).  In 7.1 it means the thing
> is replaying the WAL log after a crash.  In any case it shouldn't
> create a lockup condition like that.
>
> The only cases I've ever heard of where a user process couldn't be
> killed with kill -9 are where it's stuck in a kernel call (and the
> kill response is being held off till the end of the kernel call).
> Any such situation is arguably a kernel bug, of course, but that's
> not a lot of comfort.
>
> Exactly which process were you sending kill -9 to, anyway?  There should
> have been a postmaster and one backend running the recovery-mode code.
> If the postmaster was responding to connection requests with an error
> message, then I would not say that it was locked up.

  I believe that it was a backend that I tried -9'ing.  I knew it wasn't
something that good to do, but I had to get it running again.  It's amazing
how bold you get when you hear an entire department mumbling about "Why
isn't the site working?". : )

   Anyway, I think the problem wasn't in postgres.  I rebooted the machine,
and it worked - for about ten minutes.  Then, it froze, with the kernel
crapping out.   I rebooted it, it lasted about three minutes until the same
thing happened.  Reboot, it didn't even get through the fsck before it did
it again.

    I looked at the CPU temps, one of the four was warmer than it should be,
but still within acceptable limits (40 C).  So, I shut it down, reseated the
RAM chassis, the DIMM's, the CPU's, and the expansion cards.  When it came
up, I compiled and put on a newer kernel (I guess there was some good in the
crashes), and then it worked fine.  Because of the symptoms, I imagine that
it was a flakey connection.   Odd, considering that everything except the
DIMM's (including the CPU's) are literally screwed to the motherboard!

steve




pgsql-general by date:

Previous
From: Tom Lane
Date:
Subject: Re: Looking for info on Solaris 7 (SPARC) specific considerations
Next
From: Nelio Alves Pereira Filho
Date:
Subject: System Tables ER / ALTER COLUMN