Re: Further info : Very high load average but no cpu utilization ? - Mailing list pgsql-sql

From Rajesh Kumar Mallah.
Subject Re: Further info : Very high load average but no cpu utilization ?
Date
Msg-id 200205121116.30681.mallah@trade-india.com
Whole thread Raw
In response to Re: Further info : Very high load average but no cpu utilization ?  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Further info : Very high load average but no cpu utilization ?  ("D'Arcy J.M. Cain" <darcy@druid.net>)
List pgsql-sql
Hi there,

I have observed that it is nearly impossible to
get rid of postmaster or backends by any signal
when it decides not to quit.

Even the OS( Linux rh62)  refuses to reboot in such a situation.
and my system admin had to power off the system ,
then fsck .... and stuff.

but this only happens when postmaster is stuck for
some reason , i feel filling up of postmasters log
file was the reason of my postmaster getting stuck.

regds
mallah.




On Saturday 11 May 2002 09:29 pm, Tom Lane wrote:
> "Rajesh Kumar Mallah." <mallah@trade-india.com> writes:
> > [root@linux10320 root2]# ps auxwww| grep post
> > postgres  1131  0.0  0.0 139424   4 ?        D
> > May1004/usr/local/pgsql/bin/postmaster postgres  1132  0.0  0.0 140412
> > 4 ?        D    May10   0:13 postgres: stats buffer process postgres
> > 1133  0.0  0.0 139576   4 ?        S    May10   0:18 postgres: stats
> > collector process postgres  8046  0.0  0.0 238712   4 ?        D    00:25
> >   0:13 postgres: tradein tradein_clients 130.94.20.27 SELECT postgres
> > 8089  0.0  0.0 139812   4 ?        D    00:26   0:00 postgres: checkpoint
> > subprocess postgres 11442  0.0  0.0 218152   4 ?        D    04:25   0:03
> > postgres: tradein tradein_clients 130.94.20.27 SELECT postgres 15453  0.1
> >  0.0     0    0 ?        Z    08:17   0:09 [postmaster <defunct>]
> > postgres 15455  0.0  0.0     0    0 ?        Z    08:17   0:00
> > [postmaster <defunct>] postgres 15456  0.0  0.0     0    0 ?        Z
> > 08:18   0:00 [postmaster <defunct>] postgres 15457  0.0  0.0     0    0 ?
> >        Z    08:19   0:00 [postmaster <defunct>] postgres 15462  0.0  0.0
> >    0    0 ?        Z    08:20   0:01 [postmaster <defunct>]
>
> I think your postmaster is stuck; it should have reaped those defunct
> subprocesses instantly.  Given that you also seem to have a stuck
> checkpoint process (8 hours to run a checkpoint?) there is probably
> something hosed in the interprocess communication logic, but it's hard
> to guess what from this amount of info.
>
> At this point probably your best bet is to kill all the running postgres
> processes (try SIGTERM first, then SIGKILL if that doesn't work) and
> launch a postmaster from a fresh start.  Don't forget the ulimit this
> time.
>
>             regards, tom lane
>
> ---------------------------(end of broadcast)---------------------------
> TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org

--
Rajesh Kumar Mallah,
Project Manager (Development)
Infocom Network Limited, New Delhi
phone: +91(11)6152172 (221) (L) ,9811255597 (M)

Visit http://www.trade-india.com ,
India's Leading B2B eMarketplace.




pgsql-sql by date:

Previous
From: "Rajesh Kumar Mallah."
Date:
Subject: Re: core file found...
Next
From: "Gaetano Mendola"
Date:
Subject: Re: core file found...