postmaster dead now !! - Mailing list pgsql-sql
From | Rajesh Kumar Mallah. |
---|---|
Subject | postmaster dead now !! |
Date | |
Msg-id | 200205101713.14954.mallah@trade-india.com Whole thread Raw |
In response to | Re: pg_stat_get_backend_pid seems to be listing non existant (Jan Wieck <janwieck@yahoo.com>) |
Responses |
Re: postmaster dead now !!
pg_ctl stop does not work. core file found... |
List | pgsql-sql |
Hi my postmaster died just now, only a bunch of backends running [rmallah@server rmallah]$ psql -h 130.94.22.209 -U tradein tradein_clients psql: could not connect to server: Connection refused Is the server running on host 130.94.22.209 and accepting TCP/IP connections on port 5432? [rmallah@server rmallah]$ output of ps ============ [root@linux10320 root2]# ps auxwww| grep post postgres 5598 0.0 0.0 140412 4 ? D May07 2:31 postgres: stats buffer process postgres 5599 1.1 0.0 142396 20 ? R May07 48:48 postgres: stats collector process postgres 12262 0.0 0.0 238712 4 ? D May09 0:19 postgres: tradein tradein_clients 130.94.20.27 SELECT postgres 13039 0.0 0.0 139812 4 ? D May09 0:00 postgres: checkpoint subprocess postgres 29440 0.0 0.8 140664 9256 ? S 14:35 0:01 postgres: tradein tradein_clients 203.196.129.235 idle postgres 6805 0.0 0.0 140196 4 ? S 16:08 0:00 postgres: tradein tradein_clients 203.196.129.235 idle postgres 10154 0.0 0.0 140196 4 ? S 16:38 0:00 postgres: tradein tradein_clients 203.196.129.235 idle postgres 10446 0.0 0.0 140164 4 ? S 16:43 0:00 postgres: postgres tradein_clients 203.196.129.235 idle [root@linux10320 root2]# ============= output of top ============= 5:29pm up 3 days, 24 min, 4 users, load average: 5.68, 5.61, 5.70 54 processes: 52 sleeping, 2 running, 0 zombie, 0 stopped CPU states: 5.8% user, 44.3% system, 0.0% nice, 49.8% idle Mem: 1028484K av, 900084K used, 128400K free, 0K shrd, 2968K buff Swap: 971004K av, 99288K used, 871716K free 857220K cached PID USER PRI NI SIZE RSS SHARE STAT LIB %CPU %MEM TIME COMMAND5599 postgres 17 0 2064 20 20 R 099.9 0.0 53:25 postmaster5598 postgres 9 0 1440 4 4 D 0 0.0 0.0 2:31 postmaster 12262 postgres 9 0 88564 4 4 D 0 0.0 0.0 0:19 postmaster 13039 postgres 9 0 656 4 4 D 0 0.0 0.0 0:00 postmaster 29440 postgres 9 0 10512 9256 9256 S 0 0.0 0.8 0:01 postmaster6805 postgres 9 0 972 4 4 S 0 0.0 0.0 0:00 postmaster 10154 postgres 9 0 968 4 4 S 0 0.0 0.0 0:00 postmaster 10446 postgres 9 0 964 4 4 S 0 0.0 0.0 0:00 postmaster ========================================================================== On Friday 10 May 2002 04:18 pm, Jan Wieck wrote: > Rajesh Kumar Mallah. wrote: > > Hi Folks, > > please help , > > > > therse seems to be too much lag between the access collector > > and system status. even the pids of backend does not seems to be > > matching. > > The delay is on average 250 milliseconds for a busy database > (1/4 second). The controlling definition is in > > src/include/pgstat.h: > #define PGSTAT_STAT_INTERVAL 500 > > This means, from the moment ANY statistic packet has arrived > in the collector, it waits 500 milliseconds before writing > out all information. Thus, the above 250 milliseconds > average is only true assuming a constant flow of packets. > > And, before you discover this one: The backends send their > statistic collection information via UDP packets. In the case > of heavy database load, some of these packets can get lost so > that the statistics will not be 100% accurate. This is a > wanted feature and implemented on purpose! It is because > counting the number of scans isn't considered as much > important as responding to the client as fast as possible > during the rushhour. > > > Jan > > > tradein_clients=# SELECT pg_stat_get_backend_pid(s.backendid) AS procpid, > > pg_stat_get_backend_activity(s.backendid) AS current_query FROM (SELECT > > pg_stat_get_backend_idset() AS backendid) s; > > > > procpid | current_query > > ---------+------------------------------- > > 27134 | <IDLE> in transaction > > 26958 | <IDLE> in transaction > > 26953 | <IDLE> in transaction > > 26960 | <IDLE> in transaction > > 27008 | <IDLE> in transaction > > 12839 | <IDLE> > > 26977 | <IDLE> in transaction > > 27012 | <IDLE> in transaction > > 31354 | <IDLE> > > 27014 | <IDLE> in transaction > > 27015 | <IDLE> in transaction > > 26978 | <IDLE> in transaction > > 26985 | <IDLE> in transaction > > 27135 | select count(*) from ( select distinct on (email_id) > > email_id,email,contact from email_bank a join (select email_id from > > email_export_category where category_id in (1, 2, 3, 4, 5, 6, 7, 8, 9, > > 10, 12, 14, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 1 > > 12262 | SELECT source_id , cnt from (SELECT > > source_id,count(source_id) as cnt from email_source group by source_id ) > > subsel join sources > > using(source_id) order by source_id > > 27136 | <IDLE> in transaction > > (16 rows) > > > > > > > > tradein_clients=# > > why does the above not match with the "top" output at > > the same time: > > > > ========================================================================= > >===== 4:10pm up 2 days, 23:06, 2 users, load average: 6.21, 6.06, 5.60 > > 69 processes: 66 sleeping, 3 running, 0 zombie, 0 stopped > > CPU states: 55.6% user, 2.0% system, 0.0% nice, 42.3% idle > > Mem: 1028484K av, 980320K used, 48164K free, 0K shrd, 3744K > > buff Swap: 971004K av, 102532K used, 868472K free > > 912724K cached > > > > PID USER PRI NI SIZE RSS SHARE STAT LIB %CPU %MEM TIME > > COMMAND 5456 postgres 17 0 59456 57M 57156 R 0 99.1 5.7 3:12 > > postmaster 6601 postgres 9 0 79964 77M 78328 S 0 3.3 7.7 > > 0:01 postmaster 6779 postgres 9 0 88412 86M 86752 S 0 1.9 > > 8.5 0:00 postmaster 6703 postgres 9 0 81668 79M 80276 S 0 > > 1.7 7.9 0:01 postmaster 6943 postgres 9 0 78732 76M 77520 S > > 0 1.7 7.6 0:01 postmaster 6940 postgres 9 0 44180 42M 42668 S > > 0 0.5 4.2 0:00 postmaster 6776 postgres 9 0 121M 121M 119M S > > 0 0.3 12.0 0:01 postmaster 5597 postgres 8 0 624 248 > > 216 S 0 0.0 0.0 0:24 postmaster 5598 postgres 9 0 1440 > > 4 4 D 0 0.0 0.0 2:31 postmaster 5599 postgres 9 0 2052 > > 4 4 S 0 0.0 0.0 28:05 postmaster 12262 postgres 9 0 > > 88564 4 4 D 0 0.0 0.0 0:19 postmaster 13039 postgres 9 > > 0 656 4 4 D 0 0.0 0.0 0:00 postmaster 29440 postgres > > 9 0 20928 19M 20332 S 0 0.0 1.9 0:01 postmaster 1652 > > postgres 9 0 3356 2324 2144 S 0 0.0 0.2 3:21 postmaster > > 2219 postgres 9 0 2744 2120 2068 S 0 0.0 0.2 0:00 > > postmaster 6772 postgres 9 0 100M 100M 99.3M S 0 0.0 9.9 > > 0:00 postmaster 6805 postgres 9 0 4440 4168 3532 S 0 0.0 > > 0.4 0:00 postmaster 6809 postgres 9 0 35280 34M 33948 S 0 > > 0.0 3.4 0:00 postmaster 6846 postgres 9 0 98.9M 98M 99804 S > > 0 0.0 9.8 0:01 postmaster 6931 postgres 9 0 21744 20M 20428 S > > 0 0.0 2.0 0:02 postmaster 6934 postgres 9 0 19020 18M 17868 S > > 0 0.0 1.8 0:00 postmaster 6941 postgres 9 0 63280 61M > > 61756 S 0 0.0 6.1 0:01 postmaster > > ========================================================================= > >======= > > > > > > [root@linux10320 root2]# kill -INT 27135 > > bash: kill: (27135) - No such pid > > [root@linux10320 root2]# > > > > > > and # kill -INT 12262 does not actually kills it ?? > > > > regds > > mallah. > > > > > > -- > > Rajesh Kumar Mallah, > > Project Manager (Development) > > Infocom Network Limited, New Delhi > > phone: +91(11)6152172 (221) (L) ,9811255597 (M) > > > > Visit http://www.trade-india.com , > > India's Leading B2B eMarketplace. > > > > > > > > ---------------------------(end of broadcast)--------------------------- > > TIP 3: if posting/reading through Usenet, please send an appropriate > > subscribe-nomail command to majordomo@postgresql.org so that your > > message can get through to the mailing list cleanly -- Rajesh Kumar Mallah, Project Manager (Development) Infocom Network Limited, New Delhi phone: +91(11)6152172 (221) (L) ,9811255597 (M) Visit http://www.trade-india.com , India's Leading B2B eMarketplace.