rebellious pg stats collector (reopened case) - Mailing list pgsql-admin

From Laszlo Nagy
Subject rebellious pg stats collector (reopened case)
Date
Msg-id 49479876.8040607@shopzeus.com
Whole thread Raw
Responses Re: rebellious pg stats collector (reopened case)  (Alvaro Herrera <alvherre@commandprompt.com>)
List pgsql-admin
PostgreSQL 8.3.5, the system is now stable (uptime > 10 days).
PostgreSQL stats collector uses 100% CPU forever:

On Thursday:

last pid: 29509;  load averages:  2.36,  2.01,
2.03
up 5+17:28:56  04:02:53
196 processes: 3 running, 184 sleeping, 9 zombie
CPU states:  5.3% user,  0.0% nice, 15.3% system,  0.0% interrupt, 79.4%
idle
Mem: 1009M Active, 5995M Inact, 528M Wired, 354M Cache, 214M Buf, 12M Free
Swap: 16G Total, 500K Used, 16G Free

 PID USERNAME       THR PRI NICE   SIZE    RES STATE  C   TIME   WCPU
COMMAND
26709 pgsql            1 106    0 22400K  6832K CPU6   6 973:52 99.02%
postgres

Today:

last pid: 71326;  load averages:  2.69,  3.15,
2.92
up 10+20:13:44  06:47:41
176 processes: 3 running, 166 sleeping, 7 zombie
CPU states:     % user,     % nice,     % system,     % interrupt,     %
idle
Mem: 928M Active, 5868M Inact, 557M Wired, 380M Cache, 214M Buf, 172M Free
Swap: 16G Total, 620K Used, 16G Free

  PID USERNAME       THR PRI NICE   SIZE    RES STATE  C   TIME   WCPU
COMMAND
44689 pgsql            1 107    0 22400K  7060K CPU6   6 748:07 99.02%
postgres
64221 www              1  96    0   158M 27628K select 5   0:18  4.20% httpd
68567 www              1  20    0   151M 23092K lockf  0   0:03  2.39% httpd

...

shopzeus# uptime
 6:48AM  up 10 days, 20:14, 1 user, load averages: 2.21, 3.01, 2.87

More than 10 hours on a dual-quad core Xeon 5420??? We have two
databases, total database size is about 15GB.
(The stats collector also uses significant disk I/O.)

Thursday:

# date
Thu Dec 11 04:05:00 EST 2008
# ls -l ~pgsql/data/
# ls -l ~pgsql/data/global/pgstat.stat
-rw-------  1 pgsql  pgsql  231673 Dec 10 12:01
/usr/local/pgsql/data/global/pgstat.stat

Today:

#date
Tue Dec 16 06:48:27 EST 2008
# cd ~pgsql/data
# ls -l global/pgstat.stat
-rw-------  1 pgsql  pgsql  232358 Dec 15 18:45 global/pgstat.stat

Looks like the pgstat.stat was not updated since the pg stats collector
(re)started.

#uname -a
FreeBSD shopzeus.com 7.0-RELEASE-p5 FreeBSD 7.0-RELEASE-p5 #0: Mon Nov
17 21:37:25 EST 2008
root@shopzeus.chello.hu:/usr/obj/usr/src/sys/SHOPZEUS  amd64

After restarting the postmaster, the process disappeares for a while
(some hours, sometimes for one day), then it start updating the stat
file correctly.

Please advise.

Thanks,

  Laszlo


pgsql-admin by date:

Previous
From: "Jaime Casanova"
Date:
Subject: Re: Urgente error in restore prod
Next
From: Alvaro Herrera
Date:
Subject: Re: rebellious pg stats collector (reopened case)