"Guillaume Smet" <guillaume.smet@gmail.com> writes:
> Here is a top output I had on november 17 when the server completely
> hangs (several minutes for each page of the website) and it is typical
> of this server behaviour:
> 17:08:41 up 19 days, 15:16, 1 user, load average: 4.03, 4.26, 4.36
> 288 processes: 285 sleeping, 3 running, 0 zombie, 0 stopped
> CPU states: cpu user nice system irq softirq iowait idle
> total 59.0% 0.0% 8.8% 0.2% 0.0% 0.0% 31.9%
> cpu00 52.3% 0.0% 13.3% 0.9% 0.0% 0.0% 33.3%
> cpu01 65.7% 0.0% 7.6% 0.0% 0.0% 0.0% 26.6%
> cpu02 58.0% 0.0% 7.6% 0.0% 0.0% 0.0% 34.2%
> cpu03 60.0% 0.0% 6.6% 0.0% 0.0% 0.0% 33.3%
> Mem: 3857224k av, 3495880k used, 361344k free, 0k shrd, 92160k buff
> 2374048k actv, 463576k in_d, 37708k in_c
> Swap: 4281272k av, 25412k used, 4255860k free 2173392k cached
> As you can see, load is blocked to 4, no iowait and cpu idle of 30%.
Can you try strace'ing some of the backend processes while the system is
behaving like this? I suspect what you'll find is a whole lot of
delaying select() calls due to high contention for spinlocks ...
regards, tom lane