Re: [GENERAL] core system is getting unresponsive because over 300cpu load - Mailing list pgsql-general

From Andres Freund
Subject Re: [GENERAL] core system is getting unresponsive because over 300cpu load
Date
Msg-id 20171011001806.n7biw2lps2iq3yt7@alap3.anarazel.de
Whole thread Raw
In response to [GENERAL] core system is getting unresponsive because over 300 cpu load  (pinker <pinker@onet.eu>)
Responses Re: [GENERAL] core system is getting unresponsive because over 300 cpu load
List pgsql-general
Hi,

On 2017-10-10 13:40:07 -0700, pinker wrote:
> and the total number of connections are increasing very fast (but I suppose
> it's the symptom not the root cause of cpu load) and exceed max_connections
> (1000).

Others mentioned already that that's worth improving.

> System:
> * CentOS Linux release 7.2.1511 (Core) 
> * Linux 3.10.0-327.36.3.el7.x86_64 #1 SMP Mon Oct 24 16:09:20 UTC 2016
> x86_64 x86_64 x86_64 GNU/Linux

Some versions of this kernel have had serious problems with transparent
hugepages. I'd try turning that off. I think it defaults to off even in
that version, but also make sure zone_reclaim_mode is disabled.


> * postgresql95-9.5.5-1PGDG.rhel7.x86_64
> * postgresql95-contrib-9.5.5-1PGDG.rhel7.x86_64
> * postgresql95-docs-9.5.5-1PGDG.rhel7.x86_64
> * postgresql95-libs-9.5.5-1PGDG.rhel7.x86_64
> * postgresql95-server-9.5.5-1PGDG.rhel7.x86_64
> 
> * 4 sockets/80 cores

9.6 has quite some scalability improvements over 9.5. I don't know
whether it's feasible for you to update, but if so, It's worth trying.

How about taking perf profile to investigate?


> * vm.dirty_background_bytes = 0
> * vm.dirty_background_ratio = 2
> * vm.dirty_bytes = 0
> * vm.dirty_expire_centisecs = 3000
> * vm.dirty_ratio = 20
> * vm.dirty_writeback_centisecs = 500

I'd suggest monitoring /proc/meminfo for the amount of Dirty and
Writeback memory, and see whether rapid changes therein coincide with
periodds of slowdown.


Greetings,

Andres Freund


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

pgsql-general by date:

Previous
From: Tomas Vondra
Date:
Subject: Re: [GENERAL] core system is getting unresponsive because over 300cpu load
Next
From: pinker
Date:
Subject: Re: [GENERAL] core system is getting unresponsive because over 300 cpu load