Re: CPU load spikes when CentOS tries to reclaim 'cached' memory - Mailing list pgsql-performance

From Merlin Moncure
Subject Re: CPU load spikes when CentOS tries to reclaim 'cached' memory
Date
Msg-id CAHyXU0xe3apSAg-UGuZ_qQU3feEodtEXntT2uNeP04fN-VQXvg@mail.gmail.com
Whole thread Raw
In response to Re: CPU load spikes when CentOS tries to reclaim 'cached' memory  (Deron <fecastle@gmail.com>)
Responses Re: CPU load spikes when CentOS tries to reclaim 'cached' memory  (Vincent Lasmarias <vlasmarias@contigo.com>)
List pgsql-performance
On Thu, Jun 5, 2014 at 2:47 PM, Deron <fecastle@gmail.com> wrote:
> We saw very similar issues with a CentOS server with 40 cores (32
> virtualized) when moving from a physical server to a virtual server (I think
> it had 128GB RAM).   Never had the problem on a physical server.  We checked
> the same things as noted here, but never found a bug.   We really thought it
> had something to do with NUMA zone reclaim, but could never prove that.
> In our case it was all kernel time in the guest, all CPUs at 100%.
> Sometimes it would last for a few seconds or minutes.  Sometimes we would go
> days without a problem, and then it would completely tank.
>
> If you figure out what is going on, I would like to know  (especially if it
> is virtualized).

There is a class of problems in virutalized enviroment that come from
over-aggressive reclaiming of memory from the guest to the host.  When
the guest tries to access the 'unpinned' memory it will manifest as
high latency memory reads and show up as high user time.  That may or
may not be the case here.

What we'd need from the OP to get a better diagnosis is:
*) top/sar output showing if the load average is due to high user,sys, or iowait
*) is/isnot virtualized as noted above
*) captured 'perf' snapshot during slowdown, particularly if we are
seeing high user space loads.  For example, we could be looking at
high spinlock activity (which seems unlikely given how the problem is
described but is something to rule out for sure).

merlin


pgsql-performance by date:

Previous
From: Deron
Date:
Subject: Re: CPU load spikes when CentOS tries to reclaim 'cached' memory
Next
From: Merlin Moncure
Date:
Subject: Re: High CPU load when 'free -m' shows low 'free' memory even though large 'cached' memory still available