Thread: kswapd 100%, swap full, vm.swappiness=0

kswapd 100%, swap full, vm.swappiness=0

From

Scott Marlowe

Date:

07 October 2010, 15:11:20

Hardware:
48 core AMD Magny Cours (4x12)
128G 1333MHz memory
34 15k6 drives, 2 hot spares, rest in RAID-1 pairs, 1 set for OS, 4
for pg_xlog, rest for /data/base
LSI 8888 RAID controller
OS:
Ubuntu 10.04

uname -a
Linux bigassdbserver 2.6.32-24-generic #38-Ubuntu SMP Mon Jul 5
09:20:59 UTC 2010 x86_64 GNU/Linux

scheduler = noop for all drive sets.
Settings for sysctl.conf:
 vm.zone_reclaim_mode = 0
 kernel.shmmax = 33554432000
 kernel.shmall = 2097152000
 kernel.shmmni = 4096
 vm.swappiness = 0
 vm.dirty_ratio = 2
 vm.dirty_background_ratio = 1

$ free
             total       used       free     shared    buffers     cached
Mem:     131651412  104986524   26664888          0     910804   91170764
-/+ buffers/cache:   12904956  118746456
Swap:            0          0          0

(swap is now off with sudo swapoff -a, it fixed the problem)

It's twin, the read slave, looks like this:

$ free
             total       used       free     shared    buffers     cached
Mem:     131651412  110364700   21286712          0     702144   96771656
-/+ buffers/cache:   12890900  118760512
Swap:     25388024        940   25387084

So, this morning, the machine goes into 100% swap usage.  four kswapds
are running at 100% CPU in mostly D state.  Load climbs to 300.
Server gets a little slow.  Swapoff -a fixes it.

This makes no sense to me.  The machine had 90G+ in kernel cache, and
was NOT running out of memory in any way.  Swappiness is 0.

Any advice on this, reporting it to the kernel guys etc welcome.

--
To understand recursion, one must first understand recursion.

Re: kswapd 100%, swap full, vm.swappiness=0

From

Allan Kamau

Date:

07 October 2010, 16:47:00

On Thu, Oct 7, 2010 at 9:11 PM, Scott Marlowe <scott.marlowe@gmail.com> wrote:
> Hardware:
> 48 core AMD Magny Cours (4x12)
> 128G 1333MHz memory
> 34 15k6 drives, 2 hot spares, rest in RAID-1 pairs, 1 set for OS, 4
> for pg_xlog, rest for /data/base
> LSI 8888 RAID controller
> OS:
> Ubuntu 10.04
>
> uname -a
> Linux bigassdbserver 2.6.32-24-generic #38-Ubuntu SMP Mon Jul 5
> 09:20:59 UTC 2010 x86_64 GNU/Linux
>
> scheduler = noop for all drive sets.
> Settings for sysctl.conf:
>  vm.zone_reclaim_mode = 0
>  kernel.shmmax = 33554432000
>  kernel.shmall = 2097152000
>  kernel.shmmni = 4096
>  vm.swappiness = 0
>  vm.dirty_ratio = 2
>  vm.dirty_background_ratio = 1
>
> $ free
>             total       used       free     shared    buffers     cached
> Mem:     131651412  104986524   26664888          0     910804   91170764
> -/+ buffers/cache:   12904956  118746456
> Swap:            0          0          0
>
> (swap is now off with sudo swapoff -a, it fixed the problem)
>
> It's twin, the read slave, looks like this:
>
> $ free
>             total       used       free     shared    buffers     cached
> Mem:     131651412  110364700   21286712          0     702144   96771656
> -/+ buffers/cache:   12890900  118760512
> Swap:     25388024        940   25387084
>
> So, this morning, the machine goes into 100% swap usage.  four kswapds
> are running at 100% CPU in mostly D state.  Load climbs to 300.
> Server gets a little slow.  Swapoff -a fixes it.
>
> This makes no sense to me.  The machine had 90G+ in kernel cache, and
> was NOT running out of memory in any way.  Swappiness is 0.
>
> Any advice on this, reporting it to the kernel guys etc welcome.
>
> --
> To understand recursion, one must first understand recursion.
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>



My wild guess is that Ubuntu may be to blame. Try restarting PG and
chances are that it would not solve the problem, meaning that it is
most likely an OS issue. I had similar experiences on PostgreSQL
server hosted on Ubuntu. After a couple of days having the computer
running "free -g" would display no (or a very few) free GBs of RAM.
With Fedora I have not noticed this problem. For some reason I seem to
have issues with Ubuntu/Kubuntu but not Fedora.


Allan.

Re: kswapd 100%, swap full, vm.swappiness=0

From

Scott Marlowe

Date:

07 October 2010, 16:49:57

On Thu, Oct 7, 2010 at 1:46 PM, Allan Kamau <kamauallan@gmail.com> wrote:

> My wild guess is that Ubuntu may be to blame. Try restarting PG and
> chances are that it would not solve the problem, meaning that it is
> most likely an OS issue. I had similar experiences on PostgreSQL
> server hosted on Ubuntu. After a couple of days having the computer
> running "free -g" would display no (or a very few) free GBs of RAM.
> With Fedora I have not noticed this problem. For some reason I seem to
> have issues with Ubuntu/Kubuntu but not Fedora.

I definitely would tend to agree, but I'm more suspicious of a late
model kernel than the specific distro.  Note that this machine has 60
days of uptime with no behaviour like this before.  For now I'm just
running it with swap turned off.  It's got 128Gig of ram, if it runs
out of that I've got other problems. :)
--
To understand recursion, one must first understand recursion.