ERROR: out of memory | with 23GB cached 7GB reserved on 30GB machine - Mailing list pgsql-performance

From Montana Low
Subject ERROR: out of memory | with 23GB cached 7GB reserved on 30GB machine
Date
Msg-id CAJ=goorxHNiQFtOkxP54RuTD3ZJzmK5pK0uz9GG833s6dUYC7Q@mail.gmail.com
Whole thread Raw
Responses Re: ERROR: out of memory | with 23GB cached 7GB reserved on 30GB machine
Re: ERROR: out of memory | with 23GB cached 7GB reserved on 30GB machine
List pgsql-performance
I'm running postgres-9.3 on a 30GB ec2 xen instance w/ linux kernel 3.16.3. I receive numerous Error: out of memory messages in the log, which are aborting client requests, even though there appears to be 23GB available in the OS cache.

There is no swap on the box. Postgres is behind pgbouncer to protect from the 200 real clients, which limits connections to 32, although there are rarely more than 20 active connections, even though postgres max_connections is set very high for historic reasons. There is also a 4GB java process running on the box.




relevant postgresql.conf:

max_connections = 1000                  # (change requires restart)
shared_buffers = 7GB                    # min 128kB
work_mem = 40MB                         # min 64kB
maintenance_work_mem = 1GB              # min 1MB
effective_cache_size = 20GB



sysctl.conf:

vm.swappiness = 0
vm.overcommit_memory = 2
kernel.shmmax=34359738368
kernel.shmall=8388608



log example:

ERROR:  out of memory
DETAIL:  Failed on request of size 67108864.
STATEMENT:  SELECT  "package_texts".* FROM "package_texts"  WHERE "package_texts"."id" = $1 LIMIT 1



example pg_top, showing 23GB available in cache:

last pid:  6607;  load avg:  3.59,  2.32,  2.61;       up 16+09:17:29 20:49:51
18 processes: 1 running, 17 sleeping
CPU states: 22.5% user,  0.0% nice,  4.9% system, 63.2% idle,  9.4% iowait
Memory: 29G used, 186M free, 7648K buffers, 23G cached
DB activity: 2479 tps,  1 rollbs/s, 217 buffer r/s, 99 hit%,  11994 row r/s, 3820 row w/s  
DB I/O:     0 reads/s,     0 KB/s,     0 writes/s,     0 KB/s  
DB disk: 149.8 GB total, 46.7 GB free (68% used)
Swap:



example top showing the only other significant 4GB process on the box:

top - 21:05:09 up 16 days,  9:32,  2 users,  load average: 2.73, 2.91, 2.88
Tasks: 147 total,   3 running, 244 sleeping,   0 stopped,   0 zombie
%Cpu(s): 22.1 us,  4.1 sy,  0.0 ni, 62.9 id,  9.8 wa,  0.0 hi,  0.7 si,  0.3 st
KiB Mem:  30827220 total, 30642584 used,   184636 free,     7292 buffers
KiB Swap:        0 total,        0 used,        0 free. 23449636 cached Mem

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND                   7407 postgres  20   0 7604928  10172   7932 S  29.6  0.0   2:51.27 postgres
10469 postgres  20   0 7617716 176032 160328 R  11.6  0.6   0:01.48 postgres
10211 postgres  20   0 7630352 237736 208704 S  10.6  0.8   0:03.64 postgres
18202 elastic+  20   0 8726984 4.223g   4248 S   9.6 14.4 883:06.79 java
9711 postgres  20   0 7619500 354188 335856 S   7.0  1.1   0:08.03 postgres
3638 postgres  20   0 7634552 1.162g 1.127g S   6.6  4.0   0:50.42 postgres
 

pgsql-performance by date:

Previous
From: Igor Neyman
Date:
Subject: Re: Query with large number of joins
Next
From: Tom Lane
Date:
Subject: Re: ERROR: out of memory | with 23GB cached 7GB reserved on 30GB machine