Re: Memory Usage and OpenBSD - Mailing list pgsql-general

From Jeff Ross
Subject Re: Memory Usage and OpenBSD
Date
Msg-id 4B72FCF0.1060504@wykids.org
Whole thread Raw
In response to Re: Memory Usage and OpenBSD  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Memory Usage and OpenBSD  (Greg Smith <greg@2ndquadrant.com>)
List pgsql-general
Tom Lane wrote:
Martijn van Oosterhout <kleptog@svana.org> writes: 
On Tue, Feb 09, 2010 at 08:19:51PM +0500, Anton Maksimenkov wrote:   
Can anybody briefly explain me how one postgres process allocate
memory for it needs?     
 
There's no real maximum, as it depends on the exact usage. However, in
general postgres tries to keep below the values in work_mem and
maintainence_workmem. Most of the allocations are quite small, but
postgresql has an internal allocator which means that the system only
sees relatively large allocations. The majority will be in the order of
tens of kilobytes I suspect.   
IIRC, the complaint that started this thread was about a VACUUM command
failing.  Plain VACUUM will in fact start out by trying to acquire a
single chunk of size maintenance_work_mem.  (On a small table it might
not be so greedy, but on a large table it will do that.)  So you
probably shouldn't ever try to set that value as large as 1GB if you're
working in a 32-bit address space.  You could maybe do it if you've kept
shared_buffers small, but that seems like the wrong performance tradeoff
in most cases ...
		regards, tom lane
 

That would have been my original message.

I've been running a series of pgbench test on an i386 dual processor XEON server with 4G of ram and a RAID10 disk on a LSI MegaRAIDw/BBU controller.  I fixed the original problem by re-enabling better login.conf values for the postgresql user.

I ran the pgtune wizard and started with the settings I got from that.  On i386 OpenBSD the recommended settings are far too large and cause  a kernel panic in short order.  Here are the settings that pgtune gives for -T web and -c 200:

maintenance_work_mem = 240MB # pgtune wizard 2010-02-10
effective_cache_size = 2816MB # pgtune wizard 2010-02-10
work_mem = 18MB # pgtune wizard 2010-02-10
wal_buffers = 4MB # pgtune wizard 2010-02-10
checkpoint_segments = 8 # pgtune wizard 2010-02-10
shared_buffers = 960MB # pgtune wizard 2010-02-10
max_connections = 200 # pgtune wizard 2010-02-10

I've been whittling that back and have got down to this:

maintenance_work_mem = 240MB # pgtune wizard 2010-01-27
checkpoint_completion_target = 0.7 # pgtune wizard 2010-01-27
effective_cache_size = 2816MB # pgtune wizard 2010-01-27
work_mem = 18MB # pgtune wizard 2010-01-27
wal_buffers = 4MB # pgtune wizard 2010-01-27
checkpoint_segments = 8 # pgtune wizard 2010-01-27
full_page_writes = off
synchronous_commit = off  
max_connections = 100
shared_buffers = 250MB # pgtune wizard 2010-01-27
work_mem = 64MB
temp_buffers = 32MB
checkpoint_segments = 32

Additionally, in OpenBSD's sysctl.conf I have this set:
kern.maxproc=10240
kern.maxfiles=20480

kern.shminfo.shmseg=32
kern.seminfo.semmni=256
kern.seminfo.semmns=2048
kern.shminfo.shmmax=283115520
kern.maxvnodes=6000
kern.bufcachepercent=70

The kern.shminfo.shmmax value is just enought to let postgresql start.  kern.bufcachepercent=70 matches the effective_cache_size value.

pgbench is run with this:
pgbench -h varley.openvistas.net -U _postgresql -t 20000 -c $SCALE pgbench
with scale starting at 10 and then incrementing by 10.  I call it three times for each scale.  I've turned on logging to 'all' to try and help figure out where the system panics, so that may lower the TPS somewhat but I have not been very favorably impressed with the speed of these U320 15K disks in RAID10 yet.

Scale 10: 
tps = 644.152616 (including connections establishing)
tps = 644.323919 (excluding connections establishing)

tps = 644.032366 (including connections establishing)
tps = 644.219732 (excluding connections establishing)

tps = 659.320222 (including connections establishing)
tps = 659.506025 (excluding connections establishing)
Scale 20:

tps = 643.830650 (including connections establishing)
tps = 644.001003 (excluding connections establishing)
tps = 631.357346 (including connections establishing)
tps = 631.538591 (excluding connections establishing)

tps = 629.035682 (including connections establishing)
tps = 629.245788 (excluding connections establishing)
Scale 30:
tps = 571.640243 (including connections establishing)
tps = 571.777080 (excluding connections establishing)

tps = 565.742963 (including connections establishing)
tps = 565.888874 (excluding connections establishing)

tps = 564.058710 (including connections establishing)
tps = 564.203138 (excluding connections establishing)
Scale 40:

tps = 525.018290 (including connections establishing)
tps = 525.132745 (excluding connections establishing)

tps = 515.277398 (including connections establishing)
tps = 515.419313 (excluding connections establishing)


tps = 513.006317 (including connections establishing)
tps = 513.129971 (excluding connections establishing)
Scale 50:
tps = 468.323275 (including connections establishing)
tps = 468.415751 (excluding connections establishing)

tps = 453.100701 (including connections establishing)
tps = 453.201980 (excluding connections establishing)


tps = 461.739929 (excluding connections establishing)
tps = 461.587221 (including connections establishing)

Scale 60:
tps = 450.277550 (including connections establishing)
tps = 450.365946 (excluding connections establishing)

tps = 453.268713 (including connections establishing)
tps = 453.363862 (excluding connections establishing)

tps = 448.965514 (including connections establishing)
tps = 449.060461 (excluding connections establishing)

At Scale 70, the kernel panics with a 
panic: malloc: out of space in kmem_map
error.

The last few lines of the logs before the panic reveal nothing out of the ordinary to me:

2010-02-10 10:58:07.863133500 172.16.0.1(43152):_postgresql@pgbench:[16586]:LOG:  statement: UPDATE pgbench_tellers SET tbalance = tbalance + -4197 WHERE tid = 328;
2010-02-10 10:58:07.863139500 172.16.0.1(40518):_postgresql@pgbench:[25686]:LOG:  statement: UPDATE pgbench_accounts SET abalance = abalance + 1254 WHERE aid = 3832418;
2010-02-10 10:58:07.863150500 172.16.0.1(25655):_postgresql@pgbench:[4335]:LOG:  statement: SELECT abalance FROM pgbench_accounts WHERE aid = 208539;
2010-02-10 10:58:07.863156500 172.16.0.1(25655):_postgresql@pgbench:[4335]:LOG:  duration: 0.532 ms
2010-02-10 10:58:07.863161500 172.16.0.1(40496):_postgresql@pgbench:[4200]:LOG:  duration: 23.825 ms
2010-02-10 10:58:07.863178500 172.16.0.1(15183):_postgresql@pgbench:[12518]:LOG:  statement: INSERT INTO pgbench_history (tid, bid, aid, delta, mtime) VALUES (268, 38, 4052674, 2403, CURRENT_TIMESTAMP);
2010-02-10 10:58:07.863184500 172.16.0.1(15183):_postgresql@pgbench:[12518]:LOG:  duration: 0.396 ms


I have not yet tried a connection pooler since the panic happens with only 70 clients connected but just for fun I'm going to install pgbouncer and run the set again.

I've been told that the amd64 OpenBSD will not have this problem, however, the amd64 kernel will not run on this particular server.  I think that means that I'm going to be shopping for an Opteron based server before long.  

Thanks to all!

Jeff Ross 

pgsql-general by date:

Previous
From: Tom Lane
Date:
Subject: Re: Logging statement/duration on the same line
Next
From: Baron Schwartz
Date:
Subject: Re: Logging statement/duration on the same line