Thread: Memory settings, vm.overcommit, how to get it really safe?

Memory settings, vm.overcommit, how to get it really safe?

From

Hannes Dorbath

Date:

17 May 2007, 13:47:20

As probably many people I've been running PostgreSQL on Linux with
default overcommit settings and a lot of swap space for safety, though
disabling overcommit is recommend according to recent documentation.

PG's memory usage is not exactly predictable, for settings like work_mem
I always monitored production load and tried to find a safe compromise,
so that the box under typical load would never go into swap and on the
other hand users don't need to raise it too often just to get a few OLAP
queries perform OK.

What I'm trying now is to get a safe configuration for
vm.overcommit_memory = 2 and if possible run with much less or no swap
space.

On a clone box I disabled overcommit, lowered PG's memory settings a
bit, disabled swap, mirrored production load to it and monitored it how
it would behave. As I more or less expected, it got into trouble after
about 6 hours. All memory was exhausted, it was even unable to fork bash
again. To my surprise I haven't found any evidence of OOM going active
in the logs.

I blamed this behaviour to the swap space I've taken away, and not to
disabling overcommit. However I just enabled overcommit again and tried
to reproduce the behaviour. I was unable to get it into trouble again,
even with artificial high load.

Now I have a few questions:

1.) Why does it behave different when only changing overcommit? To my
understanding it should have run out of memory in both cases, or can PG
benefit from enabled overcommit? It's a minimal setup with PG being the
only one using any noticeable amount of resources.

2.) Is it possible at all to put a cap on the memory PG uses in total
from the OS side? kernel.shmmax, etc only limit some type of how PG
might use memory? Of cause excluding OS/FS buffers etc.

3.) Can PG be made to use it's own temp files when it runs out of memory
without setting memory settings so low that performance for typical load
will be worse? I think it would be nice, if I wouldn't need s lot of
swap, just to be safe under any load. Shouldn't that be more efficient
than using paged out memory anyway?


Currently it seems to me that I have to sacrifice the performance of
typical load, when disabling overcommit and / or reducing swap, as I
have to push PG's memory settings lower to be safe.

What might make my case a little bit more predictable is that the number
of backend processes / concurrent connections is fixed to 32. There will
never be more or less.


Thanks for any guidance / clarification.


--
Best regards,
Hannes Dorbath

Re: Memory settings, vm.overcommit, how to get it really safe?

From

Scott Marlowe

Date:

17 May 2007, 15:23:47

Hannes Dorbath wrote:
> As probably many people I've been running PostgreSQL on Linux with
> default overcommit settings and a lot of swap space for safety, though
> disabling overcommit is recommend according to recent documentation.
>
> PG's memory usage is not exactly predictable, for settings like work_mem
> I always monitored production load and tried to find a safe compromise,
> so that the box under typical load would never go into swap and on the
> other hand users don't need to raise it too often just to get a few OLAP
> queries perform OK.
>
> What I'm trying now is to get a safe configuration for
> vm.overcommit_memory = 2 and if possible run with much less or no swap
> space.
>

What distro / kernel version of linux are you running?  We have a
similar issue with late model hardware and RHEL4 recently here at work,
where our workstations are running out of memory.  They aren't running
postgresql, they're java dev workstations and it appears to be a RHEL4
on 64 bit problem, so that's why I ask.

Re: Memory settings, vm.overcommit, how to get it really safe?

From

Hannes Dorbath

Date:

17 May 2007, 15:37:00

Scott Marlowe wrote:
> What distro / kernel version of linux are you running?  We have a
> similar issue with late model hardware and RHEL4 recently here at work,
> where our workstations are running out of memory.  They aren't running
> postgresql, they're java dev workstations and it appears to be a RHEL4
> on 64 bit problem, so that's why I ask.

Linux 2.6.21-gentoo #2 SMP x86_64 Intel(R) Xeon(R) CPU 5130 GNU/Linux


--
Best regards,
Hannes Dorbath

Re: Memory settings, vm.overcommit, how to get it really safe?

From

Martijn van Oosterhout

Date:

17 May 2007, 15:58:16

On Thu, May 17, 2007 at 03:46:59PM +0200, Hannes Dorbath wrote:
> On a clone box I disabled overcommit, lowered PG's memory settings a
> bit, disabled swap, mirrored production load to it and monitored it how
> it would behave. As I more or less expected, it got into trouble after
> about 6 hours. All memory was exhausted, it was even unable to fork bash
> again. To my surprise I haven't found any evidence of OOM going active
> in the logs.

I think you are misunderstanding what overcommit does. Normally when
you're running programs and they fork(), the memory gets marked
copy-on-write. This data is only once in memory but if it is written by
one of the programs, they get they're own copy. Thus, it's memory
allocated but not actually used. Thus "overcommit".

Normally this is never a problem, but say that some unusual load
happens and every process with shared usage actually wants it' own
copy, and there's not enough memory+swap to hold it, something has to
give. Thus the OOM killer.

By disabling overcommit, all that happens is that in the above
situation, if the kernel sees it's overcomitting total memry+swap, it
returns ENOMEM instead. Thus instead of the unpredicatable OOM failure,
you get unpredicatable fork()/malloc()/exec() failure. For example you
can't start any more processes.

> I blamed this behaviour to the swap space I've taken away, and not to
> disabling overcommit. However I just enabled overcommit again and tried
> to reproduce the behaviour. I was unable to get it into trouble again,
> even with artificial high load.

The default setting under Linux with overcommit off are that the total
"allocated" pages in the system cannot exceed swap + 50% of memory. Thus
by removing the swap you severely limited the amount of memory that
could be used by programs. You need to give at least as much swap as
memory, otherwise you'll never get the most out of your machine.

> 1.) Why does it behave different when only changing overcommit? To my
> understanding it should have run out of memory in both cases, or can PG
> benefit from enabled overcommit? It's a minimal setup with PG being the
> only one using any noticeable amount of resources.

I hope I've answered you question above. Personally I don't disable
overcommit as I find OOM less irritating that not being able to login
when the machine is in trouble.

> 2.) Is it possible at all to put a cap on the memory PG uses in total
> from the OS side? kernel.shmmax, etc only limit some type of how PG
> might use memory? Of cause excluding OS/FS buffers etc.

You can make limit per process (see ulimit) and limit the number of
connections. If a postgres process runs out of memory, it aborts the
query.

> 3.) Can PG be made to use it's own temp files when it runs out of memory
> without setting memory settings so low that performance for typical load
> will be worse? I think it would be nice, if I wouldn't need s lot of
> swap, just to be safe under any load. Shouldn't that be more efficient
> than using paged out memory anyway?

Nope, letting the OS page is far more efficient than anything postgres
can do.

> Currently it seems to me that I have to sacrifice the performance of
> typical load, when disabling overcommit and / or reducing swap, as I
> have to push PG's memory settings lower to be safe.

Make lots and lots of swap. You'll probably never use it, but at least
it won't get in your way. I'd say 1-1.5 times your memory at least if
you want overcommit off.

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> From each according to his ability. To each according to his ability to litigate.

Attachment

signature.asc

Re: Memory settings, vm.overcommit, how to get it really safe?

From

Scott Marlowe

Date:

17 May 2007, 16:00:54

Hannes Dorbath wrote:
> Scott Marlowe wrote:
>
>> What distro / kernel version of linux are you running?  We have a
>> similar issue with late model hardware and RHEL4 recently here at work,
>> where our workstations are running out of memory.  They aren't running
>> postgresql, they're java dev workstations and it appears to be a RHEL4
>> on 64 bit problem, so that's why I ask.
>>
>
> Linux 2.6.21-gentoo #2 SMP x86_64 Intel(R) Xeon(R) CPU 5130 GNU/Linux
>

I wonder if you could try it with the Uniprocessor kernel and see if
your problem goes away.

FYI, the machines we're having the problem with are RHEL4 with a kernel of:

Linux 2.6.9-55.ELsmp #1 SMP Fri Apr 20 16:36:54 EDT 2007 x86_64 x86_64
x86_64 GNU/Linux

Note that 2.6.9-55 in RHEL speak is probably closer to 2.6.21 than
2.6.9, since they back port tons of stuff, but keep the same version number.

Re: Memory settings, vm.overcommit, how to get it really safe?

From

Hannes Dorbath

Date:

17 May 2007, 16:12:07

Martijn van Oosterhout wrote:
> Make lots and lots of swap. You'll probably never use it, but at least
> it won't get in your way. I'd say 1-1.5 times your memory at least if
> you want overcommit off.

Thanks for your detailed explanations. I indeed misunderstood overcommit
with shared memory.

So I just keep what I always had -- lots of swap and overcommit off.

--
Best regards,
Hannes Dorbath

Re: Memory settings, vm.overcommit, how to get it really safe?

From

Hannes Dorbath

Date:

17 May 2007, 16:13:22

Hannes Dorbath wrote:
> So I just keep what I always had -- lots of swap and overcommit off.

Ehrm, overcommit on I mean.


--
Best regards,
Hannes Dorbath

Re: Memory settings, vm.overcommit, how to get it really safe?

From

Florian Weimer

Date:

18 May 2007, 13:24:20

* Scott Marlowe:

> What distro / kernel version of linux are you running?  We have a
> similar issue with late model hardware and RHEL4 recently here at
> work, where our workstations are running out of memory.  They aren't
> running postgresql, they're java dev workstations and it appears to be
> a RHEL4 on 64 bit problem, so that's why I ask.

When Java sees that your machine has got plenty of RAM and more than
one CPU, it assumes that it's a server and you want to run just a
single VM, and configures itself to use a fair chunk of available RAM.

This is more or less a Sun-specific issue.  Other Java implementations
make different choices.

--
Florian Weimer                <fweimer@bfk.de>
BFK edv-consulting GmbH       http://www.bfk.de/
Kriegsstraße 100              tel: +49-721-96201-1
D-76133 Karlsruhe             fax: +49-721-96201-99

Re: Memory settings, vm.overcommit, how to get it really safe?

From

Scott Marlowe

Date:

18 May 2007, 15:57:28

Florian Weimer wrote:
> * Scott Marlowe:
>
>
>> What distro / kernel version of linux are you running?  We have a
>> similar issue with late model hardware and RHEL4 recently here at
>> work, where our workstations are running out of memory.  They aren't
>> running postgresql, they're java dev workstations and it appears to be
>> a RHEL4 on 64 bit problem, so that's why I ask.
>>
>
> When Java sees that your machine has got plenty of RAM and more than
> one CPU, it assumes that it's a server and you want to run just a
> single VM, and configures itself to use a fair chunk of available RAM.
>
> This is more or less a Sun-specific issue.  Other Java implementations
> make different choices.
Yeah, but these boxes run out of free mem, buffer, cache and swap.  On a
machine with 4 gigs ram and 2 gigs swap, something is seriously wrong
when all that ram just disappears, and it isn't just with java apps,
though most of what the developers run are java apps / servers.  We've
had a machine with mozilla (no java extension in it) run out of memory
just sitting idle.