Thread: reprise on Linux overcommit handling

reprise on Linux overcommit handling

From

"Andrew Dunstan"

Date:

22 July 2003, 18:37:21

The current developer docs say this:

-------------------
Linux has poor default memory overcommit behavior. Rather than failing if it
can not reserve enough memory, it returns success, but later fails when the
memory can't be mapped and terminates the application with kill -9. To
prevent unpredictable process termination, use:
 sysctl -w vm.overcommit_memory=3
---------------------

This would be true if the kernel being used had the paranoid mode compiled
in. This is not true, AFAICS, of either the stock 2.4 kernels nor of the
latest RH kernels. It is true of 2.4.21 *with* the -ac4 (and posibly earlier
-ac*) patch. In fact, Alan's patch apparently allows tuning of the amount of
overcommitting allowed. As I read the kernel source I got from RH today
(2.4.20-19.9), doing this will in fact make the kernel freely allow
overcommiting of memory, rather than it trying in a rather unsatisfactory
way to avoid it. IOW, with many kernels the advice would make things worse,
not better - e.g. the RH source says this in mm/mmap.c:
       if (sysctl_overcommit_memory)           return 1;


Rather than give bad advice, it might be better to advise users (1) to run
Pg on machines that are likely to be stable and not run into OOM situations,
and (2) to check with their vendors about proper overcommit handling.

Personally, my advice would be to avoid Linux for mission critical apps
until this is fixed, but that's just my opinion, and I'm happily developing
on Linux, albeit for something that is not mission critical.

cheers

andrew

Re: reprise on Linux overcommit handling

From

Ang Chin Han

Date:

23 July 2003, 00:22:52

Andrew Dunstan wrote:

> Rather than give bad advice, it might be better to advise users (1) to run
> Pg on machines that are likely to be stable and not run into OOM situations,
> and (2) to check with their vendors about proper overcommit handling.

Would it be possible (or trivial?) to write a small C program to test 
for memory overcommit behaviour? Might be useful to put in contrib, and 
mention it in the Admin docs. There are just too many Linux variants and 
settings to be sure of what exactly the memory overcommit policy is for 
a particular kernel and distribution.

Linux 2.6 will apparently behave better. I guess they have learnt the 
lesson. :)

http://kniggit.net/wwol26.html (Under "Other Improvements").

-- 
Linux homer 2.4.18-14 #1 Wed Sep 4 13:35:50 EDT 2002 i686 i686 i386 
GNU/Linux 10:30am  up 209 days,  1:35,  5 users,  load average: 5.08, 5.08, 5.08

Re: reprise on Linux overcommit handling

From

Bruce Momjian

Date:

23 July 2003, 00:24:50

Thanks.  Interesting.  Hard to imagine what they were thinking when they
put this code in.

---------------------------------------------------------------------------

Andrew Dunstan wrote:
> The current developer docs say this:
> 
> -------------------
> Linux has poor default memory overcommit behavior. Rather than failing if it
> can not reserve enough memory, it returns success, but later fails when the
> memory can't be mapped and terminates the application with kill -9. To
> prevent unpredictable process termination, use:
> 
>   sysctl -w vm.overcommit_memory=3
> ---------------------
> 
> This would be true if the kernel being used had the paranoid mode compiled
> in. This is not true, AFAICS, of either the stock 2.4 kernels nor of the
> latest RH kernels. It is true of 2.4.21 *with* the -ac4 (and posibly earlier
> -ac*) patch. In fact, Alan's patch apparently allows tuning of the amount of
> overcommitting allowed. As I read the kernel source I got from RH today
> (2.4.20-19.9), doing this will in fact make the kernel freely allow
> overcommiting of memory, rather than it trying in a rather unsatisfactory
> way to avoid it. IOW, with many kernels the advice would make things worse,
> not better - e.g. the RH source says this in mm/mmap.c:
> 
>         if (sysctl_overcommit_memory)
>             return 1;
> 
> 
> Rather than give bad advice, it might be better to advise users (1) to run
> Pg on machines that are likely to be stable and not run into OOM situations,
> and (2) to check with their vendors about proper overcommit handling.
> 
> Personally, my advice would be to avoid Linux for mission critical apps
> until this is fixed, but that's just my opinion, and I'm happily developing
> on Linux, albeit for something that is not mission critical.
> 
> cheers
> 
> andrew
> 
> 
> 
> 
> ---------------------------(end of broadcast)---------------------------
> TIP 3: if posting/reading through Usenet, please send an appropriate
>       subscribe-nomail command to majordomo@postgresql.org so that your
>       message can get through to the mailing list cleanly
> 

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073

Re: reprise on Linux overcommit handling

From

Kevin Brown

Date:

23 July 2003, 03:02:38

Bruce Momjian wrote:
> 
> Thanks.  Interesting.  Hard to imagine what they were thinking when they
> put this code in.

Way back in the day, when dinosaurs ruled the earth, or at least the
server room, many applications were written with rather bad memory
allocation semantics: they'd grab a bunch of memory and not necessarily use
it for anything.  Typically you could specify a maximum memory
allocation amount for the program but the problem was that it would grab
exactly that amount, and it's obviously better for it to be a bit more
dynamic.

That in itself isn't a terribly bad thing ... if you have enough actual
memory to deal with it.

Problem is, back then most systems didn't have enough memory to deal
with multiple programs behaving that way.

Overcommit was designed to account for that behavior.  It's not ideal at
all but it's better to have that option than not.

Overcommit isn't really necessary today because of the huge amount of
memory that you can put into a system for cheap (HP servers excluded,
they want some serious cash for memory).

-- 
Kevin Brown                          kevin@sysexperts.com