On 05/25/2011 03:01 AM, John R Pierce wrote:
> On 05/24/11 5:50 PM, Andrej wrote:
>> Add more RAM? Look at tunables for other processes on
>> the machine? At the end of the day making the kernel shoot
>> anything out of despair shouldn't be the done thing.
>
> somehow, 'real' unix has neither a OOMkiller nor does it flat out die
> under heavy loads, it just degrades gracefully. I've seen Solaris and
> AIX and BSD servers happily chugging along with load factors in the
> 100s, significant portions of memory paging, etc, without completely
> crumbling to a halt. Soimetimes I wonder why Linux even pretends to
> support virtual memory, as you sure don't want it to be paging.
>
>
http://developers.sun.com/solaris/articles/subprocess/subprocess.html
"Some operating systems (such as Linux, IBM AIX, and HP-UX) have a
feature called memory overcommit (also known as lazy swap allocation).
In a memory overcommit mode, malloc() does not reserve swap space and
always returns a non-NULL pointer, regardless of whether there is enough
VM on the system to support it or not.
The memory overcommit feature has advantages and disadvantages."
(the page goes on with some interesting info) [*]
It appears by your definition that neither Linux, AIX nor HP-UX are
'real' Unix. Oh, wait, FreeBSD overcommits, too, so can't be 'real' either.
/me wonders now what a 'real' Unix is. :) Must be something related with
'true' SysV derivatives. If memory serves me well, that's where the word
'thrashing' originated, right? Actually in my experience nothing
'thrashes' better than a SysV, Solaris included.
The solution for the OP problem is to keep the system from reaching OOM
state in the first place. That is necessary even with overcommitting
turned off. PG not performing its job because malloc() keeps failing
isn't really a "solution".
.TM.
[*] One missing piece is that overcommitting actually prevents or delays
OOM state. The article does mention "system memory can be used more
flexibly and efficiently" w/o really elaborating further. It means that,
given the same amount of memory (RAM+swap), a non overcommitting system
reaches OOM way before than a overcommitting one. Also it is rarely a
good idea, when running low on memory, to switch to an allocation policy
that is _less_ efficient, memory wise.