Re: PG Killed by OOM Condition - Mailing list pgsql-hackers

From daveg
Subject Re: PG Killed by OOM Condition
Date
Msg-id 20051025055217.GD8157@sonic.net
Whole thread Raw
In response to Re: PG Killed by OOM Condition  (Bruno Wolff III <bruno@wolff.to>)
Responses Re: PG Killed by OOM Condition
Re: PG Killed by OOM Condition
List pgsql-hackers
On Mon, Oct 24, 2005 at 11:26:52PM -0500, Bruno Wolff III wrote:
> On Mon, Oct 24, 2005 at 23:55:07 -0400,
>   mark@mark.mielke.cc wrote:
> > On Mon, Oct 24, 2005 at 10:20:39PM -0500, Bruno Wolff III wrote:
> > > On Mon, Oct 03, 2005 at 23:03:06 +1000,
> > >   John Hansen <john@geeknet.com.au> wrote:
> > > > Good people,
> > > > Just had a thought!
> > > > Might it be worth while protecting the postmaster from an OOM Kill on
> > > > Linux by setting /proc/{pid}/oom_adj to -17 ?
> > > > (Described vaguely in mm/oom_kill.c)
> > > Wouldn't it be better to use sysctl to tell the kernel not to over commit
> > > memory in the first place?
> > 
> > Only if you don't have large processes in your system that fork()
> > frequently, pushing the reserved memory over the limit, preventing
> > PostgreSQL from allocating memory when it does need it, even though
> > copy-on-write allows plenty of memory to continue to be available -
> > it is just reserved... :-)
> > 
> > There isn't a perfect answer.
> 
> No, but I would think tying up some disk space as swap space would be a
> better solution. The linux oom killer is really dangerous.

I work with a client that runs 16Gb memory with 16Gb of swap on dual opterons
dedicated to postgres. They have large tables and like hash joins as they are
often the fastest way to a result, so work_mem is set fairly large. Sometimes
postgres is very inaccurate predicting real memory use verses work_mem and
will grow very much larger than expected. Which can result in two or more
postgres processes with over 10 Gb of virtual memory along with the usual 60
or so normal sized ones. 

When this happens the machine runs out of memory and swap. Without the oom
killer it simply hangs the machine which is inconvenient as it is at a remote
location. The oom killer usually lets the machine recover and postgres restart
without a hard reboot.

A solution is to use ulimit to set the maximum memory available to a
process. Ideally this would be a pg_ctl or postmaster option so that all the
forked postgresql processes would inherit the ulimit. The advantage over the
oom killer is that only the overly large process fails, and it fails with an
out of memory error and exits cleanly as opposed to having the whole set
of backends restarted.

-dg

-- 
David Gould                      daveg@sonic.net
If simplicity worked, the world would be overrun with insects.


pgsql-hackers by date:

Previous
From: Bruno Wolff III
Date:
Subject: Re: PG Killed by OOM Condition
Next
From: Jeff Davis
Date:
Subject: Re: PG Killed by OOM Condition