Re: Backends dying due to memory exhaustion--I'm stonkered - Mailing list pgsql-general

From Tom Lane
Subject Re: Backends dying due to memory exhaustion--I'm stonkered
Date
Msg-id 13616.980554899@sss.pgh.pa.us
Whole thread Raw
In response to Backends dying due to memory exhaustion--I'm stonkered  (Doug McNaught <doug@wireboard.com>)
List pgsql-general
Doug McNaught <doug@wireboard.com> writes:
> The problem I'm having is that the backends will crash randomly, after
> the database has been up for a few days, with:
> FATAL 1:  Memory exhausted in AllocSetAlloc()

> The system has plenty of memory and swap, and under normal
> circumstances the backends take up 10-15 megabytes.  If it's a
> runaway situation of some kind, it happens very fast, as I've even
> taken snapshots of the process table at 1 minute intervals, and they
> show no abnormality right up to the time of the crash.

Hmm.  That puts a damper on the idea that it's a memory leak --- doesn't
eliminate the theory entirely, however.  The other likely theory is that
you've got a variable-size column value someplace whose size word has
been corrupted, so that it claims to be umpteen megabytes long.  Any
attempt to copy such a value out of the tuple it's in will result in
an instant "out of memory" complaint.

Is there any consistency about which table is being touched when the
failure occurs?  It's not hard to isolate and delete a damaged tuple
once you know which table it's in, but if you've got a lot of tables
the initial search can be tedious.

One way to get more info is to tweak the code to abort() just before
it would normally report the out-of-memory error.  Then you will get
a coredump and can learn something from the backtrace (don't forget
to compile with -g).

            regards, tom lane

pgsql-general by date:

Previous
From: "George Johnson"
Date:
Subject: high level specs on PL ?
Next
From: Peter Eisentraut
Date:
Subject: Re: high level specs on PL ?