Thread: FATAL 1: Memory exhausted in AllocSetAlloc()

FATAL 1: Memory exhausted in AllocSetAlloc()

From
"Thomas Heller"
Date:
Hi there!

The last db-crash (kernel panic) caused a little error on my db-files i
think. After the crash i had to recreate some indexes but no data were lost.
Everything is running quite ok now, except for one BIG error.

The postgres process crashes every now and then without giving any
information on any reasons. :( AND I can't use pg_dumpall, vacuumdb -z. All
the time I run a program that is supposed to dump my database or analyze it.
I get the following error from the program:

FATAL 1:  Memory exhausted in AllocSetAlloc()
pqReadData() -- backend closed the channel unexpectedly.
        This probably means the backend terminated abnormally
        before or while processing the request.
connection to server was lost
vacuumdb: vacuum failed
pg_dumpall: pg_dumpall failed

I'm running a postgres-7.0.3 on a p3-800 with 768mb ram. postgres is the
only thing running on that machine. The Database Directory is 500mb big. I
have several tables (some arround 5k entries and about 5 tables are >100k
entries). The DB is heavy loaded (linux load of 6-20 at main time) and I
plan to move it to a bigger machine. The only problem is that i can't dump
my db. :(

Somebody got any advices how I can dump or vacuum my db again?

thanks in advance,
thomas


Re: FATAL 1: Memory exhausted in AllocSetAlloc()

From
Tom Lane
Date:
"Thomas Heller" <th.heller@comtron.net> writes:
> [ after a system crash, I'm getting ]
> FATAL 1:  Memory exhausted in AllocSetAlloc()

An educated guess is that you have a corrupted tuple in some table,
within which a variable-length datum is corrupted in some way that
makes it appear to have a size of hundreds of megabytes (more than
your allowed process swap space, anyway).  VACUUM ANALYZE, or anything
else that tries to examine the corrupted datum, will start out by
trying to palloc() as much space as the datum's length word says it
needs.  Malloc failure leads to the above complaint and emergency
stop.  No (further) harm done to your table, fortunately.

What you want to do is narrow down the location of the bad tuple (or
tuples; there might be more than one) and then delete it.  It should
be easy enough to determine which table(s) contain bad data --- one
way is to VACUUM ANALYZE tables one at a time to see which ones fail.
To locate the bad tuple, one way is to do

    SELECT * FROM foo LIMIT n;

for various n and see which queries succeed and which don't.  With
a little bit of homing in, you can determine that the n'th tuple of
the table is broken.  Then you can do something like

    SELECT oid FROM foo OFFSET n-1 LIMIT 1;

    DELETE FROM foo WHERE oid = value-just-determined;

to zap the tuple without trying to examine its corrupted column.

Repeat if necessary until all tables are clean ...

            regards, tom lane