Thread: Memory exhausted (leak?)

Memory exhausted (leak?)

From
Aaron Birkland
Date:
I have a long but straightforward query (on some very large tables)
that always ends in 'Memory exhausted in AllocSetAlloc(108)'.   Even
stranger are some messages that appear in the logfile, such as the
following (edited for length, repetitions, etc):

TopMemoryContext: 32792 total in 4 blocks; 9712 free (2 chunks); 23080 used
TopTransactionContext: 8192 total in 1 blocks; 8176 free (0 chunks); 16 used
DeferredTriggerXact: 0 total in 0 blocks; 0 free (0 chunks); 0 used
TransactionCommandContext: 1040384 total in 7 blocks; 246544 free (11
chunks); 793840 used
HashTableContext: 8192 total in 1 blocks; 8176 free (0 chunks); 16 used
HashBatchContext: 616608 total in 4 blocks; 27784 free (1 chunks); 588824 used
HashTableContext: 8192 total in 1 blocks; 8176 free (0 chunks); 16 used
HashBatchContext: 1951696808 total in 244 blocks; 26536 free (473
chunks); 1951670272 used
HashTableContext: 8192 total in 1 blocks; 8176 free (0 chunks); 16 used
HashBatchContext: 287840 total in 2 blocks; 8176 free (0 chunks); 279664 used
PlanExprContext: 8192 total in 1 blocks; 8176 free (0 chunks); 16 used
MdSmgr: 24576 total in 2 blocks; 12696 free (0 chunks); 11880 used
DynaHash: 8192 total in 1 blocks; 4376 free (0 chunks); 3816 used
DynaHashTable: 8192 total in 1 blocks; 5080 free (0 chunks); 3112 used
DynaHashTable: 8192 total in 1 blocks; 6112 free (0 chunks); 2080 used
DynaHashTable: 8192 total in 1 blocks; 8176 free (0 chunks); 16 used
ErrorContext: 8192 total in 1 blocks; 8176 free (0 chunks); 16 used
ERROR:  Memory exhausted in AllocSetAlloc(108)
ERROR:  Query was cancelled.

When running the query, I observe the following:
- memory usage pf postgres stays stable for a few minutes, then
reaches a point where quickly grows to about 2GB, at which point the
error occurs
- no space in pgsql_temp is used at any time! even decreating the
sort_mem to 4MB

I saw a message previously in the lists where a similar problem
ocurred, that that appeared to stem from a "variable-length datum
whose length word is munged to look like a
ridicuously large value".  I don't think that's happening here, given
the steady growth to 2GB memory usage that occurs and given the fact
that I don't select or join on any varlena types.

Some facts about the nature of the query:
- All joins are done on Oid types, and only one column (an Oid type) is SELECTed
- the 'r1' table is huge, about 2.5x10^9 rows (and over 500 GB large)
- I expect it to return no more than a few hundred results

So.. Is there anything inherent in this execution plan that that would
require high memory usage outside of sort_mem?  The plan is below and
although it's big, it doesn't look too exotic or brain damaged to me
(Yes, those two index scans on 'r1' in the inner nested loops are
intentional).

I tried executing this query disabling hash or merge joins, and not
doing 'unique' and the same error ocurred.. (note: there was a
matrrialize step there)

postgresql version is 7.3.6

    -Aaron

EXPLAIN output below
--------------------------
Unique  (cost=278362.15..278362.26 rows=12 width=64)
   ->  Sort  (cost=278362.15..278362.20 rows=117 width=64)
         Sort Key: obj1.docid
         ->  Hash Join  (cost=278039.06..278361.34 rows=117 width=64)
               Hash Cond: ("outer".docid = "inner".refers)
               ->  Seq Scan on obj1  (cost=0.00..299.20 rows=22440 width=4)
               ->  Hash  (cost=278039.00..278039.00 rows=117 width=60)
                     ->  Hash Join  (cost=277702.73..278039.00
rows=117 width=60)
                           Hash Cond: ("outer".referrer = "inner".refers)
                           ->  Seq Scan on r2 r_1_2
(cost=0.00..308.81 rows=27161 width=8)
                           ->  Hash  (cost=277702.53..277702.53
rows=395 width=52)
                                 ->  Nested Loop
(cost=343.71..277702.53 rows=395 width=52)
                                       Join Filter: ("outer".refers =
"inner".referrer)
                                       ->  Nested Loop
(cost=343.71..258374.97 rows=1 width=44)
                                             ->  Hash Join
(cost=343.71..258371.80 rows=1 width=40)
                                                   Hash Cond:
("outer".refers = "inner".refers)
                                                   ->  Nested Loop
(cost=0.00..258011.22 rows=16859 width=28)
                                                         ->  Nested
Loop  (cost=0.00..238683.66 rows=1 width=20)
                                                               ->
Nested Loop  (cost=0.00..219356.09 rows=1 width=12)

->  Index Scan using obj4_word_map_wid_idx on obj4_word_map
(cost=0.00..213099.71 rows=1600 width=4)

    Index Cond: (wid = 19458::oid)

    Filter: ("type" = 4)

->  Index Scan using obj4_doc_idx on obj4  (cost=0.00..3.90 rows=1
width=8)

    Index Cond: ("outer".doc = obj4.descriptor)

    Filter: (text_field ~* 'ABC'::text)
                                                               ->
Index Scan using r1_referrer_idx on r1  (cost=0.00..19234.83
rows=16860 width=8)

Index Cond: (r1.referrer = "outer".objobj3t)
                                                         ->  Index
Scan using r1_referrer_idx on r1 r1_2_3  (cost=0.00..19234.83
rows=16860 width=8)
                                                               Index
Cond: ("outer".refers = r1_2_3.referrer)
                                                   ->  Hash
(cost=343.70..343.70 rows=23 width=12)
                                                         ->  Nested
Loop  (cost=0.00..343.70 rows=23 width=12)
                                                               ->
Index Scan using obj3_field1_idx on obj3  (cost=0.00..3.33 rows=1
width=4)

Index Cond: (field1 = 'XYZ'::text)
                                                               ->
Index Scan using r3_referrer_idx on r3 r_2_3  (cost=0.00..339.86
rows=94 width=8)

Index Cond: ("outer".docid = r_2_3.referrer)
                                             ->  Index Scan using
obj2_docid_idx on obj2  (cost=0.00..3.16 rows=1 width=4)
                                                   Index Cond:
(obj2.docid = "outer".referrer)
                                       ->  Index Scan using
r1_referrer_idx on r1 r1_1_2  (cost=0.00..19234.83 rows=16860 width=8)
                                             Index Cond:
("outer".referrer = r1_1_2.referrer)

Re: Memory exhausted (leak?)

From
"Joshua D. Drake"
Date:
> I tried executing this query disabling hash or merge joins, and not
> doing 'unique' and the same error ocurred.. (note: there was a
> matrrialize step there)
>
> postgresql version is 7.3.6

1. Are the row estimates accurate? If not increase your statistics.
2. Is the explain analyze different?
3. Do these tables receive a lot of update/deletes? Could you possibly
have index bloat?

Sincerely,

Joshua D. Drake



>
>     -Aaron
>
> EXPLAIN output below
> --------------------------
> Unique  (cost=278362.15..278362.26 rows=12 width=64)
>    ->  Sort  (cost=278362.15..278362.20 rows=117 width=64)
>          Sort Key: obj1.docid
>          ->  Hash Join  (cost=278039.06..278361.34 rows=117 width=64)
>                Hash Cond: ("outer".docid = "inner".refers)
>                ->  Seq Scan on obj1  (cost=0.00..299.20 rows=22440 width=4)
>                ->  Hash  (cost=278039.00..278039.00 rows=117 width=60)
>                      ->  Hash Join  (cost=277702.73..278039.00
> rows=117 width=60)
>                            Hash Cond: ("outer".referrer = "inner".refers)
>                            ->  Seq Scan on r2 r_1_2
> (cost=0.00..308.81 rows=27161 width=8)
>                            ->  Hash  (cost=277702.53..277702.53
> rows=395 width=52)
>                                  ->  Nested Loop
> (cost=343.71..277702.53 rows=395 width=52)
>                                        Join Filter: ("outer".refers =
> "inner".referrer)
>                                        ->  Nested Loop
> (cost=343.71..258374.97 rows=1 width=44)
>                                              ->  Hash Join
> (cost=343.71..258371.80 rows=1 width=40)
>                                                    Hash Cond:
> ("outer".refers = "inner".refers)
>                                                    ->  Nested Loop
> (cost=0.00..258011.22 rows=16859 width=28)
>                                                          ->  Nested
> Loop  (cost=0.00..238683.66 rows=1 width=20)
>                                                                ->
> Nested Loop  (cost=0.00..219356.09 rows=1 width=12)
>
> ->  Index Scan using obj4_word_map_wid_idx on obj4_word_map
> (cost=0.00..213099.71 rows=1600 width=4)
>
>     Index Cond: (wid = 19458::oid)
>
>     Filter: ("type" = 4)
>
> ->  Index Scan using obj4_doc_idx on obj4  (cost=0.00..3.90 rows=1
> width=8)
>
>     Index Cond: ("outer".doc = obj4.descriptor)
>
>     Filter: (text_field ~* 'ABC'::text)
>                                                                ->
> Index Scan using r1_referrer_idx on r1  (cost=0.00..19234.83
> rows=16860 width=8)
>
> Index Cond: (r1.referrer = "outer".objobj3t)
>                                                          ->  Index
> Scan using r1_referrer_idx on r1 r1_2_3  (cost=0.00..19234.83
> rows=16860 width=8)
>                                                                Index
> Cond: ("outer".refers = r1_2_3.referrer)
>                                                    ->  Hash
> (cost=343.70..343.70 rows=23 width=12)
>                                                          ->  Nested
> Loop  (cost=0.00..343.70 rows=23 width=12)
>                                                                ->
> Index Scan using obj3_field1_idx on obj3  (cost=0.00..3.33 rows=1
> width=4)
>
> Index Cond: (field1 = 'XYZ'::text)
>                                                                ->
> Index Scan using r3_referrer_idx on r3 r_2_3  (cost=0.00..339.86
> rows=94 width=8)
>
> Index Cond: ("outer".docid = r_2_3.referrer)
>                                              ->  Index Scan using
> obj2_docid_idx on obj2  (cost=0.00..3.16 rows=1 width=4)
>                                                    Index Cond:
> (obj2.docid = "outer".referrer)
>                                        ->  Index Scan using
> r1_referrer_idx on r1 r1_1_2  (cost=0.00..19234.83 rows=16860 width=8)
>                                              Index Cond:
> ("outer".referrer = r1_1_2.referrer)
>
> ---------------------------(end of broadcast)---------------------------
> TIP 5: Have you checked our extensive FAQ?
>
>                http://www.postgresql.org/docs/faqs/FAQ.html


--
Command Prompt, Inc., home of Mammoth PostgreSQL - S/ODBC and S/JDBC
Postgresql support, programming shared hosting and dedicated hosting.
+1-503-667-4564 - jd@commandprompt.com - http://www.commandprompt.com
Mammoth PostgreSQL Replicator. Integrated Replication for PostgreSQL

Attachment

Re: Memory exhausted (leak?)

From
Aaron Birkland
Date:
> 1. Are the row estimates accurate? If not increase your statistics.

No, they aren't accurate, though they are up to date.  I've found a
situation or two where there are definite patterns to the data that
tend to skew the gathered statistics, forcing estimates to be off by
an order of magnitude or more even at the highest resolution (1000),
and I suspect this will be such a case.    I'm giving it a try, though
(on such a large table, it'll take a while to finish..)  but that
should only affect which plan chosen, right?  I'm perfectly fine with
the plan, I'd just like it to not die!  Even if it does cause a
different plan to be chosen that works, that still doesn't answer the
question of why the earlier query failed.

> 2. Is the explain analyze different?

I cannot get to the point where explain analyze prints out results, as
'explain analyze' involves actually executing the query underneath.  I
discovered this problem while running an 'explain analyze', and when
that failed I ran it straight out in case something in 'explain
analyze' was the problem.  It wasn't.

> 3. Do these tables receive a lot of update/deletes? Could you possibly
> have index bloat?

Nope, the only actions done to them have been loading the data,
creating the index, analyzing, and various SELECTs.  The data's
relatively new, untouched and static :(  However, it took days to
create the index, and the index itself ended up being about 60GB
large, which I guess is reasonable on such a huge ginormous data set.
All the other queries I've run (involving slightly less complex plans)
haven't failed, and have executed in reasonable times.

One more interesting thing:  after the process failes and the
transaction is rolled back, the process still has a 2GB memory
footprint!  Don't most memory allocations palloc'd during query
execution get freed after the transaction is done?  That's not
happening here..

   -Aaron

Re: Memory exhausted (leak?)

From
Tom Lane
Date:
"Joshua D. Drake" <jd@commandprompt.com> writes:
> 1. Are the row estimates accurate? If not increase your statistics.

Presumably not, since he'd surely never overflow memory with hashes of
only a few hundred rows.

The hash join code is capable of splitting the table into multiple
segments, but it won't do so unless the planner's estimate of the
table size is larger than sort_mem.  So the real problem IMHO is
the bad rowcount estimate.

            regards, tom lane

Re: Memory exhausted (leak?)

From
Aaron Birkland
Date:
> The hash join code is capable of splitting the table into multiple
> segments, but it won't do so unless the planner's estimate of the
> table size is larger than sort_mem.  So the real problem IMHO is
> the bad rowcount estimate.

Ahh, I didn't know that.  I assumed that cost estimates would only
affect the query plan itself, not the implementation details of
individual steps.  That seems like a very plausible explanation of my
problems, especially since I *know* some of the estimates are way off.
  In my query plan, there is a hash join that expects to hash about
395 rows in its outer 'subplan' (for lack of a better term), which
consists of several nested joins (i.e. it's not a simple relation)..
that's at least two or three orders of magnitude off.  I'm still
ANALYZEing as we speak to see if I can get it any better, but that's a
pretty tough case  thing for any estimator to correctly assess.  It
seems somewhat dangerous to rely so heavily on cost estimates if the
consequences are not simply a bad query plan, but outright potential
for query failure.

That still leaves a few questions, though:

- why did the amount of memory used for the hash join exceed sort_mem?
 Is sort_mem merely a guideline for planning and not an enforced hard
limit?

- why wasn't the memory cleared after the transaction finished
(failed)?  The process was still 2GB large.  Does memory used for
sorts just stay around, possibly re-used for other sorts or hashes but
never de-allocated?

Re: Memory exhausted (leak?)

From
Tom Lane
Date:
Aaron Birkland <birkie@gmail.com> writes:
> - why did the amount of memory used for the hash join exceed sort_mem?
>  Is sort_mem merely a guideline for planning and not an enforced hard
> limit?

Yup.  The deal is that hash join has to decide in advance on how many
"batches" it wants to have (because it will divide up the hash key range
into that many sub-joins).  It sets the number of batches as
    (estimated total table size) / sort_mem
It will then soldier on trying to stuff one batch's worth of tuples into
memory, no matter how much larger or smaller than sort_mem that turns
out to be.

This doesn't usually present that much of a problem in practice, because
with customary values of sort_mem there is a lot of margin for error.
I am wondering if maybe you are using an optimistically large sort_mem
setting?

> - why wasn't the memory cleared after the transaction finished
> (failed)?  The process was still 2GB large.  Does memory used for
> sorts just stay around, possibly re-used for other sorts or hashes but
> never de-allocated?

That depends on the details of your malloc package and your kernel, but
on many Unixen a process never really shrinks.

            regards, tom lane