Thread: Memory exhausted (leak?)
I have a long but straightforward query (on some very large tables) that always ends in 'Memory exhausted in AllocSetAlloc(108)'. Even stranger are some messages that appear in the logfile, such as the following (edited for length, repetitions, etc): TopMemoryContext: 32792 total in 4 blocks; 9712 free (2 chunks); 23080 used TopTransactionContext: 8192 total in 1 blocks; 8176 free (0 chunks); 16 used DeferredTriggerXact: 0 total in 0 blocks; 0 free (0 chunks); 0 used TransactionCommandContext: 1040384 total in 7 blocks; 246544 free (11 chunks); 793840 used HashTableContext: 8192 total in 1 blocks; 8176 free (0 chunks); 16 used HashBatchContext: 616608 total in 4 blocks; 27784 free (1 chunks); 588824 used HashTableContext: 8192 total in 1 blocks; 8176 free (0 chunks); 16 used HashBatchContext: 1951696808 total in 244 blocks; 26536 free (473 chunks); 1951670272 used HashTableContext: 8192 total in 1 blocks; 8176 free (0 chunks); 16 used HashBatchContext: 287840 total in 2 blocks; 8176 free (0 chunks); 279664 used PlanExprContext: 8192 total in 1 blocks; 8176 free (0 chunks); 16 used MdSmgr: 24576 total in 2 blocks; 12696 free (0 chunks); 11880 used DynaHash: 8192 total in 1 blocks; 4376 free (0 chunks); 3816 used DynaHashTable: 8192 total in 1 blocks; 5080 free (0 chunks); 3112 used DynaHashTable: 8192 total in 1 blocks; 6112 free (0 chunks); 2080 used DynaHashTable: 8192 total in 1 blocks; 8176 free (0 chunks); 16 used ErrorContext: 8192 total in 1 blocks; 8176 free (0 chunks); 16 used ERROR: Memory exhausted in AllocSetAlloc(108) ERROR: Query was cancelled. When running the query, I observe the following: - memory usage pf postgres stays stable for a few minutes, then reaches a point where quickly grows to about 2GB, at which point the error occurs - no space in pgsql_temp is used at any time! even decreating the sort_mem to 4MB I saw a message previously in the lists where a similar problem ocurred, that that appeared to stem from a "variable-length datum whose length word is munged to look like a ridicuously large value". I don't think that's happening here, given the steady growth to 2GB memory usage that occurs and given the fact that I don't select or join on any varlena types. Some facts about the nature of the query: - All joins are done on Oid types, and only one column (an Oid type) is SELECTed - the 'r1' table is huge, about 2.5x10^9 rows (and over 500 GB large) - I expect it to return no more than a few hundred results So.. Is there anything inherent in this execution plan that that would require high memory usage outside of sort_mem? The plan is below and although it's big, it doesn't look too exotic or brain damaged to me (Yes, those two index scans on 'r1' in the inner nested loops are intentional). I tried executing this query disabling hash or merge joins, and not doing 'unique' and the same error ocurred.. (note: there was a matrrialize step there) postgresql version is 7.3.6 -Aaron EXPLAIN output below -------------------------- Unique (cost=278362.15..278362.26 rows=12 width=64) -> Sort (cost=278362.15..278362.20 rows=117 width=64) Sort Key: obj1.docid -> Hash Join (cost=278039.06..278361.34 rows=117 width=64) Hash Cond: ("outer".docid = "inner".refers) -> Seq Scan on obj1 (cost=0.00..299.20 rows=22440 width=4) -> Hash (cost=278039.00..278039.00 rows=117 width=60) -> Hash Join (cost=277702.73..278039.00 rows=117 width=60) Hash Cond: ("outer".referrer = "inner".refers) -> Seq Scan on r2 r_1_2 (cost=0.00..308.81 rows=27161 width=8) -> Hash (cost=277702.53..277702.53 rows=395 width=52) -> Nested Loop (cost=343.71..277702.53 rows=395 width=52) Join Filter: ("outer".refers = "inner".referrer) -> Nested Loop (cost=343.71..258374.97 rows=1 width=44) -> Hash Join (cost=343.71..258371.80 rows=1 width=40) Hash Cond: ("outer".refers = "inner".refers) -> Nested Loop (cost=0.00..258011.22 rows=16859 width=28) -> Nested Loop (cost=0.00..238683.66 rows=1 width=20) -> Nested Loop (cost=0.00..219356.09 rows=1 width=12) -> Index Scan using obj4_word_map_wid_idx on obj4_word_map (cost=0.00..213099.71 rows=1600 width=4) Index Cond: (wid = 19458::oid) Filter: ("type" = 4) -> Index Scan using obj4_doc_idx on obj4 (cost=0.00..3.90 rows=1 width=8) Index Cond: ("outer".doc = obj4.descriptor) Filter: (text_field ~* 'ABC'::text) -> Index Scan using r1_referrer_idx on r1 (cost=0.00..19234.83 rows=16860 width=8) Index Cond: (r1.referrer = "outer".objobj3t) -> Index Scan using r1_referrer_idx on r1 r1_2_3 (cost=0.00..19234.83 rows=16860 width=8) Index Cond: ("outer".refers = r1_2_3.referrer) -> Hash (cost=343.70..343.70 rows=23 width=12) -> Nested Loop (cost=0.00..343.70 rows=23 width=12) -> Index Scan using obj3_field1_idx on obj3 (cost=0.00..3.33 rows=1 width=4) Index Cond: (field1 = 'XYZ'::text) -> Index Scan using r3_referrer_idx on r3 r_2_3 (cost=0.00..339.86 rows=94 width=8) Index Cond: ("outer".docid = r_2_3.referrer) -> Index Scan using obj2_docid_idx on obj2 (cost=0.00..3.16 rows=1 width=4) Index Cond: (obj2.docid = "outer".referrer) -> Index Scan using r1_referrer_idx on r1 r1_1_2 (cost=0.00..19234.83 rows=16860 width=8) Index Cond: ("outer".referrer = r1_1_2.referrer)
> I tried executing this query disabling hash or merge joins, and not > doing 'unique' and the same error ocurred.. (note: there was a > matrrialize step there) > > postgresql version is 7.3.6 1. Are the row estimates accurate? If not increase your statistics. 2. Is the explain analyze different? 3. Do these tables receive a lot of update/deletes? Could you possibly have index bloat? Sincerely, Joshua D. Drake > > -Aaron > > EXPLAIN output below > -------------------------- > Unique (cost=278362.15..278362.26 rows=12 width=64) > -> Sort (cost=278362.15..278362.20 rows=117 width=64) > Sort Key: obj1.docid > -> Hash Join (cost=278039.06..278361.34 rows=117 width=64) > Hash Cond: ("outer".docid = "inner".refers) > -> Seq Scan on obj1 (cost=0.00..299.20 rows=22440 width=4) > -> Hash (cost=278039.00..278039.00 rows=117 width=60) > -> Hash Join (cost=277702.73..278039.00 > rows=117 width=60) > Hash Cond: ("outer".referrer = "inner".refers) > -> Seq Scan on r2 r_1_2 > (cost=0.00..308.81 rows=27161 width=8) > -> Hash (cost=277702.53..277702.53 > rows=395 width=52) > -> Nested Loop > (cost=343.71..277702.53 rows=395 width=52) > Join Filter: ("outer".refers = > "inner".referrer) > -> Nested Loop > (cost=343.71..258374.97 rows=1 width=44) > -> Hash Join > (cost=343.71..258371.80 rows=1 width=40) > Hash Cond: > ("outer".refers = "inner".refers) > -> Nested Loop > (cost=0.00..258011.22 rows=16859 width=28) > -> Nested > Loop (cost=0.00..238683.66 rows=1 width=20) > -> > Nested Loop (cost=0.00..219356.09 rows=1 width=12) > > -> Index Scan using obj4_word_map_wid_idx on obj4_word_map > (cost=0.00..213099.71 rows=1600 width=4) > > Index Cond: (wid = 19458::oid) > > Filter: ("type" = 4) > > -> Index Scan using obj4_doc_idx on obj4 (cost=0.00..3.90 rows=1 > width=8) > > Index Cond: ("outer".doc = obj4.descriptor) > > Filter: (text_field ~* 'ABC'::text) > -> > Index Scan using r1_referrer_idx on r1 (cost=0.00..19234.83 > rows=16860 width=8) > > Index Cond: (r1.referrer = "outer".objobj3t) > -> Index > Scan using r1_referrer_idx on r1 r1_2_3 (cost=0.00..19234.83 > rows=16860 width=8) > Index > Cond: ("outer".refers = r1_2_3.referrer) > -> Hash > (cost=343.70..343.70 rows=23 width=12) > -> Nested > Loop (cost=0.00..343.70 rows=23 width=12) > -> > Index Scan using obj3_field1_idx on obj3 (cost=0.00..3.33 rows=1 > width=4) > > Index Cond: (field1 = 'XYZ'::text) > -> > Index Scan using r3_referrer_idx on r3 r_2_3 (cost=0.00..339.86 > rows=94 width=8) > > Index Cond: ("outer".docid = r_2_3.referrer) > -> Index Scan using > obj2_docid_idx on obj2 (cost=0.00..3.16 rows=1 width=4) > Index Cond: > (obj2.docid = "outer".referrer) > -> Index Scan using > r1_referrer_idx on r1 r1_1_2 (cost=0.00..19234.83 rows=16860 width=8) > Index Cond: > ("outer".referrer = r1_1_2.referrer) > > ---------------------------(end of broadcast)--------------------------- > TIP 5: Have you checked our extensive FAQ? > > http://www.postgresql.org/docs/faqs/FAQ.html -- Command Prompt, Inc., home of Mammoth PostgreSQL - S/ODBC and S/JDBC Postgresql support, programming shared hosting and dedicated hosting. +1-503-667-4564 - jd@commandprompt.com - http://www.commandprompt.com Mammoth PostgreSQL Replicator. Integrated Replication for PostgreSQL
Attachment
> 1. Are the row estimates accurate? If not increase your statistics. No, they aren't accurate, though they are up to date. I've found a situation or two where there are definite patterns to the data that tend to skew the gathered statistics, forcing estimates to be off by an order of magnitude or more even at the highest resolution (1000), and I suspect this will be such a case. I'm giving it a try, though (on such a large table, it'll take a while to finish..) but that should only affect which plan chosen, right? I'm perfectly fine with the plan, I'd just like it to not die! Even if it does cause a different plan to be chosen that works, that still doesn't answer the question of why the earlier query failed. > 2. Is the explain analyze different? I cannot get to the point where explain analyze prints out results, as 'explain analyze' involves actually executing the query underneath. I discovered this problem while running an 'explain analyze', and when that failed I ran it straight out in case something in 'explain analyze' was the problem. It wasn't. > 3. Do these tables receive a lot of update/deletes? Could you possibly > have index bloat? Nope, the only actions done to them have been loading the data, creating the index, analyzing, and various SELECTs. The data's relatively new, untouched and static :( However, it took days to create the index, and the index itself ended up being about 60GB large, which I guess is reasonable on such a huge ginormous data set. All the other queries I've run (involving slightly less complex plans) haven't failed, and have executed in reasonable times. One more interesting thing: after the process failes and the transaction is rolled back, the process still has a 2GB memory footprint! Don't most memory allocations palloc'd during query execution get freed after the transaction is done? That's not happening here.. -Aaron
"Joshua D. Drake" <jd@commandprompt.com> writes: > 1. Are the row estimates accurate? If not increase your statistics. Presumably not, since he'd surely never overflow memory with hashes of only a few hundred rows. The hash join code is capable of splitting the table into multiple segments, but it won't do so unless the planner's estimate of the table size is larger than sort_mem. So the real problem IMHO is the bad rowcount estimate. regards, tom lane
> The hash join code is capable of splitting the table into multiple > segments, but it won't do so unless the planner's estimate of the > table size is larger than sort_mem. So the real problem IMHO is > the bad rowcount estimate. Ahh, I didn't know that. I assumed that cost estimates would only affect the query plan itself, not the implementation details of individual steps. That seems like a very plausible explanation of my problems, especially since I *know* some of the estimates are way off. In my query plan, there is a hash join that expects to hash about 395 rows in its outer 'subplan' (for lack of a better term), which consists of several nested joins (i.e. it's not a simple relation).. that's at least two or three orders of magnitude off. I'm still ANALYZEing as we speak to see if I can get it any better, but that's a pretty tough case thing for any estimator to correctly assess. It seems somewhat dangerous to rely so heavily on cost estimates if the consequences are not simply a bad query plan, but outright potential for query failure. That still leaves a few questions, though: - why did the amount of memory used for the hash join exceed sort_mem? Is sort_mem merely a guideline for planning and not an enforced hard limit? - why wasn't the memory cleared after the transaction finished (failed)? The process was still 2GB large. Does memory used for sorts just stay around, possibly re-used for other sorts or hashes but never de-allocated?
Aaron Birkland <birkie@gmail.com> writes: > - why did the amount of memory used for the hash join exceed sort_mem? > Is sort_mem merely a guideline for planning and not an enforced hard > limit? Yup. The deal is that hash join has to decide in advance on how many "batches" it wants to have (because it will divide up the hash key range into that many sub-joins). It sets the number of batches as (estimated total table size) / sort_mem It will then soldier on trying to stuff one batch's worth of tuples into memory, no matter how much larger or smaller than sort_mem that turns out to be. This doesn't usually present that much of a problem in practice, because with customary values of sort_mem there is a lot of margin for error. I am wondering if maybe you are using an optimistically large sort_mem setting? > - why wasn't the memory cleared after the transaction finished > (failed)? The process was still 2GB large. Does memory used for > sorts just stay around, possibly re-used for other sorts or hashes but > never de-allocated? That depends on the details of your malloc package and your kernel, but on many Unixen a process never really shrinks. regards, tom lane