Re: 回复:BUG #19040: Memory leak in hashed subplan node due to missing hashtempcxt reset - Mailing list pgsql-bugs
From | Tom Lane |
---|---|
Subject | Re: 回复:BUG #19040: Memory leak in hashed subplan node due to missing hashtempcxt reset |
Date | |
Msg-id | 160523.1757190713@sss.pgh.pa.us Whole thread Raw |
In response to | 回复:BUG #19040: Memory leak in hashed subplan node due to missing hashtempcxt reset ("李海洋(陌痕)" <mohen.lhy@alibaba-inc.com>) |
Responses |
回复:回复:BUG #19040: Memory leak in hashed subplan node due to missing hashtempcxt reset
|
List | pgsql-bugs |
"=?UTF-8?B?5p2O5rW35rSLKOmZjOeXlSk=?=" <mohen.lhy@alibaba-inc.com> writes: > On 2025-09-03 23:35 Tom Lane <tgl@sss.pgh.pa.us> writes: >> I wonder why ExecInitSubPlan is making a separate context for this at all, >> rather than using some surrounding short-lived context. > This behavior was introduced by commit 133924e to fix one bug. AFAICS, the > tempcxt is only used by hashfunc evaluation. We should reset tempcxt after per > hashtable find if tempcxt is a separate context. I thought it unlikely that this leak has been there unreported since 2010, so I tried the test case in older branches and soon found that v10 and older don't exhibit the leak, or at least it's much less virulent there. A "git bisect" session found that the behavior changed at bf6c614a2f2c58312b3be34a47e7fb7362e07bcb is the first bad commit commit bf6c614a2f2c58312b3be34a47e7fb7362e07bcb Author: Andres Freund <andres@anarazel.de> Date: Thu Feb 15 21:55:31 2018 -0800 Do execGrouping.c via expression eval machinery, take two. Before that commit, TupleHashTableMatch reset hashtable->tempcxt (via execTuplesMatch). Now what it resets is a different context hashtable->exprcontext->ecxt_per_tuple_memory, and so there's no reset of hashtable->tempcxt anywhere in this particular loop. So that leads me to not trust fixing this in nodeSubplan.c, because that's just one caller that can reach TupleHashTableMatch: assuming that there are no other similar leaks seems dangerous. I experimented with putting MemoryContextReset(hashtable->tempcxt) into TupleHashTableMatch, and that does stop this leak. But even though that's a one-line fix, I don't like that solution either, for two reasons: 1. TupleHashTableMatch is likely to be invoked multiple times per hashtable lookup, so that this way results in many more resets than are really necessary. 2. It's not entirely clear that this way can't clobber data we still need, since the hash function outputs are surely longer-lived than individual tuple matching operations. After contemplating things for awhile, I think that feichanghong's idea is the right one after all: in each of the functions that switch into hashtable->tempcxt, let's do a reset on the way out, as attached. That's straightforward and visibly related to the required data lifespan. Interestingly, both in pre-v11 branches and with the one-line fix, I notice that the test query's memory consumption bounces around a little (10MB or so), while it seems absolutely steady with the attached. I interpret that to mean that we weren't resetting the tempcxt quite often enough, so that there was some amount of leakage in between calls of TupleHashTableMatch, even though we got there often enough to avoid disaster in this particular test. That's another reason to prefer this way over other solutions, I think. regards, tom lane diff --git a/src/backend/executor/execGrouping.c b/src/backend/executor/execGrouping.c index b5400749353..600d1ae5a96 100644 --- a/src/backend/executor/execGrouping.c +++ b/src/backend/executor/execGrouping.c @@ -316,6 +316,7 @@ LookupTupleHashEntry(TupleHashTable hashtable, TupleTableSlot *slot, Assert(entry == NULL || entry->hash == local_hash); MemoryContextSwitchTo(oldContext); + MemoryContextReset(hashtable->tempcxt); return entry; } @@ -338,6 +339,7 @@ TupleHashTableHash(TupleHashTable hashtable, TupleTableSlot *slot) hash = TupleHashTableHash_internal(hashtable->hashtab, NULL); MemoryContextSwitchTo(oldContext); + MemoryContextReset(hashtable->tempcxt); return hash; } @@ -365,6 +367,7 @@ LookupTupleHashEntryHash(TupleHashTable hashtable, TupleTableSlot *slot, Assert(entry == NULL || entry->hash == hash); MemoryContextSwitchTo(oldContext); + MemoryContextReset(hashtable->tempcxt); return entry; } @@ -398,7 +401,9 @@ FindTupleHashEntry(TupleHashTable hashtable, TupleTableSlot *slot, /* Search the hash table */ key = NULL; /* flag to reference inputslot */ entry = tuplehash_lookup(hashtable->hashtab, key); + MemoryContextSwitchTo(oldContext); + MemoryContextReset(hashtable->tempcxt); return entry; }
pgsql-bugs by date: