Re: 回复:BUG #19040: Memory leak in hashed subplan node due to missing hashtempcxt reset - Mailing list pgsql-bugs

From Tom Lane
Subject Re: 回复:BUG #19040: Memory leak in hashed subplan node due to missing hashtempcxt reset
Date
Msg-id 1239241.1757456662@sss.pgh.pa.us
Whole thread Raw
In response to 回复:BUG #19040: Memory leak in hashed subplan node due to missing hashtempcxt reset  ("李海洋(陌痕)" <mohen.lhy@alibaba-inc.com>)
Responses 回复:回复:BUG #19040: Memory leak in hashed subplan node due to missing hashtempcxt reset
List pgsql-bugs
"=?UTF-8?B?5p2O5rW35rSLKOmZjOeXlSk=?=" <mohen.lhy@alibaba-inc.com> writes:
> For v04-0002, I checked the commit history, and before commit 133924e1,
> buildSubPlanHash was using ecxt_per_tuple_memory. It seems that the
> "Subplan HashTable Temp Context" was introduced in order to fix a certain bug.

You're quite right that 0002 looks suspiciously like it's reverting
133924e1.  However, it doesn't cause the test case added by that
commit to fail (even under Valgrind).  I think the reason is what
you say next:

> However, the changed behavior of TupleHashTableMatch introduced by commit
> bf6c614a (noted in [1]) may make the condition:
> ```
> However, the hashtable routines feel free to reset their temp context at any time,
> which'd lead to destroying input data that was still needed.
> ```
> no longer holds true. Then,  the lifespan of tempcxt in buildHashtable is similar
> to that of innercontext->ecxt_per_tuple_memory, so it makes sense to merge the two,
> I think.

Looking around in 133924e1^, the only place in execGrouping.c that
would reset the hashtable's tempcxt was execTuplesMatch (which was
called by TupleHashTableMatch).  But bf6c614a removed execTuplesMatch.
The modern execGrouping.c code resets hashtable->tempcxt nowhere.
Instead, TupleHashTableMatch applies ExecQualAndReset to
hashtable->exprcontext, so that what is reset is the
ecxt_per_tuple_memory of that econtext --- but that's manufactured by
CreateStandaloneExprContext in BuildTupleHashTable, and has no
connection at all to any context that nodeSubplan.c will touch.
It is certainly not the same ecxt_per_tuple_memory that pre-133924e1
was resetting.

So basically I'm saying that bf6c614a made it okay to revert
133924e1.

> BTW, I ran the test case supported in commit 133924e1 on version not contained commit
> 133924e1 (tag REL8_0_26). I did not find any problems. But i can not find more information
> about this issue.

Digging in the archives, I found the discussion leading to 133924e1 at
https://www.postgresql.org/message-id/flat/i2jnbo%241lcj%241%40news.hub.org

As for trying it on 8.0.26, the commit message for 133924e1 says
specifically that the problem isn't there pre-8.1.  I did try to
duplicate your test using 133924e1^, and it didn't fail for me either.
But I'm not too excited about that, because building PG versions this
old on modern platforms is really hard.  I had to compile with -O0
to get a build that worked at all, and that's already a significant
difference from the code we would have been testing back in 2010.
It may be that the reason for non-reproduction is buried somewhere in
that, or in the different compiler toolchain.  I'm not sure it's worth
a lot of effort to dig deeply into the details.

            regards, tom lane



pgsql-bugs by date:

Previous
From: PG Bug reporting form
Date:
Subject: BUG #19045: Applying custom collation rules appears to erase existing rules
Next
From: "李海洋(陌痕)"
Date:
Subject: 回复:回复:BUG #19040: Memory leak in hashed subplan node due to missing hashtempcxt reset