Re: BUG #19040: Memory leak in hashed subplan node due to missing hashtempcxt reset - Mailing list pgsql-bugs

From feichanghong
Subject Re: BUG #19040: Memory leak in hashed subplan node due to missing hashtempcxt reset
Date
Msg-id tencent_8E54DE49323C3A293EE44DFE385011189509@qq.com
Whole thread Raw
In response to BUG #19040: Memory leak in hashed subplan node due to missing hashtempcxt reset  (PG Bug reporting form <noreply@postgresql.org>)
Responses Re: BUG #19040: Memory leak in hashed subplan node due to missing hashtempcxt reset
List pgsql-bugs
Hi,

On Sep 3, 2025, at 09:43, ocean_li_996 <ocean_li_996@163.com> wrote:
I've attached 'v01_fix_memory_leak_in_hashed_subplan_node.patch' to address this.

It seems this issue has been around for many years. I took a quick
look at the patch for fixing it. Why don't we reset the temp context
in the LookupTupleHashEntry, TupleHashTableHash, LookupTupleHashEntryHash,
and FindTupleHashEntry functions? This seems more robust. Furthermore,
the added test case doesn't seem to detect whether there's a memory leak.


----- Original Message -----
From: "PG Bug reporting form" <noreply@postgresql.org>
To: pgsql-bugs@lists.postgresql.org
Cc: mohen.lhy@alibaba-inc.com
Sent: Tue, 02 Sep 2025 15:58:49 +0000
Subject: BUG #19040: Memory leak in hashed subplan node due to missing hashtempcxt reset

The following bug has been logged on the website:

Bug reference:      19040
Logged by:          haiyang li
Email address:      mohen.lhy@alibaba-inc.com
PostgreSQL version: 18beta3
Operating system:   centos7 5.10.84 x86_64
Description:        

Hello, all!

I found a query which consumes a lot of memory and triggers OOM killer.
Memory leak occurs in hashed subplan node.

I was able to create reproducible test case on machine with default config
and postgresql 18beta3:

CREATE TABLE test1(
    a numeric,
    b int);
INSERT INTO
    test1
SELECT
    i,
    i
FROM
    generate_series(1, 30000000) i; -- Make the running time longer
EXPLAIN ANALYZE SELECT
    *
FROM
    test1
WHERE
    a NOT IN(
        SELECT
            i
        FROM
            generate_series(1, 10000) i
);

plan:
                                                             QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------
 Seq Scan on test1  (cost=125.00..612432.24 rows=15000108 width=10) (actual
time=135.191..25832.808 rows=29990000 loops=1)
   Filter: (NOT (hashed SubPlan 1))
   Rows Removed by Filter: 10000
   SubPlan 1
     ->  Function Scan on generate_series i  (cost=0.00..100.00 rows=10000
width=4) (actual time=36.999..38.296 rows=10000 loops=1)
 Planning Time: 0.280 ms
 JIT:
   Functions: 15
   Options: Inlining true, Optimization true, Expressions true, Deforming
true
   Timing: Generation 1.155 ms, Inlining 25.929 ms, Optimization 60.700 ms,
Emission 23.018 ms, Total 110.802 ms
 Execution Time: 28217.026 ms
(11 rows)


I observed that the process's RES (resident memory) was increasing rapidly
during SQL execution by using 'top -p <pid>' command.

Furthermore, during SQL execution, I ran 'select
pg_log_backend_memory_contexts(<pid>)'
to print memory context statistics. The context with abnormally high memory
usage was
"Subplan HashTable Temp Context." The key part of the log is as follows:

...
LOG:  level: 5; Subplan HashTable Temp Context: 514834432 total in 62849
blocks; 973712 free (60695 chunks); 513860720 used
LOG:  level: 5; Subplan HashTable Context: 524288 total in 7 blocks; 123808
free (5 chunks); 400480 used
...
Grand total: 518275344 bytes in 63116 blocks; 2025560 free (60976 chunks);
516249784 used
...

If I change the SQL from "a NOT IN" to "b NOT IN" and do the same action, I
can not
observe abnormally high memory usage. Likewise, the key part of the log is
as follows:
...
LOG:  level: 5; Subplan HashTable Temp Context: 1024 total in 1 blocks; 784
free (0 chunks); 240 used
LOG:  level: 5; Subplan HashTable Context: 524288 total in 7 blocks; 123808
free (5 chunks); 400480 used
...
Grand total: 3441936 bytes in 268 blocks; 1050520 free (281 chunks); 2391416
used
...

While analyzing the source code, I found that the hashed subplan node fails
to reset
the 'hashtempcxt' context after probing the hash table for each slot.

When variable-length datatypes (e.g., numeric) are processed, this can
trigger calls
to 'detoast_attr', which allocate memory in hashtempcxt. Without a reset,
this memory
is not reclaimed until the context itself is destroyed, resulting in a
memory leak
when processing large numbers of slots.

Additionally, through testing, I've found that if test1 is a partitioned
table, multiple "Subplan HashTable Context" instances exist, and these
MemoryContexts will only be released after execution is complete. If the
number of subpartitions is large, it can lead to significant memory
usage. Doesn't this differ from what the execution plan shows? The plan
only displays one occurrence of SubPlan.


Best Regards,
Fei Changhong

pgsql-bugs by date:

Previous
From: "zhouenbing"
Date:
Subject: 答复: empty,query_id, pg_stat_activity
Next
From: Michael Paquier
Date:
Subject: Re: Broken PQtrace CopyData display