Use BumpContext contexts for TupleHashTables' tablecxt - Mailing list pgsql-hackers

From Tom Lane
Subject Use BumpContext contexts for TupleHashTables' tablecxt
Date
Msg-id 2268409.1761512111@sss.pgh.pa.us
Whole thread Raw
Responses Re: Use BumpContext contexts for TupleHashTables' tablecxt
Re: Use BumpContext contexts for TupleHashTables' tablecxt
List pgsql-hackers
[ starting a new thread to keep this separate from the estimation
issue ]

I looked at the callers of BuildTupleHashTable, and realized that
(a) every one of them can use a BumpContext for the "tablecxt",
and (b) Jeff Davis already noticed that for nodeAgg.c, in commit
cc721c459.  So we have precedent for the idea working.  Here's
a fleshed-out patch to fix the remaining callers.

            regards, tom lane

From 15ef2e50085ac83728cb9189e482964ff02e5aae Mon Sep 17 00:00:00 2001
From: Tom Lane <tgl@sss.pgh.pa.us>
Date: Sun, 26 Oct 2025 16:46:57 -0400
Subject: [PATCH v1] Use BumpContext contexts for TupleHashTables' tablecxt.

execGrouping.c itself does nothing with this context except to
allocate new hash entries in it, and the callers do nothing with it
except to reset the whole context.  So this is an ideal use-case for
a BumpContext, and the hash tables are frequently big enough for the
savings to be significant.

Commit cc721c459 already taught nodeAgg.c this idea, but neglected
the other callers of BuildTupleHashTable.

(There is a side issue of the extent to which this change invalidates
the planner's estimates of hashtable memory consumption.  However,
those estimates are already pretty bad, so improving them seems
like it can be a separate project.)

Reported-by: Jeff Janes <jeff.janes@gmail.com>
Author: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/CAMkU=1zia0JfW_QR8L5xA2vpa0oqVuiapm78h=WpNsHH13_9uw@mail.gmail.com
---
 src/backend/executor/execGrouping.c       |  1 +
 src/backend/executor/nodeRecursiveunion.c |  9 +++++----
 src/backend/executor/nodeSetOp.c          | 10 ++++++----
 src/backend/executor/nodeSubplan.c        |  6 +++---
 4 files changed, 15 insertions(+), 11 deletions(-)

diff --git a/src/backend/executor/execGrouping.c b/src/backend/executor/execGrouping.c
index 75087204f0c..75023182325 100644
--- a/src/backend/executor/execGrouping.c
+++ b/src/backend/executor/execGrouping.c
@@ -147,6 +147,7 @@ execTuplesHashPrepare(int numCols,
  *    additionalsize: size of data that may be stored along with the hash entry
  *    metacxt: memory context for long-lived allocation, but not per-entry data
  *    tablecxt: memory context in which to store table entries
+ *        (it's usually desirable for tablecxt to be a BumpContext)
  *    tempcxt: short-lived context for evaluation hash and comparison functions
  *    use_variable_hash_iv: if true, adjust hash IV per-parallel-worker
  *
diff --git a/src/backend/executor/nodeRecursiveunion.c b/src/backend/executor/nodeRecursiveunion.c
index 40f66fd0680..6d920955fc5 100644
--- a/src/backend/executor/nodeRecursiveunion.c
+++ b/src/backend/executor/nodeRecursiveunion.c
@@ -209,7 +209,8 @@ ExecInitRecursiveUnion(RecursiveUnion *node, EState *estate, int eflags)
      * If hashing, we need a per-tuple memory context for comparisons, and a
      * longer-lived context to store the hash table.  The table can't just be
      * kept in the per-query context because we want to be able to throw it
-     * away when rescanning.
+     * away when rescanning.  We can use a BumpContext to save storage,
+     * because we will have no need to delete individual table entries.
      */
     if (node->numCols > 0)
     {
@@ -218,9 +219,9 @@ ExecInitRecursiveUnion(RecursiveUnion *node, EState *estate, int eflags)
                                   "RecursiveUnion",
                                   ALLOCSET_DEFAULT_SIZES);
         rustate->tableContext =
-            AllocSetContextCreate(CurrentMemoryContext,
-                                  "RecursiveUnion hash table",
-                                  ALLOCSET_DEFAULT_SIZES);
+            BumpContextCreate(CurrentMemoryContext,
+                              "RecursiveUnion hash table",
+                              ALLOCSET_DEFAULT_SIZES);
     }

     /*
diff --git a/src/backend/executor/nodeSetOp.c b/src/backend/executor/nodeSetOp.c
index 4068481a523..ce42ea6a549 100644
--- a/src/backend/executor/nodeSetOp.c
+++ b/src/backend/executor/nodeSetOp.c
@@ -589,13 +589,15 @@ ExecInitSetOp(SetOp *node, EState *estate, int eflags)
     /*
      * If hashing, we also need a longer-lived context to store the hash
      * table.  The table can't just be kept in the per-query context because
-     * we want to be able to throw it away in ExecReScanSetOp.
+     * we want to be able to throw it away in ExecReScanSetOp.  We can use a
+     * BumpContext to save storage, because we will have no need to delete
+     * individual table entries.
      */
     if (node->strategy == SETOP_HASHED)
         setopstate->tableContext =
-            AllocSetContextCreate(CurrentMemoryContext,
-                                  "SetOp hash table",
-                                  ALLOCSET_DEFAULT_SIZES);
+            BumpContextCreate(CurrentMemoryContext,
+                              "SetOp hash table",
+                              ALLOCSET_DEFAULT_SIZES);

     /*
      * initialize child nodes
diff --git a/src/backend/executor/nodeSubplan.c b/src/backend/executor/nodeSubplan.c
index 53fb56f7388..8930e0398a7 100644
--- a/src/backend/executor/nodeSubplan.c
+++ b/src/backend/executor/nodeSubplan.c
@@ -891,9 +891,9 @@ ExecInitSubPlan(SubPlan *subplan, PlanState *parent)

         /* We need a memory context to hold the hash table(s) */
         sstate->hashtablecxt =
-            AllocSetContextCreate(CurrentMemoryContext,
-                                  "Subplan HashTable Context",
-                                  ALLOCSET_DEFAULT_SIZES);
+            BumpContextCreate(CurrentMemoryContext,
+                              "Subplan HashTable Context",
+                              ALLOCSET_DEFAULT_SIZES);
         /* and a short-lived exprcontext for function evaluation */
         sstate->innerecontext = CreateExprContext(estate);

--
2.43.7


pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Should HashSetOp go away
Next
From: Alexander Korotkov
Date:
Subject: Re: High CPU consumption in cascade replication with large number of walsenders