Re: Memory leak from ExecutorState context? - Mailing list pgsql-hackers
From | Jehan-Guillaume de Rorthais |
---|---|
Subject | Re: Memory leak from ExecutorState context? |
Date | |
Msg-id | 20230517191008.7bee6aa7@karst Whole thread Raw |
In response to | Re: Memory leak from ExecutorState context? (Melanie Plageman <melanieplageman@gmail.com>) |
Responses |
Re: Memory leak from ExecutorState context?
|
List | pgsql-hackers |
On Tue, 16 May 2023 16:00:52 -0400 Melanie Plageman <melanieplageman@gmail.com> wrote: > On Tue, May 16, 2023 at 04:00:51PM +0200, Jehan-Guillaume de Rorthais wrote: > > > From e5ecd466172b7bae2f1be294c1a5e70ce2b43ed8 Mon Sep 17 00:00:00 2001 > > From: Melanie Plageman <melanieplageman@gmail.com> > > Date: Thu, 30 Apr 2020 07:16:28 -0700 > > Subject: [PATCH v8 1/3] Describe hash join implementation > > > > This is just a draft to spark conversation on what a good comment might > > be like in this file on how the hybrid hash join algorithm is > > implemented in Postgres. I'm pretty sure this is the accepted term for > > this algorithm https://en.wikipedia.org/wiki/Hash_join#Hybrid_hash_join > > I recommend changing the commit message to something like this: > > Describe hash join implementation > > Add details to the block comment in nodeHashjoin.c describing the > hybrid hash join algorithm at a high level. > > Author: Melanie Plageman <melanieplageman@gmail.com> > Author: Jehan-Guillaume de Rorthais <jgdr@dalibo.com> > Reviewed-by: Tomas Vondra <tomas.vondra@enterprisedb.com> > Discussion: https://postgr.es/m/20230516160051.4267a800%40karst Done, but assigning myself as a reviewer as I don't remember having authored anything in this but the reference to the Hybrid hash page, which is questionable according to Tomas :) > > diff --git a/src/backend/executor/nodeHashjoin.c > > b/src/backend/executor/nodeHashjoin.c index 0a3f32f731..93a78f6f74 100644 > > --- a/src/backend/executor/nodeHashjoin.c > > ... > > + * TODO: should we discuss that tuples can only spill forward? > > I would just cut this for now since we haven't started on an agreed-upon > wording. Removed in v9. > > From 309ad354b7a9e4dfa01b2985bd883829f5e0eba0 Mon Sep 17 00:00:00 2001 > > From: Jehan-Guillaume de Rorthais <jgdr@dalibo.com> > > Date: Tue, 16 May 2023 15:42:14 +0200 > > Subject: [PATCH v8 2/3] Allocate hash batches related BufFile in a dedicated > > context > > Here is a draft commit message for the second patch: > > ... Thanks. Adopted with some minor rewording... hopefully it's OK. > I recommend editing and adding links to the various discussions on this > topic from your research. Done in v9. > As for the patch itself, I noticed that there are few things needing > pgindenting. > > I usually do the following to run pgindent (in case you haven't done > this recently). > > ... Thank you for your recipe. > There are some existing indentation issues in these files, but you can > leave those or put them in a separate commit. Reindented in v9. I put existing indentation issues in a separate commit to keep the actual patches clean from distractions. > ... > Add a period at the end of this comment. > > > + /* > > + * Use hash join spill memory context to allocate accessors and > > their > > + * buffers > > + */ Fixed in v9. > Add a period at the end of this comment. > > > + /* Use hash join spill memory context to allocate accessors */ Fixed in v9. > > diff --git a/src/backend/executor/nodeHashjoin.c > > b/src/backend/executor/nodeHashjoin.c @@ -1313,21 +1314,30 @@ > > ExecParallelHashJoinNewBatch(HashJoinState *hjstate) > > * The data recorded in the file for each tuple is its hash value, > > * then the tuple in MinimalTuple format. > > * > > - * Note: it is important always to call this in the regular executor > > - * context, not in a shorter-lived context; else the temp file buffers > > - * will get messed up. > > + * If this is the first write to the batch file, this function first > > + * create it. The associated BufFile buffer is allocated in the given > > + * context. It is important to always give the HashSpillContext > > + * context. First to avoid a shorter-lived context, else the temp file > > + * buffers will get messed up. Second, for a better accounting of the > > + * spilling memory consumption. > > + * > > */ > > Here is my suggested wording fot this block comment: > > The batch file is lazily created. If this is the first tuple written to > this batch, the batch file is created and its buffer is allocated in the > given context. It is important to pass in a memory context which will > live for the entirety of the lifespan of the batch. Reworded. The context must actually survive the batch itself, not just live during the lifespan of the batch. > > void > > ExecHashJoinSaveTuple(MinimalTuple tuple, uint32 hashvalue, > > - BufFile **fileptr) > > + BufFile **fileptr, MemoryContext > > cxt) { Note that I changed this to: ExecHashJoinSaveTuple(MinimalTuple tuple, uint32 hashvalue, BufFile **fileptr, HashJoinTable hashtable) { As this function must allocate BufFile buffer in spillCxt, I suppose we should force it explicitly in the function code. Plus, having hashtable locally could be useful for later patch to eg. fine track allocated memory in spaceUsed. > > diff --git a/src/include/executor/hashjoin.h > > b/src/include/executor/hashjoin.h index 8ee59d2c71..ac27222d18 100644 > > --- a/src/include/executor/hashjoin.h > > +++ b/src/include/executor/hashjoin.h > > @@ -23,12 +23,12 @@ > > /* ---------------------------------------------------------------- > > * hash-join hash table structures > > * > > - * Each active hashjoin has a HashJoinTable control block, which is > > - * palloc'd in the executor's per-query context. All other storage needed > > - * for the hashjoin is kept in private memory contexts, two for each > > hashjoin. > > - * This makes it easy and fast to release the storage when we don't need it > > - * anymore. (Exception: data associated with the temp files lives in the > > - * per-query context too, since we always call buffile.c in that context.) > > + * Each active hashjoin has a HashJoinTable structure, which is > > "Other storages" should be "Other storage" > > > + * palloc'd in the executor's per-query context. Other storages needed for Fixed in v9. > ... > > "mainly hash's meta datas" -> "mainly the hashtable's metadata" > > > + * "hashCxt" (mainly hash's meta datas). Also, the "hashCxt" context is the Fixed in v9. > Suggested alternative wording for the below: > > * Data associated with temp files is allocated in the "spillCxt" context > * which lives for the duration of the entire join as batch files' > * creation and usage may span batch execution. These files are > * explicitly destroyed by calling BufFileClose() when the code is done > * with them. The aim of this context is to help accounting for the > * memory allocated for temp files and their buffers. Adopted in v9. > Suggested alternative wording for the below: > > * Finally, data used only during a single batch's execution is allocated > * in the "batchCxt". By resetting the batchCxt at the end of each batch, > * we free all the per-batch storage reliably and without tedium. Adopted in v9. Thank you for your review! Regards,
Attachment
pgsql-hackers by date: