Re: Memory leak from ExecutorState context? - Mailing list pgsql-hackers

From Jehan-Guillaume de Rorthais
Subject Re: Memory leak from ExecutorState context?
Date
Msg-id 20230517191008.7bee6aa7@karst
Whole thread Raw
In response to Re: Memory leak from ExecutorState context?  (Melanie Plageman <melanieplageman@gmail.com>)
Responses Re: Memory leak from ExecutorState context?
List pgsql-hackers
On Tue, 16 May 2023 16:00:52 -0400
Melanie Plageman <melanieplageman@gmail.com> wrote:

> On Tue, May 16, 2023 at 04:00:51PM +0200, Jehan-Guillaume de Rorthais wrote:
> 
> > From e5ecd466172b7bae2f1be294c1a5e70ce2b43ed8 Mon Sep 17 00:00:00 2001
> > From: Melanie Plageman <melanieplageman@gmail.com>
> > Date: Thu, 30 Apr 2020 07:16:28 -0700
> > Subject: [PATCH v8 1/3] Describe hash join implementation
> > 
> > This is just a draft to spark conversation on what a good comment might
> > be like in this file on how the hybrid hash join algorithm is
> > implemented in Postgres. I'm pretty sure this is the accepted term for
> > this algorithm https://en.wikipedia.org/wiki/Hash_join#Hybrid_hash_join  
> 
> I recommend changing the commit message to something like this:
> 
>     Describe hash join implementation
> 
>     Add details to the block comment in nodeHashjoin.c describing the
>     hybrid hash join algorithm at a high level.
> 
>     Author: Melanie Plageman <melanieplageman@gmail.com>
>     Author: Jehan-Guillaume de Rorthais <jgdr@dalibo.com>
>     Reviewed-by: Tomas Vondra <tomas.vondra@enterprisedb.com>
>     Discussion: https://postgr.es/m/20230516160051.4267a800%40karst

Done, but assigning myself as a reviewer as I don't remember having authored
anything in this but the reference to the Hybrid hash page, which is
questionable according to Tomas :)

> > diff --git a/src/backend/executor/nodeHashjoin.c
> > b/src/backend/executor/nodeHashjoin.c index 0a3f32f731..93a78f6f74 100644
> > --- a/src/backend/executor/nodeHashjoin.c
> > ...
> > + * TODO: should we discuss that tuples can only spill forward?  
> 
> I would just cut this for now since we haven't started on an agreed-upon
> wording.

Removed in v9.

> > From 309ad354b7a9e4dfa01b2985bd883829f5e0eba0 Mon Sep 17 00:00:00 2001
> > From: Jehan-Guillaume de Rorthais <jgdr@dalibo.com>
> > Date: Tue, 16 May 2023 15:42:14 +0200
> > Subject: [PATCH v8 2/3] Allocate hash batches related BufFile in a dedicated
> >  context  
> 
> Here is a draft commit message for the second patch:
> 
>     ...

Thanks. Adopted with some minor rewording... hopefully it's OK.

> I recommend editing and adding links to the various discussions on this
> topic from your research.

Done in v9.

> As for the patch itself, I noticed that there are few things needing
> pgindenting.
>
> I usually do the following to run pgindent (in case you haven't done
> this recently).
> 
> ...

Thank you for your recipe.

> There are some existing indentation issues in these files, but you can
> leave those or put them in a separate commit.

Reindented in v9.

I put existing indentation issues in a separate commit to keep the actual
patches clean from distractions.

> ...
> Add a period at the end of this comment.
> 
> > +    /*
> > +     * Use hash join spill memory context to allocate accessors and
> > their
> > +     * buffers
> > +     */

Fixed in v9.

> Add a period at the end of this comment.
> 
> > +    /* Use hash join spill memory context to allocate accessors */

Fixed in v9.

> > diff --git a/src/backend/executor/nodeHashjoin.c
> > b/src/backend/executor/nodeHashjoin.c @@ -1313,21 +1314,30 @@
> > ExecParallelHashJoinNewBatch(HashJoinState *hjstate)
> >   * The data recorded in the file for each tuple is its hash value,
> >   * then the tuple in MinimalTuple format.
> >   *
> > - * Note: it is important always to call this in the regular executor
> > - * context, not in a shorter-lived context; else the temp file buffers
> > - * will get messed up.
> > + * If this is the first write to the batch file, this function first
> > + * create it.  The associated BufFile buffer is allocated in the given
> > + * context.  It is important to always give the HashSpillContext
> > + * context.  First to avoid a shorter-lived context, else the temp file
> > + * buffers will get messed up.  Second, for a better accounting of the
> > + * spilling memory consumption.
> > + *
> >   */  
> 
> Here is my suggested wording fot this block comment:
> 
> The batch file is lazily created. If this is the first tuple written to
> this batch, the batch file is created and its buffer is allocated in the
> given context. It is important to pass in a memory context which will
> live for the entirety of the lifespan of the batch.

Reworded. The context must actually survive the batch itself, not just live
during the lifespan of the batch.

> >  void
> >  ExecHashJoinSaveTuple(MinimalTuple tuple, uint32 hashvalue,
> > -                      BufFile **fileptr)
> > +                      BufFile **fileptr, MemoryContext
> > cxt) {  

Note that I changed this to:

 ExecHashJoinSaveTuple(MinimalTuple tuple, uint32 hashvalue,
                       BufFile **fileptr, HashJoinTable hashtable) {

As this function must allocate BufFile buffer in spillCxt, I suppose
we should force it explicitly in the function code.

Plus, having hashtable locally could be useful for later patch to eg. fine track
allocated memory in spaceUsed.

> > diff --git a/src/include/executor/hashjoin.h
> > b/src/include/executor/hashjoin.h index 8ee59d2c71..ac27222d18 100644
> > --- a/src/include/executor/hashjoin.h
> > +++ b/src/include/executor/hashjoin.h
> > @@ -23,12 +23,12 @@
> >  /* ----------------------------------------------------------------
> >   *                hash-join hash table structures
> >   *
> > - * Each active hashjoin has a HashJoinTable control block, which is
> > - * palloc'd in the executor's per-query context.  All other storage needed
> > - * for the hashjoin is kept in private memory contexts, two for each
> > hashjoin.
> > - * This makes it easy and fast to release the storage when we don't need it
> > - * anymore.  (Exception: data associated with the temp files lives in the
> > - * per-query context too, since we always call buffile.c in that context.)
> > + * Each active hashjoin has a HashJoinTable structure, which is  
> 
> "Other storages" should be "Other storage"
> 
> > + * palloc'd in the executor's per-query context.  Other storages needed for

Fixed in v9.

> ... 
> 
> "mainly hash's meta datas" -> "mainly the hashtable's metadata"
> 
> > + * "hashCxt" (mainly hash's meta datas). Also, the "hashCxt" context is the

Fixed in v9.

> Suggested alternative wording for the below:
> 
> * Data associated with temp files is allocated in the "spillCxt" context
> * which lives for the duration of the entire join as batch files'
> * creation and usage may span batch execution. These files are
> * explicitly destroyed by calling BufFileClose() when the code is done
> * with them. The aim of this context is to help accounting for the
> * memory allocated for temp files and their buffers.

Adopted in v9.

> Suggested alternative wording for the below:
> 
> * Finally, data used only during a single batch's execution is allocated
> * in the "batchCxt". By resetting the batchCxt at the end of each batch,
> * we free all the per-batch storage reliably and without tedium.

Adopted in v9.

Thank you for your review!

Regards,

Attachment

pgsql-hackers by date:

Previous
From: "Jonathan S. Katz"
Date:
Subject: Re: Possible regression setting GUCs on \connect
Next
From: Alena Rybakina
Date:
Subject: Fwd: POC PATCH: copy from ... exceptions to: (was Re: VLDB Features)