Re: Performance of aggregates over set-returning functions - Mailing list pgsql-performance

From Bruce Momjian
Subject Re: Performance of aggregates over set-returning functions
Date
Msg-id 200803061723.m26HNLN24410@momjian.us
Whole thread Raw
In response to Re: Performance of aggregates over set-returning functions  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Performance of aggregates over set-returning functions
List pgsql-performance
This this a bug or TODO item?

---------------------------------------------------------------------------

Tom Lane wrote:
> "John Smith" <sodgodofall@gmail.com> writes:
> >> It's pipelined either way.  But int8 is a pass-by-reference data type,
> >> and it sounds like we have a memory leak for this case.
>
> > Thanks for your reply. How easy is it to fix this? Which portion of
> > the code should we look to change?
>
> I was just looking at that.  The issue looks to me that nodeResult.c
> (and other plan node types that support SRFs in their targetlists)
> do this:
>
>     /*
>      * Check to see if we're still projecting out tuples from a previous scan
>      * tuple (because there is a function-returning-set in the projection
>      * expressions).  If so, try to project another one.
>      */
>     if (node->ps.ps_TupFromTlist)
>     {
>         resultSlot = ExecProject(node->ps.ps_ProjInfo, &isDone);
>         if (isDone == ExprMultipleResult)
>             return resultSlot;
>         /* Done with that source tuple... */
>         node->ps.ps_TupFromTlist = false;
>     }
>
>     /*
>      * Reset per-tuple memory context to free any expression evaluation
>      * storage allocated in the previous tuple cycle.  Note this can't happen
>      * until we're done projecting out tuples from a scan tuple.
>      */
>     ResetExprContext(econtext);
>
> whereas there would be no memory leak if these two chunks of code were
> in the other order.  The question is whether resetting the context first
> would throw away any data that we *do* still need for the repeated
> ExecProject calls.  That second comment block implies there's something
> we do need.
>
> I'm not sure why it's like this.  Some digging in the CVS history shows
> that indeed the code used to be in the other order, and I switched it
> (and added the second comment block) in this old commit:
>
> http://archives.postgresql.org/pgsql-committers/2000-08/msg00218.php
>
> I suppose that the SQL-function support at the time required that its
> calling memory context be persistent until it returned ExprEndResult,
> but I sure don't recall any details.  It's entirely possible that that
> requirement no longer exists, or could easily be eliminated given all
> the other changes that have happened since then.  nodeFunctionscan.c
> seems to reset the current context for each call of a SRF, so I'd think
> that anything that can't cope with that should have been flushed out
> by now.
>
> If you feel like poking at this, I *strongly* recommend doing your
> testing in an --enable-cassert build.  You'll have no idea whether you
> freed stuff too early if you don't have CLOBBER_FREED_MEMORY enabled.
>
>             regards, tom lane
>
> ---------------------------(end of broadcast)---------------------------
> TIP 5: don't forget to increase your free space map settings

--
  Bruce Momjian  <bruce@momjian.us>        http://momjian.us
  EnterpriseDB                             http://postgres.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

pgsql-performance by date:

Previous
From: "D'Arcy J.M. Cain"
Date:
Subject: Re: count * performance issue
Next
From: Josh Berkus
Date:
Subject: Re: Why the difference in plans ?