Thread: memory leak in auto_explain

memory leak in auto_explain

From
Jeff Janes
Date:
I accidentally tried to populate a test case while auto_explain.log_min_duration was set to zero.  auto_explain.log_nested_statements was also on.  

create or replace function gibberish(int) returns text language SQL as $_$ select left(string_agg(md5(random()::text),$$$$),$1) from generate_series(0,$1/32) $_$;

create table j1 as select x, md5(random()::text) as t11, gibberish(1500) as t12 from generate_series(1,20e6) f(x);

I got logorrhea of course, but I also got a memory leak into the SQL function context:

  TopPortalContext: 8192 total in 1 blocks; 7656 free (0 chunks); 536 used
    PortalContext: 16384 total in 5 blocks; 5328 free (1 chunks); 11056 used: <unnamed>
      ExecutorState: 4810120 total in 13 blocks; 4167160 free (74922 chunks); 642960 used
        SQL function: 411058232 total in 60 blocks; 4916568 free (4 chunks); 406141664 used: gibberish

The memory usage grew until OOM killer stepped in.

Cheers,

Jeff

Re: memory leak in auto_explain

From
Jeff Janes
Date:
On Mon, Feb 1, 2021 at 6:09 PM Jeff Janes <jeff.janes@gmail.com> wrote: 
  

create or replace function gibberish(int) returns text language SQL as $_$ select left(string_agg(md5(random()::text),$$$$),$1) from generate_series(0,$1/32) $_$;

create table j1 as select x, md5(random()::text) as t11, gibberish(1500) as t12 from generate_series(1,20e6) f(x);

I should have added, found it on HEAD, verified it also in 12.5.

Cheers,

Jeff

Re: memory leak in auto_explain

From
japin
Date:
On Tue, 02 Feb 2021 at 07:12, Jeff Janes <jeff.janes@gmail.com> wrote:
> On Mon, Feb 1, 2021 at 6:09 PM Jeff Janes <jeff.janes@gmail.com> wrote:
>
>>
>>
>> create or replace function gibberish(int) returns text language SQL as $_$
>> select left(string_agg(md5(random()::text),$$$$),$1) from
>> generate_series(0,$1/32) $_$;
>>
>> create table j1 as select x, md5(random()::text) as t11, gibberish(1500)
>> as t12 from generate_series(1,20e6) f(x);
>>
>
> I should have added, found it on HEAD, verified it also in 12.5.
>

Here's my analysis:
1) In the explain_ExecutorEnd(), it will create a ExplainState on SQL function
memory context, which is a long-lived, cause the memory grow up.

    /*
     * Switch to context in which the fcache lives.  This ensures that our
     * tuplestore etc will have sufficient lifetime.  The sub-executor is
     * responsible for deleting per-tuple information.  (XXX in the case of a
     * long-lived FmgrInfo, this policy represents more memory leakage, but
     * it's not entirely clear where to keep stuff instead.)
     */
    oldcontext = MemoryContextSwitchTo(fcache->fcontext);

2) I try to call pfree() to release ExplainState memory, however, it does not
make sence, I do not know why this does not work? So I try to create it in
queryDesc->estate->es_query_cxt memory context like queryDesc->totaltime, and
it works.

Attached fix the memory leakage in auto_explain. Any thoughts?

-- 
Regrads,
Japin Li.
ChengDu WenWu Information Technology Co.,Ltd.


Attachment

Re: memory leak in auto_explain

From
Tom Lane
Date:
japin <japinli@hotmail.com> writes:
> Here's my analysis:
> 1) In the explain_ExecutorEnd(), it will create a ExplainState on SQL function
> memory context, which is a long-lived, cause the memory grow up.

Yeah, agreed.  I think people looking at this have assumed that the
ExecutorEnd hook would automatically be running in the executor's
per-query context, but that's not so; we haven't yet entered
standard_ExecutorEnd where the context switch is.  The caller's
context is likely to be much longer-lived than the executor's.

I think we should put the switch one level further out than you have
it here, just to be sure that InstrEndLoop is covered too (that doesn't
allocate memory, but auto_explain shouldn't assume that).  Otherwise
seems like a good fix.

            regards, tom lane



Re: memory leak in auto_explain

From
japin
Date:
On Wed, 03 Feb 2021 at 02:13, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> japin <japinli@hotmail.com> writes:
>> Here's my analysis:
>> 1) In the explain_ExecutorEnd(), it will create a ExplainState on SQL function
>> memory context, which is a long-lived, cause the memory grow up.
>
> Yeah, agreed.  I think people looking at this have assumed that the
> ExecutorEnd hook would automatically be running in the executor's
> per-query context, but that's not so; we haven't yet entered
> standard_ExecutorEnd where the context switch is.  The caller's
> context is likely to be much longer-lived than the executor's.
>
> I think we should put the switch one level further out than you have
> it here, just to be sure that InstrEndLoop is covered too (that doesn't
> allocate memory, but auto_explain shouldn't assume that).  Otherwise
> seems like a good fix.
>

Thanks for your review and clarification.

-- 
Regrads,
Japin Li.
ChengDu WenWu Information Technology Co.,Ltd.