Re: Memory leak on subquery as scalar operand - Mailing list pgsql-bugs

From Daniel Gustafsson
Subject Re: Memory leak on subquery as scalar operand
Date
Msg-id 71272FCF-3DD2-440A-B849-4ACB8B16F213@yesql.se
Whole thread Raw
In response to Re: Memory leak on subquery as scalar operand  (Andres Freund <andres@anarazel.de>)
Responses Re: Memory leak on subquery as scalar operand  (Daniel Gustafsson <daniel@yesql.se>)
List pgsql-bugs
> On 1 Nov 2022, at 06:51, Andres Freund <andres@anarazel.de> wrote:
>> David Rowley <dgrowleyml@gmail.com> writes:

>> For me, the behavior seems similar to what the OP reported: there's a
>> per-query leakage but it's less than 100kB per query.  It'd take more than a
>> handful of repetitions to get to an OOM failure.  This is with LLVM 13.0.1
>> on RHEL 8.6.
>
> This I can reproduce. Here's an updated patchset addressing this. This query,
> for some reason, leaks a lot more aggressively than what I've seen in the
> past, so I needed to reduce the amount of time until an llvm context is
> recycled substantially. A bit of benchmarking showed no negative consequences
> of going to 100 uses till recycling, even with absurd settings
> (i.e. jit_*_cost = 0), but did show impact on lower values.

I had a look at this and I concur with the findings in this thread.  I didn't
do any benchmarking but running various tests I was unable to trigger an OOM.
Will do more testing and stressing of it.  A few small comments on the
patchset:

From reading it seems that patch 0002 and 0003 can be committed regardless of
the other patches in this series.  Were they included because they were found
while looking at this, or is there a deeper connection I'm missing?

0004:

The commit message states: "That incurs some overhead, so only do so after
10000 JITed queries.", but I fail to see how that's implemented.  There is
currently recreation after 100 reuses, was the intention to have a different
number here or is this just a leftover from an earlier patch-version?


+    /*
+     * The LLVM Context used by this JIT context. An LLVM context is reused
+     * across many compilations, but occasionally reset to prevent it using
+     * too much memory due to more and more types accumulating.
+     */
+    LLVMContextRef llvm_context;

llvm_context is added as a member in LLVMJitContext but is never set or read,
the static llvmjit.c:llvm_context is still used for everything.  Is this a
lefover or was the plan to move this to LLVMJitContext?


+     * FIXME: should split the handling of llvm_triple / llvm_layout out
+     * of llvm_create_types() - that doesn't need to be redone.

Agreed, that seems wasteful.  AFAICT there would not be any reason to recreate
this once set?


+    /*
+     * Consider as cleaned up even if we skip doing so below, that way we can
+     * verify the tracking is correct (see llvm_shutdown()).
+     */
+    llvm_jit_context_in_use_count--;

Since this doesn't actually release even if llvm_jit_context_in_use_count goes
to zero here, this hunk in llvm_release_context is seemingly a bit at odds with
the following from jit/README: "If it is desirable to release resources
earlier, jit_release_context() can be used".  If we want to verify the usage
tracking in llvm_shutdown it's hard to see how we could much else, but maybe a
note in the README that not all resources are guaranteed to be released could
be in order?

Regarding llvm_release_context we currently have this:

static void
llvm_release_context(JitContext *context)
{
    LLVMJitContext *llvm_context = (LLVMJitContext *) context;

..which shadows the "static LLVMContextRef llvm_context;" declared in this
patchset, we should probably rename the local var in llvm_release_context.
(this is referred to in the 0004 commit message.)

The attached 0005 is a WIP attempt to address a few of the FIXME's in this
patchset on top of your the 0001-0004.

--
Daniel Gustafsson


Attachment

pgsql-bugs by date:

Previous
From: Alexander Lakhin
Date:
Subject: Re: BUG #17798: Incorrect memory access occurs when using BEFORE ROW UPDATE trigger
Next
From: Tom Lane
Date:
Subject: Re: plpython does not honour max-rows