Thread: Re: Getting better results from valgrind leak tracking

Re: Getting better results from valgrind leak tracking

From
Andres Freund
Date:
Hi,

(really need to fix my mobile phone mail program to keep the CC list...)

On 2021-03-17 08:15:43 -0700, Andres Freund wrote:
> On Wed, Mar 17, 2021, at 07:16, Tom Lane wrote:
> > Andres Freund <andres@anarazel.de> writes:
> > > On 2021-03-16 20:50:17 -0700, Andres Freund wrote:
> > Meanwhile, I'm still trying to understand why valgrind is whining
> > about the rd_indexcxt identifier strings.  AFAICS it shouldn't.
> 
> I found a way around that late last night. Need to mark the context
> itself as an allocation. But I made a mess on the way to that and need
> to clean the patch up before sending it (and need to drop my
> girlfriend off first).

Unfortunately I didn't immediately find a way to do this while keeping
the MEMPOOL_CREATE/DESTROY in mcxt.c. The attached patch moves the pool
creation into the memory context implementations, "allocates" the
context itself as part of that pool, and changes the reset
implementation from MEMPOOL_DESTROY + MEMPOOL_CREATE to instead do
MEMPOOL_TRIM. That leaves the memory context itself valid (and thus
tracking ->ident etc), but marks all the other memory as freed.

This is just a first version, it probably needs more work, and
definitely a few comments...

After this, your changes, and the previously mentioned fixes, I get far
fewer false positives. Also found a crash / memory leak in pgstat.c due
to the new replication slot stats, but I'll start a separate thread.


There are a few leak warnings around guc.c that look like they might be
real, not false positives, and thus a bit concerning. Looks like several
guc check hooks don't bother to free the old *extra before allocating a
new one.


I suspect we might get better results from valgrind, not just for leaks
but also undefined value tracking, if we changed the way we represent
pools to utilize VALGRIND_MEMPOOL_METAPOOL |
VALGRIND_MEMPOOL_AUTO_FREE. E.g. aset.c would associate AllocBlock using
VALGRIND_MEMPOOL_ALLOC and then mcxt.c would use
VALGRIND_MALLOCLIKE_BLOCK for the individual chunk allocation.

https://www.valgrind.org/docs/manual/mc-manual.html#mc-manual.mempools


I played with naming the allocations underlying aset.c using
VALGRIND_CREATE_BLOCK(block, strlen(context->name), context->name).
That does produce better undefined-value warnings, but it seems that
e.g. the leak detector doen't have that information around. Nor does it
seem to be usable for use-afte-free. At least the latter likely because
I had to VALGRIND_DISCARD by that point...

Greetings,

Andres Freund

Attachment

Re: Getting better results from valgrind leak tracking

From
Tom Lane
Date:
Andres Freund <andres@anarazel.de> writes:
>> I found a way around that late last night. Need to mark the context
>> itself as an allocation. But I made a mess on the way to that and need
>> to clean the patch up before sending it (and need to drop my
>> girlfriend off first).

> Unfortunately I didn't immediately find a way to do this while keeping
> the MEMPOOL_CREATE/DESTROY in mcxt.c. The attached patch moves the pool
> creation into the memory context implementations, "allocates" the
> context itself as part of that pool, and changes the reset
> implementation from MEMPOOL_DESTROY + MEMPOOL_CREATE to instead do
> MEMPOOL_TRIM. That leaves the memory context itself valid (and thus
> tracking ->ident etc), but marks all the other memory as freed.

Huh, interesting.  I wonder why that makes the ident problem go away?
I'd supposed that valgrind would see the context headers as ordinary
memory belonging to the global "malloc" pool, so that any pointers
inside them ought to be considered valid.

Anyway, I don't have a problem with rearranging the responsibility
like this.  It gives the individual allocators more freedom to do
odd stuff, at the cost of very minor duplication of valgrind calls.

I agree we need more comments -- would you like me to have a go at
writing them?

One thing I was stewing over last night is that a MemoryContextReset
will mess up any context identifier assigned with
MemoryContextCopyAndSetIdentifier.  I'd left that as a problem to
fix later, because we don't currently have a need to reset contexts
that use copied identifiers.  But that assumption obviously will bite
us someday, so maybe now is a good time to think about it.

The very simplest fix would be to allocate non-constant idents with
malloc; which'd require adding a flag to track whether context->ident
needs to be free()d.  We have room for another bool near the top of
struct MemoryContextData (and at some point we could turn those
bool fields into a flags word).  The only real cost here is one
more free() while destroying a labeled context, which is probably
negligible.

Other ideas are possible but they seem to require getting the individual
mcxt methods involved, and I doubt it's worth the complexity.

> There are a few leak warnings around guc.c that look like they might be
> real, not false positives, and thus a bit concerning. Looks like several
> guc check hooks don't bother to free the old *extra before allocating a
> new one.

I'll take a look, but I'm pretty certain that guc.c, not the hooks, is
responsible for freeing those.  Might be another case of valgrind not
understanding what's happening.

> I suspect we might get better results from valgrind, not just for leaks
> but also undefined value tracking, if we changed the way we represent
> pools to utilize VALGRIND_MEMPOOL_METAPOOL |
> VALGRIND_MEMPOOL_AUTO_FREE.

Don't really see why that'd help?  I mean, it could conceivably help
catch bugs in the allocators themselves, but I don't follow the argument
that it'd improve anything else.  Defined is defined, as far as I can
tell from the valgrind manual.

            regards, tom lane



Re: Getting better results from valgrind leak tracking

From
Andres Freund
Date:
Hi,

On 2021-03-17 18:07:36 -0400, Tom Lane wrote:
> Andres Freund <andres@anarazel.de> writes:
> >> I found a way around that late last night. Need to mark the context
> >> itself as an allocation. But I made a mess on the way to that and need
> >> to clean the patch up before sending it (and need to drop my
> >> girlfriend off first).
> 
> > Unfortunately I didn't immediately find a way to do this while keeping
> > the MEMPOOL_CREATE/DESTROY in mcxt.c. The attached patch moves the pool
> > creation into the memory context implementations, "allocates" the
> > context itself as part of that pool, and changes the reset
> > implementation from MEMPOOL_DESTROY + MEMPOOL_CREATE to instead do
> > MEMPOOL_TRIM. That leaves the memory context itself valid (and thus
> > tracking ->ident etc), but marks all the other memory as freed.
> 
> Huh, interesting.  I wonder why that makes the ident problem go away?
> I'd supposed that valgrind would see the context headers as ordinary
> memory belonging to the global "malloc" pool, so that any pointers
> inside them ought to be considered valid.


I'm didn't quite understand either at the time of writing the change. It
just seemed the only explanation for some the behaviour I was seeing, so
I tried it. Just to be initially be rebuffed due to errors when
accessing the recycled sets...

I spent a bit of time looking at valgrind, and it looks to be explicit
behaviour:

memcheck/mc_leakcheck.c

static MC_Chunk**
find_active_chunks(Int* pn_chunks)
{
   // Our goal is to construct a set of chunks that includes every
   // mempool chunk, and every malloc region that *doesn't* contain a
   // mempool chunk.
...
   // Then we loop over the mempool tables. For each chunk in each
   // pool, we set the entry in the Bool array corresponding to the
   // malloc chunk containing the mempool chunk.
   VG_(HT_ResetIter)(MC_(mempool_list));
   while ( (mp = VG_(HT_Next)(MC_(mempool_list))) ) {
      VG_(HT_ResetIter)(mp->chunks);
      while ( (mc = VG_(HT_Next)(mp->chunks)) ) {

         // We'll need to record this chunk.
         n_chunks++;

         // Possibly invalidate the malloc holding the beginning of this chunk.
         m = find_chunk_for(mc->data, mallocs, n_mallocs);
         if (m != -1 && malloc_chunk_holds_a_pool_chunk[m] == False) {
            tl_assert(n_chunks > 0);
            n_chunks--;
            malloc_chunk_holds_a_pool_chunk[m] = True;
         }

I think that means it explicitly ignores the entire malloced allocation
whenever there's *any* mempool chunk in it, instead considering only the
mempool chunks. So once aset allocats something in the init block, the
context itself is ignored for leak checking purposes. But that wouldn't
be the case if we didn't have the init block...

I guess that makes sense, but would definitely be nice to have had
documented...



> Anyway, I don't have a problem with rearranging the responsibility
> like this.  It gives the individual allocators more freedom to do
> odd stuff, at the cost of very minor duplication of valgrind calls.

Yea, similar.


> I agree we need more comments -- would you like me to have a go at
> writing them?

Gladly!


> One thing I was stewing over last night is that a MemoryContextReset
> will mess up any context identifier assigned with
> MemoryContextCopyAndSetIdentifier.

Valgrind should catch such problems. Not having the danger would be
better, of course.

We could also add an assertion at reset time that the identifier isn't
in the current context.


> The very simplest fix would be to allocate non-constant idents with
> malloc; which'd require adding a flag to track whether context->ident
> needs to be free()d.  We have room for another bool near the top of
> struct MemoryContextData (and at some point we could turn those
> bool fields into a flags word).  The only real cost here is one
> more free() while destroying a labeled context, which is probably
> negligible.

Hm. A separate malloc()/free() could conceivably actually show up in
profiles at some point.

What if we instead used that flag to indicate that MemoryContextReset()
needs to save the identifier? Then any cost would only be paid if the
context is actually reset.


> > There are a few leak warnings around guc.c that look like they might be
> > real, not false positives, and thus a bit concerning. Looks like several
> > guc check hooks don't bother to free the old *extra before allocating a
> > new one.
> 
> I'll take a look, but I'm pretty certain that guc.c, not the hooks, is
> responsible for freeing those.

Yea, I had misattributed the leak to the callbacks. One of the things I
saw definitely is a leak: if call_string_check_hook() ereport(ERRORs)
the guc_strdup() of newval->stringval is lost.

There's another set of them that seems to be related to paralellism. But
I've not hunted it down yet.

I think it might be worth adding a VALGRIND_DO_CHANGED_LEAK_CHECK() at
the end of a transaction or such? That way it'd not be quite as hard to
pinpoint the source of a leak to individual statements as it is right
now.


> Might be another case of valgrind not understanding what's happening.

Those allocations seem very straightforward, plain malloc/free that is
referenced by a pointer to the start of the allocation. So that doesn't
seem very likely?


> > I suspect we might get better results from valgrind, not just for leaks
> > but also undefined value tracking, if we changed the way we represent
> > pools to utilize VALGRIND_MEMPOOL_METAPOOL |
> > VALGRIND_MEMPOOL_AUTO_FREE.
> 
> Don't really see why that'd help?  I mean, it could conceivably help
> catch bugs in the allocators themselves, but I don't follow the argument
> that it'd improve anything else.  Defined is defined, as far as I can
> tell from the valgrind manual.

I think it might improve the attribution of some use-after-free errors
to the memory context. Looks like it also might reduce the cost of
running valgrind a bit.

Greetings,

Andres Freund



Re: Getting better results from valgrind leak tracking

From
Tom Lane
Date:
Andres Freund <andres@anarazel.de> writes:
> On 2021-03-17 18:07:36 -0400, Tom Lane wrote:
>> Huh, interesting.  I wonder why that makes the ident problem go away?

> I spent a bit of time looking at valgrind, and it looks to be explicit
> behaviour:
>
>    // Our goal is to construct a set of chunks that includes every
>    // mempool chunk, and every malloc region that *doesn't* contain a
>    // mempool chunk.

Ugh.

> I guess that makes sense, but would definitely be nice to have had
> documented...

Indeed.  So this started happening to us when we merged the aset
header with its first allocation block.

>>> There are a few leak warnings around guc.c that look like they might be
>>> real, not false positives, and thus a bit concerning. Looks like several
>>> guc check hooks don't bother to free the old *extra before allocating a
>>> new one.

>> I'll take a look, but I'm pretty certain that guc.c, not the hooks, is
>> responsible for freeing those.

I believe I've just tracked down the cause of that.  Those errors
are (as far as I've seen) only happening in parallel workers, and
the reason is this gem in RestoreGUCState:

    /* See comment at can_skip_gucvar(). */
    for (i = 0; i < num_guc_variables; i++)
        if (!can_skip_gucvar(guc_variables[i]))
            InitializeOneGUCOption(guc_variables[i]);

where InitializeOneGUCOption zeroes out that GUC variable, causing
it to lose track of any allocations it was responsible for.

At minimum, somebody didn't understand the difference between "initialize"
and "reset".  But I suspect we could just nuke this loop altogether.
I've not yet tried to grok the comment that purports to justify it,
but I fail to see why it'd ever be useful to drop values inherited
from the postmaster.

            regards, tom lane



Re: Getting better results from valgrind leak tracking

From
Andres Freund
Date:
Hi,

On 2021-03-17 21:30:48 -0400, Tom Lane wrote:
> Andres Freund <andres@anarazel.de> writes:
> > On 2021-03-17 18:07:36 -0400, Tom Lane wrote:
> >> Huh, interesting.  I wonder why that makes the ident problem go away?
> 
> > I spent a bit of time looking at valgrind, and it looks to be explicit
> > behaviour:
> >
> >    // Our goal is to construct a set of chunks that includes every
> >    // mempool chunk, and every malloc region that *doesn't* contain a
> >    // mempool chunk.
> 
> Ugh.
> 
> > I guess that makes sense, but would definitely be nice to have had
> > documented...
> 
> Indeed.  So this started happening to us when we merged the aset
> header with its first allocation block.

Yea.

I have the feeling that valgrinds error detection capability in our
codebase used to be higher too. I wonder if that could be related
somehow. Or maybe it's just a figment of my imagination.


> I believe I've just tracked down the cause of that.  Those errors
> are (as far as I've seen) only happening in parallel workers, and
> the reason is this gem in RestoreGUCState:
> 
>     /* See comment at can_skip_gucvar(). */
>     for (i = 0; i < num_guc_variables; i++)
>         if (!can_skip_gucvar(guc_variables[i]))
>             InitializeOneGUCOption(guc_variables[i]);
> 
> where InitializeOneGUCOption zeroes out that GUC variable, causing
> it to lose track of any allocations it was responsible for.

Ouch. At least it's a short lived issue rather than something permanently
leaking memory on every SIGHUP...



> At minimum, somebody didn't understand the difference between "initialize"
> and "reset".  But I suspect we could just nuke this loop altogether.
> I've not yet tried to grok the comment that purports to justify it,
> but I fail to see why it'd ever be useful to drop values inherited
> from the postmaster.

I can't really make sense of it either. I think it may be trying to
restore the GUC state to what it would have been in postmaster,
disregarding all the settings that were set as part of PostgresInit()
etc?

Greetings,

Andres Freund



Re: Getting better results from valgrind leak tracking

From
Tom Lane
Date:
Andres Freund <andres@anarazel.de> writes:
> On 2021-03-17 21:30:48 -0400, Tom Lane wrote:
>> I believe I've just tracked down the cause of that.  Those errors
>> are (as far as I've seen) only happening in parallel workers, and
>> the reason is this gem in RestoreGUCState: ...

> Ouch. At least it's a short lived issue rather than something permanently
> leaking memory on every SIGHUP...

Yeah.  This leak is really insignificant in practice, although I'm
suspicious that randomly init'ing GUCs like this might have semantic
issues that we've not detected yet.

>> I've not yet tried to grok the comment that purports to justify it,
>> but I fail to see why it'd ever be useful to drop values inherited
>> from the postmaster.

> I can't really make sense of it either. I think it may be trying to
> restore the GUC state to what it would have been in postmaster,
> disregarding all the settings that were set as part of PostgresInit()
> etc?

At least in a non-EXEC_BACKEND build, the most reliable way to reproduce
the postmaster's settings is to do nothing whatsoever.  And I think the
same is true for EXEC_BACKEND, really, because the guc.c subsystem is
responsible for restoring what would have been the inherited-via-fork
settings.  So I'm really not sure what this is on about, and I'm too
tired to try to figure it out tonight.

In other news, I found that there's a genuine leak in
RelationBuildLocalRelation: RelationInitTableAccessMethod
was added in a spot where CurrentMemoryContext is CacheMemoryContext,
which is bad because it does a syscache lookup that can lead to
a table access which can leak some memory.  Seems easy to fix though.

The plpgsql situation looks like a mess.  As a short-term answer,
I'm inclined to recommend adding an exclusion that will ignore anything
allocated within plpgsql_compile().  Doing better will require a fair
amount of rewriting.  (Although I suppose we could also consider adding
an on_proc_exit callback that runs through and frees all the function
cache entries.)

            regards, tom lane



Re: Getting better results from valgrind leak tracking

From
Andres Freund
Date:
On 2021-03-17 22:33:59 -0400, Tom Lane wrote:
> >> I've not yet tried to grok the comment that purports to justify it,
> >> but I fail to see why it'd ever be useful to drop values inherited
> >> from the postmaster.
> 
> > I can't really make sense of it either. I think it may be trying to
> > restore the GUC state to what it would have been in postmaster,
> > disregarding all the settings that were set as part of PostgresInit()
> > etc?
> 
> At least in a non-EXEC_BACKEND build, the most reliable way to reproduce
> the postmaster's settings is to do nothing whatsoever.  And I think the
> same is true for EXEC_BACKEND, really, because the guc.c subsystem is
> responsible for restoring what would have been the inherited-via-fork
> settings.  So I'm really not sure what this is on about, and I'm too
> tired to try to figure it out tonight.

The restore thing runs after we've already set and initialized GUCs,
including things like user/database default GUCs. Is see things like

==2251779== 4,560 bytes in 1 blocks are definitely lost in loss record 383 of 406
==2251779==    at 0x483877F: malloc (vg_replace_malloc.c:307)
==2251779==    by 0x714A45: ConvertTimeZoneAbbrevs (datetime.c:4556)
==2251779==    by 0x88DE95: load_tzoffsets (tzparser.c:465)
==2251779==    by 0x88511D: check_timezone_abbreviations (guc.c:11565)
==2251779==    by 0x884633: call_string_check_hook (guc.c:11232)
==2251779==    by 0x87CB57: parse_and_validate_value (guc.c:7012)
==2251779==    by 0x87DD5F: set_config_option (guc.c:7630)
==2251779==    by 0x88397F: ProcessGUCArray (guc.c:10784)
==2251779==    by 0x32BCCF: ApplySetting (pg_db_role_setting.c:256)
==2251779==    by 0x874CA2: process_settings (postinit.c:1163)
==2251779==    by 0x874A0B: InitPostgres (postinit.c:1048)
==2251779==    by 0x60129A: BackgroundWorkerInitializeConnectionByOid (postmaster.c:5681)
==2251779==    by 0x2BB283: ParallelWorkerMain (parallel.c:1374)
==2251779==    by 0x5EBA6A: StartBackgroundWorker (bgworker.c:864)
==2251779==    by 0x6014FE: do_start_bgworker (postmaster.c:5802)
==2251779==    by 0x6018D2: maybe_start_bgworkers (postmaster.c:6027)
==2251779==    by 0x600811: sigusr1_handler (postmaster.c:5190)
==2251779==    by 0x4DD513F: ??? (in /usr/lib/x86_64-linux-gnu/libpthread-2.31.so)
==2251779==    by 0x556E865: select (select.c:41)
==2251779==    by 0x5FC0CB: ServerLoop (postmaster.c:1700)
==2251779==    by 0x5FBA68: PostmasterMain (postmaster.c:1408)
==2251779==    by 0x4F8BFD: main (main.c:209)

The BackgroundWorkerInitializeConnectionByOid() in that trace is before
the RestoreGUCState() later in ParallelWorkerMain().  But that doesn't
really explain the approach.


> In other news, I found that there's a genuine leak in
> RelationBuildLocalRelation: RelationInitTableAccessMethod
> was added in a spot where CurrentMemoryContext is CacheMemoryContext,
> which is bad because it does a syscache lookup that can lead to
> a table access which can leak some memory.  Seems easy to fix though.

Yea, that's the one I was talking about re
ParallelBlockTableScanWorkerData not being freed. While I think we
should also add that pfree, I think you're right, and we shouldn't do
RelationInitTableAccessMethod() while in CacheMemoryContext.


> The plpgsql situation looks like a mess.  As a short-term answer,
> I'm inclined to recommend adding an exclusion that will ignore anything
> allocated within plpgsql_compile().  Doing better will require a fair
> amount of rewriting.  (Although I suppose we could also consider adding
> an on_proc_exit callback that runs through and frees all the function
> cache entries.)

The error variant of this one seems like it might actually be a
practically relevant leak? As well as increasing memory usage for
compiled plpgsql functions unnecessarily, of course. The latter would be
good to fix, but the former seems like it might be a practical issue for
poolers and the like?

So I think we should do at least the reparenting thing to address that?

Greetings,

Andres Freund