Re: Getting better results from valgrind leak tracking - Mailing list pgsql-hackers

From Andres Freund
Subject Re: Getting better results from valgrind leak tracking
Date
Msg-id 20210318030250.xlqxcppknzrakcs5@alap3.anarazel.de
Whole thread Raw
Responses Re: Getting better results from valgrind leak tracking
List pgsql-hackers
Hi,

On 2021-03-17 00:01:55 -0400, Tom Lane wrote:
> As for the particular point about ParallelBlockTableScanWorkerData,
> I agree with your question to David about why that's in TableScanDesc
> not HeapScanDesc, but I can't get excited about it not being freed in
> heap_endscan. That's mainly because I do not believe that anything as
> complex as a heap or indexscan should be counted on to be zero-leakage.
> The right answer is to not do such operations in long-lived contexts.
> So if we're running such a thing while switched into CacheContext,
> *that* is the bug, not that heap_endscan didn't free this particular
> allocation.

I agree that it's a bad idea to do scans in non-transient contexts. It
does however seems like there's a number of places that do...

I added the following hacky definition of "permanent" contexts

/*
 * NB: Only for assertion use.
 *
 * TopMemoryContext itself obviously is permanent. Treat CacheMemoryContext
 * and all its children as permanent too.
 *
 * XXX: Might be worth adding this as an explicit flag on the context?
 */
bool
MemoryContextIsPermanent(MemoryContext c)
{
    if (c == TopMemoryContext)
        return true;

    while (c)
    {
        if (c == CacheMemoryContext)
            return true;
        c = c->parent;
    }

    return false;
}

and checked that the CurrentMemoryContext is not permanent in
SearchCatCacheInternal() and systable_beginscan(). Hit a number of
times.

The most glaring case is the RelationInitTableAccessMethod() call in
RelationBuildLocalRelation(). Seems like the best fix is to just move
the MemoryContextSwitchTo() to just before the
RelationInitTableAccessMethod().  Although I wonder if we shouldn't go
further, and move it to much earlier, somewhere after the rd_rel
allocation.

There's plenty other hits, but I think I should get back to working on
making the shared memory stats patch committable. I really wouldn't want
it to slip yet another year.

But I think it might make sense to add a flag indicating contexts that
shouldn't be used for non-transient data. Seems like we fairly regularly
have "bugs" around this?

Greetings,

Andres Freund



pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: Permission failures with WAL files in 13~ on Windows
Next
From: Andres Freund
Date:
Subject: Re: Getting better results from valgrind leak tracking