Re: Getting better results from valgrind leak tracking - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Getting better results from valgrind leak tracking
Date
Msg-id 3665254.1616034839@sss.pgh.pa.us
Whole thread Raw
In response to Re: Getting better results from valgrind leak tracking  (Andres Freund <andres@anarazel.de>)
Responses Re: Getting better results from valgrind leak tracking
List pgsql-hackers
Andres Freund <andres@anarazel.de> writes:
> On 2021-03-17 21:30:48 -0400, Tom Lane wrote:
>> I believe I've just tracked down the cause of that.  Those errors
>> are (as far as I've seen) only happening in parallel workers, and
>> the reason is this gem in RestoreGUCState: ...

> Ouch. At least it's a short lived issue rather than something permanently
> leaking memory on every SIGHUP...

Yeah.  This leak is really insignificant in practice, although I'm
suspicious that randomly init'ing GUCs like this might have semantic
issues that we've not detected yet.

>> I've not yet tried to grok the comment that purports to justify it,
>> but I fail to see why it'd ever be useful to drop values inherited
>> from the postmaster.

> I can't really make sense of it either. I think it may be trying to
> restore the GUC state to what it would have been in postmaster,
> disregarding all the settings that were set as part of PostgresInit()
> etc?

At least in a non-EXEC_BACKEND build, the most reliable way to reproduce
the postmaster's settings is to do nothing whatsoever.  And I think the
same is true for EXEC_BACKEND, really, because the guc.c subsystem is
responsible for restoring what would have been the inherited-via-fork
settings.  So I'm really not sure what this is on about, and I'm too
tired to try to figure it out tonight.

In other news, I found that there's a genuine leak in
RelationBuildLocalRelation: RelationInitTableAccessMethod
was added in a spot where CurrentMemoryContext is CacheMemoryContext,
which is bad because it does a syscache lookup that can lead to
a table access which can leak some memory.  Seems easy to fix though.

The plpgsql situation looks like a mess.  As a short-term answer,
I'm inclined to recommend adding an exclusion that will ignore anything
allocated within plpgsql_compile().  Doing better will require a fair
amount of rewriting.  (Although I suppose we could also consider adding
an on_proc_exit callback that runs through and frees all the function
cache entries.)

            regards, tom lane



pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: Permission failures with WAL files in 13~ on Windows
Next
From: "tsunakawa.takay@fujitsu.com"
Date:
Subject: RE: libpq debug log