Hi,
On 2025-05-08 22:04:06 -0400, Tom Lane wrote:
> A nearby thread [1] reminded me to wonder why we seem to have
> so many false-positive leaks reported by Valgrind these days.
> For example, at exit of a backend that's executed a couple of
> trivial queries, I see
>
> ==00:00:00:25.515 260013== LEAK SUMMARY:
> ==00:00:00:25.515 260013== definitely lost: 3,038 bytes in 90 blocks
> ==00:00:00:25.515 260013== indirectly lost: 4,431 bytes in 61 blocks
> ==00:00:00:25.515 260013== possibly lost: 390,242 bytes in 852 blocks
> ==00:00:00:25.515 260013== still reachable: 579,139 bytes in 1,457 blocks
> ==00:00:00:25.515 260013== suppressed: 0 bytes in 0 blocks
>
> so about a thousand "leaked" blocks, all but a couple of which
> are false positives --- including nearly all the "definitely"
> leaked ones.
>
> Some testing and reading of the Valgrind manual [2] turned up a
> number of answers, which mostly boil down to us using very
> Valgrind-unfriendly data structures. Per [2],
>
> There are two ways a block can be reached. The first is with a
> "start-pointer", i.e. a pointer to the start of the block. The
> second is with an "interior-pointer", i.e. a pointer to the middle
> of the block.
>
> [ A block is reported as "possibly lost" if ] a chain of one or
> more pointers to the block has been found, but at least one of the
> pointers is an interior-pointer.
Huh. We use the memory pool client requests to inform valgrind about memory
contexts. I seem to recall that that "hid" many leak warnings from valgrind. I
wonder if we somehow broke (or weakened) that.
We currently don't reset TopMemoryContext at exit, which, obviously, does
massively increase the number of leaks. But OTOH, without that there's not a
whole lot of value in the leak check...
Greetings,
Andres Freund