On Mon, May 12, 2025 at 12:11 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
I wrote: > And, since there's nothing new under the sun around here, > we already had a discussion about that back in 2021: > https://www.postgresql.org/message-id/flat/3471359.1615937770%40sss.pgh.pa.us > That thread seems to have led to fixing some specific bugs, > but we never committed any of the discussed valgrind infrastructure > improvements. I'll have a go at resurrecting that...
Okay, here is a patch series that updates the 0001-Make-memory-contexts-themselves-more-visible-to-valg.patch patch you posted in that thread, and makes various follow-up fixes that either fix or paper over various leaks. Some of it is committable I think, but other parts are just WIP. Anyway, as of the 0010 patch we can run through the core regression tests and see no more than a couple of kilobytes total reported leakage in any process, except for two tests that expose leaks in TS dictionary building. (That's fixable but I ran out of time, and I wanted to get this posted before Montreal.) There is work left to do before we can remove the suppressions added in 0002, but this is already huge progress compared to where we were.
A couple of these patches are bug fixes that need to be applied and even back-patched. In particular, I had not realized that autovacuum leaks a nontrivial amount of memory per relation processed (cf 0009), and apparently has done for a few releases now. This is horrid in databases with many tables, and I'm surprised that we've not gotten complaints about it.
regards, tom lane
Thanks for sharing the patch series. I've applied the patches on my end and rerun the tests. Valgrind now reports 8 bytes leakage only, and the previously noisy outputs are almost entirely gone. Here's valgrind output:
==00:00:01:50.385 90463== LEAK SUMMARY: ==00:00:01:50.385 90463== definitely lost: 8 bytes in 1 blocks ==00:00:01:50.385 90463== indirectly lost: 0 bytes in 0 blocks ==00:00:01:50.385 90463== possibly lost: 0 bytes in 0 blocks ==00:00:01:50.385 90463== still reachable: 1,182,132 bytes in 2,989 blocks ==00:00:01:50.385 90463== suppressed: 0 bytes in 0 blocks ==00:00:01:50.385 90463== Rerun with --leak-check=full to see details of leaked memory ==00:00:01:50.385 90463== ==00:00:01:50.385 90463== For lists of detected and suppressed errors, rerun with: -s ==00:00:01:50.385 90463== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 34 from 3)