Thread: Re: Getting better results from valgrind leak tracking
Hi, (really need to fix my mobile phone mail program to keep the CC list...) On 2021-03-17 08:15:43 -0700, Andres Freund wrote: > On Wed, Mar 17, 2021, at 07:16, Tom Lane wrote: > > Andres Freund <andres@anarazel.de> writes: > > > On 2021-03-16 20:50:17 -0700, Andres Freund wrote: > > Meanwhile, I'm still trying to understand why valgrind is whining > > about the rd_indexcxt identifier strings. AFAICS it shouldn't. > > I found a way around that late last night. Need to mark the context > itself as an allocation. But I made a mess on the way to that and need > to clean the patch up before sending it (and need to drop my > girlfriend off first). Unfortunately I didn't immediately find a way to do this while keeping the MEMPOOL_CREATE/DESTROY in mcxt.c. The attached patch moves the pool creation into the memory context implementations, "allocates" the context itself as part of that pool, and changes the reset implementation from MEMPOOL_DESTROY + MEMPOOL_CREATE to instead do MEMPOOL_TRIM. That leaves the memory context itself valid (and thus tracking ->ident etc), but marks all the other memory as freed. This is just a first version, it probably needs more work, and definitely a few comments... After this, your changes, and the previously mentioned fixes, I get far fewer false positives. Also found a crash / memory leak in pgstat.c due to the new replication slot stats, but I'll start a separate thread. There are a few leak warnings around guc.c that look like they might be real, not false positives, and thus a bit concerning. Looks like several guc check hooks don't bother to free the old *extra before allocating a new one. I suspect we might get better results from valgrind, not just for leaks but also undefined value tracking, if we changed the way we represent pools to utilize VALGRIND_MEMPOOL_METAPOOL | VALGRIND_MEMPOOL_AUTO_FREE. E.g. aset.c would associate AllocBlock using VALGRIND_MEMPOOL_ALLOC and then mcxt.c would use VALGRIND_MALLOCLIKE_BLOCK for the individual chunk allocation. https://www.valgrind.org/docs/manual/mc-manual.html#mc-manual.mempools I played with naming the allocations underlying aset.c using VALGRIND_CREATE_BLOCK(block, strlen(context->name), context->name). That does produce better undefined-value warnings, but it seems that e.g. the leak detector doen't have that information around. Nor does it seem to be usable for use-afte-free. At least the latter likely because I had to VALGRIND_DISCARD by that point... Greetings, Andres Freund
Attachment
Andres Freund <andres@anarazel.de> writes: >> I found a way around that late last night. Need to mark the context >> itself as an allocation. But I made a mess on the way to that and need >> to clean the patch up before sending it (and need to drop my >> girlfriend off first). > Unfortunately I didn't immediately find a way to do this while keeping > the MEMPOOL_CREATE/DESTROY in mcxt.c. The attached patch moves the pool > creation into the memory context implementations, "allocates" the > context itself as part of that pool, and changes the reset > implementation from MEMPOOL_DESTROY + MEMPOOL_CREATE to instead do > MEMPOOL_TRIM. That leaves the memory context itself valid (and thus > tracking ->ident etc), but marks all the other memory as freed. Huh, interesting. I wonder why that makes the ident problem go away? I'd supposed that valgrind would see the context headers as ordinary memory belonging to the global "malloc" pool, so that any pointers inside them ought to be considered valid. Anyway, I don't have a problem with rearranging the responsibility like this. It gives the individual allocators more freedom to do odd stuff, at the cost of very minor duplication of valgrind calls. I agree we need more comments -- would you like me to have a go at writing them? One thing I was stewing over last night is that a MemoryContextReset will mess up any context identifier assigned with MemoryContextCopyAndSetIdentifier. I'd left that as a problem to fix later, because we don't currently have a need to reset contexts that use copied identifiers. But that assumption obviously will bite us someday, so maybe now is a good time to think about it. The very simplest fix would be to allocate non-constant idents with malloc; which'd require adding a flag to track whether context->ident needs to be free()d. We have room for another bool near the top of struct MemoryContextData (and at some point we could turn those bool fields into a flags word). The only real cost here is one more free() while destroying a labeled context, which is probably negligible. Other ideas are possible but they seem to require getting the individual mcxt methods involved, and I doubt it's worth the complexity. > There are a few leak warnings around guc.c that look like they might be > real, not false positives, and thus a bit concerning. Looks like several > guc check hooks don't bother to free the old *extra before allocating a > new one. I'll take a look, but I'm pretty certain that guc.c, not the hooks, is responsible for freeing those. Might be another case of valgrind not understanding what's happening. > I suspect we might get better results from valgrind, not just for leaks > but also undefined value tracking, if we changed the way we represent > pools to utilize VALGRIND_MEMPOOL_METAPOOL | > VALGRIND_MEMPOOL_AUTO_FREE. Don't really see why that'd help? I mean, it could conceivably help catch bugs in the allocators themselves, but I don't follow the argument that it'd improve anything else. Defined is defined, as far as I can tell from the valgrind manual. regards, tom lane
Hi, On 2021-03-17 18:07:36 -0400, Tom Lane wrote: > Andres Freund <andres@anarazel.de> writes: > >> I found a way around that late last night. Need to mark the context > >> itself as an allocation. But I made a mess on the way to that and need > >> to clean the patch up before sending it (and need to drop my > >> girlfriend off first). > > > Unfortunately I didn't immediately find a way to do this while keeping > > the MEMPOOL_CREATE/DESTROY in mcxt.c. The attached patch moves the pool > > creation into the memory context implementations, "allocates" the > > context itself as part of that pool, and changes the reset > > implementation from MEMPOOL_DESTROY + MEMPOOL_CREATE to instead do > > MEMPOOL_TRIM. That leaves the memory context itself valid (and thus > > tracking ->ident etc), but marks all the other memory as freed. > > Huh, interesting. I wonder why that makes the ident problem go away? > I'd supposed that valgrind would see the context headers as ordinary > memory belonging to the global "malloc" pool, so that any pointers > inside them ought to be considered valid. I'm didn't quite understand either at the time of writing the change. It just seemed the only explanation for some the behaviour I was seeing, so I tried it. Just to be initially be rebuffed due to errors when accessing the recycled sets... I spent a bit of time looking at valgrind, and it looks to be explicit behaviour: memcheck/mc_leakcheck.c static MC_Chunk** find_active_chunks(Int* pn_chunks) { // Our goal is to construct a set of chunks that includes every // mempool chunk, and every malloc region that *doesn't* contain a // mempool chunk. ... // Then we loop over the mempool tables. For each chunk in each // pool, we set the entry in the Bool array corresponding to the // malloc chunk containing the mempool chunk. VG_(HT_ResetIter)(MC_(mempool_list)); while ( (mp = VG_(HT_Next)(MC_(mempool_list))) ) { VG_(HT_ResetIter)(mp->chunks); while ( (mc = VG_(HT_Next)(mp->chunks)) ) { // We'll need to record this chunk. n_chunks++; // Possibly invalidate the malloc holding the beginning of this chunk. m = find_chunk_for(mc->data, mallocs, n_mallocs); if (m != -1 && malloc_chunk_holds_a_pool_chunk[m] == False) { tl_assert(n_chunks > 0); n_chunks--; malloc_chunk_holds_a_pool_chunk[m] = True; } I think that means it explicitly ignores the entire malloced allocation whenever there's *any* mempool chunk in it, instead considering only the mempool chunks. So once aset allocats something in the init block, the context itself is ignored for leak checking purposes. But that wouldn't be the case if we didn't have the init block... I guess that makes sense, but would definitely be nice to have had documented... > Anyway, I don't have a problem with rearranging the responsibility > like this. It gives the individual allocators more freedom to do > odd stuff, at the cost of very minor duplication of valgrind calls. Yea, similar. > I agree we need more comments -- would you like me to have a go at > writing them? Gladly! > One thing I was stewing over last night is that a MemoryContextReset > will mess up any context identifier assigned with > MemoryContextCopyAndSetIdentifier. Valgrind should catch such problems. Not having the danger would be better, of course. We could also add an assertion at reset time that the identifier isn't in the current context. > The very simplest fix would be to allocate non-constant idents with > malloc; which'd require adding a flag to track whether context->ident > needs to be free()d. We have room for another bool near the top of > struct MemoryContextData (and at some point we could turn those > bool fields into a flags word). The only real cost here is one > more free() while destroying a labeled context, which is probably > negligible. Hm. A separate malloc()/free() could conceivably actually show up in profiles at some point. What if we instead used that flag to indicate that MemoryContextReset() needs to save the identifier? Then any cost would only be paid if the context is actually reset. > > There are a few leak warnings around guc.c that look like they might be > > real, not false positives, and thus a bit concerning. Looks like several > > guc check hooks don't bother to free the old *extra before allocating a > > new one. > > I'll take a look, but I'm pretty certain that guc.c, not the hooks, is > responsible for freeing those. Yea, I had misattributed the leak to the callbacks. One of the things I saw definitely is a leak: if call_string_check_hook() ereport(ERRORs) the guc_strdup() of newval->stringval is lost. There's another set of them that seems to be related to paralellism. But I've not hunted it down yet. I think it might be worth adding a VALGRIND_DO_CHANGED_LEAK_CHECK() at the end of a transaction or such? That way it'd not be quite as hard to pinpoint the source of a leak to individual statements as it is right now. > Might be another case of valgrind not understanding what's happening. Those allocations seem very straightforward, plain malloc/free that is referenced by a pointer to the start of the allocation. So that doesn't seem very likely? > > I suspect we might get better results from valgrind, not just for leaks > > but also undefined value tracking, if we changed the way we represent > > pools to utilize VALGRIND_MEMPOOL_METAPOOL | > > VALGRIND_MEMPOOL_AUTO_FREE. > > Don't really see why that'd help? I mean, it could conceivably help > catch bugs in the allocators themselves, but I don't follow the argument > that it'd improve anything else. Defined is defined, as far as I can > tell from the valgrind manual. I think it might improve the attribution of some use-after-free errors to the memory context. Looks like it also might reduce the cost of running valgrind a bit. Greetings, Andres Freund
Andres Freund <andres@anarazel.de> writes: > On 2021-03-17 18:07:36 -0400, Tom Lane wrote: >> Huh, interesting. I wonder why that makes the ident problem go away? > I spent a bit of time looking at valgrind, and it looks to be explicit > behaviour: > > // Our goal is to construct a set of chunks that includes every > // mempool chunk, and every malloc region that *doesn't* contain a > // mempool chunk. Ugh. > I guess that makes sense, but would definitely be nice to have had > documented... Indeed. So this started happening to us when we merged the aset header with its first allocation block. >>> There are a few leak warnings around guc.c that look like they might be >>> real, not false positives, and thus a bit concerning. Looks like several >>> guc check hooks don't bother to free the old *extra before allocating a >>> new one. >> I'll take a look, but I'm pretty certain that guc.c, not the hooks, is >> responsible for freeing those. I believe I've just tracked down the cause of that. Those errors are (as far as I've seen) only happening in parallel workers, and the reason is this gem in RestoreGUCState: /* See comment at can_skip_gucvar(). */ for (i = 0; i < num_guc_variables; i++) if (!can_skip_gucvar(guc_variables[i])) InitializeOneGUCOption(guc_variables[i]); where InitializeOneGUCOption zeroes out that GUC variable, causing it to lose track of any allocations it was responsible for. At minimum, somebody didn't understand the difference between "initialize" and "reset". But I suspect we could just nuke this loop altogether. I've not yet tried to grok the comment that purports to justify it, but I fail to see why it'd ever be useful to drop values inherited from the postmaster. regards, tom lane
Hi, On 2021-03-17 21:30:48 -0400, Tom Lane wrote: > Andres Freund <andres@anarazel.de> writes: > > On 2021-03-17 18:07:36 -0400, Tom Lane wrote: > >> Huh, interesting. I wonder why that makes the ident problem go away? > > > I spent a bit of time looking at valgrind, and it looks to be explicit > > behaviour: > > > > // Our goal is to construct a set of chunks that includes every > > // mempool chunk, and every malloc region that *doesn't* contain a > > // mempool chunk. > > Ugh. > > > I guess that makes sense, but would definitely be nice to have had > > documented... > > Indeed. So this started happening to us when we merged the aset > header with its first allocation block. Yea. I have the feeling that valgrinds error detection capability in our codebase used to be higher too. I wonder if that could be related somehow. Or maybe it's just a figment of my imagination. > I believe I've just tracked down the cause of that. Those errors > are (as far as I've seen) only happening in parallel workers, and > the reason is this gem in RestoreGUCState: > > /* See comment at can_skip_gucvar(). */ > for (i = 0; i < num_guc_variables; i++) > if (!can_skip_gucvar(guc_variables[i])) > InitializeOneGUCOption(guc_variables[i]); > > where InitializeOneGUCOption zeroes out that GUC variable, causing > it to lose track of any allocations it was responsible for. Ouch. At least it's a short lived issue rather than something permanently leaking memory on every SIGHUP... > At minimum, somebody didn't understand the difference between "initialize" > and "reset". But I suspect we could just nuke this loop altogether. > I've not yet tried to grok the comment that purports to justify it, > but I fail to see why it'd ever be useful to drop values inherited > from the postmaster. I can't really make sense of it either. I think it may be trying to restore the GUC state to what it would have been in postmaster, disregarding all the settings that were set as part of PostgresInit() etc? Greetings, Andres Freund
Andres Freund <andres@anarazel.de> writes: > On 2021-03-17 21:30:48 -0400, Tom Lane wrote: >> I believe I've just tracked down the cause of that. Those errors >> are (as far as I've seen) only happening in parallel workers, and >> the reason is this gem in RestoreGUCState: ... > Ouch. At least it's a short lived issue rather than something permanently > leaking memory on every SIGHUP... Yeah. This leak is really insignificant in practice, although I'm suspicious that randomly init'ing GUCs like this might have semantic issues that we've not detected yet. >> I've not yet tried to grok the comment that purports to justify it, >> but I fail to see why it'd ever be useful to drop values inherited >> from the postmaster. > I can't really make sense of it either. I think it may be trying to > restore the GUC state to what it would have been in postmaster, > disregarding all the settings that were set as part of PostgresInit() > etc? At least in a non-EXEC_BACKEND build, the most reliable way to reproduce the postmaster's settings is to do nothing whatsoever. And I think the same is true for EXEC_BACKEND, really, because the guc.c subsystem is responsible for restoring what would have been the inherited-via-fork settings. So I'm really not sure what this is on about, and I'm too tired to try to figure it out tonight. In other news, I found that there's a genuine leak in RelationBuildLocalRelation: RelationInitTableAccessMethod was added in a spot where CurrentMemoryContext is CacheMemoryContext, which is bad because it does a syscache lookup that can lead to a table access which can leak some memory. Seems easy to fix though. The plpgsql situation looks like a mess. As a short-term answer, I'm inclined to recommend adding an exclusion that will ignore anything allocated within plpgsql_compile(). Doing better will require a fair amount of rewriting. (Although I suppose we could also consider adding an on_proc_exit callback that runs through and frees all the function cache entries.) regards, tom lane
On 2021-03-17 22:33:59 -0400, Tom Lane wrote: > >> I've not yet tried to grok the comment that purports to justify it, > >> but I fail to see why it'd ever be useful to drop values inherited > >> from the postmaster. > > > I can't really make sense of it either. I think it may be trying to > > restore the GUC state to what it would have been in postmaster, > > disregarding all the settings that were set as part of PostgresInit() > > etc? > > At least in a non-EXEC_BACKEND build, the most reliable way to reproduce > the postmaster's settings is to do nothing whatsoever. And I think the > same is true for EXEC_BACKEND, really, because the guc.c subsystem is > responsible for restoring what would have been the inherited-via-fork > settings. So I'm really not sure what this is on about, and I'm too > tired to try to figure it out tonight. The restore thing runs after we've already set and initialized GUCs, including things like user/database default GUCs. Is see things like ==2251779== 4,560 bytes in 1 blocks are definitely lost in loss record 383 of 406 ==2251779== at 0x483877F: malloc (vg_replace_malloc.c:307) ==2251779== by 0x714A45: ConvertTimeZoneAbbrevs (datetime.c:4556) ==2251779== by 0x88DE95: load_tzoffsets (tzparser.c:465) ==2251779== by 0x88511D: check_timezone_abbreviations (guc.c:11565) ==2251779== by 0x884633: call_string_check_hook (guc.c:11232) ==2251779== by 0x87CB57: parse_and_validate_value (guc.c:7012) ==2251779== by 0x87DD5F: set_config_option (guc.c:7630) ==2251779== by 0x88397F: ProcessGUCArray (guc.c:10784) ==2251779== by 0x32BCCF: ApplySetting (pg_db_role_setting.c:256) ==2251779== by 0x874CA2: process_settings (postinit.c:1163) ==2251779== by 0x874A0B: InitPostgres (postinit.c:1048) ==2251779== by 0x60129A: BackgroundWorkerInitializeConnectionByOid (postmaster.c:5681) ==2251779== by 0x2BB283: ParallelWorkerMain (parallel.c:1374) ==2251779== by 0x5EBA6A: StartBackgroundWorker (bgworker.c:864) ==2251779== by 0x6014FE: do_start_bgworker (postmaster.c:5802) ==2251779== by 0x6018D2: maybe_start_bgworkers (postmaster.c:6027) ==2251779== by 0x600811: sigusr1_handler (postmaster.c:5190) ==2251779== by 0x4DD513F: ??? (in /usr/lib/x86_64-linux-gnu/libpthread-2.31.so) ==2251779== by 0x556E865: select (select.c:41) ==2251779== by 0x5FC0CB: ServerLoop (postmaster.c:1700) ==2251779== by 0x5FBA68: PostmasterMain (postmaster.c:1408) ==2251779== by 0x4F8BFD: main (main.c:209) The BackgroundWorkerInitializeConnectionByOid() in that trace is before the RestoreGUCState() later in ParallelWorkerMain(). But that doesn't really explain the approach. > In other news, I found that there's a genuine leak in > RelationBuildLocalRelation: RelationInitTableAccessMethod > was added in a spot where CurrentMemoryContext is CacheMemoryContext, > which is bad because it does a syscache lookup that can lead to > a table access which can leak some memory. Seems easy to fix though. Yea, that's the one I was talking about re ParallelBlockTableScanWorkerData not being freed. While I think we should also add that pfree, I think you're right, and we shouldn't do RelationInitTableAccessMethod() while in CacheMemoryContext. > The plpgsql situation looks like a mess. As a short-term answer, > I'm inclined to recommend adding an exclusion that will ignore anything > allocated within plpgsql_compile(). Doing better will require a fair > amount of rewriting. (Although I suppose we could also consider adding > an on_proc_exit callback that runs through and frees all the function > cache entries.) The error variant of this one seems like it might actually be a practically relevant leak? As well as increasing memory usage for compiled plpgsql functions unnecessarily, of course. The latter would be good to fix, but the former seems like it might be a practical issue for poolers and the like? So I think we should do at least the reparenting thing to address that? Greetings, Andres Freund