Thread: BUG #16348: Memory leak when parsing config
The following bug has been logged on the website: Bug reference: 16348 Logged by: Hugh Wang Email address: hghwng@gmail.com PostgreSQL version: 12.2 Operating system: Arch Linux Description: SUMMARY: Memory leaked by gui-file.l:ProcessConfig accumulates across repeated SIGHUP cycles. ANALYSIS: When parsing config file, gui-file.l:ProcessConfig sets up memory context. It invokes guc-file.l:ProcessConfigFileInternal, which is expected to leak memory. It's okay because the leaked memory will be freed when cleaning up the context. Next, guc-file.l:ProcessConfigFileInternal associates each config item with filename by calling guc.c:set_config_option, which calls guc.c:guc_strdup. However, guc.c:guc_strdup is not aware of the memory context, and invokes libc.so:strdup directly.
PG Bug reporting form <noreply@postgresql.org> writes: > ANALYSIS: > When parsing config file, gui-file.l:ProcessConfig sets up memory context. > It invokes guc-file.l:ProcessConfigFileInternal, which is expected to leak > memory. It's okay because the leaked memory will be freed when cleaning up > the context. Next, guc-file.l:ProcessConfigFileInternal associates each > config item with filename by calling guc.c:set_config_option, which calls > guc.c:guc_strdup. However, guc.c:guc_strdup is not aware of the memory > context, and invokes libc.so:strdup directly. And? The places that use guc_strdup are expecting that the values will be tracked and then freed when discarded. Perhaps there is a code path where that's not happening, but you haven't either shown it or given any reason to believe one exists. As a quick cross-check, I set up a shell loop to send a constant stream of SIGHUPs to an idle backend. I don't see any change in the backend's memory consumption in "top". Maybe this test case is missing some necessary factor ... but, again, you haven't given a reason to believe there is one. regards, tom lane
On Tue, Apr 7, 2020 at 5:44 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: > > I've analyzed all related code and can't see any code that's responsible > > for freeing the variable. > > If you are talking about set_config_sourcefile, that does its very > own cleanup: > > sourcefile = guc_strdup(elevel, sourcefile); > if (record->sourcefile) > free(record->sourcefile); > record->sourcefile = sourcefile; > record->sourceline = sourceline; > > That "free()" removes any previously-allocated sourcefile string. > > [...] > > I know what the high-level design is here: the temp memory context is > for data that we don't need anymore after parsing the config file. > But the sourceline string needs to be kept around in case somebody > wants to see it (eg for the pg_settings view). What you propose > *will* break that. Thanks for your detailed analysis -- it's pretty educational! Basing on your analysis, I'm able to fix the error in my previous analysis: we need to consider two scenarios now. ParseConfigFileInternal returns struct ConfigVariable *, and it has two use (or direct invokers). One is show_all_file_settings (for pg_settings view). The returned linked list and sourcefile referenced by it should not be freed, because it's used to construct query results. Another invoker is ProcessConfigFile. Here's the problem: ProcessConfigFile does NOT use the result, which means it won't free the filename referenced in the returned linked list. The returned filename is allocated by set_config_sourcefile exactly. > Generally, the results of automated leak analysis tools need to be taken > with a mountain of salt, particularly if what you are paying attention > to is what remains allocated at program exit. Such data can only fairly > be called a leak if there is no accessible pointer to it --- but the tools > are pretty bad at making that determination, at least with C programs. Your consideration is reasonable: remaining heap object can still be referenced when exit; in this case, it's not a leak. But have you tried LeakSanitizer, a "modern" memory leak detector? LeakSanitizer works like a garbage collector --- it does trace objects from the "root set", including the globals, the registers, each thread's stack, and TLS variables. As long as you stay away from pointer black magics, LeakSanitizer is unlikely to produce false-positives. Even if I believe in LeakSanitizer, it's absolutely stupid to ignore the recommendations from an experienced developer. To really make sure that there's memory leak, I verified the leak with gdb. When ProcessConfigFile is entered, I set a breakpoint to print guc_strdup's return value ($rax) in set_config_sourcefile. I also set a breakpoint on free to print the argument ($rdi). To sum up: for any returned pointer ofguc_strdup, if a corresponding free cannot be found, then we can confirm that there's a leak. Attached is the output of gdb, which confirms the leak.
Attachment
Hugh Wang <hghwng@gmail.com> writes: > Even if I believe in LeakSanitizer, it's absolutely stupid to ignore the > recommendations from an experienced developer. To really make sure that there's > memory leak, I verified the leak with gdb. When ProcessConfigFile is entered, I > set a breakpoint to print guc_strdup's return value ($rax) in > set_config_sourcefile. I also set a breakpoint on free to print the argument > ($rdi). To sum up: for any returned pointer ofguc_strdup, if a > corresponding free > cannot be found, then we can confirm that there's a leak. [ shrug... ] Nonetheless, your analysis is faulty, because it's experimentally demonstrable that there is no leak. I just left a backend run for quite some time, hitting it with a stream of SIGHUPs, and for good measure having it read the pg_file_settings view over and over so that the show_all_file_settings code path is being exercised as well. By now, if either of those code paths was leaking even one single byte per execution, I could see the difference in "top". But the process memory size is not moving. I speculate that you may be confusing the transient ConfigVariable data structures with the non-transient GUC storage (struct config_generic and its overlays). The former data structure, including attached strings such as filenames, is palloc'd in a short-lived memory context and we get rid of it by dropping the context. The actual GUC storage is malloc'd so we have to take care to free() any string we replace. As far as I can tell, we do. set_config_sourcefile certainly does so. regards, tom lane