Thread: ecpg assertion on windows
Hi, I occasionally Running the ecpg regression tests interactively (to try to find a different issue), triggered a crash on windows due to an uninitialized variable (after pressing "ignore" in that stupid gui window that we've only disabled for the backend). "The variable 'replace_val' is being used without being initialized." Child-SP RetAddr Call Site 000000b3`3bcfe140 00007ff9`03f9cd74 libpgtypes!failwithmessage( void * retaddr = 0x00007ff9`03f96133, int crttype = 0n1, int errnum = 0n3, char * msg = 0x000000b3`3bcff050 "The variable 'replace_val' is being used without being initialized.")+0x234[d:\a01\_work\12\s\src\vctools\crt\vcstartup\src\rtc\error.cpp @ 213] 000000b3`3bcff030 00007ff9`03f96133 libpgtypes!_RTC_UninitUse( char * varname = 0x00007ff9`03fa8a90 "replace_val")+0xa4 [d:\a01\_work\12\s\src\vctools\crt\vcstartup\src\rtc\error.cpp@ 362] 000000b3`3bcff470 00007ff9`03f94acd libpgtypes!dttofmtasc_replace( int64 * ts = 0x000000b3`3bcff778, long dDate = 0n0, int dow = 0n6, struct tm * tm = 0x000000b3`3bcff598, char * output = 0x0000026e`9c223de0 "abc-00:00:00", int * pstr_len = 0x000000b3`3bcff620, char * fmtstr = 0x00007ff7`b01ae5c0 "abc-%X-def-%x-ghi%%")+0xe53 [C:\dev\postgres-meson\src\interfaces\ecpg\pgtypeslib\timestamp.c@ 759] *** WARNING: Unable to verify checksum for C:\dev\postgres-meson\build-msbuild\src\interfaces\ecpg\test\pgtypeslib\dt_test.exe 000000b3`3bcff550 00007ff7`b01a23c9 libpgtypes!PGTYPEStimestamp_fmt_asc( int64 * ts = 0x000000b3`3bcff778, char * output = 0x0000026e`9c223de0 "abc-00:00:00", int str_len = 0n19, char * fmtstr = 0x00007ff7`b01ae5c0 "abc-%X-def-%x-ghi%%")+0xed [C:\dev\postgres-meson\src\interfaces\ecpg\pgtypeslib\timestamp.c@ 794] 000000b3`3bcff610 00007ff7`b01a4499 dt_test!main(void)+0xe59 [C:\dev\postgres-meson\src\interfaces\ecpg\test\pgtypeslib\dt_test.pgc@ 200] 000000b3`3bcff860 00007ff7`b01a433e dt_test!invoke_main(void)+0x39 [d:\a01\_work\12\s\src\vctools\crt\vcstartup\src\startup\exe_common.inl@ 79] 000000b3`3bcff8b0 00007ff7`b01a41fe dt_test!__scrt_common_main_seh(void)+0x12e [d:\a01\_work\12\s\src\vctools\crt\vcstartup\src\startup\exe_common.inl@ 288] 000000b3`3bcff920 00007ff7`b01a452e dt_test!__scrt_common_main(void)+0xe [d:\a01\_work\12\s\src\vctools\crt\vcstartup\src\startup\exe_common.inl@ 331] 000000b3`3bcff950 00007ff9`1d987034 dt_test!mainCRTStartup( void * __formal = 0x000000b3`3bbe8000)+0xe [d:\a01\_work\12\s\src\vctools\crt\vcstartup\src\startup\exe_main.cpp@ 17] 000000b3`3bcff980 00007ff9`1f842651 KERNEL32!BaseThreadInitThunk+0x14 000000b3`3bcff9b0 00000000`00000000 ntdll!RtlUserThreadStart+0x21 I haven't analyzed this further. CI also shows ecpg itself occasionally crashing, but I haven't managed to catch it in the act. Greetings, Andres Freund
Hi, On 2022-08-23 20:36:55 -0700, Andres Freund wrote: > Running the ecpg regression tests interactively (to try to find a different > issue), triggered a crash on windows due to an uninitialized variable (after > pressing "ignore" in that stupid gui window that we've only disabled for the > backend). > > "The variable 'replace_val' is being used without being initialized." Looks to me like that's justified. The paths in dttofmtasc_replace using PGTYPES_TYPE_NOTHING don't set replace_val, but call pgtypes_fmt_replace() - with replace_val passed by value. If that's the first replacement, an unitialized variable is passed... Seems either the caller should skip calling pgtypes_fmt_replace() in the NOTHING case, or replace_val should be zero initialized? - Andres
Andres Freund <andres@anarazel.de> writes: > On 2022-08-23 20:36:55 -0700, Andres Freund wrote: >> Running the ecpg regression tests interactively (to try to find a different >> issue), triggered a crash on windows due to an uninitialized variable (after >> pressing "ignore" in that stupid gui window that we've only disabled for the >> backend). >> "The variable 'replace_val' is being used without being initialized." > Looks to me like that's justified. Hmm ... that message sounded like it is a run-time detection not from static analysis. But if the regression tests are triggering use of uninitialized values, how could we have failed to detect that? Either valgrind or unstable behavior should have found this ages ago. Seeing that replace_val is a union of differently-sized types, I was wondering if this message is a false positive based on struct assignment transferring a few uninitialized bytes, or something like that. regards, tom lane
Hi, On 2022-08-24 00:18:27 -0400, Tom Lane wrote: > Andres Freund <andres@anarazel.de> writes: > > On 2022-08-23 20:36:55 -0700, Andres Freund wrote: > >> Running the ecpg regression tests interactively (to try to find a different > >> issue), triggered a crash on windows due to an uninitialized variable (after > >> pressing "ignore" in that stupid gui window that we've only disabled for the > >> backend). > >> "The variable 'replace_val' is being used without being initialized." > > > Looks to me like that's justified. > > Hmm ... that message sounded like it is a run-time detection not from > static analysis. Yes, it's a runtime error. > But if the regression tests are triggering use of uninitialized values, how > could we have failed to detect that? Either valgrind or unstable behavior > should have found this ages ago. I think it's just different criteria for when to report issues. Valgrind reports uninitialized memory only when there's a conditional branch depending on it or such. Whereas this seems to trigger when passing an uninitialized value to a function by value, even if it's then not relied upon. I don't think we regularly test all client tests with valgrind, btw. Skink only runs the server under valgrind at least. > Seeing that replace_val is a union of differently-sized types, > I was wondering if this message is a false positive based on > struct assignment transferring a few uninitialized bytes, or > something like that. I think it's genuinely uninitialized - if you track what happens if the first parameter is e.g. %X: It'll not initialize replace_val, but then call pgtypes_fmt_replace(). So an uninit value is passed. Greetings, Andres Freund
Andres Freund <andres@anarazel.de> writes: > On 2022-08-24 00:18:27 -0400, Tom Lane wrote: >> But if the regression tests are triggering use of uninitialized values, how >> could we have failed to detect that? Either valgrind or unstable behavior >> should have found this ages ago. > I think it's just different criteria for when to report issues. Valgrind > reports uninitialized memory only when there's a conditional branch depending > on it or such. Whereas this seems to trigger when passing an uninitialized > value to a function by value, even if it's then not relied upon. If the value is not actually relied on, then it's a false positive. I don't say we shouldn't fix it, because we routinely jump through hoops to silence various sorts of functionally-harmless warnings. But let's be clear about whether there's a real bug here. regards, tom lane
Hi, On 2022-08-24 00:32:53 -0400, Tom Lane wrote: > Andres Freund <andres@anarazel.de> writes: > > On 2022-08-24 00:18:27 -0400, Tom Lane wrote: > >> But if the regression tests are triggering use of uninitialized values, how > >> could we have failed to detect that? Either valgrind or unstable behavior > >> should have found this ages ago. > > > I think it's just different criteria for when to report issues. Valgrind > > reports uninitialized memory only when there's a conditional branch depending > > on it or such. Whereas this seems to trigger when passing an uninitialized > > value to a function by value, even if it's then not relied upon. > > If the value is not actually relied on, then it's a false positive. My understanding is that formally speaking passing an undefined value by value to a function is "relying on it" and undefined behaviour. Hard to believe it'll cause any compiler go haywire and eat the computer, but ... > I don't say we shouldn't fix it, because we routinely jump through > hoops to silence various sorts of functionally-harmless warnings. > But let's be clear about whether there's a real bug here. Yea. Greetings, Andres Freund