Thread: ecpg assertion on windows

ecpg assertion on windows

From
Andres Freund
Date:
Hi,

I occasionally

Running the ecpg regression tests interactively (to try to find a different
issue), triggered a crash on windows due to an uninitialized variable (after
pressing "ignore" in that stupid gui window that we've only disabled for the
backend).

"The variable 'replace_val' is being used without being initialized."

Child-SP          RetAddr               Call Site
000000b3`3bcfe140 00007ff9`03f9cd74     libpgtypes!failwithmessage(
                        void * retaddr = 0x00007ff9`03f96133,
                        int crttype = 0n1,
                        int errnum = 0n3,
                        char * msg = 0x000000b3`3bcff050 "The variable 'replace_val' is being used without being
initialized.")+0x234[d:\a01\_work\12\s\src\vctools\crt\vcstartup\src\rtc\error.cpp @ 213]
 
000000b3`3bcff030 00007ff9`03f96133     libpgtypes!_RTC_UninitUse(
                        char * varname = 0x00007ff9`03fa8a90 "replace_val")+0xa4
[d:\a01\_work\12\s\src\vctools\crt\vcstartup\src\rtc\error.cpp@ 362]
 
000000b3`3bcff470 00007ff9`03f94acd     libpgtypes!dttofmtasc_replace(
                        int64 * ts = 0x000000b3`3bcff778,
                        long dDate = 0n0,
                        int dow = 0n6,
                        struct tm * tm = 0x000000b3`3bcff598,
                        char * output = 0x0000026e`9c223de0 "abc-00:00:00",
                        int * pstr_len = 0x000000b3`3bcff620,
                        char * fmtstr = 0x00007ff7`b01ae5c0 "abc-%X-def-%x-ghi%%")+0xe53
[C:\dev\postgres-meson\src\interfaces\ecpg\pgtypeslib\timestamp.c@ 759]
 
*** WARNING: Unable to verify checksum for
C:\dev\postgres-meson\build-msbuild\src\interfaces\ecpg\test\pgtypeslib\dt_test.exe
000000b3`3bcff550 00007ff7`b01a23c9     libpgtypes!PGTYPEStimestamp_fmt_asc(
                        int64 * ts = 0x000000b3`3bcff778,
                        char * output = 0x0000026e`9c223de0 "abc-00:00:00",
                        int str_len = 0n19,
                        char * fmtstr = 0x00007ff7`b01ae5c0 "abc-%X-def-%x-ghi%%")+0xed
[C:\dev\postgres-meson\src\interfaces\ecpg\pgtypeslib\timestamp.c@ 794]
 
000000b3`3bcff610 00007ff7`b01a4499     dt_test!main(void)+0xe59
[C:\dev\postgres-meson\src\interfaces\ecpg\test\pgtypeslib\dt_test.pgc@ 200]
 
000000b3`3bcff860 00007ff7`b01a433e     dt_test!invoke_main(void)+0x39
[d:\a01\_work\12\s\src\vctools\crt\vcstartup\src\startup\exe_common.inl@ 79]
 
000000b3`3bcff8b0 00007ff7`b01a41fe     dt_test!__scrt_common_main_seh(void)+0x12e
[d:\a01\_work\12\s\src\vctools\crt\vcstartup\src\startup\exe_common.inl@ 288]
 
000000b3`3bcff920 00007ff7`b01a452e     dt_test!__scrt_common_main(void)+0xe
[d:\a01\_work\12\s\src\vctools\crt\vcstartup\src\startup\exe_common.inl@ 331]
 
000000b3`3bcff950 00007ff9`1d987034     dt_test!mainCRTStartup(
                        void * __formal = 0x000000b3`3bbe8000)+0xe
[d:\a01\_work\12\s\src\vctools\crt\vcstartup\src\startup\exe_main.cpp@ 17]
 
000000b3`3bcff980 00007ff9`1f842651     KERNEL32!BaseThreadInitThunk+0x14
000000b3`3bcff9b0 00000000`00000000     ntdll!RtlUserThreadStart+0x21

I haven't analyzed this further.


CI also shows ecpg itself occasionally crashing, but I haven't managed to
catch it in the act.

Greetings,

Andres Freund



Re: ecpg assertion on windows

From
Andres Freund
Date:
Hi,

On 2022-08-23 20:36:55 -0700, Andres Freund wrote:
> Running the ecpg regression tests interactively (to try to find a different
> issue), triggered a crash on windows due to an uninitialized variable (after
> pressing "ignore" in that stupid gui window that we've only disabled for the
> backend).
> 
> "The variable 'replace_val' is being used without being initialized."

Looks to me like that's justified. The paths in dttofmtasc_replace using
PGTYPES_TYPE_NOTHING don't set replace_val, but call pgtypes_fmt_replace() -
with replace_val passed by value. If that's the first replacement, an
unitialized variable is passed...

Seems either the caller should skip calling pgtypes_fmt_replace() in the
NOTHING case, or replace_val should be zero initialized?

- Andres



Re: ecpg assertion on windows

From
Tom Lane
Date:
Andres Freund <andres@anarazel.de> writes:
> On 2022-08-23 20:36:55 -0700, Andres Freund wrote:
>> Running the ecpg regression tests interactively (to try to find a different
>> issue), triggered a crash on windows due to an uninitialized variable (after
>> pressing "ignore" in that stupid gui window that we've only disabled for the
>> backend).
>> "The variable 'replace_val' is being used without being initialized."

> Looks to me like that's justified.

Hmm ... that message sounded like it is a run-time detection not from
static analysis.  But if the regression tests are triggering use of
uninitialized values, how could we have failed to detect that?
Either valgrind or unstable behavior should have found this ages ago.

Seeing that replace_val is a union of differently-sized types,
I was wondering if this message is a false positive based on
struct assignment transferring a few uninitialized bytes, or
something like that.

            regards, tom lane



Re: ecpg assertion on windows

From
Andres Freund
Date:
Hi,

On 2022-08-24 00:18:27 -0400, Tom Lane wrote:
> Andres Freund <andres@anarazel.de> writes:
> > On 2022-08-23 20:36:55 -0700, Andres Freund wrote:
> >> Running the ecpg regression tests interactively (to try to find a different
> >> issue), triggered a crash on windows due to an uninitialized variable (after
> >> pressing "ignore" in that stupid gui window that we've only disabled for the
> >> backend).
> >> "The variable 'replace_val' is being used without being initialized."
> 
> > Looks to me like that's justified.
> 
> Hmm ... that message sounded like it is a run-time detection not from
> static analysis.

Yes, it's a runtime error.


> But if the regression tests are triggering use of uninitialized values, how
> could we have failed to detect that?  Either valgrind or unstable behavior
> should have found this ages ago.

I think it's just different criteria for when to report issues. Valgrind
reports uninitialized memory only when there's a conditional branch depending
on it or such. Whereas this seems to trigger when passing an uninitialized
value to a function by value, even if it's then not relied upon.

I don't think we regularly test all client tests with valgrind, btw. Skink
only runs the server under valgrind at least.


> Seeing that replace_val is a union of differently-sized types,
> I was wondering if this message is a false positive based on
> struct assignment transferring a few uninitialized bytes, or
> something like that.

I think it's genuinely uninitialized - if you track what happens if the first
parameter is e.g. %X: It'll not initialize replace_val, but then call
pgtypes_fmt_replace(). So an uninit value is passed.

Greetings,

Andres Freund



Re: ecpg assertion on windows

From
Tom Lane
Date:
Andres Freund <andres@anarazel.de> writes:
> On 2022-08-24 00:18:27 -0400, Tom Lane wrote:
>> But if the regression tests are triggering use of uninitialized values, how
>> could we have failed to detect that?  Either valgrind or unstable behavior
>> should have found this ages ago.

> I think it's just different criteria for when to report issues. Valgrind
> reports uninitialized memory only when there's a conditional branch depending
> on it or such. Whereas this seems to trigger when passing an uninitialized
> value to a function by value, even if it's then not relied upon.

If the value is not actually relied on, then it's a false positive.

I don't say we shouldn't fix it, because we routinely jump through
hoops to silence various sorts of functionally-harmless warnings.
But let's be clear about whether there's a real bug here.

            regards, tom lane



Re: ecpg assertion on windows

From
Andres Freund
Date:
Hi,

On 2022-08-24 00:32:53 -0400, Tom Lane wrote:
> Andres Freund <andres@anarazel.de> writes:
> > On 2022-08-24 00:18:27 -0400, Tom Lane wrote:
> >> But if the regression tests are triggering use of uninitialized values, how
> >> could we have failed to detect that?  Either valgrind or unstable behavior
> >> should have found this ages ago.
> 
> > I think it's just different criteria for when to report issues. Valgrind
> > reports uninitialized memory only when there's a conditional branch depending
> > on it or such. Whereas this seems to trigger when passing an uninitialized
> > value to a function by value, even if it's then not relied upon.
> 
> If the value is not actually relied on, then it's a false positive.

My understanding is that formally speaking passing an undefined value by value
to a function is "relying on it" and undefined behaviour. Hard to believe
it'll cause any compiler go haywire and eat the computer, but ...


> I don't say we shouldn't fix it, because we routinely jump through
> hoops to silence various sorts of functionally-harmless warnings.
> But let's be clear about whether there's a real bug here.

Yea.

Greetings,

Andres Freund