Hi,
On 2025-03-07 00:03:47 +0100, Tomas Vondra wrote:
> while running check-world on 64-bit arm (rpi5 with Debian 12.9), I got a
> couple reports like this:
>
> ==64550== Use of uninitialised value of size 8
> ==64550== at 0xA62FE0: wrapper_handler (pqsignal.c:107)
> ==64550== by 0x580BB9E7: ??? (in
> /usr/libexec/valgrind/memcheck-arm64-linux)
> ==64550== Uninitialised value was created by a stack allocation
> ==64550== at 0x4F94660: strcoll_l (strcoll_l.c:258)
> ==64550==
> {
> <insert_a_suppression_name_here>
> Memcheck:Value8
> fun:wrapper_handler
> obj:/usr/libexec/valgrind/memcheck-arm64-linux
> }
> **64550** Valgrind detected 1 error(s) during execution of "ANALYZE
> mcv_lists;"
> The exact command varies, I don't think it's necessarily about analyze
> or extended stats.
Do you have a few other examples from where it was triggered?
Is the source of the uninitialized value always strcoll_l?
Can you reliably reproduce it in certain scenarios or is it probabilistic in
some form?
Do you know what signal was delivered (I think that could be detected using
valgrinds --vgdb)?
> The line the report refers to is this:
>
> (*pqsignal_handlers[postgres_signal_arg]) (postgres_signal_arg);
>
> so I guess it can't be about postgres_signal_arg (as that's an int). But
> that leaves just pqsignal_handlers, and why would that be uninitialized?
Is it possible that the signal number we're getting called for is above
PG_NSIG? That'd explain why the source value is something fairly random?
ISTM that we should add an Assert() to wrapper_handler() that ensures that the
signal arg is below PG_NSIG.
Might also be worth trying to run without valgrind but with address and
undefined behaviour sanitizers enabled. I don't currently have access to an
armv8 machine that's not busy doing other stuff...
Greetings,
Andres Freund