openssl valgrind failures on skink are due to openssl issue - Mailing list pgsql-hackers

From Andres Freund
Subject openssl valgrind failures on skink are due to openssl issue
Date
Msg-id 20190611205131.e5trufqahzpiwut6@alap3.anarazel.de
Whole thread Raw
Responses Re: openssl valgrind failures on skink are due to openssl issue
List pgsql-hackers
Hi,

Skink a few days ago started failing [1][2] with errors like:

==2732== Conditional jump or move depends on uninitialised value(s)
==2732==    at 0x4C612E3: ??? (in /usr/lib/x86_64-linux-gnu/libcrypto.so.1.1)
==2732==    by 0x4C621FA: RAND_DRBG_generate (in /usr/lib/x86_64-linux-gnu/libcrypto.so.1.1)
==2732==    by 0x4C63620: ??? (in /usr/lib/x86_64-linux-gnu/libcrypto.so.1.1)
==2732==    by 0x4C61A09: RAND_DRBG_instantiate (in /usr/lib/x86_64-linux-gnu/libcrypto.so.1.1)
==2732==    by 0x4C62937: ??? (in /usr/lib/x86_64-linux-gnu/libcrypto.so.1.1)
==2732==    by 0x4C62CC9: RAND_DRBG_get0_public (in /usr/lib/x86_64-linux-gnu/libcrypto.so.1.1)
==2732==    by 0x4C62CEF: ??? (in /usr/lib/x86_64-linux-gnu/libcrypto.so.1.1)
==2732==    by 0x69C6AF: pg_strong_random (pg_strong_random.c:135)
==2732==    by 0x4A2841: InitProcessGlobals (postmaster.c:2581)
==2732==    by 0x661137: InitStandaloneProcess (miscinit.c:322)
==2732==    by 0x2943CF: AuxiliaryProcessMain (bootstrap.c:209)
==2732==    by 0x4005D1: main (main.c:220)
==2732==  Uninitialised value was created by a stack allocation
==2732==    at 0x4C633B0: ??? (in /usr/lib/x86_64-linux-gnu/libcrypto.so.1.1)

and then a lot of followup error.

Reproduced that locally to get a nicer trace:
==7146== Conditional jump or move depends on uninitialised value(s)
==7146==    at 0x4B122E3: inc_128 (drbg_ctr.c:32)
==7146==    by 0x4B122E3: drbg_ctr_generate (drbg_ctr.c:330)
==7146==    by 0x4B131FA: RAND_DRBG_generate (drbg_lib.c:638)
==7146==    by 0x4B14620: rand_drbg_get_entropy (rand_lib.c:172)
==7146==    by 0x4B12A09: RAND_DRBG_instantiate (drbg_lib.c:338)
==7146==    by 0x4B13937: drbg_setup (drbg_lib.c:892)
==7146==    by 0x4B13CC9: RAND_DRBG_get0_public (drbg_lib.c:1120)
==7146==    by 0x4B13CC9: RAND_DRBG_get0_public (drbg_lib.c:1109)
==7146==    by 0x4B13CEF: drbg_bytes (drbg_lib.c:963)
==7146==    by 0x87BD60: pg_strong_random (pg_strong_random.c:139)
==7146==    by 0x5CAFFC: InitProcessGlobals (postmaster.c:2581)
==7146==    by 0x81AAD7: InitStandaloneProcess (miscinit.c:322)
==7146==    by 0x681A61: PostgresMain (postgres.c:3732)
==7146==    by 0x4E3D86: main (main.c:224)
==7146==  Uninitialised value was created by a stack allocation
==7146==    at 0x4B143B0: rand_drbg_get_nonce (rand_lib.c:231)

reading through the code lead me to figure out that that's due to a
recent openssl change:
https://github.com/openssl/openssl/commit/b3d113ed2993801ee643126118ccf6592ad18ef7
as explained in
https://github.com/openssl/openssl/issues/8460
and fixed since in
https://github.com/openssl/openssl/commit/15d7e7997e219fc5fef3f6003cc6bd7b2e7379d4

For reasons I do not understand the "cosmetic change" was backpatched
into 1.1.1 And the fix for the cosmetic change, made on master at the
end of March, was only backpatched to 1.1.1 *after* the 1.1.1c release
was made in late May.  I mean, huh.

That release was then installed on skink recently:
2019-06-01 06:39:20 upgrade libssl1.1:amd64 1.1.1b-2 1.1.1c-1

And finally the reason for the issue only being visible on master: I am
stupid, and matched branch names like REL9_6 instead of REL9_6_STABLE
etc to enable openssl (9.4 doesn't work against current openssl).


I can't think of a better way to fix skink for now than just disabling
openssl for skink, until 1.1.1d is released.

Greetings,

Andres Freund

[1] https://buildfarm.postgresql.org/cgi-bin/show_history.pl?nm=skink&br=HEAD
[2] https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=skink&dt=2019-06-10%2001%3A36%3A12



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: UCT (Re: pgsql: Update time zone data files to tzdata release 2019a.)
Next
From: Tom Lane
Date:
Subject: Re: openssl valgrind failures on skink are due to openssl issue