Thread: pgcrypto related backend crash on solaris 10/x86_64
I brought back clownfish(still a bit dubious about the unexplained failures which seem vmware emulation bugs but this one seems to be easily reproduceable) onto the buildfarm and enabled --with-openssl after the the recent openssl/pgcrypto related fixes but I'm still getting a backend crash during the pgcrypto regression tests: http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=clownfish&dt=2007-09-09%2012:14:50 backtrace looks like: program terminated by signal SEGV (no mapping at the fault address) 0xfffffd7fff241b61: AES_encrypt+0x0241: xorq (%r15,%rdx,8),%rbx (dbx) where =>[1] AES_encrypt(0x5, 0x39dc9a7a, 0xf560e7b50e, 0x90ca350d49, 0xf560e7b50ea90dfb, 0x6b6b6b6b), at 0xfffffd7fff241b61 [2] 0x0(0x0, 0x0, 0x0, 0x0, 0x0, 0x0), at 0x0 Stefan
On 9/9/07, Stefan Kaltenbrunner <stefan@kaltenbrunner.cc> wrote: > I brought back clownfish(still a bit dubious about the unexplained > failures which seem vmware emulation bugs but this one seems to be > easily reproduceable) onto the buildfarm and enabled --with-openssl > after the the recent openssl/pgcrypto related fixes but I'm still > getting a backend crash during the pgcrypto regression tests: > > http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=clownfish&dt=2007-09-09%2012:14:50 > > > > backtrace looks like: > > program terminated by signal SEGV (no mapping at the fault address) > 0xfffffd7fff241b61: AES_encrypt+0x0241: xorq (%r15,%rdx,8),%rbx > (dbx) where > =>[1] AES_encrypt(0x5, 0x39dc9a7a, 0xf560e7b50e, 0x90ca350d49, > 0xf560e7b50ea90dfb, 0x6b6b6b6b), at 0xfffffd7fff241b61 > [2] 0x0(0x0, 0x0, 0x0, 0x0, 0x0, 0x0), at 0x0 This is crashing because of the crippled OpenSSL on some version of Solaris. Zdenek Kotala posted a workaround for that, I am cleaning it but have not found the time to finalize it. I'll try to post v03 of Zdenek's patch ASAP. -- marko
"Marko Kreen" <markokr@gmail.com> writes: > On 9/9/07, Stefan Kaltenbrunner <stefan@kaltenbrunner.cc> wrote: >> I brought back clownfish(still a bit dubious about the unexplained >> failures which seem vmware emulation bugs but this one seems to be >> easily reproduceable) onto the buildfarm and enabled --with-openssl >> after the the recent openssl/pgcrypto related fixes but I'm still >> getting a backend crash during the pgcrypto regression tests: > This is crashing because of the crippled OpenSSL on some version > of Solaris. Zdenek Kotala posted a workaround for that, I am > cleaning it but have not found the time to finalize it. But clownfish was working fine up through Aug 2, and the only change in pgcrypto since then could hardly have introduced this failure: http://archives.postgresql.org/pgsql-committers/2007-08/msg00306.php So I think there's more to it than Marko's explanation. Maybe clownfish now has a different OpenSSL version installed than before? regards, tom lane
Tom Lane wrote: > "Marko Kreen" <markokr@gmail.com> writes: >> On 9/9/07, Stefan Kaltenbrunner <stefan@kaltenbrunner.cc> wrote: >>> I brought back clownfish(still a bit dubious about the unexplained >>> failures which seem vmware emulation bugs but this one seems to be >>> easily reproduceable) onto the buildfarm and enabled --with-openssl >>> after the the recent openssl/pgcrypto related fixes but I'm still >>> getting a backend crash during the pgcrypto regression tests: > >> This is crashing because of the crippled OpenSSL on some version >> of Solaris. Zdenek Kotala posted a workaround for that, I am >> cleaning it but have not found the time to finalize it. > > But clownfish was working fine up through Aug 2, and the only change in > pgcrypto since then could hardly have introduced this failure: > http://archives.postgresql.org/pgsql-committers/2007-08/msg00306.php > > So I think there's more to it than Marko's explanation. Maybe clownfish > now has a different OpenSSL version installed than before? no clownfish was not building with openssl before because of that "crippled openssl" issue - I was under the assumption that the above commit was actually incorporating the complete fix from zdenek so I added it back again only to find that it is still not working ... Stefan
Marko Kreen wrote: > On 9/9/07, Stefan Kaltenbrunner <stefan@kaltenbrunner.cc> wrote: >> I brought back clownfish(still a bit dubious about the unexplained >> failures which seem vmware emulation bugs but this one seems to be >> easily reproduceable) onto the buildfarm and enabled --with-openssl >> after the the recent openssl/pgcrypto related fixes but I'm still >> getting a backend crash during the pgcrypto regression tests: >> >> http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=clownfish&dt=2007-09-09%2012:14:50 >> >> >> >> backtrace looks like: >> >> program terminated by signal SEGV (no mapping at the fault address) >> 0xfffffd7fff241b61: AES_encrypt+0x0241: xorq (%r15,%rdx,8),%rbx >> (dbx) where >> =>[1] AES_encrypt(0x5, 0x39dc9a7a, 0xf560e7b50e, 0x90ca350d49, >> 0xf560e7b50ea90dfb, 0x6b6b6b6b), at 0xfffffd7fff241b61 >> [2] 0x0(0x0, 0x0, 0x0, 0x0, 0x0, 0x0), at 0x0 > > This is crashing because of the crippled OpenSSL on some version > of Solaris. Zdenek Kotala posted a workaround for that, I am > cleaning it but have not found the time to finalize it. > > I'll try to post v03 of Zdenek's patch ASAP. > However, I guess there still will be a problem with regression tests, because pg_crypto will reports error in case when user tries to use stronger cipher, but it generates diff between expected and real output. I don't know if is possible select different output based on test if strong crypto is installed or not. Maybe some magic in Makefile/Configure. Test should be: # ldd /usr/postgres/8.2/lib/pgcrypto.so | grep libcrypto_extra # libcrypto_extra.so.0.9.8 => (file not found) if output contains (file not found) library is not installed or not in path (/usr/sfw/lib). Zdenek
On 9/11/07, Zdenek Kotala <Zdenek.Kotala@sun.com> wrote: > Marko Kreen wrote: > > This is crashing because of the crippled OpenSSL on some version > > of Solaris. Zdenek Kotala posted a workaround for that, I am > > cleaning it but have not found the time to finalize it. > > > > I'll try to post v03 of Zdenek's patch ASAP. > However, I guess there still will be a problem with regression tests, > because pg_crypto will reports error in case when user tries to use > stronger cipher, but it generates diff between expected and real output. > > I don't know if is possible select different output based on test if > strong crypto is installed or not. Maybe some magic in > Makefile/Configure. Test should be: > > # ldd /usr/postgres/8.2/lib/pgcrypto.so | grep libcrypto_extra > # libcrypto_extra.so.0.9.8 => (file not found) > > if output contains (file not found) library is not installed or not in > path (/usr/sfw/lib). Failing regression tests are fine - it is good if user can easily see that the os is broken. -- marko
Marko Kreen wrote: > On 9/11/07, Zdenek Kotala <Zdenek.Kotala@sun.com> wrote: >> Marko Kreen wrote: >>> This is crashing because of the crippled OpenSSL on some version >>> of Solaris. Zdenek Kotala posted a workaround for that, I am >>> cleaning it but have not found the time to finalize it. >>> >>> I'll try to post v03 of Zdenek's patch ASAP. > >> However, I guess there still will be a problem with regression tests, >> because pg_crypto will reports error in case when user tries to use >> stronger cipher, but it generates diff between expected and real output. >> >> I don't know if is possible select different output based on test if >> strong crypto is installed or not. Maybe some magic in >> Makefile/Configure. Test should be: >> >> # ldd /usr/postgres/8.2/lib/pgcrypto.so | grep libcrypto_extra >> # libcrypto_extra.so.0.9.8 => (file not found) >> >> if output contains (file not found) library is not installed or not in >> path (/usr/sfw/lib). > > Failing regression tests are fine - it is good if user can > easily see that the os is broken. > But if build machine still complain about problem we can easily overlook another problems. There are two possible solution 1) modify reg test or 2) recommend to install crypto package on all affected build machine. Anyway I plan to add some mention into solaris FAQ when we will have final patch. I also think It should be good to mention in pg_crypto README or add comment into regression test expected output file which will be visible in regression.diff. Zdenek
Zdenek Kotala wrote: > Marko Kreen wrote: >> On 9/11/07, Zdenek Kotala <Zdenek.Kotala@sun.com> wrote: >>> Marko Kreen wrote: >>>> This is crashing because of the crippled OpenSSL on some version >>>> of Solaris. Zdenek Kotala posted a workaround for that, I am >>>> cleaning it but have not found the time to finalize it. >>>> >>>> I'll try to post v03 of Zdenek's patch ASAP. >> >>> However, I guess there still will be a problem with regression tests, >>> because pg_crypto will reports error in case when user tries to use >>> stronger cipher, but it generates diff between expected and real output. >>> >>> I don't know if is possible select different output based on test if >>> strong crypto is installed or not. Maybe some magic in >>> Makefile/Configure. Test should be: >>> >>> # ldd /usr/postgres/8.2/lib/pgcrypto.so | grep libcrypto_extra >>> # libcrypto_extra.so.0.9.8 => (file not found) >>> >>> if output contains (file not found) library is not installed or not in >>> path (/usr/sfw/lib). >> >> Failing regression tests are fine - it is good if user can >> easily see that the os is broken. >> > > But if build machine still complain about problem we can easily > overlook another problems. There are two possible solution 1) modify reg > test or 2) recommend to install crypto package on all affected build > machine. > > Anyway I plan to add some mention into solaris FAQ when we will have > final patch. I also think It should be good to mention in pg_crypto > README or add comment into regression test expected output file which > will be visible in regression.diff. well in my opinion we should simply fail regression(not crash like we do now) in case we have to deal with such a crippled openssl installation. Adding information about that issue to the Solaris FAQ seems also like a good thing. Stefan