Thread: libpq does not manage SSL callbacks properly when other libraries are involved.

Hi,

After experiencing a seg fault on RHEL5's command line php, I did the
following investigation.

As I dot not have a RHEL5 box available with debugging tools, I
reproduced the bug on CentOS5.

Reproduction on Centos5
-----------------------
[root@unknown-00-16-3e-30-f0-0d ~]# php -f test-pgsql.php
PHP Fatal error:  Uncaught exception 'PDOException' with message
'SQLSTATE[08006] [7] FATAL:  no pg_hba.conf entry for host
"59.167.146.4", user "nouser", database "fsd", SSL off' in
/root/test-pgsql.php:3
Stack trace:
#0 /root/test-pgsql.php(3): PDO->__construct('pgsql:host=thoe...',
'nouser', 'nopass')
#1 {main}
  thrown in /root/test-pgsql.php on line 3
Segmentation fault

[root@unknown-00-16-3e-30-f0-0d ~]# cat test-pgsql.php
<?php

$x = new
PDO("pgsql:host=connectableserver;port=5437;dbname=fsd","nouser","nopass");

?>
[root@unknown-00-16-3e-30-f0-0d ~]# php -f test-pgsql.php
PHP Fatal error:  Uncaught exception 'PDOException' with message
'SQLSTATE[08006] [7] FATAL:  no pg_hba.conf entry for host
"59.167.146.4", user "nouser", database "fsd", SSL off' in
/root/test-pgsql.php:3
Stack trace:
#0 /root/test-pgsql.php(3): PDO->__construct('pgsql:host=thoe...',
'nouser', 'nopass')
#1 {main}
  thrown in /root/test-pgsql.php on line 3
Segmentation fault
[root@unknown-00-16-3e-30-f0-0d ~]# gdb php
GNU gdb Red Hat Linux (6.5-25.el5_1.1rh)
Copyright (C) 2006 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain
conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "i386-redhat-linux-gnu"...(no debugging
symbols found)
Using host libthread_db library "/lib/libthread_db.so.1".

(gdb) run -f test-pgsql.php
Starting program: /usr/bin/php -f test-pgsql.php
(no debugging symbols found)
(no debugging symbols found)
(no debugging symbols found)
(no debugging symbols found)
(no debugging symbols found)
[Thread debugging using libthread_db enabled]
[New Thread -1208912160 (LWP 27037)]
PHP Fatal error:  Uncaught exception 'PDOException' with message
'SQLSTATE[08006] [7] FATAL:  no pg_hba.conf entry for host
"59.167.146.4", user "nouser", database "fsd", SSL off' in
/root/test-pgsql.php:3
Stack trace:
#0 /root/test-pgsql.php(3): PDO->__construct('pgsql:host=thoe...',
'nouser', 'nopass')
#1 {main}
  thrown in /root/test-pgsql.php on line 3

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread -1208912160 (LWP 27037)]
0x003248b0 in ?? ()
(gdb) bt
#0  0x003248b0 in ?? ()
#1  0x003996c5 in CRYPTO_lock () from /lib/libcrypto.so.6
#2  0x003efdbd in ERR_get_implementation () from /lib/libcrypto.so.6
#3  0x003efb90 in ERR_free_strings () from /lib/libcrypto.so.6
#4  0x00588107 in Curl_ossl_cleanup () from /usr/lib/libcurl.so.3
#5  0x00599370 in Curl_ssl_cleanup () from /usr/lib/libcurl.so.3
#6  0x005910c3 in curl_global_cleanup () from /usr/lib/libcurl.so.3
#7  0x0808d85b in zm_shutdown_curl ()
#8  0x081cecbe in module_destructor ()
#9  0x081d42a7 in zend_hash_quick_find ()
#10 0x081d4538 in zend_hash_graceful_reverse_destroy ()
#11 0x081cb89e in zend_shutdown ()
#12 0x0818e0ff in php_module_shutdown ()
#13 0x0824168b in main ()


After googling around a bit, I found these relevant bug links;

http://bugs.php.net/bug.php?id=40926
https://bugs.launchpad.net/ubuntu/+source/php5/+bug/63141
http://bugs.debian.org/411982 <http://bugs.debian.org/411982>

Following up the php bug report appeared to give the most useful outcome;

This is part of a comment from the php bug comment history;

*[12 Nov 2007 2:45pm UTC] sam at zoy dot org*

Hello, I did read the sources and studied them, and I can confirm
that it is a matter of callback jumping to an invalid address.

libpq's init_ssl_system() installs callbacks by calling
CRYPTO_set_id_callback() and CRYPTO_set_locking_callback(). This
function is called each time initialize_SSL() is called (for instance
through the PHP pg_connect() function) and does not keep a reference
counter, so libpq's destroy_SSL() has no way to know that it should
call a destroy_ssl_system() function, and there is no such function
anyway. So the callbacks are never removed.

But then, upon cleanup, PHP calls zend_shutdown() which properly
unloads pgsql.so and therefore the unused libpq.

Finally, the zend_shutdown procedure calls zm_shutdown_curl()
which in turn calls curl_global_cleanup() which leads to an
ERR_free_strings() call and eventually a CRYPTO_lock() call.
CRYPTO_lock() checks whether there are any callbacks to call,
finds one (the one installed by libpg), calls it, and crashes
because libpq was unloaded and hence the callback is no longer
in mapped memory.



After noting that it is SSL related, I adjusted my test script to show
the following (added sslmode=disable);


[root@unknown-00-16-3e-30-f0-0d ~]# cat test-pgsql.php

<?php

$x = new PDO("pgsql:host=connectablehost;port=5437;dbname=fsd;sslmode=disable","nouser","nopass");

?>

[root@unknown-00-16-3e-30-f0-0d ~]# php -f test-pgsql.php

PHP Fatal error:  Uncaught exception 'PDOException' with message 'SQLSTATE[08006] [7] FATAL:  no pg_hba.conf entry for
host"xx.xx.xx.xx", user "nouser", database "fsd", SSL off' in /root/test-pgsql.php:3 

Stack trace:

#0 /root/test-pgsql.php(3): PDO->__construct('pgsql:host=thoe...', 'nouser', 'nopass')

#1 {main}

  thrown in /root/test-pgsql.php on line 3


As a result, the crash has gone away. Are the comments in the PHP
comment accurate and is it reasonable to count calls to SSL in the way
suggested? As currently the callback remains even if libpq is unloaded
from memory, which is what's causing this problem. The callback should
be unregistered when we close our own SSL stuff?

Is it possible to get this fixed and possibly backported?

Thanks

Russell Smith
Russell Smith wrote:
> Hi,
>
> After experiencing a seg fault on RHEL5's command line php, I did the
> following investigation.
>
> As I dot not have a RHEL5 box available with debugging tools, I
> reproduced the bug on CentOS5.
Hi,

I've not received any feedback on this bug in a week, is anybody looking
at it.  Is there anything I'm doing wrong with my report of this bug?

Thanks

Russell.
***PUSH***

this bug is really some annoyance if you use automatic build environments.
I'm using phpunit to run tests and as soon as postgres is involved the php
cli environment segfaults at the end. this can be worked around by disabling
ssl but it would be great if the underlying bug got fixed.
--
View this message in context:
http://www.nabble.com/libpq-does-not-manage-SSL-callbacks-properly-when-other-libraries-are-involved.-tp18108184p19212172.html
Sent from the PostgreSQL - bugs mailing list archive at Nabble.com.
PoolSnoopy wrote:
>
> ***PUSH***
>
> this bug is really some annoyance if you use automatic build environments.
> I'm using phpunit to run tests and as soon as postgres is involved the php
> cli environment segfaults at the end. this can be worked around by disabling
> ssl but it would be great if the underlying bug got fixed.

This is PHP's bug, isn't it?  Why are you complaining here?

--
Alvaro Herrera                                http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

Re: libpq does not manage SSL callbacks properly when other libraries are involved.

From
Russell Smith
Date:
Alvaro Herrera wrote:
> PoolSnoopy wrote:
>
>> ***PUSH***
>>
>> this bug is really some annoyance if you use automatic build environments.
>> I'm using phpunit to run tests and as soon as postgres is involved the php
>> cli environment segfaults at the end. this can be worked around by disabling
>> ssl but it would be great if the underlying bug got fixed.
>>
>
> This is PHP's bug, isn't it?  Why are you complaining here
No, this is a problem with the callback/exit functions used by
PostgreSQL.  We setup callback functions when we use SSL, if somebody
else uses SSL we can create a problem.

I thought my original report was detailed enough to explain where the
problem is coming from.  Excerpt from original report;

This is part of a comment from the php bug comment history;

*[12 Nov 2007 2:45pm UTC] sam at zoy dot org*

Hello, I did read the sources and studied them, and I can confirm
that it is a matter of callback jumping to an invalid address.

libpq's init_ssl_system() installs callbacks by calling
CRYPTO_set_id_callback() and CRYPTO_set_locking_callback(). This
function is called each time initialize_SSL() is called (for instance
through the PHP pg_connect() function) and does not keep a reference
counter, so libpq's destroy_SSL() has no way to know that it should
call a destroy_ssl_system() function, and there is no such function
anyway. So the callbacks are never removed.

But then, upon cleanup, PHP calls zend_shutdown() which properly
unloads pgsql.so and therefore the unused libpq.

Finally, the zend_shutdown procedure calls zm_shutdown_curl()
which in turn calls curl_global_cleanup() which leads to an
ERR_free_strings() call and eventually a CRYPTO_lock() call.
CRYPTO_lock() checks whether there are any callbacks to call,
finds one (the one installed by libpg), calls it, and crashes
because libpq was unloaded and hence the callback is no longer
in mapped memory.

--

Basically postgresql doesn't cancel the callbacks to itself when the pg
connection is shut down.  So if the libpq library is unloaded before
other libraries that use SSL you get a crash as described above.  PHP
has suggested the fix is to keep a reference counter in libpq so knows
when to remove the callbacks.

This is a complicated bug, but without real evidence there is no way to
go to back to PHP and say it's their fault.  Their analysis is
relatively comprehensive compared to the feedback that's been posted
here so far.  I'm not sure how best to setup an environment to replicate
the bug in a way I can debug it.  And even if I get to the point of
nailing it down, I'll just be back asking questions about how you would
fix it because I know very little about SSL.

All that said, a quick poke in the source of PostgreSQL says that
fe-secure.c sets callbacks using CRYPTO_set_xx_callback(...).  These are
only set in the threaded version it appears.  Which is pretty much
default in all the installations I encounter.

My google research indicated we need to call
CRYPTO_set_xx_callback(NULL) when we exit.  but that's not done.  One
idea for a fix is to add a counter to the initialize_ssl function and
when destory_ssl is called, decrement the counter.  If it reaches 0 then
call CRYPT_set_xx_callback(NULL) to remove the callbacks.  This is a
windows SSL thread that crashes iexplore and testifies to the same
problem http://www.mail-archive.com/openssl-users@openssl.org/msg53869.html

Thoughts?


Regards

Russell Smith

Re: libpq does not manage SSL callbacks properly when other libraries are involved.

From
Alvaro Herrera
Date:
Russell Smith wrote:
> Alvaro Herrera wrote:
> > PoolSnoopy wrote:
> >
> >> this bug is really some annoyance if you use automatic build environments.
> >> I'm using phpunit to run tests and as soon as postgres is involved the php
> >> cli environment segfaults at the end. this can be worked around by disabling
> >> ssl but it would be great if the underlying bug got fixed.
> >>
> >
> > This is PHP's bug, isn't it?  Why are you complaining here
> No, this is a problem with the callback/exit functions used by
> PostgreSQL.  We setup callback functions when we use SSL, if somebody
> else uses SSL we can create a problem.

Ok, so it seems you're correct; there is more evidence to be found by
searching other projects' mailing lists, for example as a starting point
http://markmail.org/search/?q=+CRYPTO_set_locking_callback%28NULL%29

--
Alvaro Herrera                                http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support