SSL connections don't cope with server crash very well at all - Mailing list pgsql-hackers

From Tom Lane
Subject SSL connections don't cope with server crash very well at all
Date
Msg-id 15808.1201482550@sss.pgh.pa.us
Whole thread Raw
Responses Re: SSL connections don't cope with server crash very well at all  (Magnus Hagander <magnus@hagander.net>)
List pgsql-hackers
If you do a manual "kill -9" (for testing purposes) on its connected
server process, psql normally recovers nicely:

regression=# select 1;?column? 
----------       1
(1 row)

-- issue kill here in another window
regression=# select 1;
server closed the connection unexpectedly       This probably means the server terminated abnormally       before or
whileprocessing the request.
 
The connection to the server was lost. Attempting reset: Succeeded.
regression=# 

But try it with an SSL-enabled connection, and psql just dies rudely.
Investigation shows that it's being killed by SIGPIPE while attempting
to clean up the failed connection:

Program received signal SIGPIPE, Broken pipe.
0x00000030f7ec6e80 in __write_nocancel () from /lib64/libc.so.6
(gdb) bt
#0  0x00000030f7ec6e80 in __write_nocancel () from /lib64/libc.so.6
#1  0x0000003102497a27 in rl_filename_completion_function ()  from /lib64/libcrypto.so.6
#2  0x0000003102495e5e in BIO_write () from /lib64/libcrypto.so.6
#3  0x0000003877a1f449 in ssl3_write_pending () from /lib64/libssl.so.6
#4  0x0000003877a1f8b6 in ssl3_dispatch_alert () from /lib64/libssl.so.6
#5  0x0000003877a1d602 in ssl3_shutdown () from /lib64/libssl.so.6
#6  0x00002aaaaaac2675 in close_SSL (conn=0x642d60) at fe-secure.c:1095
#7  0x00002aaaaaabb483 in pqReadData (conn=0x642d60) at fe-misc.c:719
#8  0x00002aaaaaaba9b8 in PQgetResult (conn=0x642d60) at fe-exec.c:1223
#9  0x00002aaaaaabaa8e in PQexecFinish (conn=0x642d60) at fe-exec.c:1452
#10 0x00000000004075b7 in SendQuery (query=<value optimized out>)   at common.c:853
#11 0x0000000000409cf3 in MainLoop (source=0x30f8151680) at mainloop.c:225
#12 0x000000000040c3dc in main (argc=<value optimized out>, argv=0x100)   at startup.c:352

Apparently we need to do the SIGPIPE disable/enable dance around
SSL_shutdown() as well as SSL_write().  I wonder whether we don't need
it around SSL_read() as well --- I seem to recall that OpenSSL might
either read or write the socket within SSL_read(), due to various corner
cases in the SSL protocol.

Comments?
        regards, tom lane


pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: GSSAPI doesn't play nice with non-canonical host names
Next
From: Tom Lane
Date:
Subject: Re: [PATCHES] Proposed patch: synchronized_scanning GUC variable