Thread: libpq thread locking during SSL connection start
All, I wanted to highlight the below commit as being a significant enough change that it warrents being seen on -hackers andnot just -committers. If you use SSL with libpq, particularly in a threaded mode/environment, please take a look/testthis change. Prior to the patch, we would crash pretty easily when using client certificates with multiple threads;I was unable to get it to crash after the patch. That said, there's also risk that this patch will cause slow-downsin SSL connections when using threads due to the additional locking. I'm not sure that's a heavily used case,but none-the-less, if you do that, please considering testing this patch. Thanks! Stephen ----- Forwarded message from Stephen Frost <sfrost@snowman.net> ----- Date: Thu, 01 Aug 2013 05:33:12 +0000 From: Stephen Frost <sfrost@snowman.net> To: pgsql-committers@postgresql.org X-Spam-Status: No, score=-3.4 required=5.0 tests=BAYES_00,RP_MATCHES_RCVD autolearn=hamversion=3.3.2 Subject: [COMMITTERS] pgsql: Add locking around SSL_context usage in libpq Add locking around SSL_context usage in libpq I've been working with Nick Phillips on an issue he ran into when trying to use threads with SSL client certificates. As it turns out, the call in initialize_SSL() to SSL_CTX_use_certificate_chain_file() will modify our SSL_context without any protection from other threads also calling that function or being at some other point and trying to read from SSL_context. To protect against this, I've written up the attached (based on an initial patch from Nick and much subsequent discussion) which puts locks around SSL_CTX_use_certificate_chain_file() and all of the other users of SSL_context which weren't already protected. Nick Phillips, much reworked by Stephen Frost Back-patch to 9.0 where we started loading the cert directly instead of using a callback. Branch ------ master Details ------- http://git.postgresql.org/pg/commitdiff/aad2a630b1b163038ea904e16a59e409020f5828 Modified Files -------------- src/interfaces/libpq/fe-secure.c | 56 ++++++++++++++++++++++++++++++++++++-- 1 file changed, 53 insertions(+), 3 deletions(-) -- Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-committers ----- End forwarded message -----
Stephen Frost wrote: > All, > > I wanted to highlight the below commit as being a significant enough > change that it warrents being seen on -hackers and not just > -committers. If you use SSL with libpq, particularly in a threaded > mode/environment, please take a look/test this change. Prior to the > patch, we would crash pretty easily when using client certificates > with multiple threads; I was unable to get it to crash after the > patch. That said, there's also risk that this patch will cause > slow-downs in SSL connections when using threads due to the additional > locking. I'm not sure that's a heavily used case, but none-the-less, > if you do that, please considering testing this patch. Now that I look at it, I think there are a couple of problems with this patch, particularly when pthread_mutex_lock fails. I grant you that that is a very improbable condition, but if so then we should Assert() against it or something like that. (Since I am not sure what would constitute good behavior from Assert() in libpq, I tend to think this is not a good idea.) pgsecure_open_client() returns -1 if it can't lock the mutex. This is a problem because the callers are not prepared for that return value. I think it should return PGRES_POLLING_FAILED instead, after setting an appropriate error message in conn->errorMessage. initialize_SSL() fails to set an error message. The return code of -1 seems fine here. -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
* Alvaro Herrera (alvherre@2ndquadrant.com) wrote: > pgsecure_open_client() returns -1 if it can't lock the mutex. This is a > problem because the callers are not prepared for that return value. I > think it should return PGRES_POLLING_FAILED instead, after setting an > appropriate error message in conn->errorMessage. Ah, right, adding it there was a bit of a late addition, tbh. > initialize_SSL() fails to set an error message. The return code of -1 > seems fine here. Right, will improve. Thanks! Stephen
On 8/1/13 1:42 PM, Stephen Frost wrote: > * Alvaro Herrera (alvherre@2ndquadrant.com) wrote: >> pgsecure_open_client() returns -1 if it can't lock the mutex. This is a >> problem because the callers are not prepared for that return value. I >> think it should return PGRES_POLLING_FAILED instead, after setting an >> appropriate error message in conn->errorMessage. > > Ah, right, adding it there was a bit of a late addition, tbh. Elsewhere in libpq, we use PGTHREAD_ERROR and abort if pthread_mutex_lock fails. Shouldn't we handle that consistently? Alternatively, if we want to just print an error message and proceed, we should put the strerror based on the return value into the message. Overall, the response to these kinds of errors doesn't look very well thought through. (Not to speak of ecpg, which ignores these errors altogether.)
* Peter Eisentraut (peter_e@gmx.net) wrote: > On 8/1/13 1:42 PM, Stephen Frost wrote: > > * Alvaro Herrera (alvherre@2ndquadrant.com) wrote: > >> pgsecure_open_client() returns -1 if it can't lock the mutex. This is a > >> problem because the callers are not prepared for that return value. I > >> think it should return PGRES_POLLING_FAILED instead, after setting an > >> appropriate error message in conn->errorMessage. > > > > Ah, right, adding it there was a bit of a late addition, tbh. > > Elsewhere in libpq, we use PGTHREAD_ERROR and abort if > pthread_mutex_lock fails. Shouldn't we handle that consistently? The calls which use PGTHREAD_ERROR can't return normally and have to abort (they're callbacks). > Alternatively, if we want to just print an error message and proceed, we > should put the strerror based on the return value into the message. That could certainly be added. Should we also be adding an error message+strerror in cases where pthread_mutex_unlock() fails for some reason? > Overall, the response to these kinds of errors doesn't look very well > thought through. (Not to speak of ecpg, which ignores these errors > altogether.) Agreed. Thanks, Stephen
On Mon, 2013-08-12 at 10:49 -0400, Stephen Frost wrote: > > Alternatively, if we want to just print an error message and > proceed, we > > should put the strerror based on the return value into the message. > > That could certainly be added. Here is a patch for that. I also adjusted the message wording to be more in line with project style. > Should we also be adding an > error message+strerror in cases where pthread_mutex_unlock() fails for > some reason? > Ideally yes, I think. But it's probably less urgent, because it if fails the next lock request will probably error?