Thread: libpq thread locking during SSL connection start

libpq thread locking during SSL connection start

From
Stephen Frost
Date:
All,
 I wanted to highlight the below commit as being a significant enough change that it warrents being seen on -hackers
andnot just -committers.  If you use SSL with libpq, particularly in a threaded mode/environment, please take a
look/testthis change.  Prior to the patch, we would crash pretty easily when using client certificates with multiple
threads;I was unable to get it to crash after the patch.  That said, there's also risk that this patch will cause
slow-downsin SSL connections when using threads due to the additional locking.  I'm not sure that's a heavily used
case,but none-the-less, if you do that, please considering testing this patch. 
  Thanks!
      Stephen

----- Forwarded message from Stephen Frost <sfrost@snowman.net> -----

Date: Thu, 01 Aug 2013 05:33:12 +0000
From: Stephen Frost <sfrost@snowman.net>
To: pgsql-committers@postgresql.org
X-Spam-Status: No, score=-3.4 required=5.0 tests=BAYES_00,RP_MATCHES_RCVD autolearn=hamversion=3.3.2
Subject: [COMMITTERS] pgsql: Add locking around SSL_context usage in libpq

Add locking around SSL_context usage in libpq

I've been working with Nick Phillips on an issue he ran into when
trying to use threads with SSL client certificates.  As it turns out,
the call in initialize_SSL() to SSL_CTX_use_certificate_chain_file()
will modify our SSL_context without any protection from other threads
also calling that function or being at some other point and trying to
read from SSL_context.

To protect against this, I've written up the attached (based on an
initial patch from Nick and much subsequent discussion) which puts
locks around SSL_CTX_use_certificate_chain_file() and all of the other
users of SSL_context which weren't already protected.

Nick Phillips, much reworked by Stephen Frost

Back-patch to 9.0 where we started loading the cert directly instead of
using a callback.

Branch
------
master

Details
-------
http://git.postgresql.org/pg/commitdiff/aad2a630b1b163038ea904e16a59e409020f5828

Modified Files
--------------
src/interfaces/libpq/fe-secure.c |   56 ++++++++++++++++++++++++++++++++++++--
1 file changed, 53 insertions(+), 3 deletions(-)


--
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers

----- End forwarded message -----

Re: libpq thread locking during SSL connection start

From
Alvaro Herrera
Date:
Stephen Frost wrote:
> All,
> 
>   I wanted to highlight the below commit as being a significant enough
>   change that it warrents being seen on -hackers and not just
>   -committers.  If you use SSL with libpq, particularly in a threaded
>   mode/environment, please take a look/test this change.  Prior to the
>   patch, we would crash pretty easily when using client certificates
>   with multiple threads; I was unable to get it to crash after the
>   patch.  That said, there's also risk that this patch will cause
>   slow-downs in SSL connections when using threads due to the additional
>   locking.  I'm not sure that's a heavily used case, but none-the-less,
>   if you do that, please considering testing this patch.

Now that I look at it, I think there are a couple of problems with this
patch, particularly when pthread_mutex_lock fails.  I grant you that
that is a very improbable condition, but if so then we should Assert()
against it or something like that.  (Since I am not sure what would
constitute good behavior from Assert() in libpq, I tend to think this is
not a good idea.)

pgsecure_open_client() returns -1 if it can't lock the mutex.  This is a
problem because the callers are not prepared for that return value.  I
think it should return PGRES_POLLING_FAILED instead, after setting an
appropriate error message in conn->errorMessage.

initialize_SSL() fails to set an error message.  The return code of -1
seems fine here.

-- 
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services



Re: libpq thread locking during SSL connection start

From
Stephen Frost
Date:
* Alvaro Herrera (alvherre@2ndquadrant.com) wrote:
> pgsecure_open_client() returns -1 if it can't lock the mutex.  This is a
> problem because the callers are not prepared for that return value.  I
> think it should return PGRES_POLLING_FAILED instead, after setting an
> appropriate error message in conn->errorMessage.

Ah, right, adding it there was a bit of a late addition, tbh.

> initialize_SSL() fails to set an error message.  The return code of -1
> seems fine here.

Right, will improve.
Thanks!
    Stephen

Re: libpq thread locking during SSL connection start

From
Peter Eisentraut
Date:
On 8/1/13 1:42 PM, Stephen Frost wrote:
> * Alvaro Herrera (alvherre@2ndquadrant.com) wrote:
>> pgsecure_open_client() returns -1 if it can't lock the mutex.  This is a
>> problem because the callers are not prepared for that return value.  I
>> think it should return PGRES_POLLING_FAILED instead, after setting an
>> appropriate error message in conn->errorMessage.
> 
> Ah, right, adding it there was a bit of a late addition, tbh.

Elsewhere in libpq, we use PGTHREAD_ERROR and abort if
pthread_mutex_lock fails.  Shouldn't we handle that consistently?

Alternatively, if we want to just print an error message and proceed, we
should put the strerror based on the return value into the message.

Overall, the response to these kinds of errors doesn't look very well
thought through.  (Not to speak of ecpg, which ignores these errors
altogether.)




Re: libpq thread locking during SSL connection start

From
Stephen Frost
Date:
* Peter Eisentraut (peter_e@gmx.net) wrote:
> On 8/1/13 1:42 PM, Stephen Frost wrote:
> > * Alvaro Herrera (alvherre@2ndquadrant.com) wrote:
> >> pgsecure_open_client() returns -1 if it can't lock the mutex.  This is a
> >> problem because the callers are not prepared for that return value.  I
> >> think it should return PGRES_POLLING_FAILED instead, after setting an
> >> appropriate error message in conn->errorMessage.
> >
> > Ah, right, adding it there was a bit of a late addition, tbh.
>
> Elsewhere in libpq, we use PGTHREAD_ERROR and abort if
> pthread_mutex_lock fails.  Shouldn't we handle that consistently?

The calls which use PGTHREAD_ERROR can't return normally and have to
abort (they're callbacks).

> Alternatively, if we want to just print an error message and proceed, we
> should put the strerror based on the return value into the message.

That could certainly be added.  Should we also be adding an
error message+strerror in cases where pthread_mutex_unlock() fails for
some reason?

> Overall, the response to these kinds of errors doesn't look very well
> thought through.  (Not to speak of ecpg, which ignores these errors
> altogether.)

Agreed.
Thanks,
    Stephen

Re: libpq thread locking during SSL connection start

From
Peter Eisentraut
Date:
On Mon, 2013-08-12 at 10:49 -0400, Stephen Frost wrote:
> > Alternatively, if we want to just print an error message and
> proceed, we
> > should put the strerror based on the return value into the message.
>
> That could certainly be added.

Here is a patch for that.  I also adjusted the message wording to be
more in line with project style.

> Should we also be adding an
> error message+strerror in cases where pthread_mutex_unlock() fails for
> some reason?
>
Ideally yes, I think.  But it's probably less urgent, because it if
fails the next lock request will probably error?


Attachment