Re: Strange hanging bug in a simple milter - Mailing list pgsql-hackers

From Stephen Frost
Subject Re: Strange hanging bug in a simple milter
Date
Msg-id 20130913183325.GX2706@tamriel.snowman.net
Whole thread Raw
In response to Re: Strange hanging bug in a simple milter  (Stephen Frost <sfrost@snowman.net>)
Responses Re: Strange hanging bug in a simple milter
List pgsql-hackers
* Stephen Frost (sfrost@snowman.net) wrote:
> * Andres Freund (andres@2ndquadrant.com) wrote:
> > Hm. close_SSL() first does pqsecure_destroy() which will unset the
> > callbacks, and the count and then goes on to do X509_free() and
> > ENGINE_finish(), ENGINE_free() if either is used.
> >
> > It's not implausible that one of those actually needs locking. I doubt
> > engines play a role here, but, without having looked at the testcase,
> > X509_free() might be a possibility.
>
> Unfortunately, while I can still easily get the deadlock to happen when
> the hooks are reset, the hooks don't appear to ever get called when
> ssl_open_connections is set to zero.  You have a good point about the
> additional SSL calls after the hooks are unloaded though, I wonder if
> holding the ssl_config_mutex lock over all of close_SSL might be more
> sensible..

I went ahead and moved the locks to be around all of close_SSL() and
haven't been able to reproduce the deadlock, so perhaps those calls are
the issue and what's happening is that another thread is dropping or
adding the hooks in a common place while the X509_free, etc, are trying
to figure out if they should be calling the locking functions or not,
but there's a race because there's no higher-level locking happening
around those.

Attached is a patch to move those and which doesn't deadlock for me.

Thoughts?

    Thanks,

        Stephen

Attachment

pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: Strange hanging bug in a simple milter
Next
From: Andres Freund
Date:
Subject: Re: Strange hanging bug in a simple milter