Re: Support for NSS as a libpq TLS backend - Mailing list pgsql-hackers

From Jacob Champion
Subject Re: Support for NSS as a libpq TLS backend
Date
Msg-id c8d4bc0dfd266799ab4213f1673a813786ac0c70.camel@vmware.com
Whole thread Raw
In response to Re: Support for NSS as a libpq TLS backend  (Stephen Frost <sfrost@snowman.net>)
Responses Re: Support for NSS as a libpq TLS backend  (Michael Paquier <michael@paquier.xyz>)
List pgsql-hackers
On Fri, 2021-03-26 at 18:05 -0400, Stephen Frost wrote:
> * Jacob Champion (pchampion@vmware.com) wrote:
> > Yeah. I was hoping to avoid implementing our own locks and refcounts,
> > but it seems like it's going to be required.
> 
> Yeah, afraid so.

I think it gets worse, after having debugged some confusing crashes.
There's already been a discussion on PR_Init upthread a bit:

> Once we settle on a version we can confirm if PR_Init is/isn't needed and
> remove all traces of it if not.

What the NSPR documentation omits is that implicit initialization is
not threadsafe. So NSS_InitContext() is technically "threadsafe"
because it's built on PR_CallOnce(), but if you haven't called
PR_Init() yet, multiple simultaneous PR_CallOnce() calls can crash into
each other.

So, fine. We just add our own locks around NSS_InitContext() (or around
a single call to PR_Init()). Well, the first thread to win and
successfully initialize NSPR gets marked as the "primordial" thread
using thread-local state. And it gets a pthread destructor that does...
something. So lazy initialization seems a bit dangerous regardless of
whether or not we add locks, but I can't really prove whether it's
dangerous or not in practice.

I do know that only the primordial thread is allowed to call
PR_Cleanup(), and of course we wouldn't be able to control which thread
does what for libpq clients. I don't know what other assumptions are
made about the primordial thread, or if there are any platform-specific 
behaviors with older versions of NSPR that we'd need to worry about. It
used to be that the primordial thread was not allowed to exit before
any other threads, but that restriction was lifted at some point [1].

I think we're going to need some analogue to PQinitOpenSSL() to help
client applications cut through the mess, but I'm not sure what it
should look like, or how we would maintain any sort of API
compatibility between the two flavors. And does libpq already have some
notion of a "main thread" that I'm missing?

--Jacob

[1] https://bugzilla.mozilla.org/show_bug.cgi?id=294955

pgsql-hackers by date:

Previous
From: "'alvherre@alvh.no-ip.org'"
Date:
Subject: Re: libpq debug log
Next
From: Tom Lane
Date:
Subject: Re: libpq debug log