Re: Support for NSS as a libpq TLS backend - Mailing list pgsql-hackers

From Stephen Frost
Subject Re: Support for NSS as a libpq TLS backend
Date
Msg-id 20210323190427.GN20766@tamriel.snowman.net
Whole thread Raw
In response to Re: Support for NSS as a libpq TLS backend  (Daniel Gustafsson <daniel@yesql.se>)
Responses Re: Support for NSS as a libpq TLS backend  (Daniel Gustafsson <daniel@yesql.se>)
List pgsql-hackers
Greetings,

* Daniel Gustafsson (daniel@yesql.se) wrote:
> > On 22 Mar 2021, at 00:49, Stephen Frost <sfrost@snowman.net> wrote:
>
> Thanks for the review!  Below is a partial response, I haven't had time to
> address all your review comments yet but I wanted to submit a rebased patchset
> directly since the current version doesn't work after recent changes in the
> tree. I will address the remaining comments tomorrow or the day after.

Great, thanks!

> This rebase also includes a fix for pgtls_init which was sent offlist by Jacob.
> The changes in pgtls_init can potentially be used to initialize the crypto
> context for NSS to clean up this patch, Jacob is currently looking at that.

Ah, cool, sounds good.

> > They aren't the same and it might not be
> > clear what's going on if one was to somehow mix them (at least if pr_fd
> > continues to sometimes be a void*, but I wonder why that's being
> > done..?  more on that later..).
>
> To paraphrase from a later in this email, there are collisions between nspr and
> postgres on things like BITS_PER_BYTE, and there were also collisions on basic
> types until I learned about NO_NSPR_10_SUPPORT.  By moving the juggling of this
> into common/nss.h we can use proper types without introducing that pollution
> everywhere. I will address these places.

Ah, ok, and great, that sounds good.

> >> +++ b/src/backend/libpq/be-secure-nss.c
> > [...]
> >> +/* default init hook can be overridden by a shared library */
> >> +static void default_nss_tls_init(bool isServerStart);
> >> +nss_tls_init_hook_type nss_tls_init_hook = default_nss_tls_init;
> >
> >> +static PRDescIdentity pr_id;
> >> +
> >> +static PRIOMethods pr_iomethods;
> >
> > Happy to be told I'm missing something, but the above two variables seem
> > to only be used in init_iolayer.. is there a reason they're declared
> > here instead of just being declared in that function?
>
> They must be there since NSPR doesn't copy these but reference them.

Ah, ok, interesting.

> >> +    /*
> >> +     * Set the fallback versions for the TLS protocol version range to a
> >> +     * combination of our minimal requirement and the library maximum. Error
> >> +     * messages should be kept identical to those in be-secure-openssl.c to
> >> +     * make translations easier.
> >> +     */
> >
> > Should we pull these error messages out into another header so that
> > they're in one place to make sure they're kept consistent, if we really
> > want to put the effort in to keep them the same..?  I'm not 100% sure
> > that it's actually necessary to do so, but defining these in one place
> > would help maintain this if we want to.  Also alright with just keeping
> > the comment, not that big of a deal.
>
> It might make sense to pull them into common/nss.h, but seeing the error
> message right there when reading the code does IMO make it clearer so it's a
> doubleedged sword.  Not sure what is the best option, but I'm not married to
> the current solution so if there is consensus to pull them out somewhere I'm
> happy to do so.

My thought was to put them into some common/ssl.h or something along
those lines but I don't see it as a big deal either way really.  You
make a good point that having the error message there when reading the
code is nice.

> > Maybe we should put a stake in the ground that says "we only support
> > back to version X of NSS", test with that and a few more recent versions
> > and the most recent, and then rip out anything that's needed for
> > versions which are older than that?
>
> Yes, right now there is very little in the patch which caters for old versions,
> the PR_Init call might be one of the few offenders.  There has been discussion
> upthread about settling for a required version, combining the insights learned
> there with a survey of which versions are commonly available packaged.
>
> Once we settle on a version we can confirm if PR_Init is/isn't needed and
> remove all traces of it if not.

I don't really see this as all that hard to do- I'd suggest we look at
what systems someone might reasonably deploy v14 on.  To that end, I'd
say "only systems which are presently supported", so: RHEL7+, Debian 9+,
Ubuntu 16.04+.  Looking at those, I see:

Ubuntu 16.04: 3.28.4
RHEL6: v3.28.4
Debian: 3.26.2

> > I have a pretty hard time imagining that someone is going to want to build PG
> > v14 w/ NSS 2.0 ...
>
> Let alone compiling 2.0 at all on a recent system..

Indeed, and given the above, it seems entirely reasonable to make the
requirement be NSS v3+, no?  I wouldn't be against making that even
tighter if we thought it made sense to do so.

> > Also- we don't seem to complain at all about a cipher being specified that we
> > don't find?  Guess I would think that we might want to throw a WARNING in such
> > a case, but I could possibly be convinced otherwise.
>
> No, I think you're right, we should throw WARNING there or possibly even a
> higher elevel. Should that be a COMMERROR even?

I suppose the thought I was having was that we might want to allow some
string that covered all the OpenSSL and NSS ciphers that someone feels
comfortable with and we'd just ignore the ones that don't make sense for
the particular library we're currently built with.  Making it a
COMMERROR seems like overkill and I'm not entirely sure we actually want
any warning since we might then be constantly bleating about it.

> > Kind of wonder just what happens with the current code, I'm guessing ciphercode
> > is zero and therefore doesn't complain but also doesn't do what we want.  I
> > wonder if there's a way to test this?
>
> We could extend the test suite to set ciphers in postgresql.conf, I'll give it
> a go.

That'd be great, thanks!

> > I do think we should probably throw an error if we end up with *no*
> > ciphers being set, which doesn't seem to be happening here..?
>
> Yeah, that should be a COMMERROR. Fixed.

I do think it makes sense to throw a COMMERROR here since the connection
is going to end up failing anyway.

> >> +pg_ssl_read(PRFileDesc *fd, void *buf, PRInt32 amount, PRIntn flags,
> >> +            PRIntervalTime timeout)
> >> +{
> >> +    PRRecvFN    read_fn;
> >> +    PRInt32        n_read;
> >> +
> >> +    read_fn = fd->lower->methods->recv;
> >> +    n_read = read_fn(fd->lower, buf, amount, flags, timeout);
> >> +
> >> +    return n_read;
> >> +}
> >> +
> >> +static PRInt32
> >> +pg_ssl_write(PRFileDesc *fd, const void *buf, PRInt32 amount, PRIntn flags,
> >> +             PRIntervalTime timeout)
> >> +{
> >> +    PRSendFN    send_fn;
> >> +    PRInt32        n_write;
> >> +
> >> +    send_fn = fd->lower->methods->send;
> >> +    n_write = send_fn(fd->lower, buf, amount, flags, timeout);
> >> +
> >> +    return n_write;
> >> +}
> >> +
> >> +static PRStatus
> >> +pg_ssl_close(PRFileDesc *fd)
> >> +{
> >> +    /*
> >> +     * Disconnect our private Port from the fd before closing out the stack.
> >> +     * (Debug builds of NSPR will assert if we do not.)
> >> +     */
> >> +    fd->secret = NULL;
> >> +    return PR_GetDefaultIOMethods()->close(fd);
> >> +}
> >
> > Regarding these, I find myself wondering how they're different from the
> > defaults..?  I mean, the above just directly called
> > PR_GetDefaultIOMethods() to then call it's close() function- are the
> > fd->lower_methods->recv/send not the default methods?  I don't quite get
> > what the point is from having our own callbacks here if they just do
> > exactly what the defaults would do (or are there actually no defined
> > defaults and you have to provide these..?).
>
> It's really just to cope with debug builds of NSPR which assert that fd->secret
> is null before closing.

And we have to override the recv/send functions for this too..?  Sorry,
my comment wasn't just about the close() method but about the others
too.

> >> +    /*
> >> +     * Return the underlying PRFileDesc which can be used to access
> >> +     * information on the connection details. There is no SSL context per se.
> >> +     */
> >> +    if (strcmp(struct_name, "NSS") == 0)
> >> +        return conn->pr_fd;
> >> +    return NULL;
> >> +}
> >
> > Is there never a reason someone might want the pointer returned by
> > NSS_InitContext?  I don't know that there is but it might be something
> > to consider (we could even possibly have our own structure returned by
> > this function which includes both, maybe..?).  Not sure if there's a
> > sensible use-case for that or not just wanted to bring it up as it's
> > something I asked myself while reading through this patch.
>
> Not sure I understand what you're asking for here, did you mean "is there ever
> a reason"?

Eh, poor wording on my part.  You're right, the question, reworded
again, was "Would someone want to get the context returned by
NSS_InitContext?".  If we think there's a reason that someone might want
that context then perhaps we should allow getting it, in addition to the
pr_fd.  If there's really no reason to ever want the context from
NSS_InitContext then what you have here where we're returning pr_fd is
probably fine.

> >> diff --git a/src/interfaces/libpq/fe-secure.c b/src/interfaces/libpq/fe-secure.c
> >> index c601071838..7f10da3010 100644
> >> --- a/src/interfaces/libpq/fe-secure.c
> >> +++ b/src/interfaces/libpq/fe-secure.c
> >> @@ -448,6 +448,27 @@ PQdefaultSSLKeyPassHook_OpenSSL(char *buf, int size, PGconn *conn)
> >> }
> >> #endif                            /* USE_OPENSSL */
> >>
> >> +#ifndef USE_NSS
> >> +
> >> +PQsslKeyPassHook_nss_type
> >> +PQgetSSLKeyPassHook_nss(void)
> >> +{
> >> +    return NULL;
> >> +}
> >> +
> >> +void
> >> +PQsetSSLKeyPassHook_nss(PQsslKeyPassHook_nss_type hook)
> >> +{
> >> +    return;
> >> +}
> >> +
> >> +char *
> >> +PQdefaultSSLKeyPassHook_nss(PK11SlotInfo * slot, PRBool retry, void *arg)
> >> +{
> >> +    return NULL;
> >> +}
> >> +#endif                            /* USE_NSS */
> >
> > Isn't this '!USE_NSS'?
>
> Technically it is, but using just /* USE_NSS */ is consistent with the rest of
> blocks in the file.

Hrmpf.  I guess it seems a bit confusing to me to have to go find the
opening #ifndef to realize that it's actally !USE_NSS..  In other words,
I would think we'd actually want to fix all of these, heh.  I only
actually see one case on a quick grep where it's wrong for USE_OPENSSL
and so that doesn't seem like it's really a precedent and is more of a
bug.  We certainly say 'not OPENSSL' in one place today too and also
have a number of places where we have: #endif ... /* ! WHATEVER */.

> >> diff --git a/src/interfaces/libpq/libpq-int.h b/src/interfaces/libpq/libpq-int.h
> >> index 0c9e95f1a7..f15af39222 100644
> >> --- a/src/interfaces/libpq/libpq-int.h
> >> +++ b/src/interfaces/libpq/libpq-int.h
> >> @@ -383,6 +383,7 @@ struct pg_conn
> >>     char       *sslrootcert;    /* root certificate filename */
> >>     char       *sslcrl;            /* certificate revocation list filename */
> >>     char       *sslcrldir;        /* certificate revocation list directory name */
> >> +    char       *cert_database;    /* NSS certificate/key database */
> >>     char       *requirepeer;    /* required peer credentials for local sockets */
> >>     char       *gssencmode;        /* GSS mode (require,prefer,disable) */
> >>     char       *krbsrvname;        /* Kerberos service name */
> >> @@ -507,6 +508,28 @@ struct pg_conn
> >>                                  * OpenSSL version changes */
> >> #endif
> >> #endif                            /* USE_OPENSSL */
> >> +
> >> +/*
> >> + * The NSS/NSPR specific types aren't used to avoid pulling in the required
> >> + * headers here, as they are causing conflicts with PG definitions.
> >> + */
> >
> > I'm a bit confused- what are the conflicts being caused here..?
> > Certainly under USE_OPENSSL we use the actual OpenSSL types..
>
> It's referring to collisions with for example BITS_PER_BYTE which is defined
> both by postgres and nspr.  Since writing this I've introduced src/common/nss.h
> to handle it in a single place, so we can indeed use the proper types without
> polluting the file.  Fixed.

Great, thanks!

> >> Subject: [PATCH v30 2/9] Refactor SSL testharness for multiple library
> >>
> >> The SSL testharness was fully tied to OpenSSL in the way the server was
> >> set up and reconfigured. This refactors the SSLServer module into a SSL
> >> library agnostic SSL/Server module which in turn use SSL/Backend/<lib>
> >> modules for the implementation details.
> >>
> >> No changes are done to the actual tests, this only change how setup and
> >> teardown is performed.
> >
> > Presumably this could be committed ahead of the main NSS support?
>
> Correct, I think this has merits even if NSS support is ultimately rejected.

Ok- could you break it out on to its own thread and I'll see about
committing it soonish, to get it out of the way?

> >> Subject: [PATCH v30 5/9] nss: Documentation
> >> +++ b/doc/src/sgml/acronyms.sgml
> >> @@ -684,6 +717,16 @@
> >>     </listitem>
> >>    </varlistentry>
> >>
> >> +   <varlistentry>
> >> +    <term><acronym>TLS</acronym></term>
> >> +    <listitem>
> >> +     <para>
> >> +      <ulink url="https://en.wikipedia.org/wiki/Transport_Layer_Security">
> >> +      Transport Layer Security</ulink>
> >> +     </para>
> >> +    </listitem>
> >> +   </varlistentry>
> >
> > We don't have this already..?  Surely we should..
>
> We really should, especially since we've had <acronym>TLS</acronym> in
> config.sgml since 2014 (c6763156589).  That's another small piece that could be
> committed on it's own to cut down the size of this patchset (even if only by a
> tiny amount).

Ditto on this. :)

> >> @@ -1288,7 +1305,9 @@ include_dir 'conf.d'
> >>         connections using TLS version 1.2 and lower are affected.  There is
> >>         currently no setting that controls the cipher choices used by TLS
> >>         version 1.3 connections.  The default value is
> >> -        <literal>HIGH:MEDIUM:+3DES:!aNULL</literal>.  The default is usually a
> >> +        <literal>HIGH:MEDIUM:+3DES:!aNULL</literal> for servers which have
> >> +        been built with <productname>OpenSSL</productname> as the
> >> +        <acronym>SSL</acronym> library.  The default is usually a
> >>         reasonable choice unless you have specific security requirements.
> >>        </para>
> >
> > Shouldn't we say something here wrt NSS?
>
> We should, but I'm not entirely what just yet. Need to revisit that.

Not sure if we really want to do this but at least with ssllabs.com,
postgresql.org gets an 'A' rating with this set:

ECDHE-ECDSA-CHACHA20-POLY1305
ECDHE-RSA-CHACHA20-POLY1305
ECDHE-ECDSA-AES128-GCM-SHA256
ECDHE-RSA-AES128-GCM-SHA256
ECDHE-ECDSA-AES256-GCM-SHA384
ECDHE-RSA-AES256-GCM-SHA384
DHE-RSA-AES128-GCM-SHA256
DHE-RSA-AES256-GCM-SHA384
ECDHE-ECDSA-AES128-SHA256
ECDHE-RSA-AES128-SHA256
ECDHE-ECDSA-AES128-SHA
ECDHE-RSA-AES256-SHA384
ECDHE-RSA-AES128-SHA
ECDHE-ECDSA-AES256-SHA384
ECDHE-ECDSA-AES256-SHA
ECDHE-RSA-AES256-SHA
DHE-RSA-AES128-SHA256
DHE-RSA-AES128-SHA
DHE-RSA-AES256-SHA256
DHE-RSA-AES256-SHA
ECDHE-ECDSA-DES-CBC3-SHA
ECDHE-RSA-DES-CBC3-SHA
EDH-RSA-DES-CBC3-SHA
AES128-GCM-SHA256
AES256-GCM-SHA384
AES128-SHA256
AES256-SHA256
AES128-SHA
AES256-SHA
DES-CBC3-SHA
!DSS

Which also seems kinda close to what the default when built with OpenSSL
ends up being?  Thought the ssllabs report does list which ones it
thinks are weak and so we might consider excluding those by default too:

https://www.ssllabs.com/ssltest/analyze.html?d=postgresql.org&s=2a02%3a16a8%3adc51%3a0%3a0%3a0%3a0%3a50

> >> @@ -1490,8 +1509,11 @@ include_dir 'conf.d'
> >>        <para>
> >>         Sets an external command to be invoked when a passphrase for
> >>         decrypting an SSL file such as a private key needs to be obtained.  By
> >> -        default, this parameter is empty, which means the built-in prompting
> >> -        mechanism is used.
> >> +        default, this parameter is empty. When the server is using
> >> +        <productname>OpenSSL</productname>, this means the built-in prompting
> >> +        mechanism is used. When using <productname>NSS</productname>, there is
> >> +        no default prompting so a blank callback will be used returning an
> >> +        empty password.
> >>        </para>
> >
> > Maybe we should point out here that this requires the database to not
> > require a password..?  So if they have one, they need to set this, or
> > maybe we should provide a default one..
>
> I've added a sentence on not using a password for the cert database.  I'm not
> sure if providing a default one is a good idea but it's no less insecure than
> having no password really..

I was meaning a default callback to prompt, not sure if that was clear.

> > Just a, well, not so quick read-through.  Generally it's looking pretty
> > good to me.  Will see about playing with it this week.
>
> Thanks again for reviewing, another version which addresses the remaining
> issues will be posted soon but I wanted to get this out to give further reviews
> something that properly works.

Fantastic, thanks again!

Stephen

Attachment

pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: pg_upgrade failing for 200+ million Large Objects
Next
From: Robert Haas
Date:
Subject: Re: pg_amcheck contrib application