Re: Support for NSS as a libpq TLS backend - Mailing list pgsql-hackers
From | Daniel Gustafsson |
---|---|
Subject | Re: Support for NSS as a libpq TLS backend |
Date | |
Msg-id | 051B5E92-066F-4079-8C40-01009CFEBC4B@yesql.se Whole thread Raw |
In response to | Re: Support for NSS as a libpq TLS backend (Stephen Frost <sfrost@snowman.net>) |
Responses |
Re: Support for NSS as a libpq TLS backend
|
List | pgsql-hackers |
> On 23 Mar 2021, at 20:04, Stephen Frost <sfrost@snowman.net> wrote: > > Greetings, > > * Daniel Gustafsson (daniel@yesql.se) wrote: >>> On 22 Mar 2021, at 00:49, Stephen Frost <sfrost@snowman.net> wrote: >> >> Thanks for the review! Below is a partial response, I haven't had time to >> address all your review comments yet but I wanted to submit a rebased patchset >> directly since the current version doesn't work after recent changes in the >> tree. I will address the remaining comments tomorrow or the day after. > > Great, thanks! > >> This rebase also includes a fix for pgtls_init which was sent offlist by Jacob. >> The changes in pgtls_init can potentially be used to initialize the crypto >> context for NSS to clean up this patch, Jacob is currently looking at that. > > Ah, cool, sounds good. > >>> They aren't the same and it might not be >>> clear what's going on if one was to somehow mix them (at least if pr_fd >>> continues to sometimes be a void*, but I wonder why that's being >>> done..? more on that later..). >> >> To paraphrase from a later in this email, there are collisions between nspr and >> postgres on things like BITS_PER_BYTE, and there were also collisions on basic >> types until I learned about NO_NSPR_10_SUPPORT. By moving the juggling of this >> into common/nss.h we can use proper types without introducing that pollution >> everywhere. I will address these places. > > Ah, ok, and great, that sounds good. >>>> + /* >>>> + * Set the fallback versions for the TLS protocol version range to a >>>> + * combination of our minimal requirement and the library maximum. Error >>>> + * messages should be kept identical to those in be-secure-openssl.c to >>>> + * make translations easier. >>>> + */ >>> >>> Should we pull these error messages out into another header so that >>> they're in one place to make sure they're kept consistent, if we really >>> want to put the effort in to keep them the same..? I'm not 100% sure >>> that it's actually necessary to do so, but defining these in one place >>> would help maintain this if we want to. Also alright with just keeping >>> the comment, not that big of a deal. >> >> It might make sense to pull them into common/nss.h, but seeing the error >> message right there when reading the code does IMO make it clearer so it's a >> doubleedged sword. Not sure what is the best option, but I'm not married to >> the current solution so if there is consensus to pull them out somewhere I'm >> happy to do so. > > My thought was to put them into some common/ssl.h or something along > those lines but I don't see it as a big deal either way really. You > make a good point that having the error message there when reading the > code is nice. Thinking more on this, I think my vote will be to keep them duplicated in the code for readability. Unless there are strong feelings against I think we at least should start there. >>> Maybe we should put a stake in the ground that says "we only support >>> back to version X of NSS", test with that and a few more recent versions >>> and the most recent, and then rip out anything that's needed for >>> versions which are older than that? >> >> Yes, right now there is very little in the patch which caters for old versions, >> the PR_Init call might be one of the few offenders. There has been discussion >> upthread about settling for a required version, combining the insights learned >> there with a survey of which versions are commonly available packaged. >> >> Once we settle on a version we can confirm if PR_Init is/isn't needed and >> remove all traces of it if not. > > I don't really see this as all that hard to do- I'd suggest we look at > what systems someone might reasonably deploy v14 on. To that end, I'd > say "only systems which are presently supported", so: RHEL7+, Debian 9+, > Ubuntu 16.04+. Sounds reasonable. > Looking at those, I see: > > Ubuntu 16.04: 3.28.4 > RHEL6: v3.28.4 > Debian: 3.26.2 I assume these have matching NSPR versions placing the Debian 9 NSPR package as the lowest required version for that? >>> I have a pretty hard time imagining that someone is going to want to build PG >>> v14 w/ NSS 2.0 ... >> >> Let alone compiling 2.0 at all on a recent system.. > > Indeed, and given the above, it seems entirely reasonable to make the > requirement be NSS v3+, no? I wouldn't be against making that even > tighter if we thought it made sense to do so. I think anything but doing that would be incredibly unreasonable. >>> Also- we don't seem to complain at all about a cipher being specified that we >>> don't find? Guess I would think that we might want to throw a WARNING in such >>> a case, but I could possibly be convinced otherwise. >> >> No, I think you're right, we should throw WARNING there or possibly even a >> higher elevel. Should that be a COMMERROR even? > > I suppose the thought I was having was that we might want to allow some > string that covered all the OpenSSL and NSS ciphers that someone feels > comfortable with and we'd just ignore the ones that don't make sense for > the particular library we're currently built with. Making it a > COMMERROR seems like overkill and I'm not entirely sure we actually want > any warning since we might then be constantly bleating about it. Right, with a string like that we'd induce WARNING fatigue quickly. Catching the case of *no* ciphers enabled with a COMMERROR is going some way towards being helpful to the user in debugging the failed connection here. >>>> +pg_ssl_read(PRFileDesc *fd, void *buf, PRInt32 amount, PRIntn flags, >>>> + PRIntervalTime timeout) >>>> +{ >>>> + PRRecvFN read_fn; >>>> + PRInt32 n_read; >>>> + >>>> + read_fn = fd->lower->methods->recv; >>>> + n_read = read_fn(fd->lower, buf, amount, flags, timeout); >>>> + >>>> + return n_read; >>>> +} >>>> + >>>> +static PRInt32 >>>> +pg_ssl_write(PRFileDesc *fd, const void *buf, PRInt32 amount, PRIntn flags, >>>> + PRIntervalTime timeout) >>>> +{ >>>> + PRSendFN send_fn; >>>> + PRInt32 n_write; >>>> + >>>> + send_fn = fd->lower->methods->send; >>>> + n_write = send_fn(fd->lower, buf, amount, flags, timeout); >>>> + >>>> + return n_write; >>>> +} >>>> + >>>> +static PRStatus >>>> +pg_ssl_close(PRFileDesc *fd) >>>> +{ >>>> + /* >>>> + * Disconnect our private Port from the fd before closing out the stack. >>>> + * (Debug builds of NSPR will assert if we do not.) >>>> + */ >>>> + fd->secret = NULL; >>>> + return PR_GetDefaultIOMethods()->close(fd); >>>> +} >>> >>> Regarding these, I find myself wondering how they're different from the >>> defaults..? I mean, the above just directly called >>> PR_GetDefaultIOMethods() to then call it's close() function- are the >>> fd->lower_methods->recv/send not the default methods? I don't quite get >>> what the point is from having our own callbacks here if they just do >>> exactly what the defaults would do (or are there actually no defined >>> defaults and you have to provide these..?). >> >> It's really just to cope with debug builds of NSPR which assert that fd->secret >> is null before closing. > > And we have to override the recv/send functions for this too..? Sorry, > my comment wasn't just about the close() method but about the others > too. Ah, no we can ditch the .send and .recv functions and stick with the default built-ins, I just confirmed this and removed them. I think they are leftovers from when I injected debug code there during development, they were as you say copies of the default. >>>> + /* >>>> + * Return the underlying PRFileDesc which can be used to access >>>> + * information on the connection details. There is no SSL context per se. >>>> + */ >>>> + if (strcmp(struct_name, "NSS") == 0) >>>> + return conn->pr_fd; >>>> + return NULL; >>>> +} >>> >>> Is there never a reason someone might want the pointer returned by >>> NSS_InitContext? I don't know that there is but it might be something >>> to consider (we could even possibly have our own structure returned by >>> this function which includes both, maybe..?). Not sure if there's a >>> sensible use-case for that or not just wanted to bring it up as it's >>> something I asked myself while reading through this patch. >> >> Not sure I understand what you're asking for here, did you mean "is there ever >> a reason"? > > Eh, poor wording on my part. You're right, the question, reworded > again, was "Would someone want to get the context returned by > NSS_InitContext?". If we think there's a reason that someone might want > that context then perhaps we should allow getting it, in addition to the > pr_fd. If there's really no reason to ever want the context from > NSS_InitContext then what you have here where we're returning pr_fd is > probably fine. I can't think of any reason, maybe Jacob who has been knee-deep in NSS contexts have insights which tell a different story? >>>> diff --git a/src/interfaces/libpq/fe-secure.c b/src/interfaces/libpq/fe-secure.c >>>> index c601071838..7f10da3010 100644 >>>> --- a/src/interfaces/libpq/fe-secure.c >>>> +++ b/src/interfaces/libpq/fe-secure.c >>>> @@ -448,6 +448,27 @@ PQdefaultSSLKeyPassHook_OpenSSL(char *buf, int size, PGconn *conn) >>>> } >>>> #endif /* USE_OPENSSL */ >>>> >>>> +#ifndef USE_NSS >>>> + >>>> +PQsslKeyPassHook_nss_type >>>> +PQgetSSLKeyPassHook_nss(void) >>>> +{ >>>> + return NULL; >>>> +} >>>> + >>>> +void >>>> +PQsetSSLKeyPassHook_nss(PQsslKeyPassHook_nss_type hook) >>>> +{ >>>> + return; >>>> +} >>>> + >>>> +char * >>>> +PQdefaultSSLKeyPassHook_nss(PK11SlotInfo * slot, PRBool retry, void *arg) >>>> +{ >>>> + return NULL; >>>> +} >>>> +#endif /* USE_NSS */ >>> >>> Isn't this '!USE_NSS'? >> >> Technically it is, but using just /* USE_NSS */ is consistent with the rest of >> blocks in the file. > > Hrmpf. I guess it seems a bit confusing to me to have to go find the > opening #ifndef to realize that it's actally !USE_NSS.. In other words, > I would think we'd actually want to fix all of these, heh. I only > actually see one case on a quick grep where it's wrong for USE_OPENSSL > and so that doesn't seem like it's really a precedent and is more of a > bug. We certainly say 'not OPENSSL' in one place today too and also > have a number of places where we have: #endif ... /* ! WHATEVER */. No disagreement from me. To cut down the size of this patchset however I propose that we tackle this separately and leave this as is in this thread since it's in line with the rest of the file (for now). >>>> Subject: [PATCH v30 2/9] Refactor SSL testharness for multiple library >>>> >>>> The SSL testharness was fully tied to OpenSSL in the way the server was >>>> set up and reconfigured. This refactors the SSLServer module into a SSL >>>> library agnostic SSL/Server module which in turn use SSL/Backend/<lib> >>>> modules for the implementation details. >>>> >>>> No changes are done to the actual tests, this only change how setup and >>>> teardown is performed. >>> >>> Presumably this could be committed ahead of the main NSS support? >> >> Correct, I think this has merits even if NSS support is ultimately rejected. > > Ok- could you break it out on to its own thread and I'll see about > committing it soonish, to get it out of the way? It was already on it's own thread, as we discussed offlist. I have since rebased and expanded that patch over in that thread which has gotten review that needs to be addressed. As such, I will not update that patch in the series in this thread but keep the changes on that thread, and then pull them back into here when ready. >>>> Subject: [PATCH v30 5/9] nss: Documentation >>>> +++ b/doc/src/sgml/acronyms.sgml >>>> @@ -684,6 +717,16 @@ >>>> </listitem> >>>> </varlistentry> >>>> >>>> + <varlistentry> >>>> + <term><acronym>TLS</acronym></term> >>>> + <listitem> >>>> + <para> >>>> + <ulink url="https://en.wikipedia.org/wiki/Transport_Layer_Security"> >>>> + Transport Layer Security</ulink> >>>> + </para> >>>> + </listitem> >>>> + </varlistentry> >>> >>> We don't have this already..? Surely we should.. >> >> We really should, especially since we've had <acronym>TLS</acronym> in >> config.sgml since 2014 (c6763156589). That's another small piece that could be >> committed on it's own to cut down the size of this patchset (even if only by a >> tiny amount). > > Ditto on this. :) Done in https://postgr.es/m/27109504-82DB-41A8-8E63-C0498314F5B0@yesql.se >>>> @@ -1288,7 +1305,9 @@ include_dir 'conf.d' >>>> connections using TLS version 1.2 and lower are affected. There is >>>> currently no setting that controls the cipher choices used by TLS >>>> version 1.3 connections. The default value is >>>> - <literal>HIGH:MEDIUM:+3DES:!aNULL</literal>. The default is usually a >>>> + <literal>HIGH:MEDIUM:+3DES:!aNULL</literal> for servers which have >>>> + been built with <productname>OpenSSL</productname> as the >>>> + <acronym>SSL</acronym> library. The default is usually a >>>> reasonable choice unless you have specific security requirements. >>>> </para> >>> >>> Shouldn't we say something here wrt NSS? >> >> We should, but I'm not entirely what just yet. Need to revisit that. > > Not sure if we really want to do this but at least with ssllabs.com, > postgresql.org gets an 'A' rating with this set: > > ECDHE-ECDSA-CHACHA20-POLY1305 > ECDHE-RSA-CHACHA20-POLY1305 > ECDHE-ECDSA-AES128-GCM-SHA256 > ECDHE-RSA-AES128-GCM-SHA256 > ECDHE-ECDSA-AES256-GCM-SHA384 > ECDHE-RSA-AES256-GCM-SHA384 > DHE-RSA-AES128-GCM-SHA256 > DHE-RSA-AES256-GCM-SHA384 > ECDHE-ECDSA-AES128-SHA256 > ECDHE-RSA-AES128-SHA256 > ECDHE-ECDSA-AES128-SHA > ECDHE-RSA-AES256-SHA384 > ECDHE-RSA-AES128-SHA > ECDHE-ECDSA-AES256-SHA384 > ECDHE-ECDSA-AES256-SHA > ECDHE-RSA-AES256-SHA > DHE-RSA-AES128-SHA256 > DHE-RSA-AES128-SHA > DHE-RSA-AES256-SHA256 > DHE-RSA-AES256-SHA > ECDHE-ECDSA-DES-CBC3-SHA > ECDHE-RSA-DES-CBC3-SHA > EDH-RSA-DES-CBC3-SHA > AES128-GCM-SHA256 > AES256-GCM-SHA384 > AES128-SHA256 > AES256-SHA256 > AES128-SHA > AES256-SHA > DES-CBC3-SHA > !DSS > > Which also seems kinda close to what the default when built with OpenSSL > ends up being? Thought the ssllabs report does list which ones it > thinks are weak and so we might consider excluding those by default too: > > https://www.ssllabs.com/ssltest/analyze.html?d=postgresql.org&s=2a02%3a16a8%3adc51%3a0%3a0%3a0%3a0%3a50 Agreed, maintaining parity (or thereabouts) with OpenSSL defaults taking industry best practices into account is probably what we should aim for. >>>> @@ -1490,8 +1509,11 @@ include_dir 'conf.d' >>>> <para> >>>> Sets an external command to be invoked when a passphrase for >>>> decrypting an SSL file such as a private key needs to be obtained. By >>>> - default, this parameter is empty, which means the built-in prompting >>>> - mechanism is used. >>>> + default, this parameter is empty. When the server is using >>>> + <productname>OpenSSL</productname>, this means the built-in prompting >>>> + mechanism is used. When using <productname>NSS</productname>, there is >>>> + no default prompting so a blank callback will be used returning an >>>> + empty password. >>>> </para> >>> >>> Maybe we should point out here that this requires the database to not >>> require a password..? So if they have one, they need to set this, or >>> maybe we should provide a default one.. >> >> I've added a sentence on not using a password for the cert database. I'm not >> sure if providing a default one is a good idea but it's no less insecure than >> having no password really.. > > I was meaning a default callback to prompt, not sure if that was clear. Ah, no that's not what I thought you meant. Do you have any thoughts on what that callback would look like? Take a password on a TTY input? Below are a few fixes addressed from the original review email: >>> + /* >>> + * Set up the custom IO layer. >>> + */ >> >> Might be good to mention that the IO Layer is what sets up the >> read/write callbacks to be used. > > Good point, will do in the next version of the patchset. Fixed. >>> + port->pr_fd = SSL_ImportFD(model, pr_fd); >>> + if (!port->pr_fd) >>> + { >>> + ereport(COMMERROR, >>> + (errmsg("unable to initialize"))); >>> + return -1; >>> + } >> >> Maybe a comment and a better error message for this? > > Will do. Fixed. >>> >>> + PR_Close(model); >> >> This might deserve one also, the whole 'model' construct is a bit >> different. :) > > Agreed. will do. Fixed. >> Also, I get that they do similar jobs and that one is in the frontend >> and the other is in the backend, but I'm not a fan of having two >> 'ssl_protocol_version_to_nss()'s functions that take different argument >> types but have exact same name and do functionally different things.. > > Good point, I'll change that. Fixed. >>> + /* >>> + * Configure cipher policy. >>> + */ >>> + status = NSS_SetDomesticPolicy(); >>> + if (status != SECSuccess) >>> + { >>> + printfPQExpBuffer(&conn->errorMessage, >>> + libpq_gettext("unable to configure cipher policy: %s"), >>> + pg_SSLerrmessage(PR_GetError())); >>> + >>> + return PGRES_POLLING_FAILED; >>> + } >> >> Probably good to pull over at least some parts of the comments made in >> the backend code about SetDomesticPolicy() actually enabling everything >> (just like all the policies apparently do)... > > Good point, will do. Fixed. >>> +int >>> +be_tls_open_server(Port *port) >>> +{ >>> + SECStatus status; >>> + PRFileDesc *model; >>> + PRFileDesc *pr_fd; >> >> pr_fd here is materially different from port->pr_fd, no? As in, one is >> the NSS raw TCP fd while the other is the SSL fd, right? Maybe we >> should use two different variable names to try and make sure they don't >> get confused? Might even set this to NULL after we are done with it >> too.. Then again, I see later on that when we do the dance with the >> 'model' PRFileDesc that we just use the same variable- maybe we should >> do that? That is, just get rid of this 'pr_fd' and use port->pr_fd >> always? > > Hmm, I think you're right. I will try that for the next patchset version. >> Similar comments to the backend code- should we just always use >> conn->pr_fd? Or should we rename pr_fd to something else? > > Renaming is probably not a bad idea, will fix. Both fixed. Additionally, a few other off-list reported issues are also fixed in this version (such as fixing the silly markup doc error and testplan off-by-one etc). -- Daniel Gustafsson https://vmware.com/
Attachment
- v32-0009-nss-Build-infrastructure.patch
- v32-0008-nss-Support-NSS-in-cryptohash.patch
- v32-0007-nss-Support-NSS-in-sslinfo.patch
- v32-0006-nss-Support-NSS-in-pgcrypto.patch
- v32-0005-nss-Documentation.patch
- v32-0004-nss-pg_strong_random-support.patch
- v32-0003-nss-Add-NSS-specific-tests.patch
- v32-0002-Refactor-SSL-testharness-for-multiple-library.patch
- v32-0001-nss-Support-libnss-as-TLS-library-in-libpq.patch
pgsql-hackers by date: