Thread: scram-sha-256 broken with FIPS and OpenSSL 1.0.2
Hi all, Enabling FIPS with OpenSSL 1.0.2 causes direct calls to the SHAXXX routines to fail: "Low level API call to digest SHA256 forbidden in fips mode" This got discussed back in 2018, but I never got back to it: https://www.postgresql.org/message-id/20180911030250.GA27115@paquier.xyz One thing I did not like back in the past patch was that we did not handle failures if one of OpenSSL's call failed, but this can easily be handled by using a trick similar to jsonapi.c to fail hard if that happens. It is worth noting that the low-level SHA routines are not recommended for years in OpenSSL, and that these have been officially marked as deprecated in 3.0.0. So, while the changes in sha2.h don't make this stuff back-patchable per the ABI breakage it introduces, switching sha2_openssl.c to use EVP is a better move in the long term, even if that means that SCRAM+FIPS would not work with PG 10~13, so the attached is something for HEAD, even if this would be possible to do in older releases as the routines used in the attached are available in versions of OpenSSL older than 1.0.1. Any thoughts? -- Michael
Attachment
> On 24 Sep 2020, at 04:53, Michael Paquier <michael@paquier.xyz> wrote: > Enabling FIPS with OpenSSL 1.0.2 causes direct calls to the SHAXXX > routines to fail: > "Low level API call to digest SHA256 forbidden in fips mode" Confirmed by running 1.0.2s-fips with fips_mode=yes in the alg_section in openssl.cnf. > This got discussed back in 2018, but I never got back to it: > https://www.postgresql.org/message-id/20180911030250.GA27115@paquier.xyz > > One thing I did not like back in the past patch was that we did not > handle failures if one of OpenSSL's call failed, but this can easily > be handled by using a trick similar to jsonapi.c to fail hard if that > happens. > > It is worth noting that the low-level SHA routines are not recommended > for years in OpenSSL, and that these have been officially marked as > deprecated in 3.0.0. So, while the changes in sha2.h don't make this > stuff back-patchable per the ABI breakage it introduces, switching > sha2_openssl.c to use EVP is a better move in the long term, Agreed. Commit 5ff4a67f63f [0] moved contrib/pgcrypto from using low-level functions for pretty much exactly the same reasons: they are advised against and break FIPS. With your patch I can run the SSL tests successfully both with and without FIPS. I'm in favor of moving to the EVP API. > ..even if > that means that SCRAM+FIPS would not work with PG 10~13, so the > attached is something for HEAD, even if this would be possible to do > in older releases as the routines used in the attached are available > in versions of OpenSSL older than 1.0.1. Thats kind of a shame, but given the low volume of reports to -bugs and -hackers, the combination SCRAM and FIPS might not be all too common. Since OpenSSL FIPS is EOL it might also not increase until 3.0.0 comes with FIPS certification? If we really want to support it (which would require more evidence of it being a problem IMO), using the non-OpenSSL sha256 code would be one option I guess? cheers ./daniel [0] https://postgr.es/m/561274F1.1030000@iki.fi
On 24/09/2020 17:21, Daniel Gustafsson wrote: > If we really want to support it (which would require more evidence of it being > a problem IMO), using the non-OpenSSL sha256 code would be one option I guess? That would technically work, but wouldn't it make the product as whole not FIPS compliant? I'm not a FIPS lawyer, but as I understand it the point of FIPS is that all the crypto code is encapsulated in a certified module. Having your own SHA-256 implementation would defeat that. - Heikki
> On 24 Sep 2020, at 18:21, Heikki Linnakangas <hlinnaka@iki.fi> wrote: > > On 24/09/2020 17:21, Daniel Gustafsson wrote: >> If we really want to support it (which would require more evidence of it being >> a problem IMO), using the non-OpenSSL sha256 code would be one option I guess? > > That would technically work, but wouldn't it make the product as whole not FIPS compliant? I'm not a FIPS lawyer, but asI understand it the point of FIPS is that all the crypto code is encapsulated in a certified module. Having your own SHA-256implementation would defeat that. Doh, of course, I blame a lack of caffeine this afternoon. Having a private local sha256 implementation using the EVP_* API inside scram-common would maintain FIPS compliance and ABI compatibility, but would also be rather ugly. cheers ./daniel
On 2020-09-24 18:21, Heikki Linnakangas wrote: > That would technically work, but wouldn't it make the product as whole > not FIPS compliant? I'm not a FIPS lawyer, but as I understand it the > point of FIPS is that all the crypto code is encapsulated in a certified > module. Having your own SHA-256 implementation would defeat that. Depends on what one considers to be covered by FIPS. The entire rest of SCRAM is custom code, so running it on top of the world's greatest SHA-256 implementation isn't going to make the end product any more trustworthy. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Thu, Sep 24, 2020 at 1:57 PM Peter Eisentraut <peter.eisentraut@2ndquadrant.com> wrote: > Depends on what one considers to be covered by FIPS. The entire rest of > SCRAM is custom code, so running it on top of the world's greatest > SHA-256 implementation isn't going to make the end product any more > trustworthy. I mean, the issue here, as is so often the case, is not what is actually more secure, but what meets the terms of some security standard. At least in the US, FIPS 140-2 compliance is a reasonably common need, so if we can make it easier for people who have that need to be compliant, they are more likely to use PostgreSQL, which seems like something that we should want. Our opinions about that standard do not matter to the users who are legally required to comply with it; the opinions of their lawyers and auditors do. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
> On 24 Sep 2020, at 21:22, Robert Haas <robertmhaas@gmail.com> wrote: > > On Thu, Sep 24, 2020 at 1:57 PM Peter Eisentraut > <peter.eisentraut@2ndquadrant.com> wrote: >> Depends on what one considers to be covered by FIPS. The entire rest of >> SCRAM is custom code, so running it on top of the world's greatest >> SHA-256 implementation isn't going to make the end product any more >> trustworthy. > > I mean, the issue here, as is so often the case, is not what is > actually more secure, but what meets the terms of some security > standard. Correct, IIUC in order to be FIPS compliant all cryptographic modules used must be FIPS certified. > At least in the US, FIPS 140-2 compliance is a reasonably > common need, so if we can make it easier for people who have that need > to be compliant, they are more likely to use PostgreSQL, which seems > like something that we should want. The proposed patch makes SCRAM+FIPS work for 14, question is if we need/want to try and address v10-13. cheers ./daniel
On Thu, Sep 24, 2020 at 06:28:25PM +0200, Daniel Gustafsson wrote: > Doh, of course, I blame a lack of caffeine this afternoon. Having a private > local sha256 implementation using the EVP_* API inside scram-common would > maintain FIPS compliance and ABI compatibility, but would also be rather ugly. Even if we'd try to force our internal implementation of SHA256 on already-released branches instead of the one of OpenSSL, this would be an ABI break for compiled modules expected to work on this released branch as OpenSSL's internal SHA structures don't exactly match with our own implementation (think just about sizeof() or such). -- Michael
Attachment
On Thu, Sep 24, 2020 at 09:44:57PM +0200, Daniel Gustafsson wrote: > On 24 Sep 2020, at 21:22, Robert Haas <robertmhaas@gmail.com> wrote: >> I mean, the issue here, as is so often the case, is not what is >> actually more secure, but what meets the terms of some security >> standard. > > Correct, IIUC in order to be FIPS compliant all cryptographic modules used must > be FIPS certified. /me whitles, thinking about not using src/common/md5.c when building with OpenSSL to actually get a complain if FIPS gets used. >> At least in the US, FIPS 140-2 compliance is a reasonably >> common need, so if we can make it easier for people who have that need >> to be compliant, they are more likely to use PostgreSQL, which seems >> like something that we should want. > > The proposed patch makes SCRAM+FIPS work for 14, question is if we need/want to > try and address v10-13. Unfortunately, I don't have a good answer for that, except for the answers involving an ABI breakage. FWIW, the only users of those APIs I can find in the open wild are pgpool, which actually bundles a copy of the code in src/common/ so it does not matter, and pgbouncer, that uses a different compatibility layer to make the code compilable: https://sources.debian.org/src/pgbouncer/1.14.0-1/include/common/postgres_compat.h/?hl=26#L26 But it is not really possible to say if there is any closed code relying on that, so I'd like to fix that just on HEAD, about which I guess there would be no objections? -- Michael
Attachment
On Fri, Sep 25, 2020 at 12:19:44PM +0900, Michael Paquier wrote: > Even if we'd try to force our internal implementation of SHA256 on > already-released branches instead of the one of OpenSSL, this would be > an ABI break for compiled modules expected to work on this released > branch as OpenSSL's internal SHA structures don't exactly match with > our own implementation (think just about sizeof() or such). Well, we could as well add one extra SHA API layer pointing to the EVP structures and APIs with new names, leaving the original ones in place, and then have SCRAM use the new ones, but I'd rather not go down that road for the back-branches. -- Michael
Attachment
Michael Paquier <michael@paquier.xyz> writes: > On Fri, Sep 25, 2020 at 12:19:44PM +0900, Michael Paquier wrote: >> Even if we'd try to force our internal implementation of SHA256 on >> already-released branches instead of the one of OpenSSL, this would be >> an ABI break for compiled modules expected to work on this released >> branch as OpenSSL's internal SHA structures don't exactly match with >> our own implementation (think just about sizeof() or such). > Well, we could as well add one extra SHA API layer pointing to the EVP > structures and APIs with new names, leaving the original ones in > place, and then have SCRAM use the new ones, but I'd rather not go > down that road for the back-branches. Given the tiny number of complaints to date, it seems sufficient to me to deal with this in HEAD. regards, tom lane
On 2020-09-24 21:44, Daniel Gustafsson wrote: >> On 24 Sep 2020, at 21:22, Robert Haas <robertmhaas@gmail.com> wrote: >> >> On Thu, Sep 24, 2020 at 1:57 PM Peter Eisentraut >> <peter.eisentraut@2ndquadrant.com> wrote: >>> Depends on what one considers to be covered by FIPS. The entire rest of >>> SCRAM is custom code, so running it on top of the world's greatest >>> SHA-256 implementation isn't going to make the end product any more >>> trustworthy. >> >> I mean, the issue here, as is so often the case, is not what is >> actually more secure, but what meets the terms of some security >> standard. > > Correct, IIUC in order to be FIPS compliant all cryptographic modules used must > be FIPS certified. As I read FIPS 140-2, it just specifies what must be true of cryptographic modules that claim to follow that standard, it doesn't say that all cryptographic activity in an application or platform must only use such modules. (Notably, it doesn't even seem to define "cryptographic".) The latter may well be a requirement of a user or customer on top of the actual standard. However, again, the SCRAM implementation would already appear to fail that requirement because it uses a custom HMAC implementation, and HMAC is listed in FIPS 140-2 as a covered algorithm. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Peter Eisentraut <peter.eisentraut@2ndquadrant.com> writes: > On 2020-09-24 21:44, Daniel Gustafsson wrote: >> Correct, IIUC in order to be FIPS compliant all cryptographic modules used must >> be FIPS certified. > As I read FIPS 140-2, it just specifies what must be true of > cryptographic modules that claim to follow that standard, it doesn't say > that all cryptographic activity in an application or platform must only > use such modules. (Notably, it doesn't even seem to define > "cryptographic".) The latter may well be a requirement of a user or > customer on top of the actual standard. Hm ... AFAICT, the intent of the "FIPS mode" in Red Hat's implementation, and probably other Linux distros, is exactly that thou shalt not use any non-FIPS-approved crypto code. By your reading above, there would be no need at all for a kernel-level enforcement switch. > However, again, the SCRAM > implementation would already appear to fail that requirement because it > uses a custom HMAC implementation, and HMAC is listed in FIPS 140-2 as a > covered algorithm. Ugh. But is there any available FIPS-approved library code that could be used instead? regards, tom lane
On Fri, Sep 25, 2020 at 01:36:44AM -0400, Tom Lane wrote: > Peter Eisentraut <peter.eisentraut@2ndquadrant.com> writes: >> However, again, the SCRAM >> implementation would already appear to fail that requirement because it >> uses a custom HMAC implementation, and HMAC is listed in FIPS 140-2 as a >> covered algorithm. > > Ugh. But is there any available FIPS-approved library code that could be > used instead? That's a good point, and I think that this falls down to use OpenSSL's HMAC_* interface for this job when building with OpenSSL: https://www.openssl.org/docs/man1.1.1/man3/HMAC.html Worth noting that these have been deprecated in 3.0.0 as per the rather-recent commit dbde472, where they recommend the use of EVP_MAC_*() instead. -- Michael
Attachment
On Fri, Sep 25, 2020 at 03:56:53PM +0900, Michael Paquier wrote: > On Fri, Sep 25, 2020 at 01:36:44AM -0400, Tom Lane wrote: > > Peter Eisentraut <peter.eisentraut@2ndquadrant.com> writes: > >> However, again, the SCRAM > >> implementation would already appear to fail that requirement because it > >> uses a custom HMAC implementation, and HMAC is listed in FIPS 140-2 as a > >> covered algorithm. > > > > Ugh. But is there any available FIPS-approved library code that could be > > used instead? > > That's a good point, and I think that this falls down to use OpenSSL's > HMAC_* interface for this job when building with OpenSSL: > https://www.openssl.org/docs/man1.1.1/man3/HMAC.html > > Worth noting that these have been deprecated in 3.0.0 as per the > rather-recent commit dbde472, where they recommend the use of > EVP_MAC_*() instead. Would a FIPS server only be able to talk to a FIPS client, or would our internal code produce the same output? -- Bruce Momjian <bruce@momjian.us> https://momjian.us EnterpriseDB https://enterprisedb.com The usefulness of a cup is in its emptiness, Bruce Lee
Bruce, In my experience, any client is permitted to connect to FIPS140-2 compliant server. I set this up when I worked at SSA,at management’s request. — Jay Sent from my iPad > On Sep 25, 2020, at 3:13 PM, Bruce Momjian <bruce@momjian.us> wrote: > > On Fri, Sep 25, 2020 at 03:56:53PM +0900, Michael Paquier wrote: >>> On Fri, Sep 25, 2020 at 01:36:44AM -0400, Tom Lane wrote: >>> Peter Eisentraut <peter.eisentraut@2ndquadrant.com> writes: >>>> However, again, the SCRAM >>>> implementation would already appear to fail that requirement because it >>>> uses a custom HMAC implementation, and HMAC is listed in FIPS 140-2 as a >>>> covered algorithm. >>> >>> Ugh. But is there any available FIPS-approved library code that could be >>> used instead? >> >> That's a good point, and I think that this falls down to use OpenSSL's >> HMAC_* interface for this job when building with OpenSSL: >> https://www.openssl.org/docs/man1.1.1/man3/HMAC.html >> >> Worth noting that these have been deprecated in 3.0.0 as per the >> rather-recent commit dbde472, where they recommend the use of >> EVP_MAC_*() instead. > > Would a FIPS server only be able to talk to a FIPS client, or would our > internal code produce the same output? > > -- > Bruce Momjian <bruce@momjian.us> https://momjian.us > EnterpriseDB https://enterprisedb.com > > The usefulness of a cup is in its emptiness, Bruce Lee > > >
On Fri, Sep 25, 2020 at 03:38:22PM -0400, John Scalia wrote: > Bruce, > > In my experience, any client is permitted to connect to FIPS140-2 compliant server. I set this up when I worked at SSA,at management’s request. My question is whether the hash output would match if using different code. -- Bruce Momjian <bruce@momjian.us> https://momjian.us EnterpriseDB https://enterprisedb.com The usefulness of a cup is in its emptiness, Bruce Lee
FIPS only specifies which algorithms are approved for use on it. For instance, MD-5 is NOT approved at all under FIPS. Iwould say any algorithm should produce the same result regardless of where it is run. BTW, on Redhat servers, the firstalgorithm listed for use with SSH is MD-5. This causes the sshd daemon to abort when FIPS is enabled and that configfile has not been edited. So, you can no longer connect with an SSH client as the daemon isn’t running. Ask me howI know this. Sent from my iPad > On Sep 25, 2020, at 3:39 PM, Bruce Momjian <bruce@momjian.us> wrote: > > On Fri, Sep 25, 2020 at 03:38:22PM -0400, John Scalia wrote: >> Bruce, >> >> In my experience, any client is permitted to connect to FIPS140-2 compliant server. I set this up when I worked at SSA,at management’s request. > > My question is whether the hash output would match if using different > code. > > -- > Bruce Momjian <bruce@momjian.us> https://momjian.us > EnterpriseDB https://enterprisedb.com > > The usefulness of a cup is in its emptiness, Bruce Lee >
On Fri, Sep 25, 2020 at 12:27:03AM -0400, Tom Lane wrote: > Given the tiny number of complaints to date, it seems sufficient to me > to deal with this in HEAD. Thanks. I have done more tests with the range of OpenSSL versions we support on HEAD, and applied this one. I have noticed that the previous patch forgot two fail-and-abort code paths as of EVP_DigestInit_ex() and EVP_DigestUpdate(). -- Michael
Attachment
On Mon, Sep 28, 2020 at 12:55:06PM +0900, Michael Paquier wrote: > Thanks. I have done more tests with the range of OpenSSL versions we > support on HEAD, and applied this one. I have noticed that the > previous patch forgot two fail-and-abort code paths as of > EVP_DigestInit_ex() and EVP_DigestUpdate(). As this got reverted with fe0a1dc because of the lack of correct error reporting in libpq, I have restarted this work from scratch, and finished with the set of two patches attached. 0001 is a redesign of the APIs we use for the SHA2 implementations. The origin of the problem is that we cannot have a control of the memory context used by OpenSSL to allocate any of the EVP-related data, so we need to add some routines to be able to allocate and free the SHA2 contexts, basically. We have too many routines to do the work now, so I reduced the whole to 5, instead of 12 originally (this number would become 20 if we'd add the free/alloc routines for each SHA2 part), giving the following structure: /* Context Structures for SHA224/256/384/512 */ typedef enum { PG_SHA224 = 0, PG_SHA256, PG_SHA384, PG_SHA512 } pg_sha2_type; typedef struct pg_sha2_ctx { pg_sha2_type type; /* private area used by each SHA2 implementation */ void *data; } pg_sha2_ctx; extern pg_sha2_ctx *pg_sha2_create(pg_sha2_type type); extern int pg_sha2_init(pg_sha2_ctx *ctx); extern int pg_sha2_update(pg_sha2_ctx *ctx, const uint8 *data, size_t len); extern int pg_sha2_final(pg_sha2_ctx *ctx, uint8 *dest); extern void pg_sha2_free(pg_sha2_ctx *ctx); A huge advantage of this approach is that the keep all the details of the SHA2 implementations within each file, so we have nothing related to OpenSSL in sha2.h, which is rather clean. All the internal structures part of the fallback implementations are also moved into their own file sha2.c. I have made the choice to limit the number of ifdef FRONTEND in the files of src/common/ for clarity, meaning that the callers of those routines can handle errors as they want, in the frontend and the backend. The areas making use of the SHA2 implementations are SCRAM (libpq and backend) and the checksum manifests, so this has required some rework of the existing interfaces to pass down errors correctly, but at the end the changes needed in libpq and pg_verifybackup are straight-forward. With 0001 in place, switching the SHA2 implementation of OpenSSL to use EVP is straight-forward, as the only thing that's actually needed here is to put in place a callback to clean up the EVP contexts allocated by OpenSSL. This is rather similar to what we do in pgcrypto in some ways, but that's actually simpler and I made things so as we only track down the EVP_MD_CTX members to free on abort. I'll add that to the next CF for review. -- Michael
Attachment
On 14/10/2020 06:29, Michael Paquier wrote: > With 0001 in place, switching the SHA2 implementation of OpenSSL to > use EVP is straight-forward, as the only thing that's actually needed > here is to put in place a callback to clean up the EVP contexts > allocated by OpenSSL. This is rather similar to what we do in > pgcrypto in some ways, but that's actually simpler and I made things > so as we only track down the EVP_MD_CTX members to free on abort. Since this is going to be core backend code (and also frontend), we don't need to use the generic reource owner callback mechanism, we could add a built-in ResourceOwnerData field and functions in resowner.c. The callback mechanism is a bit clunky. - Heikki
On Wed, Oct 14, 2020 at 10:40:12AM +0300, Heikki Linnakangas wrote: > Since this is going to be core backend code (and also frontend), we don't > need to use the generic reource owner callback mechanism, we could add a > built-in ResourceOwnerData field and functions in resowner.c. The callback > mechanism is a bit clunky. Sure, thanks. I wanted to keep things isolated in sha2_openssl.c as that's something specific to the implementation. Thinking more about it, your suggestion makes a lot of sense in the long-term by including MD5 and HMAC in the picture. These also go through EVP in OpenSSL, and we are kind of incorrect currently to not use the OpenSSL flavor if available (MD5 is not authorized in FIPS, but we still allow it to be used with the in-core implementation). -- Michael
Attachment
On Wed, Oct 14, 2020 at 05:18:51PM +0900, Michael Paquier wrote: > Sure, thanks. I wanted to keep things isolated in sha2_openssl.c as > that's something specific to the implementation. Thinking more about > it, your suggestion makes a lot of sense in the long-term by including > MD5 and HMAC in the picture. These also go through EVP in OpenSSL, > and we are kind of incorrect currently to not use the OpenSSL flavor > if available (MD5 is not authorized in FIPS, but we still allow it to > be used with the in-core implementation). I got my hands on that, and this proves to simplify a lot things. In bonus, attached is a 0003 that cleans up some code in pgcrypto so as it uses the in-core resowner facility to handle EVP contexts. -- Michael
Attachment
On Thu, Oct 15, 2020 at 03:56:21PM +0900, Michael Paquier wrote: > I got my hands on that, and this proves to simplify a lot things. In > bonus, attached is a 0003 that cleans up some code in pgcrypto so as > it uses the in-core resowner facility to handle EVP contexts. This conflicted on HEAD with pgcrypto. Please find attached a rebased set. -- Michael
Attachment
On Thu, Nov 05, 2020 at 03:41:23PM +0900, Michael Paquier wrote: > This conflicted on HEAD with pgcrypto. Please find attached a rebased > set. I got to think more about this stuff and attached is a new patch set that redesigns the generic interface used for the crypto hash functions, in order to use the same entry point at the end for SHA2, SHA1, MD5 or even HMAC. This is part of 0001: - Introduction of a single file called cryptohash[_openssl].c, which includes five functions to create, initialize, update, finalize and free a crypto hash context. The attached does the work for SHA2. - The fallback implementations are in their own file in src/common/, and get included in cryptohash.c. cryptohash_openssl.c is much more simple as it needs to use EVP for everything. - Adding a new crypto function in the set is simple once this is done, as a type needs to be added with the correct options plugged in. 0002 and 0003 don't have any changes. I think that we could also rename the existing cryptohashes.c to crypohashfuncs.c to be more consistent, but I have left that out for now. -- Michael
Attachment
> On 13 Nov 2020, at 04:14, Michael Paquier <michael@paquier.xyz> wrote: > I got to think more about this stuff and attached is a new patch set > that redesigns the generic interface used for the crypto hash > functions I've spent some time on this, and the gist of the patchset is clearly something we want to do: move to the EVP API and ensure that we don't drop any contexts. IIUC, this patchset moves the allocation of the context inside the API rather than having the caller be responsible for providing the memory (and thus be able to use the stack). This in turn changes the API to returning int rather than being void. As an effect of this we get the below types of statements that aren't easy on the eyes: + /* + * Calculate ClientSignature. Note that we don't log directly a failure + * here even when processing the calculations as this could involve a mock + * authentication. + */ + if (scram_HMAC_init(&ctx, state->StoredKey, SCRAM_KEY_LEN) < 0 || + scram_HMAC_update(&ctx, + state->client_first_message_bare, + strlen(state->client_first_message_bare)) < 0 || + scram_HMAC_update(&ctx, ",", 1) < 0 || + scram_HMAC_update(&ctx, + state->server_first_message, + strlen(state->server_first_message)) < 0 || + scram_HMAC_update(&ctx, ",", 1) < 0 || + scram_HMAC_update(&ctx, + state->client_final_message_without_proof, + strlen(state->client_final_message_without_proof)) < 0 || + scram_HMAC_final(ClientSignature, &ctx) < 0) + elog(ERROR, "could not calculate client signature"); This seems like a complication which brings little benefit since the only real errorcase is OOM in creating the context? The built-in crypto support is designed to never fail, and reading the OpenSSL code the only failure cases are ENGINE initialization (which we don't do) and memory allocation. Did you consider using EVP_MD_CTX_init instead which place the memory allocation responsibility with the caller? Keeping this a void API and leaving the caller to decide on memory allocation would make the API a lot simpler IMHO. Are there other error cases and/or errorpaths you're considering that I'm missing here? If it's just OOM on allocation it seems a high price to pay. +#ifndef _PG_CRYPTOHASH_H_ +#define _PG_CRYPTOHASH_H_ This should probably be CRYPTOHASH_H to be consistent? + status = EVP_DigestInit_ex((EVP_MD_CTX *) ctx->data, + EVP_sha224(), NULL); Since we always use the default implementation and never select an ENGINE, why not use EVP_DigestInit instead? +/* Include internals of each implementation here */ +#include "sha2.c" Do we really want to implement this by including a .c file? Are there no other options you see? > I think that we could also rename the existing cryptohashes.c to > crypohashfuncs.c to be more consistent, but I have left that out for now. +1. That can be done as a separate patch submission though as it's unrelated. cheers ./daniel
On Thu, Nov 19, 2020 at 02:05:35PM +0100, Daniel Gustafsson wrote: > IIUC, this patchset moves the allocation of the context inside the API rather > than having the caller be responsible for providing the memory (and thus be > able to use the stack). This in turn changes the API to returning int rather > than being void. > > As an effect of this we get the below types of statements that aren't easy on > the eyes: > > [... code ...] We use this style in other places of the code. Slitting each part can also be done, if that's easier for the eye. Personal taste likely ;) > This seems like a complication which brings little benefit since the only real > errorcase is OOM in creating the context? The built-in crypto support is > designed to never fail, and reading the OpenSSL code the only failure cases are > ENGINE initialization (which we don't do) and memory allocation. Did you > consider using EVP_MD_CTX_init instead which place the memory allocation > responsibility with the caller? Keeping this a void API and leaving the caller > to decide on memory allocation would make the API a lot simpler IMHO. Yes. That's actually the main take and why EVP callers *have* to let OpenSSL do the allocation: we cannot know the size of EVP_MD_CTX in advance in newer versions, and OpenSSL visibly want to keep the right to do extra allocations if necessary within what's set in the context. This is based on my reads of the upstream code as of crypto/evp/digest.c and crypto/evp/evp.h. evp.h includes env_md_ctx_st for OpenSSL <= 1.0.2, but this has been moved to evp_local.h, header not installed, in newer versions which refers only to the internals of the thing, with ossl_typ.h still making EVP_MD_CTX point to env_md_ctx_st, but it becomes an incomplete type. On top of that, knowing that we still need to deal with the OOM failure handling and pass down the error to the callers playing with SHA2, it feels like the most consistent API to me for the frontend and the backend. If you want, it is easy to test that with stuff like that in cryptohashes.c: + EVP_MD_CTX *ctx; [...] + ctx = palloc(sizeof(EVP_MD_CTX)); + + EVP_MD_CTX_init(ctx); + + EVP_DigestInit_ex(ctx, EVP_sha256(), NULL); + EVP_DigestUpdate(ctx, data, len); + EVP_DigestFinal_ex(ctx, buf, 0); FWIW, this thread has dealt with the problem for pgcrypto: https://www.postgresql.org/message-id/20160701094159.GA17882@msg.df7cb.de > +#ifndef _PG_CRYPTOHASH_H_ > +#define _PG_CRYPTOHASH_H_ > This should probably be CRYPTOHASH_H to be consistent? cryptohash.h sounds like a header file we could find somewhere else, hence the extra PG_. > + status = EVP_DigestInit_ex((EVP_MD_CTX *) ctx->data, > + EVP_sha224(), NULL); > Since we always use the default implementation and never select an ENGINE, why > not use EVP_DigestInit instead? Indeed, let's do that. > +/* Include internals of each implementation here */ > +#include "sha2.c" > Do we really want to implement this by including a .c file? Are there no other > options you see? That was the least intrusive option I could figure out. Two other options I have thought about: - Compile those fallback implementations independently and publish the internals in a separate header, but nobody is going to need that if we have a generic entry point. - Include the internals of each implementation in cryptohash.c, but this bloats the file with more implementations added (HMAC and MD5 still need to be added on top of SHA2), and it messes up with the existing copyright entries. So splitting and just including them sounds like the cleanest option of the set. -- Michael
Attachment
> On 20 Nov 2020, at 01:33, Michael Paquier <michael@paquier.xyz> wrote: >> This seems like a complication which brings little benefit since the only real >> errorcase is OOM in creating the context? The built-in crypto support is >> designed to never fail, and reading the OpenSSL code the only failure cases are >> ENGINE initialization (which we don't do) and memory allocation. Did you >> consider using EVP_MD_CTX_init instead which place the memory allocation >> responsibility with the caller? Keeping this a void API and leaving the caller >> to decide on memory allocation would make the API a lot simpler IMHO. > > Yes. That's actually the main take and why EVP callers *have* to let > OpenSSL do the allocation: we cannot know the size of EVP_MD_CTX in > advance in newer versions, Yeah, there is that. > knowing that we still need to deal with the OOM failure handling > and pass down the error to the callers playing with SHA2, it feels > like the most consistent API to me for the frontend and the backend. For the backend I'd prefer an API where the allocation worked like palloc, if not then it's a straight exit through the giftshop. But if we want an API for both the frontend and backend, I guess this is what we'll have to do. >> +#ifndef _PG_CRYPTOHASH_H_ >> +#define _PG_CRYPTOHASH_H_ >> This should probably be CRYPTOHASH_H to be consistent? > > cryptohash.h sounds like a header file we could find somewhere else, > hence the extra PG_. Ok, then at least I think we should use PG_CRYPTOHASH_H to be more consistent with the tree, and since leading underscores in C are problematic spec-wise. >> +/* Include internals of each implementation here */ >> +#include "sha2.c" >> Do we really want to implement this by including a .c file? Are there no other >> options you see? > > That was the least intrusive option I could figure out. Two other > options I have thought about: > - Compile those fallback implementations independently and publish the > internals in a separate header, but nobody is going to need that if we > have a generic entry point. > - Include the internals of each implementation in cryptohash.c, but > this bloats the file with more implementations added (HMAC and MD5 > still need to be added on top of SHA2), and it messes up with the > existing copyright entries. > So splitting and just including them sounds like the cleanest option > of the set. Personally I think the first option of using an internal header seems cleaner, but MMV so I'll leave it to others to weigh in too. cheers ./daniel
On Fri, Nov 20, 2020 at 11:17:44PM +0100, Daniel Gustafsson wrote: > On 20 Nov 2020, at 01:33, Michael Paquier <michael@paquier.xyz> wrote: >> knowing that we still need to deal with the OOM failure handling >> and pass down the error to the callers playing with SHA2, it feels >> like the most consistent API to me for the frontend and the backend. > > For the backend I'd prefer an API where the allocation worked like palloc, if > not then it's a straight exit through the giftshop. But if we want an API for > both the frontend and backend, I guess this is what we'll have to do. The fallback implementations can somewhat achieve that as we know all the internals of the structures used. > Ok, then at least I think we should use PG_CRYPTOHASH_H to be more consistent > with the tree, and since leading underscores in C are problematic spec-wise. Makes sense. Will fix. >> That was the least intrusive option I could figure out. Two other >> options I have thought about: >> - Compile those fallback implementations independently and publish the >> internals in a separate header, but nobody is going to need that if we >> have a generic entry point. >> - Include the internals of each implementation in cryptohash.c, but >> this bloats the file with more implementations added (HMAC and MD5 >> still need to be added on top of SHA2), and it messes up with the >> existing copyright entries. >> So splitting and just including them sounds like the cleanest option >> of the set. > > Personally I think the first option of using an internal header seems cleaner, > but MMV so I'll leave it to others to weigh in too. What you meant and what I meant was slightly different here. I meant publishing a header in src/include/common/ that would get installed, and I'd rather avoid that. And you mean to have the header for local consumption in src/common/. I would be fine with your third option as well. Your suggestion is more consistent with what we do for the rest of src/common/ and libpq actually. So I don't mind switching to that. -- Michael
Attachment
On Sat, Nov 21, 2020 at 10:19:42AM +0900, Michael Paquier wrote: > What you meant and what I meant was slightly different here. I meant > publishing a header in src/include/common/ that would get installed, > and I'd rather avoid that. And you mean to have the header for local > consumption in src/common/. I would be fine with your third option as > well. Your suggestion is more consistent with what we do for the rest > of src/common/ and libpq actually. So I don't mind switching to > that. I got to look at your suggestion, and finished with the attached which is pretty close my previous set, except that MSVC scripts as well as the header includes needed a slight refresh. Please note that the OpenSSL docs tell that EVP_DigestInit() is obsolete and that applications should just use EVP_DigestInit_ex(), so I have kept the original: https://www.openssl.org/docs/man1.1.1/man3/EVP_DigestInit.html The PG_CRYPTOHASH macro in cryptohash.h has been changed as you suggested. What do you think? -- Michael
Attachment
> On 24 Nov 2020, at 11:52, Michael Paquier <michael@paquier.xyz> wrote: > I got to look at your suggestion, and finished with the attached which > is pretty close my previous set, except that MSVC scripts as well as > the header includes needed a slight refresh. Looks good, nothing major sticks out. I'm not excited about all the allocations needed here now, but it seems the only optipn. > Please note that the OpenSSL docs tell that EVP_DigestInit() is > obsolete and that applications should just use EVP_DigestInit_ex(), so > I have kept the original: > https://www.openssl.org/docs/man1.1.1/man3/EVP_DigestInit.html Fair enough. > What do you think? A few comments in no particular order: + if (scram_HMAC_init(&ctx, state->ServerKey, SCRAM_KEY_LEN) < 0 || + scram_HMAC_update(&ctx, + state->client_first_message_bare, ..snip.. + scram_HMAC_final(ServerSignature, &ctx) < 0) + { + elog(ERROR, "could not calculate server signature"); + } Some of these long if-statements omit the {} around the elog, while some (like this one) don't. I think it makes sense from a readability POV to use the braces. + * pg_cryptohash_create + * + * Allocate a hash context. Returns NULL on failure. This comment should perhaps be amended to specify that it returns NULL on failure in the frontend, in the backend it won't return on error. + case PG_SHA224: + pg_sha224_update((pg_sha224_ctx *) ctx->data, data, len); + break; + case PG_SHA256: + pg_sha256_update((pg_sha256_ctx *) ctx->data, data, len); + break; Would codepaths like these become more readable if pg_cryptohash_ctx used a union with shaxxx_ctx's rather than a void pointer for the data part? + /* Ditto for EVP contexts */ + while (ResourceArrayGetAny(&(owner->evparr), &foundres)) + { + if (isCommit) + PrintEVPLeakWarning(foundres); +#ifdef USE_OPENSSL + EVP_MD_CTX_destroy((EVP_MD_CTX *) DatumGetPointer(foundres)); +#endif + ResourceOwnerForgetEVP(owner, foundres); + } Since the cryptohash support is now generalized behind an abstraction layer, wouldn't it make sense to roll the resource ownership there as well kind of like how JIT is handled? It would make it easier to implement TLS backend support, and we won't have to inject OpenSSL headers here. cheers ./daniel
On Mon, Nov 30, 2020 at 01:43:24PM +0100, Daniel Gustafsson wrote: > Looks good, nothing major sticks out. Thanks for the review. > I'm not excited about all the allocations needed here now, but it > seems the only option. Yeah, OpenSSL forces a bit our hand here I am afraid.. > Some of these long if-statements omit the {} around the elog, while some (like > this one) don't. I think it makes sense from a readability POV to use the > braces. Agreed. > + * pg_cryptohash_create > + * > + * Allocate a hash context. Returns NULL on failure. > This comment should perhaps be amended to specify that it returns NULL on > failure in the frontend, in the backend it won't return on error. Okay. Let's be precise. > + case PG_SHA224: > + pg_sha224_update((pg_sha224_ctx *) ctx->data, data, len); > + break; > + case PG_SHA256: > + pg_sha256_update((pg_sha256_ctx *) ctx->data, data, len); > + break; > Would codepaths like these become more readable if pg_cryptohash_ctx used a > union with shaxxx_ctx's rather than a void pointer for the data part? Hmm. I am not sure that this would help much, because we'd still need to cast to the local structures in this case as the pg_shaXXX_ctx are local to sha2.c, and we still need to keep some void* in cryptohash.h. > Since the cryptohash support is now generalized behind an abstraction layer, > wouldn't it make sense to roll the resource ownership there as well kind of > like how JIT is handled? It would make it easier to implement TLS backend > support, and we won't have to inject OpenSSL headers here. So, you are referring here about using a new API in the abstraction layer. This makes sense. What about naming that pg_cryptohash_context_free(void *)? -- Michael
Attachment
> On 30 Nov 2020, at 14:13, Michael Paquier <michael@paquier.xyz> wrote: > On Mon, Nov 30, 2020 at 01:43:24PM +0100, Daniel Gustafsson wrote: >> Since the cryptohash support is now generalized behind an abstraction layer, >> wouldn't it make sense to roll the resource ownership there as well kind of >> like how JIT is handled? It would make it easier to implement TLS backend >> support, and we won't have to inject OpenSSL headers here. > > So, you are referring here about using a new API in the abstraction > layer. This makes sense. What about naming that > pg_cryptohash_context_free(void *)? Yeah, that's along the lines of what I was thinking of. cheers ./daniel
On Mon, Nov 30, 2020 at 02:29:29PM +0100, Daniel Gustafsson wrote: > Yeah, that's along the lines of what I was thinking of. Hmm. I have looked at that, and thought first about having directly a reference to the resowner directly in pg_cryptohash_ctx, but that's not a good plan for two reasons: - common/cryptohash.h would get knowledge of that, requiring bundling in a bunch of dependencies. - There is no need for that in the non-OpenSSL case. So, instead, I have been thinking about using an extra context layer only for cryptohash_openssl.c with a structure saved as pg_cryptohash_context->data that stores the information about EVP_MD_CTX* and the resource owner. Then, I was thinking about storing directly pg_cryptohash_ctx in the resowner EVP array and just call pg_cryptohash_free() from resowner.c without the need of an extra routine. I have not tested this idea but that should work. What's your take? In parallel, I have spent more time today polishing and reviewing 0001 (indented, adjusted a couple of areas and added also brackets and extra comments as you suggested) and tested it on Linux and Windows, with and without OpenSSL down to 1.0.1, the oldest version supported on HEAD. So I'd like to apply the attached first and sort out the resowner stuff in a next step. -- Michael
Attachment
> On 1 Dec 2020, at 06:38, Michael Paquier <michael@paquier.xyz> wrote: > > On Mon, Nov 30, 2020 at 02:29:29PM +0100, Daniel Gustafsson wrote: >> Yeah, that's along the lines of what I was thinking of. > > Hmm. I have looked at that, and thought first about having directly a > reference to the resowner directly in pg_cryptohash_ctx, but that's > not a good plan for two reasons: > - common/cryptohash.h would get knowledge of that, requiring bundling > in a bunch of dependencies. > - There is no need for that in the non-OpenSSL case. > So, instead, I have been thinking about using an extra context layer > only for cryptohash_openssl.c with a structure saved as > pg_cryptohash_context->data that stores the information about > EVP_MD_CTX* and the resource owner. Then, I was thinking about > storing directly pg_cryptohash_ctx in the resowner EVP array and just > call pg_cryptohash_free() from resowner.c without the need of an > extra routine. I have not tested this idea but that should work. > What's your take? That sounds like it would work. Since the cryptohash implementation owns and controls the void *data contents it can store whatever it needs to properly free it. > In parallel, I have spent more time today polishing and reviewing 0001 > (indented, adjusted a couple of areas and added also brackets and > extra comments as you suggested) and tested it on Linux and Windows, > with and without OpenSSL down to 1.0.1, the oldest version supported > on HEAD. So I'd like to apply the attached first and sort out the > resowner stuff in a next step. +1 on separating the API, EVP migration, resowner patches. Reading through the v6 patch I see nothing sticking out and all review comments addressed, +1 on applying that one and then we'll take if from there with the remaining ones in the patchset. cheers ./daniel
On Tue, Dec 01, 2020 at 10:10:45AM +0100, Daniel Gustafsson wrote: > That sounds like it would work. Since the cryptohash implementation owns and > controls the void *data contents it can store whatever it needs to properly > free it. That seems to work properly. I have injected some elog(ERROR) with the attached in some of the SQL functions from cryptohashes.c and the cleanup happens with the correct resowner references. It felt a bit strange to use directly the term EVP in resowner.c once I did this switch, so this is renamed to "CryptoHash" instead. > Reading through the v6 patch I see nothing sticking out and all review comments > addressed, +1 on applying that one and then we'll take if from there with the > remaining ones in the patchset. Thanks. 0001 has been applied and the buildfarm does not complain, so it looks like we are good (I'll take care of any issues, like the one Fujii-san has just reported). Attached are new patches for 0002, the EVP switch. One thing I noticed is that we need to free the backup manifest a bit earlier once we begin to use resource owner in basebackup.c as there is a specific step that may do a double-free. This would not happen when not using OpenSSL or on HEAD. It would be easy to separate the resowner and cryptohash portions of the patch here, but both are tightly linked, so I'd prefer to keep them together. Another thing to note is that this makes 0003 for pgcrypto meaningless, because this would track down only states created by cryptohash_openssl.c, and not directly EVP_MD_CTX. Consistency in the backend code is for the best, and considering that pgcrypto has similar linked lists for ciphers it feels a bit weird to switch only half of it to use something in resowner.c. So, I am not sure if we need to do anything here, and I am discarding this part. -- Michael
Attachment
On Wed, Dec 02, 2020 at 12:03:49PM +0900, Michael Paquier wrote: > Thanks. 0001 has been applied and the buildfarm does not complain, so > it looks like we are good (I'll take care of any issues, like the one > Fujii-san has just reported). Attached are new patches for 0002, the > EVP switch. One thing I noticed is that we need to free the backup > manifest a bit earlier once we begin to use resource owner in > basebackup.c as there is a specific step that may do a double-free. > This would not happen when not using OpenSSL or on HEAD. It would be > easy to separate the resowner and cryptohash portions of the patch > here, but both are tightly linked, so I'd prefer to keep them > together. Attached is a rebased version to take care of the conflicts introduced by 91624c2f. -- Michael
Attachment
> On 3 Dec 2020, at 02:47, Michael Paquier <michael@paquier.xyz> wrote: > > On Wed, Dec 02, 2020 at 12:03:49PM +0900, Michael Paquier wrote: >> Thanks. 0001 has been applied and the buildfarm does not complain, so >> it looks like we are good (I'll take care of any issues, like the one >> Fujii-san has just reported). Attached are new patches for 0002, the >> EVP switch. One thing I noticed is that we need to free the backup >> manifest a bit earlier once we begin to use resource owner in >> basebackup.c as there is a specific step that may do a double-free. >> This would not happen when not using OpenSSL or on HEAD. It would be >> easy to separate the resowner and cryptohash portions of the patch >> here, but both are tightly linked, so I'd prefer to keep them >> together. > > Attached is a rebased version to take care of the conflicts introduced > by 91624c2f. This version looks good to me, and builds/tests without any issues. While I didn't try to adapt the libnss patch to the resowner machinery, I don't see any reasons off the cuff why it wouldn't work with the scaffolding provided here. My only question is: +#ifndef FRONTEND + elog(ERROR, "out of memory"); Shouldn't that be an ereport using ERRCODE_OUT_OF_MEMORY? cheers ./daniel
On Thu, Dec 03, 2020 at 10:58:39PM +0100, Daniel Gustafsson wrote: > This version looks good to me, and builds/tests without any issues. While I > didn't try to adapt the libnss patch to the resowner machinery, I don't see any > reasons off the cuff why it wouldn't work with the scaffolding provided here. Based on my read of the code in lib/freebl/, SHA256ContextStr & co hold the context data for SHA2, but are headers like sha256.h installed? I don't know enough of NSS to be able to answer to that. If, like OpenSSL, the context internals are not provided, I think that you could use SHA256_NewContext() and track the allocation with the resource owner callbacks, but doing a palloc() would be much simpler if the context internals are available. > My only question is: > > +#ifndef FRONTEND > + elog(ERROR, "out of memory"); > Shouldn't that be an ereport using ERRCODE_OUT_OF_MEMORY? That makes sense, fixed. I have done more testing across all versions of OpenSSL, and applied this one, meaning that we are done for SHA2. Thanks for the reviews! Now, moving back to MD5.. -- Michael