Thread: Refactoring HMAC in the core code

Refactoring HMAC in the core code

From
Michael Paquier
Date:
Hi all,
(Added Bruce and Daniel in CC:)

$subject has been mentioned a couple of times lately, mainly by me for
the recent cryptohash refactoring that has been done.  We use in the
core code HMAC with SHA256 for SCRAM, but this logic should go through
SSL libraries able to support it (OpenSSL and libnss allow that) so as
requirements like FIPS can be pushed down to any lower-level library
we are building with and not Postgres.

FWIW, I have also bumped into this stuff as being a requirement for
the recent thread about file-level encryption in [1] where the code
makes use of HMAC with SHA512.

So, please find attached a patch set to rework all that.  This
provides a split similar to what I have done recently for cryptohash
functions, with a fallback implementation located as of
src/common/hmac.c, that depends itself on the fallback implementations
of cryptohash functions.  The OpenSSL part is done hmac_openssl.c.

There are five APIs to be able to plug in HMAC implementations to
create, initialize, update, finalize and free a HMAC context, in a
fashion similar to cryptohashes.

Regarding OpenSSL, upstream has changed lately the way it is possible
to control HMACs.  3.0.0 has introduced a new set of APIs, with
compatibility macros for older versions, as mentioned here:
https://www.openssl.org/docs/manmaster/man3/EVP_MAC_CTX_new.html
The new APIs are named EVP_MAC_CTX_new() and such.

I think that this is a bit too new to use though, as we need to
support OpenSSL down to 1.0.1 on HEAD and because there are
compatibility macros.  So instead I have decided to rely on the older
interface based on HMAC_Init_ex() and co:
https://www.openssl.org/docs/manmaster/man3/HMAC.html

After that there is another point to note.  In 1.1.0, any consumers of
HMAC *have* to let OpenSSL allocate the HMAC context, like
cryptohashes because the internals of the HMAC context are only known
to OpenSSL.  In 1.0.2 and older versions, it is possible to have
access to HMAC_CTX.  This requires an extra tweak in hmac_openssl.c
where we need to {m,p}alloc by ourselves instead of calling
HMAC_CTX_new() for 1.1.0 and 1.1.1 but some extra configure switches
are able to do the trick.  That means that we could use resowners only
when building with OpenSSL >= 1.1.0, and not for older versions but
note that the patch uses resowners anyway, as a matter of simplicity.
As the changes are local to a single file, that's easy enough to
follow and update.  It would be easy enough to rip off this code once
support for older OpenSSL versions is removed.

Please note that I have added code that should be enough for the
compilation on Windows, but I have not taken the time to check that.
I have checked that things compiled and that check-world passes
with and without OpenSSL 1.1.1 on Linux though, so I guess that I have
not messed up too badly.  This stuff requires much more tests, like
making sure that we are able to connect to PG correctly with SCRAM
when using combinations like libpq based on OpenSSL and the backend
using the fallback, or just check the consistency of the results of
computations with SQL functions or such.  An extra thing that can be
done is to clean up pgcrypto's px-hmac.c but this also requires SHA1
in cryptohash.c, something that I have submitted separately in [2].
So this could just be done later.  This patch has updated the code of
SCRAM so as we don't use anymore all the SCRAM/HMAC business but the
generic HMAC routines instead for this work.

Thoughts are welcome.  I am adding that to the next CF.

[1]: https://www.postgresql.org/message-id/X9lhi1ht04I+v/rV@paquier.xyz
[2]: https://commitfest.postgresql.org/31/2868/

Thanks,
--
Michael

Attachment

Re: Refactoring HMAC in the core code

From
Bruce Momjian
Date:
On Wed, Dec 16, 2020 at 04:17:50PM +0900, Michael Paquier wrote:
> Please note that I have added code that should be enough for the
> compilation on Windows, but I have not taken the time to check that.
> I have checked that things compiled and that check-world passes
> with and without OpenSSL 1.1.1 on Linux though, so I guess that I have
> not messed up too badly.  This stuff requires much more tests, like
> making sure that we are able to connect to PG correctly with SCRAM
> when using combinations like libpq based on OpenSSL and the backend
> using the fallback, or just check the consistency of the results of
> computations with SQL functions or such.  An extra thing that can be
> done is to clean up pgcrypto's px-hmac.c but this also requires SHA1
> in cryptohash.c, something that I have submitted separately in [2].
> So this could just be done later.  This patch has updated the code of
> SCRAM so as we don't use anymore all the SCRAM/HMAC business but the
> generic HMAC routines instead for this work.
> 
> Thoughts are welcome.  I am adding that to the next CF.

Very nice.  Are you planning to apply this soon?  If so, I will delay
my key management patch until this is applied.  If not, I will update my
HMAC call when you apply this.

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EnterpriseDB                             https://enterprisedb.com

  The usefulness of a cup is in its emptiness, Bruce Lee




Re: Refactoring HMAC in the core code

From
Michael Paquier
Date:
On Thu, Dec 17, 2020 at 12:53:06PM -0500, Bruce Momjian wrote:
> Very nice.  Are you planning to apply this soon?  If so, I will delay
> my key management patch until this is applied.  If not, I will update my
> HMAC call when you apply this.

Knowing that we are in a period of vacations for a lot of people, and
that this is a sensitive area of the code that involves
authentication, I think that it is better to let this thread brew
longer and get more eyes to look at it.  As this also concerns
external SSL libraries like libnss, making sure that the APIs have a
shape flexible enough would be good.  Based on my own checks with
OpenSSL and libnss, I think that's more than enough.  But let's be
sure.

I don't think that this prevents to update your code to rely on this
new API as you could post a copy of this patch in your own patch
series (the CF bot can pick up a set of patches labeled with
format-patch), making your own feature a bit smaller in size.  But I
guess that depends on how you want to maintain a live patch series.

What should be really avoided is to commit in the code tree any code
that we know could have been refactored out first, so as we always
have in the tree what we consider as a clean state and don't
accumulate duplications.  That pays off a lot when it comes to the
buildfarm turning suddenly red where a revert is necessary, as
incremental changes reduce the number of things to work on at once,
and the number of changes to revert.
--
Michael

Attachment

Re: Refactoring HMAC in the core code

From
Michael Paquier
Date:
On Fri, Dec 18, 2020 at 08:41:01AM +0900, Michael Paquier wrote:
> Knowing that we are in a period of vacations for a lot of people, and
> that this is a sensitive area of the code that involves
> authentication, I think that it is better to let this thread brew
> longer and get more eyes to look at it.  As this also concerns
> external SSL libraries like libnss, making sure that the APIs have a
> shape flexible enough would be good.  Based on my own checks with
> OpenSSL and libnss, I think that's more than enough.  But let's be
> sure.

FWIW, I got my eyes on this stuff again today, and please find
attached a v2, where I have fixed a certain number of issues:
- Fixed a memory leak with the shrink buffer in the fallback
implementation.
- Fixed a couple of incorrect comments.
- The logic around the resowner was a bit busted with OpenSSL <=
1.0.2.  So I haev reorganized the code a bit.

This has been tested on Windows and Linux across all the versions of
OpenSSL we support on HEAD.  I am also attaching a small module called
hmacfuncs that I used as a way to validate this patch across all the
versions of OpenSSL and the fallback implementation.  As a reference,
this matches with the results from Wikipedia here:
https://en.wikipedia.org/wiki/HMAC#Examples
--
Michael

Attachment

Re: Refactoring HMAC in the core code

From
Bruce Momjian
Date:
On Fri, Dec 18, 2020 at 03:46:42PM +0900, Michael Paquier wrote:
> On Fri, Dec 18, 2020 at 08:41:01AM +0900, Michael Paquier wrote:
> > Knowing that we are in a period of vacations for a lot of people, and
> > that this is a sensitive area of the code that involves
> > authentication, I think that it is better to let this thread brew
> > longer and get more eyes to look at it.  As this also concerns
> > external SSL libraries like libnss, making sure that the APIs have a
> > shape flexible enough would be good.  Based on my own checks with
> > OpenSSL and libnss, I think that's more than enough.  But let's be
> > sure.
...
> This has been tested on Windows and Linux across all the versions of
> OpenSSL we support on HEAD.  I am also attaching a small module called
> hmacfuncs that I used as a way to validate this patch across all the
> versions of OpenSSL and the fallback implementation.  As a reference,
> this matches with the results from Wikipedia here:
> https://en.wikipedia.org/wiki/HMAC#Examples

Great.  See my questions in the key manager thread about whether I
should use the init/update/final API or just keep the one-line version.
As far as when to commit this, I think the quiet time is actually better
because if you break something, it is less of a disruption while you fix
it.

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EnterpriseDB                             https://enterprisedb.com

  The usefulness of a cup is in its emptiness, Bruce Lee




Re: Refactoring HMAC in the core code

From
Michael Paquier
Date:
On Fri, Dec 18, 2020 at 10:48:00AM -0500, Bruce Momjian wrote:
> Great.  See my questions in the key manager thread about whether I
> should use the init/update/final API or just keep the one-line version.
> As far as when to commit this, I think the quiet time is actually better
> because if you break something, it is less of a disruption while you fix
> it.

Please note that on a related thread that I have begun yesterday,
Heikki has suggested some changes in the way we handle the opaque data
used by each cryptohash implementation.
https://www.postgresql.org/message-id/6ebe7f1f-bf37-2688-2ac1-a081d278367c@iki.fi

As the design used on this thread for HMAC is similar to what I did
for cryptohashes, it would be good to conclude first on the interface
there, and then come back here so as a consistent design is used.  As
a whole, I don't think that there is any need to rush for this stuff.
I would rather wait more and make sure that we agree on an interface
people are happy enough with.
--
Michael

Attachment

Re: Refactoring HMAC in the core code

From
Bruce Momjian
Date:
On Sat, Dec 19, 2020 at 09:35:57AM +0900, Michael Paquier wrote:
> On Fri, Dec 18, 2020 at 10:48:00AM -0500, Bruce Momjian wrote:
> > Great.  See my questions in the key manager thread about whether I
> > should use the init/update/final API or just keep the one-line version.
> > As far as when to commit this, I think the quiet time is actually better
> > because if you break something, it is less of a disruption while you fix
> > it.
> 
> Please note that on a related thread that I have begun yesterday,
> Heikki has suggested some changes in the way we handle the opaque data
> used by each cryptohash implementation.
> https://www.postgresql.org/message-id/6ebe7f1f-bf37-2688-2ac1-a081d278367c@iki.fi
> 
> As the design used on this thread for HMAC is similar to what I did
> for cryptohashes, it would be good to conclude first on the interface
> there, and then come back here so as a consistent design is used.  As
> a whole, I don't think that there is any need to rush for this stuff.
> I would rather wait more and make sure that we agree on an interface
> people are happy enough with.

Others are waiting to continue working. I am not going to hold up a
patch over a one function, two-line API issue.  I will deal with
whatever new API you choose, and mine will work fine using the OpenSSL
API until I do.

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EnterpriseDB                             https://enterprisedb.com

  The usefulness of a cup is in its emptiness, Bruce Lee




Re: Refactoring HMAC in the core code

From
Bruce Momjian
Date:
On Fri, Dec 18, 2020 at 07:42:02PM -0500, Bruce Momjian wrote:
> > Please note that on a related thread that I have begun yesterday,
> > Heikki has suggested some changes in the way we handle the opaque data
> > used by each cryptohash implementation.
> > https://www.postgresql.org/message-id/6ebe7f1f-bf37-2688-2ac1-a081d278367c@iki.fi
> > 
> > As the design used on this thread for HMAC is similar to what I did
> > for cryptohashes, it would be good to conclude first on the interface
> > there, and then come back here so as a consistent design is used.  As
> > a whole, I don't think that there is any need to rush for this stuff.
> > I would rather wait more and make sure that we agree on an interface
> > people are happy enough with.
> 
> Others are waiting to continue working. I am not going to hold up a
> patch over a one function, two-line API issue.  I will deal with
> whatever new API you choose, and mine will work fine using the OpenSSL
> API until I do.

I will also point out that my patch is going to be bigger and bigger,
and harder to review, the longer I work on it.

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EnterpriseDB                             https://enterprisedb.com

  The usefulness of a cup is in its emptiness, Bruce Lee




Re: Refactoring HMAC in the core code

From
Michael Paquier
Date:
On Fri, Dec 18, 2020 at 03:46:42PM +0900, Michael Paquier wrote:
> This has been tested on Windows and Linux across all the versions of
> OpenSSL we support on HEAD.  I am also attaching a small module called
> hmacfuncs that I used as a way to validate this patch across all the
> versions of OpenSSL and the fallback implementation.  As a reference,
> this matches with the results from Wikipedia here:
> https://en.wikipedia.org/wiki/HMAC#Examples

Please find attached a rebased version.  I have simplified the
implementation to use an opaque pointer similar to the cryptohash
part, leading to a large cleanup of the allocation logic for both
implementations, with and without OpenSSL.
--
Michael

Attachment

Re: Refactoring HMAC in the core code

From
Michael Paquier
Date:
On Fri, Jan 08, 2021 at 04:11:53PM +0900, Michael Paquier wrote:
> Please find attached a rebased version.  I have simplified the
> implementation to use an opaque pointer similar to the cryptohash
> part, leading to a large cleanup of the allocation logic for both
> implementations, with and without OpenSSL.

Rebased patch is attached wiht SHA1 added as of a8ed6bb.  Now that
SHA1 is part of the set of options for cryptohashes, a lot of code of
pgcrypto can be cleaned up thanks to the refactoring done here, but
I am leaving that as a separate item to address later.
--
Michael

Attachment

Re: Refactoring HMAC in the core code

From
Michael Paquier
Date:
On Sat, Jan 23, 2021 at 01:43:20PM +0900, Michael Paquier wrote:
> Rebased patch is attached wiht SHA1 added as of a8ed6bb.  Now that
> SHA1 is part of the set of options for cryptohashes, a lot of code of
> pgcrypto can be cleaned up thanks to the refactoring done here, but
> I am leaving that as a separate item to address later.

Again a new rebase, giving v5:
- Fixed the APIs to return -1 if the caller gives NULL in input, to be
consistent with cryptohash.
- Added a length argument to pg_hmac_final(), wiht sanity checks.
--
Michael

Attachment

Re: Refactoring HMAC in the core code

From
Michael Paquier
Date:
On Mon, Feb 15, 2021 at 08:25:27PM +0900, Michael Paquier wrote:
> Again a new rebase, giving v5:
> - Fixed the APIs to return -1 if the caller gives NULL in input, to be
> consistent with cryptohash.
> - Added a length argument to pg_hmac_final(), wiht sanity checks.

So, this patch has been around for a couple of weeks now, and I would
like to get this part done in 14 to close the loop with the parts of
the code that had better rely on what the crypto libs have.  The main
advantage of this change is for SCRAM so as it does not use its own
implementation of HMAC whenever possible.

Any objections?
--
Michael

Attachment

Re: Refactoring HMAC in the core code

From
Bruce Momjian
Date:
On Fri, Apr  2, 2021 at 07:04:18PM +0900, Michael Paquier wrote:
> On Mon, Feb 15, 2021 at 08:25:27PM +0900, Michael Paquier wrote:
> > Again a new rebase, giving v5:
> > - Fixed the APIs to return -1 if the caller gives NULL in input, to be
> > consistent with cryptohash.
> > - Added a length argument to pg_hmac_final(), wiht sanity checks.
> 
> So, this patch has been around for a couple of weeks now, and I would
> like to get this part done in 14 to close the loop with the parts of
> the code that had better rely on what the crypto libs have.  The main
> advantage of this change is for SCRAM so as it does not use its own
> implementation of HMAC whenever possible.
> 
> Any objections?

Works for me.

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  If only the physical world exists, free will is an illusion.




Re: Refactoring HMAC in the core code

From
Michael Paquier
Date:
On Fri, Apr 02, 2021 at 10:10:36AM -0400, Bruce Momjian wrote:
> Works for me.

Thanks.  I got to spend some time on this stuff again today and did a
complete review, without noticing any issues except some indentation
that was strange so I have applied it.  Attached is a small extension
I have used for some of my tests to validate the implementations.
This uses some result samples one can find on Wikipedia at [1], for
instance.

[1]: https://en.wikipedia.org/wiki/HMAC
--
Michael

Attachment