Thread: BUG #17760: SCRAM authentication fails with "modern" (rsassaPss signature) server certificate

BUG #17760: SCRAM authentication fails with "modern" (rsassaPss signature) server certificate

From
PG Bug reporting form
Date:
The following bug has been logged on the website:

Bug reference:      17760
Logged by:          Gunnar "Nick" Bluth
Email address:      gunnar.bluth@pro-open.de
PostgreSQL version: 13.8
Operating system:   Ubuntu 20.04
Description:

My client recently started rolling out new server certificates, which, when
dumped with "openssl x509 [...]", show slightly different information
regarding signature, hash etc.

Old:
Signature Algorithm: sha256WithRSAEncryption
New:
Signature Algorithm: rsassaPss
         Hash Algorithm: sha512
         Mask Algorithm: mgf1 with sha512
          Salt Length: 0x40
         Trailer Field: 0xBC (default)

When trying to authenticate on a server using such a certificate using
scram-sha-256, we receive an error message: "could not find digest for NID
UNDEF"

I tried to boil this down a bit. The error comes from
src/interfaces/libpq/fe-secure-openssl.c, pgtls_get_peer_certificate_hash(),
line 440+.
The comment there says "If something else is used, the same hash as the
signature algorithm is used."
And obviously, "EVP_get_digestbynid(NID_rsassaPss)" doesn't return a
result.

Now, openssl's "crypto/objects/obj_xref.txt" contains this:
# OID cross reference table.
# Links signatures OIDs to their corresponding public key algorithms
# and digests.
<snip>
sha256WithRSAEncryption sha256  rsaEncryption
<snip>
# For PSS the digest algorithm can vary and depends on the included
# AlgorithmIdentifier. The digest "undef" indicates the public key
# method should handle this explicitly.
rsassaPss               undef   rsassaPss


That explains the "UNDEF" in the error message (i.e., I *think* that's where
it's coming from).

I have to say that I'm not all too deep into crypto stuff. I couldn't even
tell how to create one such certificate, let alone what that PSS stuff is
all about ;-/

Maybe this is even fixed with recent OpenSSL versions (client has 1.1.1f,
Ubuntu 20.04)? Though that line was introduced in 2010...

I do think however that this is an oversight on our side and has to be
addressed. If not in code, the docs should point out that certain server
certificate types (PSS) may not work with SCRAM auth (or libpq needs to be
compiled against a minimum version of OpenSSL, if that's the root cause).

PS: the "obj_xref.txt" also lists "undef" for ED25519 and ED448

PPS: excerpt from "openssl s_client  -starttls postgres -showcerts -connect
[...]":
Requested Signature Algorithms:

ECDSA+SHA256:ECDSA+SHA384:ECDSA+SHA512:Ed25519:Ed448:RSA-PSS+SHA256:RSA-PSS+SHA384:RSA-PSS+SHA512:RSA-PSS+SHA256:RSA-PSS+SHA384:RSA-PSS+SHA512:RSA+SHA256:RSA+SHA384:RSA+SHA512:ECDSA+SHA224:RSA+SHA224
Shared Requested Signature Algorithms:

ECDSA+SHA256:ECDSA+SHA384:ECDSA+SHA512:Ed25519:Ed448:RSA-PSS+SHA256:RSA-PSS+SHA384:RSA-PSS+SHA512:RSA-PSS+SHA256:RSA-PSS+SHA384:RSA-PSS+SHA512:RSA+SHA256:RSA+SHA384:RSA+SHA512
Peer signing digest: SHA256
Peer signature type: RSA-PSS
Server Temp Key: ECDH, P-256, 256 bits


Am 25.01.23 um 12:32 schrieb PG Bug reporting form:

> I have to say that I'm not all too deep into crypto stuff. I couldn't even
> tell how to create one such certificate, let alone what that PSS stuff is
> all about ;-/
> 
> Maybe this is even fixed with recent OpenSSL versions (client has 1.1.1f,
> Ubuntu 20.04)? Though that line was introduced in 2010...
> 
> I do think however that this is an oversight on our side and has to be
> addressed. If not in code, the docs should point out that certain server
> certificate types (PSS) may not work with SCRAM auth (or libpq needs to be
> compiled against a minimum version of OpenSSL, if that's the root cause).

Nobody willing to enlighten me on this one? :(

-- 
Gunnar "Nick" Bluth

Eimermacherweg 106
D-48159 Münster

Mobil +49 172 8853339
Email: gunnar.bluth@pro-open.de
__________________________________________________________________________
"Ceterum censeo SystemD esse delendam" - Cato


Attachment
On 25/01/2023 13:32, PG Bug reporting form wrote:
> The following bug has been logged on the website:
> 
> Bug reference:      17760
> Logged by:          Gunnar "Nick" Bluth
> Email address:      gunnar.bluth@pro-open.de
> PostgreSQL version: 13.8
> Operating system:   Ubuntu 20.04
> Description:
> 
> My client recently started rolling out new server certificates, which, when
> dumped with "openssl x509 [...]", show slightly different information
> regarding signature, hash etc.
> 
> Old:
> Signature Algorithm: sha256WithRSAEncryption
> New:
> Signature Algorithm: rsassaPss
>           Hash Algorithm: sha512
>           Mask Algorithm: mgf1 with sha512
>            Salt Length: 0x40
>           Trailer Field: 0xBC (default)
> 
> When trying to authenticate on a server using such a certificate using
> scram-sha-256, we receive an error message: "could not find digest for NID
> UNDEF"

I was able to create a certificate like that with these commands:

openssl genpkey -algorithm rsa-pss -pkeyopt rsa_keygen_bits:2048 
-pkeyopt rsa_keygen_pubexp:65537 -out CA.key

openssl req -new -x509 -days 365 -nodes -text -out server.crt -key 
CA.key -keyout server.key -subj "/CN=rsapss.localtest.me"

And was able to reproduce the problem with that.

As a workaround, you can disable channel binding with the 
"channel_binding=disable" libpq option.

> I tried to boil this down a bit. The error comes from
> src/interfaces/libpq/fe-secure-openssl.c, pgtls_get_peer_certificate_hash(),
> line 440+.
> The comment there says "If something else is used, the same hash as the
> signature algorithm is used."
> And obviously, "EVP_get_digestbynid(NID_rsassaPss)" doesn't return a
> result.
> 
> Now, openssl's "crypto/objects/obj_xref.txt" contains this:
> # OID cross reference table.
> # Links signatures OIDs to their corresponding public key algorithms
> # and digests.
> <snip>
> sha256WithRSAEncryption sha256  rsaEncryption
> <snip>
> # For PSS the digest algorithm can vary and depends on the included
> # AlgorithmIdentifier. The digest "undef" indicates the public key
> # method should handle this explicitly.
> rsassaPss               undef   rsassaPss
> 
> 
> That explains the "UNDEF" in the error message (i.e., I *think* that's where
> it's coming from).
> 
> I have to say that I'm not all too deep into crypto stuff. I couldn't even
> tell how to create one such certificate, let alone what that PSS stuff is
> all about ;-/
> 
> Maybe this is even fixed with recent OpenSSL versions (client has 1.1.1f,
> Ubuntu 20.04)? Though that line was introduced in 2010...

Thanks for the investigation!

> I do think however that this is an oversight on our side and has to be
> addressed. If not in code, the docs should point out that certain server
> certificate types (PSS) may not work with SCRAM auth (or libpq needs to be
> compiled against a minimum version of OpenSSL, if that's the root cause).

There are a few things we change in PostgreSQL for this:

1. If the server cannot compute a hash of the certificate, because it 
cannot unambiguously determine the hash algorithm to use, it should not 
offer the SCRAM-SHA-256-PLUS authentication method to the client. In 
other words, if the server cannot do channel binding with it's 
certificate, it should say so, instead of erroring out later.

2. Similarly in the client: if libpq cannot determinate the hash 
algorithm to use with the server's certificate, it should not attempt 
channel binding.

I believe this is OK from a security point of view. If the server is 
using a certificate that cannot be used with channel binding, and the 
client doesn't require channel binding, it's OK to not do it. A man in 
the middle can present a certificate to the client that cannot be used 
with channel binding, but if they can do that they could also just not 
offer the SCRAM-SHA-256-PLUS option to the client. So I don't see risk 
of downgrade attacks here.

3. Add support for channel binding with RSA-PSS. The problem is that 
be_tls_get_certificate_hash() doesn't know which digest algorithm to 
use. As you noted, OBJ_find_sigid_algs() returns "undef" (NID_undef) for 
rsassaPss certificates. I did some googling: when certificate uses 
RSASSA-PSS as the signature algorithm, there is a separate field, 
RSASSA-PSS-params that specifies the hash algorithm 
(https://www.rfc-editor.org/rfc/rfc4055#section-3.1). If you ask me, 
OBJ_find_sigid_algs() should look into RSASSA-PSS-params and figure it 
out, but I'm no expert in the OpenSSL codebase so maybe that would be 
the wrong place to do it.

OpenSSL 3.0 actually has a function called X509_digest_sig(), which 
parses the RSASSA-PSS-params field. It also has a fallback mechanism to 
use SHA512 for ED25519, SHAKE256 for ED448, and SHA256 for anything 
else. We could use that function, perhaps without accepting the 
fallbacks, though. Or call lower level OpenSSL functions to get the hash 
algorithm from the RSASSA-PSS-params field ourselves.

I'll write a patch for 1 and 2, at least, and maybe 3 if it's not too 
complicated. Please poke me if you don't hear from me in a few days.

- Heikki




cc'ing Jacob and Michael who have poked around channel binding before. 
If you have a chance to review this patch, I would appreciate that.

On 05/02/2023 03:28, Heikki Linnakangas wrote:
> On 25/01/2023 13:32, PG Bug reporting form wrote:
>> I do think however that this is an oversight on our side and has to be
>> addressed. If not in code, the docs should point out that certain server
>> certificate types (PSS) may not work with SCRAM auth (or libpq needs to be
>> compiled against a minimum version of OpenSSL, if that's the root cause).
> 
> There are a few things we change in PostgreSQL for this:
> 
> 1. If the server cannot compute a hash of the certificate, because it
> cannot unambiguously determine the hash algorithm to use, it should not
> offer the SCRAM-SHA-256-PLUS authentication method to the client. In
> other words, if the server cannot do channel binding with it's
> certificate, it should say so, instead of erroring out later.

Here's a patch for that.

> 2. Similarly in the client: if libpq cannot determinate the hash
> algorithm to use with the server's certificate, it should not attempt
> channel binding.
> 
> I believe this is OK from a security point of view. If the server is
> using a certificate that cannot be used with channel binding, and the
> client doesn't require channel binding, it's OK to not do it. A man in
> the middle can present a certificate to the client that cannot be used
> with channel binding, but if they can do that they could also just not
> offer the SCRAM-SHA-256-PLUS option to the client. So I don't see risk
> of downgrade attacks here.

On second thoughts, the above is wrong. We cannot safely downgrade to 
not using channel binding just because we don't support the 
certificate's signature algorithm. Otherwise a MITM could indeed do what 
I said above and send a certificate that uses an unsupported signature 
algorithm, to force downgrade. It cannot just modify the SASL mechanism 
negotiation by leaving out the SCRAM-SHA-256-PLUS, because the SCRAM 
handshake catches that case. The client sends a different gs2-cbind-flag 
when it chooses to not do channel binding even though the server 
supports it, and when it doesn't use channel binding because the server 
didn't support it.

I improved the error message in the client, but I'm afraid that's the 
best we can do in the client side. Fortunately, it is enough to upgrade 
the server to include this fix, to allow clients to connect. (Albeit 
without channel binding).

> 3. Add support for channel binding with RSA-PSS. The problem is that
> be_tls_get_certificate_hash() doesn't know which digest algorithm to
> use. As you noted, OBJ_find_sigid_algs() returns "undef" (NID_undef) for
> rsassaPss certificates. I did some googling: when certificate uses
> RSASSA-PSS as the signature algorithm, there is a separate field,
> RSASSA-PSS-params that specifies the hash algorithm
> (https://www.rfc-editor.org/rfc/rfc4055#section-3.1). If you ask me,
> OBJ_find_sigid_algs() should look into RSASSA-PSS-params and figure it
> out, but I'm no expert in the OpenSSL codebase so maybe that would be
> the wrong place to do it.

I didn't dare to make any such changes for now. Maybe we could do that 
in 'master', but I would be wary of backpatching.

- Heikki

Attachment
On Wed, Feb 08, 2023 at 06:11:24PM +0200, Heikki Linnakangas wrote:
> On 05/02/2023 03:28, Heikki Linnakangas wrote:
>> There are a few things we change in PostgreSQL for this:

Nice digging in finding the set of openssl commands that would be
needed to produce certs able to trigger the error.

>> 1. If the server cannot compute a hash of the certificate, because it
>> cannot unambiguously determine the hash algorithm to use, it should not
>> offer the SCRAM-SHA-256-PLUS authentication method to the client. In
>> other words, if the server cannot do channel binding with it's
>> certificate, it should say so, instead of erroring out later.
>
> Here's a patch for that.

If the server cannot do it, not offering -PLUS makes sense and sending
the mechanisms available back to the client is the first step of the
SASL exchange as far as I recall.

>> 2. Similarly in the client: if libpq cannot determinate the hash
>> algorithm to use with the server's certificate, it should not attempt
>> channel binding.
>>
>> I believe this is OK from a security point of view. If the server is
>> using a certificate that cannot be used with channel binding, and the
>> client doesn't require channel binding, it's OK to not do it. A man in
>> the middle can present a certificate to the client that cannot be used
>> with channel binding, but if they can do that they could also just not
>> offer the SCRAM-SHA-256-PLUS option to the client. So I don't see risk
>> of downgrade attacks here.
>
> On second thoughts, the above is wrong. We cannot safely downgrade to not
> using channel binding just because we don't support the certificate's
> signature algorithm. Otherwise a MITM could indeed do what I said above and
> send a certificate that uses an unsupported signature algorithm, to force
> downgrade. It cannot just modify the SASL mechanism negotiation by leaving
> out the SCRAM-SHA-256-PLUS, because the SCRAM handshake catches that case.
> The client sends a different gs2-cbind-flag when it chooses to not do
> channel binding even though the server supports it, and when it doesn't use
> channel binding because the server didn't support it.
>
> I improved the error message in the client, but I'm afraid that's the best
> we can do in the client side. Fortunately, it is enough to upgrade the
> server to include this fix, to allow clients to connect. (Albeit without
> channel binding).

But the client has the choice to decide if it wants to use channel
binding, does it?  In this case, it would send the non-PLUS mechanism
followed by 'n' as gs2-cbind-flag, no?  If the client requires channel
binding with channel_binding=require and the server does not support
it, the client should fail if the server has a certificate that does
not support channel binding.

>> 3. Add support for channel binding with RSA-PSS. The problem is that
>> be_tls_get_certificate_hash() doesn't know which digest algorithm to
>> use. As you noted, OBJ_find_sigid_algs() returns "undef" (NID_undef) for
>> rsassaPss certificates. I did some googling: when certificate uses
>> RSASSA-PSS as the signature algorithm, there is a separate field,
>> RSASSA-PSS-params that specifies the hash algorithm
>> (https://www.rfc-editor.org/rfc/rfc4055#section-3.1). If you ask me,
>> OBJ_find_sigid_algs() should look into RSASSA-PSS-params and figure it
>> out, but I'm no expert in the OpenSSL codebase so maybe that would be
>> the wrong place to do it.
>
> I didn't dare to make any such changes for now. Maybe we could do that in
> 'master', but I would be wary of backpatching.

Perhaps that's something worth reporting to upstream?  Not having the
interface to be able to find out the hash algo to use for a rsapss
certificate sounds like an issue on its own.  Or it's just that rsapss
has nothing of the kind?

be_tls_get_certificate_hash() as presented just returns the hash
calculated in the TLS init phase, cached in TopMemoryContext.  So you
are doing so because you need to know at a stage earlier than the SASL
exchange if channel binding would be a possibility or not.  I am
wondering how this would complicate a potential implementation of
tls-exporter, as deciding if -PLUS should be sent depends on more
parameters.  Anyway, this part is fine here.

+ * or TLS v1.3 to be secure.  In the future, we should implement the
+ * tls-exporter channel binding type as defined in RFC9266.
One thing here is the complexity that we get in terms of channel
binding negotiation, and what to do with endpoint once tls-exporter is
around.  See recent patch https://commitfest.postgresql.org/40/3850/
and its associated thread.

+   /*
+    * Pre-calculate the hash of the certificate, for use in channel binding.
+    */
+   new_cert = SSL_CTX_get0_certificate(context);
The minimal version supported on HEAD is OpenSSL 1.0.1, so this will
not compile as this routine exists only since 1.0.2.  I am afraid that
this is going to require an extra configure flag.  Using a wrapper
that englobes calculate_certificate_hash() would do the work, as
well.

+       oldcxt = MemoryContextSwitchTo(TopMemoryContext);
+       new_hash = calculate_certificate_hash(new_cert, &new_hash_len, isServerStart);
+       MemoryContextSwitchTo(oldcxt);
Shouldn't this bit in be_tls_init() have a "goto error" if the result
of calculate_certificate_hash() is NULL in some cases?  On server
startup, that would be a FATAL, but not on reloads where we would have
a LOG.  There are indeed two code paths where you still want to
proceed with the next TLS init steps, so this had better return a
boolean to let its caller know if an error should be triggered or not?

+           ereport(LOG,
+                   (errmsg("channel binding is disabled; server certificate does not unambiguously specify a signature
algorithm")));
[...]
+       ereport(LOG,
+               (errmsg("channel binding is disabled; server certificate has unknown signature algorithm")));
No FATAL if isServerStart, because the certificate in use does not
provide a hash algorithm that can be used..  I think that we'd better
document that, with a warning for example.  On the client-side, it is
fortunate that we have channel_binding in libpq so as one would know
when the server does not support it, still it worries me that if
things are not set up correctly, then instances could finish by being
unprotected without knowing about it, especially since the default is
channel_binding=prefer.

It seems so wrong to me that we would just silently disable this
feature because RSA-PSS does not give you an algo type to do the
computation work.  I'll try to look at bit at the OpenSSL code base
and see if we could not extract the algo information in this case.
Unfortunately, RFC 5929 is very clear:
"if the certificate's signatureAlgorithm uses no hash functions or
uses multiple hash functions, then this channel binding type's
channel bindings are undefined at this time (updates to is channel
binding type may occur to address this issue if it ever arises)."

I understand from this sentence that if a certificate has no hash
functions, then there's nothing you can do about it.

So as much as I'd like to be aggressive and potentially enforce the
use of SHA256 to compute the certificate hash, what you are doing is
RFC-compliant.  I am wondering whether it would make sense to force a
failure with a GUC if the server cannot compute the hash of the
certificate at startup, though, rather than hope that a user knows
about that via a client forcing channel_binding=require at connection
time.
--
Michael

Attachment
On 09/02/2023 10:24, Michael Paquier wrote:
> It seems so wrong to me that we would just silently disable this
> feature because RSA-PSS does not give you an algo type to do the
> computation work.  I'll try to look at bit at the OpenSSL code base
> and see if we could not extract the algo information in this case.
> Unfortunately, RFC 5929 is very clear:
> "if the certificate's signatureAlgorithm uses no hash functions or
> uses multiple hash functions, then this channel binding type's
> channel bindings are undefined at this time (updates to is channel
> binding type may occur to address this issue if it ever arises)."
> 
> I understand from this sentence that if a certificate has no hash
> functions, then there's nothing you can do about it.
> 
> So as much as I'd like to be aggressive and potentially enforce the
> use of SHA256 to compute the certificate hash, what you are doing is
> RFC-compliant.

Yeah, it seems silly to use anything else than SHA-256, given that 
SCRAM-SHA-256 is totally dependent on the security of SHA-256 anyway.

Reading RFC5929 section "10.1.  Cryptographic Algorithm Agility" [1], I 
think the idea behind deriving the hash algorithm from the certificate's 
signature algorithm is to allow the algorithm to be upgraded 
automatically as new certificates with stronger algorithms are 
introduced. But again, that seems silly with SCRAM-SHA-256.

I'm a little uncomfortable with this patch as I wrote it. It's not 
trivial. The LOG messages might not be seen by admins. If we add a GUC 
as you suggested, then an admin might need to change the GUC as part of 
a minor version upgrade, or the server will refuse to start up. And we 
cannot do anything about the client. And with all of this, we still 
won't actually support channel binding with those certificates.

Perhaps we should just ignore the RFC and use SHA256 for these known 
"undef" cases, after all. Having slept over this, I feel that would be 
the best option. What do you think?

We could use a new "tls-server-end-point-sha-256" channel binding name 
for that, instead of "tls-server-end-point", to make it clear when we're 
violating the RFC.

[1] https://www.rfc-editor.org/rfc/rfc5929#section-10.1

- Heikki




On Thu, Feb 9, 2023 at 1:54 PM Heikki Linnakangas <hlinnaka@iki.fi> wrote:
> On 09/02/2023 10:24, Michael Paquier wrote:
> > It seems so wrong to me that we would just silently disable this
> > feature because RSA-PSS does not give you an algo type to do the
> > computation work.  I'll try to look at bit at the OpenSSL code base
> > and see if we could not extract the algo information in this case.

+1. I'm still coming up to speed on RSA-PSS and the conversation at

    https://github.com/openssl/openssl/issues/15477

but it makes me really uncomfortable to fail open rather than shut in
this case, as part of a backport.

> > Unfortunately, RFC 5929 is very clear:
> > "if the certificate's signatureAlgorithm uses no hash functions or
> > uses multiple hash functions, then this channel binding type's
> > channel bindings are undefined at this time (updates to is channel
> > binding type may occur to address this issue if it ever arises)."
> >
> > I understand from this sentence that if a certificate has no hash
> > functions, then there's nothing you can do about it.

But we also don't have to proceed as if everything is okay.

> > So as much as I'd like to be aggressive and potentially enforce the
> > use of SHA256 to compute the certificate hash, what you are doing is
> > RFC-compliant.
>
> Yeah, it seems silly to use anything else than SHA-256, given that
> SCRAM-SHA-256 is totally dependent on the security of SHA-256 anyway.

How will this evolve when SCRAM-SHA-512 is implemented? Do you plan to
upgrade the undef-hash in that case to SHA-512?

Knee-jerk reaction: this departure feels unnecessary since Gunnar's
certificate does in fact contain a hash algorithm...

--Jacob



On Thu, Feb 09, 2023 at 02:22:28PM -0800, Jacob Champion wrote:
> +1. I'm still coming up to speed on RSA-PSS and the conversation at
>
>     https://github.com/openssl/openssl/issues/15477
>
> but it makes me really uncomfortable to fail open rather than shut in
> this case, as part of a backport.

I could get that some users would want to be able to use such certs,
though.  At least we have one such user as of this thread.

>>> Unfortunately, RFC 5929 is very clear:
>>> "if the certificate's signatureAlgorithm uses no hash functions or
>>> uses multiple hash functions, then this channel binding type's
>>> channel bindings are undefined at this time (updates to is channel
>>> binding type may occur to address this issue if it ever arises)."
>>>
>>> I understand from this sentence that if a certificate has no hash
>>> functions, then there's nothing you can do about it.
>
> But we also don't have to proceed as if everything is okay.

The RFC is kind of silly here?  It just looks like a wording to give
a lot of freedom about future changes, dropping the responsibility
down to folks using it..  Reading the thread you mentioned, it looks
like even the OpenSSL folks are confused by the choice to have, and
that libcrypto should have little knowledge about which hash type to
pass down depending on the cert type, still they are enforcing the
hash type to use for some cert types.

>>> So as much as I'd like to be aggressive and potentially enforce the
>>> use of SHA256 to compute the certificate hash, what you are doing is
>>> RFC-compliant.
>>
>> Yeah, it seems silly to use anything else than SHA-256, given that
>> SCRAM-SHA-256 is totally dependent on the security of SHA-256 anyway.
>
> How will this evolve when SCRAM-SHA-512 is implemented? Do you plan to
> upgrade the undef-hash in that case to SHA-512?

Yes, that's the second part of the patch that makes me uncomfortable,
actually.  If there is a SASL mechanism that supports channel binding
with a hash higher than SHA-256, the default would likely be such a
hash for functions that are weaker.  For example, SCRAM-SHA-512-PLUS
would enforce SHA-512 as hash function when a cert is linked to MD5,
SHA-1 or SHA-256.  So, there is a possibility that we need to compile
the hash of a certificate depending on the mechanism name, which is
known at authentication time.  As proposed, the patch moves the
compilation at TLS init time.  It should not make reloads much slower,
but removes the possibility to switch on-the-fly depending on the
mechanism name received from the client.  It is not a problem with
only SCRAM-SHA-256 supported, but could become one in the future.

> Knee-jerk reaction: this departure feels unnecessary since Gunnar's
> certificate does in fact contain a hash algorithm...

What do you mean by Gunnar here?
--
Michael

Attachment
On 10/02/2023 00:45, Michael Paquier wrote:
> On Thu, Feb 09, 2023 at 02:22:28PM -0800, Jacob Champion wrote:
>> +1. I'm still coming up to speed on RSA-PSS and the conversation at
>>
>>      https://github.com/openssl/openssl/issues/15477
>>
>> but it makes me really uncomfortable to fail open rather than shut in
>> this case, as part of a backport.
> 
> I could get that some users would want to be able to use such certs,
> though.  At least we have one such user as of this thread.

Note that the discussion in that thread is in a slightly different 
context. They are not talking about channel binding, but something else 
(https://github.com/openssl/openssl/issues/15477#issuecomment-852928702). 
The problem is similar: what hash algorithm to use if it's not well 
defined. But there's no guarantee that the rules they come up with will 
be the same as the rules for tls-server-end-point. So I'm not very 
comfortable relying on X509_digest_sig() to do the right thing for us in 
the future, even if it does today.

>>>> So as much as I'd like to be aggressive and potentially enforce the
>>>> use of SHA256 to compute the certificate hash, what you are doing is
>>>> RFC-compliant.
>>>
>>> Yeah, it seems silly to use anything else than SHA-256, given that
>>> SCRAM-SHA-256 is totally dependent on the security of SHA-256 anyway.
>>
>> How will this evolve when SCRAM-SHA-512 is implemented? Do you plan to
>> upgrade the undef-hash in that case to SHA-512?

Yes, that's what I had in mind. Or for SCRAM-SHA-512, we could go 
further and define it as "always use SHA-512", because we wouldn't need 
to worry about backwards compatibility anymore.

>> Knee-jerk reaction: this departure feels unnecessary since Gunnar's
>> certificate does in fact contain a hash algorithm...
> 
> What do you mean by Gunnar here?

Gunnar reported the bug, he's cc'd in this thread.


Ok, let me refine my earlier idea to make this more concrete:

> We could use a new "tls-server-end-point-sha-256" channel binding name 
> for that, instead of "tls-server-end-point", to make it clear when we're 
> violating the RFC.


Let's define a new channel binding type: tls-server-end-point-sha-256. 
with "tls-server-end-point-sha-256", the SHA-256 hash of the certificate 
is used, regardless of what algorithm is in the certificate's signature.

1. When the client sees a "normal" server certificate that uses SHA-256, 
SHA-512, or something else that it recognizes and handles fine today, it 
does the same thing as it does today. It sends "tls-server-end-point" as 
the the cb-name, and calculates the hash using the certificate's hash 
algorithm.

2. When the client sees a certificate with RSA-PSS, ED25519, ED448, or 
any other signature method that it doesn't recognize, it selects the 
tls-server-end-point-sha-256 channel binding type. It sends 
"tls-server-end-point-sha-256" as the cb-name, and calculates the 
SHA-256 hash of the certificate.

3. The server follows the current rules, or uses SHA-256, depending on 
whether the client sent "tls-server-end-point" or 
"tls-server-end-point-sha-256" in the cb-name.

These rules ensure that with certificates that work today, nothing 
changes. Rule 1 ensures that you can connect with a new client to an old 
server, and it still works. And rule 3. ensures that you can still 
connect to a new server with an old client.

At the same time, we are using a new channel binding type to make it 
clear when we are deviating from the standard rules of 
"tls-server-end-point".

If you have an RSA-PSS certificate, you need to upgrade the client and 
the server, or you get an error when you try to connect. (Unless you set 
channel_binding=disable). But after upgrading, it will work.

- Heikki




On Fri, Feb 10, 2023 at 02:24:12AM +0200, Heikki Linnakangas wrote:
> Note that the discussion in that thread is in a slightly different context.
> They are not talking about channel binding, but something else
> (https://github.com/openssl/openssl/issues/15477#issuecomment-852928702).
> The problem is similar: what hash algorithm to use if it's not well defined.
> But there's no guarantee that the rules they come up with will be the same
> as the rules for tls-server-end-point. So I'm not very comfortable relying
> on X509_digest_sig() to do the right thing for us in the future, even if it
> does today.

Okay.  Yes, it does not seem like a good idea in the long run to avoid
that.

>>> How will this evolve when SCRAM-SHA-512 is implemented? Do you plan to
>>> upgrade the undef-hash in that case to SHA-512?
>
> Yes, that's what I had in mind. Or for SCRAM-SHA-512, we could go further
> and define it as "always use SHA-512", because we wouldn't need to worry
> about backwards compatibility anymore.

Hmm.  There is a risk of breakage here with older versions of libpq
because we'd still need to support both 256 and 512, no?  Or do you
mean to drop SCRAM-SHA-256 and do a drop-in replacement with
SCRAM-SHA-512?  Doing that in the server may be fine, not in libpq.
As long as you control that with the channel binding name, there
should be enough control I guess.

> Let's define a new channel binding type: tls-server-end-point-sha-256. with
> "tls-server-end-point-sha-256", the SHA-256 hash of the certificate is used,
> regardless of what algorithm is in the certificate's signature.
>
> 1. When the client sees a "normal" server certificate that uses SHA-256,
> SHA-512, or something else that it recognizes and handles fine today, it
> does the same thing as it does today. It sends "tls-server-end-point" as the
> the cb-name, and calculates the hash using the certificate's hash algorithm.
>
> 2. When the client sees a certificate with RSA-PSS, ED25519, ED448, or any
> other signature method that it doesn't recognize, it selects the
> tls-server-end-point-sha-256 channel binding type. It sends
> "tls-server-end-point-sha-256" as the cb-name, and calculates the SHA-256
> hash of the certificate.
>
> 3. The server follows the current rules, or uses SHA-256, depending on
> whether the client sent "tls-server-end-point" or
> "tls-server-end-point-sha-256" in the cb-name.
>
> These rules ensure that with certificates that work today, nothing changes.
> Rule 1 ensures that you can connect with a new client to an old server, and
> it still works. And rule 3. ensures that you can still connect to a new
> server with an old client.
>
> At the same time, we are using a new channel binding type to make it clear
> when we are deviating from the standard rules of "tls-server-end-point".

So, using a new channel binding name with tls-server-end-point-sha-256
is actually the way to violate the RFC, while tls-server-end-point
still tries to stick with it?  I am not sure if there's much benefit
in a second channel binding, enforcing SHA-256 for undefined
signatures could just be better, and simpler. :)

There be a risk of a downgrade attack here, where the client could
feign the server to do a SHA-256 computation where SHA-512 or higher
should be used.  Or, if the server certificate has a hash function but
the client feigns in not knowing that, do you mean to go through a
mock authentication in this case and fail?

> If you have an RSA-PSS certificate, you need to upgrade the client and the
> server, or you get an error when you try to connect. (Unless you set
> channel_binding=disable). But after upgrading, it will work.

Indeed.
--
Michael

Attachment
On 10/02/2023 08:32, Michael Paquier wrote:
> On Fri, Feb 10, 2023 at 02:24:12AM +0200, Heikki Linnakangas wrote:
>>>> How will this evolve when SCRAM-SHA-512 is implemented? Do you
>>>> plan to upgrade the undef-hash in that case to SHA-512?
>> 
>> Yes, that's what I had in mind. Or for SCRAM-SHA-512, we could go
>> further and define it as "always use SHA-512", because we wouldn't
>> need to worry about backwards compatibility anymore.
> 
> Hmm.  There is a risk of breakage here with older versions of libpq 
> because we'd still need to support both 256 and 512, no?  Or do you 
> mean to drop SCRAM-SHA-256 and do a drop-in replacement with 
> SCRAM-SHA-512?  Doing that in the server may be fine, not in libpq. 
> As long as you control that with the channel binding name, there 
> should be enough control I guess.

We would still support for SCRAM-SHA-256-PLUS in client and server. But 
*if* the server supports and advertises SCRAM-SHA-512-PLUS as a 
supported mechanism, and the client supports and selects it, then with 
SCRAM-SHA-512-PLUS, we could always use SHA-512 as the hash algorithm 
for the channel binding.

Yeah, a MITM could force a downgrade from SCRAM-SHA-512 to 
SCRAM-SHA-256, by modifying the list of supported SASL mechanisms that 
the server sends. Then again, usually the server would not advertise 
both SCRAM-SHA-256 and SCRAM-SHA-512, because pg_role contains only one 
password verifier. Unless we change that too. I don't have a solution 
for these issues right now, but they are not related to channel binding.

The point is that if we add a new SASL mechanism, like 
SCRAM-SHA-512-PLUS, we are free to define how to calculate the 
certificate hash with that mechanism however we want, without breaking 
backwards compatibility.

> So, using a new channel binding name with
> tls-server-end-point-sha-256 is actually the way to violate the RFC,
> while tls-server-end-point still tries to stick with it?

Correct.

> I am not sure if there's much benefit in a second channel binding,
> enforcing SHA-256 for undefined signatures could just be better, and
> simpler. :)

The benefit is that it's more explicit, which helps to reduce the 
confusion. As new signature algorithms are invented in the future, who 
knows what OBJ_find_sigid_algs() in different versions of OpenSSL, or 
other TLS libraries, will do with them. With a new 
"tls-server-end-point-sha-256" channel binding type, it's clear.

> There be a risk of a downgrade attack here, where the client could 
> feign the server to do a SHA-256 computation where SHA-512 or higher 
> should be used.  Or, if the server certificate has a hash function
> but the client feigns in not knowing that, do you mean to go through
> a mock authentication in this case and fail?

Let's always allow tls-server-end-point-sha-256 in the server. The point 
of channel binding is to protect the client from a MITM. The client 
chooses the channel binding type to use, so a MITM cannot force it to 
downgrade.

To be honest, I am not at all worried about losing security by using 
SHA-256 instead of SHA-512 or whatever else is in the certificate. If 
you don't trust SHA-256, you wouldn't be comfortable using SCRAM-SHA-256 
at all.

"tls-server-end-point-sha-256" is simpler than the current rules, so new 
client implementations that are not worried about 
backwards-compatibility might not bother with the current rules at all, 
and only implement "tls-server-end-point-sha-256".

- Heikki



The new "tls-server-end-point-sha-256" channel binding type now seems 
like the best approach to me, but I still wanted to reply to this:

On 09/02/2023 10:24, Michael Paquier wrote:
> But the client has the choice to decide if it wants to use channel
> binding, does it?  In this case, it would send the non-PLUS mechanism
> followed by 'n' as gs2-cbind-flag, no?

No, that makes a downgrade attack possible with channel_binding=prefer. 
Imagine that you have a server with a typical RSA certificate that 
supports channel binding. And a client with channel_binding=prefer. 
There is a Man-In-The-Middle between the client and the server. The MITM 
works as a proxy, and opens two separate TLS connections: server <-> 
MITM and MITM <-> client. To the client, it presents a TLS server 
certificate that uses 'rsassaPss'. The client connects to the MITM, sees 
the rsassaPss certificate, and decides that it cannot do channel binding 
because the server doesn't support it. It sends 'n' gs2-cbind-flag, and 
the server happily accepts that.

Currently, such a downgrade attack is not possible. If client and server 
both support channel binding, it will be used. If a MITM tries to modify 
the SASL mechanism list, leaving out SCRAM-SHA-256-PLUS, the client will 
use gs2-cbind-flag='y' which notifies the server that the mechanism list 
was modified, and the server will fail the authentication. And if the 
MITM doesn't change the negotiation, then channel binding will be used, 
and the authentication will fail because the client and server will 
compute a different server certificate hash.

There is no way for the client to tell the server "I support channel 
binding in general, but not with this server certificate".

- Heikki




On Thu, Feb 9, 2023 at 2:46 PM Michael Paquier <michael@paquier.xyz> wrote:
> I could get that some users would want to be able to use such certs,
> though.  At least we have one such user as of this thread.

My take on this bug is that Gunnar doesn't need to solve the general
"undef" case. They've got a certificate that's supposed to be using
SHA512.

It looks like OpenSSL 1.1.1 gives us a better API for grabbing that;
see attached draft (not mesonified, needs independent verification),
which fixes Heikki's certificate case at least. It's unfortunate that
it doesn't reach back to 1.0.2, or to LibreSSL, but that doesn't
appear to be a problem for Gunnar's situation, and you'd mentioned
wanting to drop 1.0.2 support in HEAD soon anyway.

Maybe someone really wants to use EdDSA certs, which aren't handled by
that API. But this stopgap would buy us some time for the
cryptographers to settle on things -- or at least to ask them? And if
people want new crypto they're going to need to upgrade eventually.
(Maybe by that point we'll know that X509_digest_sig() is in fact
correct for bindings.)

<strong opinions>
- Our current departures from the spec (e.g. no tls-unique) mean that
we already can't interoperate with standard SASL libraries. I've been
trying to realign them, slowly.
- Coming up with our own binding type takes time and resources away
from other things this thread highlighted (the ability to force the
use of a binding at the server side. better behavior when we can't
actually compute a binding value. tls-exporter support...).
- Worst case, using SHA256 for a future certificate type might be
*catastrophically* wrong, but no one's going to warn us, and it's not
going to be obvious to the DBA or their users that our nonstandard
binding is in effect.
- Best case, we choose exactly right, but if/when tls-server-end-point
gets updated for EdDSA, the world will move on and we'll still have to
support that vestigial binding type for the next X years.
</strong opinions>

--Jacob

Attachment
On 11/02/2023 02:13, Jacob Champion wrote:
> On Thu, Feb 9, 2023 at 2:46 PM Michael Paquier <michael@paquier.xyz> wrote:
>> I could get that some users would want to be able to use such certs,
>> though.  At least we have one such user as of this thread.
> 
> My take on this bug is that Gunnar doesn't need to solve the general
> "undef" case. They've got a certificate that's supposed to be using
> SHA512.
> 
> It looks like OpenSSL 1.1.1 gives us a better API for grabbing that;
> see attached draft (not mesonified, needs independent verification),
> which fixes Heikki's certificate case at least. It's unfortunate that
> it doesn't reach back to 1.0.2, or to LibreSSL, but that doesn't
> appear to be a problem for Gunnar's situation, and you'd mentioned
> wanting to drop 1.0.2 support in HEAD soon anyway.

A-ha! I browsed around OpenSSL APIs and the source code, but missed 
X509_get_signature_info(). Good find!

X509_get_signature_info() calls X509_check_purpose(), which calls 
internal function ossl_x509v3_cache_extensions(), which extracts and 
caches quite a lot of information from the certificate. It calculates 
and caches its SHA1 hash, for example. That seems acceptable, the 
overhead is negligible and I don't see any scenario where 
X509_get_signature_nid() would succeed but X509_get_signature_info() 
would fail.

+1 on your patch. I think the only thing it's missing is changes in 
meson.build and Solution.pm to match the configure.ac changes.

> Maybe someone really wants to use EdDSA certs, which aren't handled by
> that API. But this stopgap would buy us some time for the
> cryptographers to settle on things -- or at least to ask them? And if
> people want new crypto they're going to need to upgrade eventually.
> (Maybe by that point we'll know that X509_digest_sig() is in fact
> correct for bindings.)

Agreed, if we have an easy solution for RSA-PSS, that's good enough for now.

> <strong opinions>
> - Our current departures from the spec (e.g. no tls-unique) mean that
> we already can't interoperate with standard SASL libraries. I've been
> trying to realign them, slowly.
> - Coming up with our own binding type takes time and resources away
> from other things this thread highlighted (the ability to force the
> use of a binding at the server side. better behavior when we can't
> actually compute a binding value. tls-exporter support...).
> - Worst case, using SHA256 for a future certificate type might be
> *catastrophically* wrong, but no one's going to warn us, and it's not
> going to be obvious to the DBA or their users that our nonstandard
> binding is in effect.
> - Best case, we choose exactly right, but if/when tls-server-end-point
> gets updated for EdDSA, the world will move on and we'll still have to
> support that vestigial binding type for the next X years.
> </strong opinions>

Fair points.

- Heikki




On Sat, Feb 11, 2023 at 12:58:02PM +0200, Heikki Linnakangas wrote:
> X509_get_signature_info() calls X509_check_purpose(), which calls internal
> function ossl_x509v3_cache_extensions(), which extracts and caches quite a
> lot of information from the certificate. It calculates and caches its SHA1
> hash, for example. That seems acceptable, the overhead is negligible and I
> don't see any scenario where X509_get_signature_nid() would succeed but
> X509_get_signature_info() would fail.

Excellent find!  On 1.1.1, that's x509v3_cache_extensions, it seems.
The extra overhead at authentication does not worry me, FWIW.  I am
wondering why the OpenSSL folks have done nothing for
X509_get_signature_nid() in 1.1.1~ in this case, but whatever.

From my notes, I was also wondering whether we should improve the
situation with the two following things:
- Addition of a new GUC called, say, ssl_check_cert_hash to check at
TLS init time in the backend if a certificate's hash can be found or
not.
- Improvement of error messages in this area when a hash function
cannot be found.

However, this proves to be unnecessary once we use
X509_get_signature_info() as loading an RSA-PSS cert with OpenSSL
1.0.2 or 1.1.0 leads to this error, simply:
FATAL:  could not load server certificate file "server.crt": unsupported algorithm

Would more facility make sense for other cert types where OpenSSL
cannot map with these yet, though?  I am not really convinced that we
need to burn more resources until we get a complaint and adapt again,
as it may be possible that OpenSSL also improves in-between for such
cases.

The SSL tests need to be patched so as they adapt on-the-fly depending
on if RSA-PSS is supported, of course, and we cannot do a
switch_server_cert() for versions older than 1.1.1.

> +1 on your patch. I think the only thing it's missing is changes in
> meson.build and Solution.pm to match the configure.ac changes.

Done.

> Agreed, if we have an easy solution for RSA-PSS, that's good enough
> for now.

Indeed.

With all that in mind, I am finishing with the attached with the
tests, the meson tweaks and the MSVC tweaks.  I have tested it on
HEAD, with OpenSSL down to 1.0.1 which is the minimum version of this
branch.  Making sure that this mostly works with 1.0.0 and 0.9.8 on
older branches would not be an issue here.
--
Michael

Attachment

Re: BUG #17760: SCRAM authentication fails with "modern" (rsassaPss signature) server certificate

From
"Gunnar \"Nick\" Bluth"
Date:
Am 13.02.23 um 02:58 schrieb Michael Paquier:
[...]

>> Agreed, if we have an easy solution for RSA-PSS, that's good enough
>> for now.
> 
> Indeed.
> 
> With all that in mind, I am finishing with the attached with the
> tests, the meson tweaks and the MSVC tweaks.  I have tested it on
> HEAD, with OpenSSL down to 1.0.1 which is the minimum version of this
> branch.  Making sure that this mostly works with 1.0.0 and 0.9.8 on
> older branches would not be an issue here.

FWIW:
I only understand a fraction of the underlying crypto, but the patch LGTM!

Much appreciated, thanks for the effort (and sorry for causing it ;-)!
-- 
Gunnar "Nick" Bluth

Eimermacherweg 106
D-48159 Münster

Mobil +49 172 8853339
Email: gunnar.bluth@pro-open.de
__________________________________________________________________________
"Ceterum censeo SystemD esse delendam" - Cato


Attachment
On Mon, Feb 13, 2023 at 4:54 AM Gunnar "Nick" Bluth
<gunnar.bluth@pro-open.de> wrote:
> FWIW:
> I only understand a fraction of the underlying crypto, but the patch LGTM!

LGTM too, thanks Michael! I tested against LibreSSL 3.5.3 to
double-check the fallback.

--Jacob



On Mon, Feb 13, 2023 at 09:44:03AM -0800, Jacob Champion wrote:
> LGTM too, thanks Michael! I tested against LibreSSL 3.5.3 to
> double-check the fallback.

Thanks for checking with this one, I don't have LibreSSL in my
environment, at least not now.  Perhaps I should..  So, I have spent a
couple of hours on that, and backpatched the fix down to 11.  There
were different conflicts for each branch.

The new tests have been added in 15~, where the generation of the cert
and key files is more straight-forward than ~14.  Actually, make
sslfiles fails on these branches when using OpenSSL 1.1.1~.  Perhaps
that may be worth addressing, but the existing tests pass anyway when
relying on X509_get_signature_info(), as much as they pass with older
versions of OpenSSL.  I have done some manual checks with RSA-PSS
certs and keys to make sure that channel binding works correctly for
these versions (one can just reuse the ones generated on HEAD or
REL_15_STABLE in src/test/ssl/ for that).
--
Michael

Attachment