Thread: Re: Modern SHA2- based password hashes for pgcrypto

Re: Modern SHA2- based password hashes for pgcrypto

From
Bernd Helmle
Date:
Am Freitag, dem 24.01.2025 um 19:06 +0100 schrieb Alvaro Herrera:
> > So we behave exactly the same way as px_crypt_md5(): It stops after
> > the
> > first '$' after the magic byte preamble. For shacrypt, this could
> > be
> > the next '$' after the closing one of the non-mandatory 'rounds'
> > option, but with your example this doesn't happen since it gets
> > never
> > parsed. The salt length will be set to 0.
>
> IMO silently using no salt or 0 iterations because the input is
> somewhat
> broken is bad security and should be rejected.  If we did so in the
> past
> without noticing, that's bad already, but we should not replicate
> that
> behavior any further.

Following Japin's example we parse the broken input after the magic
byte into

$6$rounds=10000$/Zg436s2vmTwsoSz

So this is what gets passed over to the next steps after extracting the
magic byte.

We then try to parse the optional rounds option, which is expected to
be at the very first characters at this stage, which isn't true. In
that case we fall back to default 5000 iterations. So no problem with
iterations here. Note that we don't output the rounds option in that
case. I adapted this from Drepper's code to be compatible with the
tests (it suppresses the rounds option in case the default is used). I
forgot to adjust the docs about this change.

But we don't extract the salt because of the first '$', which confused
the input handling that there isn't anything interesting for it. And
that get unnoticed currently, which might be a problem for the caller.
Well, you see that something is wrong when looking at the result, but
if that happens programmatically it might be hidden.

I've looked what others do:

Drepper's code with the input above produces the following:

$5$=10000$cWSkAaJa1mL912.HQqjIhODJ9b3S7hmgsb/k9Efp7.7

It parses the salt from the input as "=10000".

Python's passlib is very strict when it comes to supported characters
within a salt string. It rejects everything thats not matching '[./0-
9A-Za-z]'. So when you provide the example above you get

  File "/usr/lib/python3.13/site-packages/passlib/utils/handlers.py",
line 1459, in _norm_salt
    raise ValueError("invalid characters in %s salt" % cls.name)
ValueError: invalid characters in sha256_crypt salt

openssl-passwd does this:

echo test | openssl passwd -5 -stdin -salt
'$6$rounds=10000$/Zg436s2vmTwsoSz'

$5$$6$rounds=10000$$BFtTxJrvpb/ra8SnnXkiNCJ1HGZ3OOmwvyQ2TJZx44B

It absorbs everything up to 16 bytes and uses that as the salt string.

Interestingly, python-passlib and Drepper's code accept both empty
salt, openssl-passwd seems to be buggy:

echo test | openssl passwd -5 -stdin -salt ''
<NULL>

So at least, they do *something* with the input and don't silently omit
salting the input passphrase, but the results aren't the same hash.

So i think we have two options: adapt Drepper's code and throw away
everything or be very strict like python-passlib. I myself can find
support for both, so not sure.


    Bernd






Re: Modern SHA2- based password hashes for pgcrypto

From
Alvaro Herrera
Date:
On 2025-Jan-28, Bernd Helmle wrote:

> Python's passlib is very strict when it comes to supported characters
> within a salt string. It rejects everything thats not matching '[./0-
> 9A-Za-z]'. So when you provide the example above you get

The reason it uses these chars is that in their scheme the salt bytes
are base64-encoded.

The passlib docs has this page about the "modular crypt format":
https://passlib.readthedocs.io/en/stable/modular_crypt_format.html

and they point this other page as a "modern, non-ambiguous standard":
https://github.com/P-H-C/phc-string-format/blob/master/phc-sf-spec.md
About the salt, this last document says:

  The role of salts is to achieve uniqueness. A random salt is fine for
  that as long as its length is sufficient; a 16-byte salt would work
  well (by definition, UUID are very good salts, and they encode over
  exactly 16 bytes). 16 bytes encode as 22 characters in B64. Functions
  should disallow salt values that are too small for security (4 bytes
  should be viewed as an absolute minimum).

This "Password Hashing Competition" organization hardly seems an
authority though.  It'd be great to have an IETF standard about this ...

-- 
Álvaro Herrera         PostgreSQL Developer  —  https://www.EnterpriseDB.com/



Re: Modern SHA2- based password hashes for pgcrypto

From
Japin Li
Date:
On Thu, 06 Feb 2025 at 11:20, Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
> On 2025-Jan-28, Bernd Helmle wrote:
>
>> Python's passlib is very strict when it comes to supported characters
>> within a salt string. It rejects everything thats not matching '[./0-
>> 9A-Za-z]'. So when you provide the example above you get
>
> The reason it uses these chars is that in their scheme the salt bytes
> are base64-encoded.
>
> The passlib docs has this page about the "modular crypt format":
> https://passlib.readthedocs.io/en/stable/modular_crypt_format.html
>
> and they point this other page as a "modern, non-ambiguous standard":
> https://github.com/P-H-C/phc-string-format/blob/master/phc-sf-spec.md
> About the salt, this last document says:
>
>   The role of salts is to achieve uniqueness. A random salt is fine for
>   that as long as its length is sufficient; a 16-byte salt would work
>   well (by definition, UUID are very good salts, and they encode over
>   exactly 16 bytes). 16 bytes encode as 22 characters in B64. Functions
>   should disallow salt values that are too small for security (4 bytes
>   should be viewed as an absolute minimum).
>
> This "Password Hashing Competition" organization hardly seems an
> authority though.  It'd be great to have an IETF standard about this ...

Yeah.

Since there is no standard, how do we handle this?  I prefer to use the strict
mode like passlib.

-- 
Regrads,
Japin Li



Re: Modern SHA2- based password hashes for pgcrypto

From
Alvaro Herrera
Date:
On 2025-Feb-07, Japin Li wrote:

> Since there is no standard, how do we handle this?  I prefer to use
> the strict mode like passlib.

I definitely like that passlib have documented their thought process
thoroughly.

I think using their strict mode is good on principle, but if we're going
to do that, then the salt string should not be used verbatim, but
instead base64-decoded first to get the actual salt bytes, like they do.
Does this break compabitibility with other systems?  Are
passlib-generated password hashes incompatible with, say, "openssl
passwd" which you (Bernd) mentioned at the beginning of the thread?
Maybe if the password hashes are no longer compatible, then we should
ditch the idea of restricting salts to base64 chars and accept the whole
range of bytes, like Drepper.

But in any case ISTM we should reject, as they suggest, the use of less
than 4 bytes of salt (and should perhaps settle for a default of 16, as
passlib suggests).  I suppose this is why passlib returns NULL with
empty salt.  What we should do in that case IMO is ereport(ERROR).  

-- 
Álvaro Herrera         PostgreSQL Developer  —  https://www.EnterpriseDB.com/
"Porque Kim no hacía nada, pero, eso sí,
con extraordinario éxito" ("Kim", Kipling)



Re: Modern SHA2- based password hashes for pgcrypto

From
Bernd Helmle
Date:
Hi,

> I definitely like that passlib have documented their thought process
> thoroughly.
>

Please find attached v4 of this patch. I added the following changes:

- Check for non-supported characters in the salt like passlib does.
- Check for reserved tokens when parsing the salt string (i find this
very strict, but it covers the cases Japin Li pointed out).

I didn't implement being more strict when it comes to the salt length:
the salt might be provided from external sources and the password hash
is just used for validating within the database.

When hashing the password within postgres, users always should use
gen_salt(), which generates a reasonable salt string.

Maybe, in case of empty salts, we should issue a WARNING instead of
erroring out and put additional documentation on how to use it right.

> I think using their strict mode is good on principle, but if we're
> going
> to do that, then the salt string should not be used verbatim, but
> instead base64-decoded first to get the actual salt bytes, like they
> do.
> Does this break compabitibility with other systems?  Are
> passlib-generated password hashes incompatible with, say, "openssl
> passwd" which you (Bernd) mentioned at the beginning of the thread?
> Maybe if the password hashes are no longer compatible, then we should
> ditch the idea of restricting salts to base64 chars and accept the
> whole
> range of bytes, like Drepper.
>
> But in any case ISTM we should reject, as they suggest, the use of
> less
> than 4 bytes of salt (and should perhaps settle for a default of 16,
> as
> passlib suggests).  I suppose this is why passlib returns NULL with
> empty salt.  What we should do in that case IMO is ereport(ERROR).
>


Hmm, i didn't understand that passlib does decode them first, i thought
they use it encoded... at least, in our current form we're pretty much
compatible with Drepper, passlib and OpenSSL, as far as i tested:

Empty salt
----------

# Drepper reference code

shacrypt password "" sha512crypt
$6$$bLTg4cpho8PIUrjfsE7qlU08Qx2UEfw..xOc6I1wpGVtyVYToGrr7BzRdAAnEr5lYFr
1Z9WcCf1xNZ1HG9qFW1

# Patch

SELECT crypt('password', '$6$$');
                                           crypt
───────────────────────────────────────────────────────────────────────
─────────────────────
$6$$bLTg4cpho8PIUrjfsE7qlU08Qx2UEfw..xOc6I1wpGVtyVYToGrr7BzRdAAnEr5lYFr1
Z9WcCf1xNZ1HG9qFW1
(1 row)

# Passlib

from passlib.hash import sha256_crypt,sha512_crypt
hash = sha512_crypt.using(salt='').using(rounds=5000).hash("password")
print(hash)
$6$$bLTg4cpho8PIUrjfsE7qlU08Qx2UEfw..xOc6I1wpGVtyVYToGrr7BzRdAAnEr5lYFr
1Z9WcCf1xNZ1HG9qFW1

# OpenSSL

echo password | openssl passwd -6 -stdin -salt ''
<NULL>

Short salt
----------

# Drepper reference code

./shacrypt password abcdef sha512crypt
$6$abcdef$2yYFwtar3.rlLovG0bXbxXMjZe7Z/1EhQ0gEs0nXalsCW04p2iy6unkvbrBFH
1SXexabbdNXXOO11F7sdyurk/

# Patch

SELECT crypt('password', '$6$abcdef$');
                                              crypt
───────────────────────────────────────────────────────────────────────
───────────────────────────
$6$abcdef$2yYFwtar3.rlLovG0bXbxXMjZe7Z/1EhQ0gEs0nXalsCW04p2iy6unkvbrBFH1
SXexabbdNXXOO11F7sdyurk/
(1 row)

# Passlib

from passlib.hash import sha256_crypt,sha512_crypt
hash =
sha512_crypt.using(salt='abcdef').using(rounds=5000).hash("password")
print(hash)
$6$abcdef$2yYFwtar3.rlLovG0bXbxXMjZe7Z/1EhQ0gEs0nXalsCW04p2iy6unkvbrBFH
1SXexabbdNXXOO11F7sdyurk/

# OpenSSL

echo password | openssl passwd -6 -stdin -salt 'abcdef'
$6$abcdef$2yYFwtar3.rlLovG0bXbxXMjZe7Z/1EhQ0gEs0nXalsCW04p2iy6unkvbrBFH
1SXexabbdNXXOO11F7sdyurk/

Salt > MAXLEN(16 Bytes)
-----------------------

# Drepper reference code
(truncates salt)

shacrypt password abcdefghijklmnopqrstuvwxyz sha512crypt
$6$abcdefghijklmnop$0aenUFHf897F9u0tURIHOeACWajSuVGa7jgJGyq.DKZm/WXl/IZ
FvPbneFydBjomEOgM.Sh1m0L3KsS1.H5b//

# Patch
(truncates salt)

SELECT crypt('password', '$6$abcdefghijklmnopqrstuvwxyz$');
                                                   crypt
───────────────────────────────────────────────────────────────────────
─────────────────────────────────────
$6$abcdefghijklmnop$0aenUFHf897F9u0tURIHOeACWajSuVGa7jgJGyq.DKZm/WXl/IZF
vPbneFydBjomEOgM.Sh1m0L3KsS1.H5b//
(1 row)

# Passlib

Errors out with

ValueError: salt too large (sha512_crypt requires <= 16 chars)

# OpenSSL
(truncates salt)

echo password | openssl passwd -6 -stdin -salt
'abcdefghijklmnopqrstuvwxyz'
$6$abcdefghijklmnop$0aenUFHf897F9u0tURIHOeACWajSuVGa7jgJGyq.DKZm/WXl/IZ
FvPbneFydBjomEOgM.Sh1m0L3KsS1.H5b//

Salt with "invalid" character
-----------------------------

# Drepper reference code

./shacrypt password abcdefghi%jklmno sha512crypt
$6$abcdefghi%jklmno$LMN/V1pW97IoK0rWSDVqCo9EYd6zpqP0TdTX9.cxFAFqsdSMWQM
jehkmMtDzL36VBKeG6dg.kFAQKoFvZpK0G.

# Patch
current version V4 (attached)

Errors out:

SELECT crypt('password', '$6$abcdefghi%jklmno$');
FEHLER:  invalid character in salt string: "%"
Time: 0,217 ms

# Passlib

Errors out

ValueError: invalid characters in sha512_crypt salt

# OpenSSL

echo password | openssl passwd -6 -stdin -salt 'abcdefghi%jklmno'
$6$abcdefghi%jklmno$LMN/V1pW97IoK0rWSDVqCo9EYd6zpqP0TdTX9.cxFAFqsdSMWQM
jehkmMtDzL36VBKeG6dg.kFAQKoFvZpK0G.


So the hashes we produce are identical, but with being more strict we
differ in the handling of the provided salt.

    Bernd



Attachment

Re: Modern SHA2- based password hashes for pgcrypto

From
Alvaro Herrera
Date:
On 2025-Mar-11, Bernd Helmle wrote:

> Please find attached v4 of this patch. I added the following changes:
> 
> - Check for non-supported characters in the salt like passlib does.
> - Check for reserved tokens when parsing the salt string (i find this
> very strict, but it covers the cases Japin Li pointed out).
> 
> I didn't implement being more strict when it comes to the salt length:
> the salt might be provided from external sources and the password hash
> is just used for validating within the database.
> 
> When hashing the password within postgres, users always should use
> gen_salt(), which generates a reasonable salt string. 

This sounds reasonable to me.

> Maybe, in case of empty salts, we should issue a WARNING instead of
> erroring out and put additional documentation on how to use it right.

I don't know, that doesn't seem ideal to me, because it's very easy to
run stuff and never see the warnings.  If we find that people are
desperate to use empty salts, we can relax that later (turn the error to
a warning), but I'd rather not have it in the first cut.

> Hmm, i didn't understand that passlib does decode them first, i thought
> they use it encoded... at least, in our current form we're pretty much
> compatible with Drepper, passlib and OpenSSL, as far as i tested:

I am ready to believe that I misinterpreted what I read.

> So the hashes we produce are identical, but with being more strict we
> differ in the handling of the provided salt.

Okay.


I can offer a few cosmetic changes.  0001 is a pgindent run, and 0002 is
some manual adjustments after that.  There are only two nontrivial
changes

1. the calculation for rounds was using type long, which is confusing
because the range is different according to the platform.  Since it's
limited by the macro definitions to no more than 999999999, we can make
it an int32 instead.  So we use strtoint() instead of strtoul() to parse
the value, and remove the "l" suffixes from the macros that define the
limits and default, which were bugging me a bit when used in the
gen_list struct.

2. I changed "if (block % 3)" to "if ((block % 3) != 0)".

-- 
Álvaro Herrera               48°01'N 7°57'E  —  https://www.EnterpriseDB.com/
"El destino baraja y nosotros jugamos" (A. Schopenhauer)

Attachment