Thread: Password identifiers, protocol aging and SCRAM protocol

Password identifiers, protocol aging and SCRAM protocol

From

Michael Paquier

Date:

23 February 2016, 07:17:51

Hi all

As a continuation of the thread firstly dedicated to SCRAM:
http://www.postgresql.org/message-id/55192AFE.6080106@iki.fi
Here is a new thread aimed at gathering all the ideas of this previous
thread and aimed at clarifying a bit what has been discussed until now
regarding password protocols, verifiers, and SCRAM itself.

Attached is a set of patches implementing a couple of things that have
been discussed, so let's roll in. There are a couple of concepts that
are introduced in this set of patches, and those patches are aimed at
resolving the following things:
- Introduce in Postgres an extensible password aging facility, by
having a new concept of 1 user/multiple password verifier, one
password verifier per protocol.
- Give to system administrators tools to decide unsupported protocols,
and have pg_upgrade use that
- Introduce new password protocols for Postgres, aimed at replacing
existing, say limited ones.
Note that here is not discussed the point of password verifier
rolling, which is the possibility to have multiple verifiers of the
same protocol for the same user (this maps with the fact that
valid_until is still part of pg_authid here, but in order to support
authentication rolling it would be necessary to move it to
pg_auth_verifiers).

Here is a short description of each patch and what they do:
1) 0001, removing the password column from pg_authid and putting it
into a new catalog called pg_auth_verifiers that has the following
format:
- Role OID
- Password protocol
- Password verifier
The protocols proposed in this patch are "plain" and "md5", which map
to the current things that Postgres has, so there is nothing new. What
is new is the new clause PASSWORD VERIFIERS usable by CREATE/ALTER
USER, like that:
ALTER ROLE foo PASSWORD VERIFIERS (md5 = 'foo', plain = 'foo');
This is easily extensible as new protocols can be added on top of
that. This has been discussed in the previous thread.
As discussed as well previously, password_encryption is switched from
a boolean switch to a list of protocols, which is md5 by default in
this patch.
Also, as discussed in 6174.1455501497@sss.pgh.pa.us, pg_shadow has
been changed so as the password value is replaced by '*****'.
This patch adds docs, regression tests, pg_dump support, etc.

2) 0002, introduction of a new GUC parameter password_protocols
(superuser-only) aimed at controlling the password verifiers of
protocols that can be created. This is quite simple: all the protocols
specified in this list define what are the protocols allowed when
creating password verifiers using CREATE/ALTER ROLE. By default, and
in this patch, this is set to 'plain,md5', which is the current
default in Postgres, though a system admin could set it to 'md5', to
forbid the creation of unencrypted passwords for example. Docs and
regressions are added on the stack, the regression tests taking
advantage of the fact that this is a superuser parameters.
This patch is an answer to remarks done in the last thread regarding
the fact that there is no way to handle how a system controls what are
the password verifier types created, and protocol aging gets its sense
with with patch and 0003...

3) 0003, Introduction of a system function, that I called
pg_auth_verifiers_sanitize, which is superuser-only, aimed at cleaning
up password verifiers in pg_auth_verifiers depending on what the user
has defined in password_protocols. This basically does a heap scan of
pg_auth_verifiers, and deletes the tuple entries that are of protocols
not listed in password_protocols. I have hesitated to put that in
pg_upgrade_support.c, perhaps it would make more sense to have it
there, but feedback is welcome. I have in mind that it is actually
useful for users to have this function at hand to do post-upgrade
cleanup operations. Regression tests cannot be added for this one, I
guess the reason to not have them is obvious when considering
installcheck...

4) 0004, Have pg_upgrade make use of the system function introduced by
0003. This is quite simple, and this allows pg_upgrade to remove
entries of outdated protocols.

Those 4 patches are aimed at putting in-core basics for the concept I
call password protocol aging, which is a way to allow multiple
password protocols to be defined in Postgres, and aimed at easing
administration as well as retirement of outdated protocols, which is
something that is not doable now in Postgres.

The second set of patch 0005~0008 introduces a new protocol, SCRAM.
This is a brushed up, rebased version of the previous patches, and is
divided as follows:
5) 0005, Move of SHA1 routines of pgcrypto to src/common to allow
frontend authentication code path to use SHA1.
6) 0006 is a refactoring of sendAuthRequest that taken independently
makes sense.
7) 0007 is a small refactoring of RandomSalt(), to allow this function
to handle salt values of different lengths
8) 0008 is another refactoring, moving a set of encoding routines from
the backend's encode.c to src/common, escape, base64 and hex are moved
as such, though SCRAM uses only base64. For consistency moving all the
set made more sense to me.
9) 0009 is the SCRAM authentication itself....

The first 4 patches obviously are the core portion that I would like
to discuss about in this CF, as they put in the base for the rest, and
will surely help Postgres long-term. 0005~0008 are just refactoring
patches, so they are quite simple. 0009 though is quite difficult, and
needs careful review because it manipulates areas of the code where it
is not necessary to be an authenticated user, so if there are bugs in
it it would be possible for example to crash down Postgres just by
sending authentication requests.
Regards,
--
Michael

On 1 March 2016 at 06:34, Michael Paquier <michael.paquier@gmail.com> wrote:

On Mon, Feb 29, 2016 at 8:43 PM, Valery Popov <v.popov@postgrespro.ru> wrote:
> vpopov@vpopov-Ubuntu:~/Projects/pwdtest/postgresql$ git branch

Thanks for the input!

> 0001-Add-facility-to-store-multiple-password-verifiers.patch:2547: trailing
> whitespace.
> warning: 1 line adds whitespace errors.
> 0003-Add-pg_auth_verifiers_sanitize.patch:87: indent with spaces.
> if (!superuser())
> warning: 1 line adds whitespace errors.

Argh, yes. Those two ones have slipped though my successive rebases I
think. Will fix in my tree, I don't think that it is worth sending
again the whole series just for that though.
--
Michael

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Hi, Michael

Few questions about the documentation.

config.sgml:1200

> <listitem>

> <para>

> Specifies a comma-separated list of supported password formats by

> the server. Supported formats are currently <literal>plain</> and

> <literal>md5</>.

> </para>

> <para>

> When a password is specified in <xref linkend="sql-createuser"> or

> <xref linkend="sql-alterrole">, this parameter determines if the

> password specified is authorized to be stored or not, returning

> an error message to caller if it is not.

> </para>

> <para>

> The default is <literal>plain,md5,scram</>, meaning that MD5-encrypted

> passwords, plain passwords, and SCRAM-encrypted passwords are accepted.

> </para>

> </listitem>

The default value contains "scram". Shouldn't be here also:

> Specifies a comma-separated list of supported password formats by

> the server. Supported formats are currently <literal>plain</>,

> <literal>md5</> and <literal>scram</>.

Or I missed something?

And one more:

config.sgml:1284

> <para>

> <varname>db_user_namespace</> causes the client's and

> server's user name representation to differ.

> Authentication checks are always done with the server's user name

> so authentication methods must be configured for the

> server's user name, not the client's. Because

> <literal>md5</> uses the user name as salt on both the

> client and server, <literal>md5</> cannot be used with

> <varname>db_user_namespace</>.

> </para>

Looks like the same (pls, correct me if I'm wrong) is applicable for "scram" as I see from the code below. Shouldn't be "scram" mentioned here also? Here's the code:

> diff --git a/src/backend/libpq/hba.c b/src/backend/libpq/hba.c

> index 28f9fb5..df0cc1d 100644

> --- a/src/backend/libpq/hba.c

> +++ b/src/backend/libpq/hba.c

> @@ -1184,6 +1184,19 @@ parse_hba_line(List *line, int line_num, char *raw_line)

> }

> parsedline->auth_method = uaMD5;

> }

>+ else if (strcmp(token->string, "scram") == 0)

>+ {

>+ if (Db_user_namespace)

>+ {

>+ ereport(LOG,

>+ (errcode(ERRCODE_CONFIG_FILE_ERROR),

>+ errmsg("SCRAM authentication is not supported when \"db_user_namespace\" is enabled"),

>+ errcontext("line %d of configuration file \"%s\"",

>+ line_num, HbaFileName)));

>+ return NULL;

>+ }

>+ parsedline->auth_method = uaSASL;

>+ }

> else if (strcmp(token->string, "pam") == 0)

> #ifdef USE_PAM

> parsedline->auth_method = uaPAM;

Re: [REVIEW]: Password identifiers, protocol aging and SCRAM protocol

From

Michael Paquier

Date:

02 March 2016, 06:52:24

On Wed, Mar 2, 2016 at 4:05 AM, Dmitry Dolgov <9erthalion6@gmail.com> wrote:
> [...]

Thanks for the review.

> The default value contains "scram". Shouldn't be here also:
>
>>        Specifies a comma-separated list of supported password formats by
>>        the server. Supported formats are currently <literal>plain</>,
>>        <literal>md5</> and <literal>scram</>.
>
> Or I missed something?

Ah, I see. That's in the documentation of password_protocols. Yes
scram should be listed there as well. That should be fixed in 0009.

>>       <para>
>>        <varname>db_user_namespace</> causes the client's and
>>        server's user name representation to differ.
>>        Authentication checks are always done with the server's user name
>>        so authentication methods must be configured for the
>>        server's user name, not the client's.  Because
>>        <literal>md5</> uses the user name as salt on both the
>>        client and server, <literal>md5</> cannot be used with
>>        <varname>db_user_namespace</>.
>>       </para>
>
> Looks like the same (pls, correct me if I'm wrong) is applicable for "scram"
> as I see from the code below. Shouldn't be "scram" mentioned here also?

Oops. Good catch. Yes it should be mentioned as part of the SCRAM patch (0009).
-- 
Michael

Re: [REVIEW]: Password identifiers, protocol aging and SCRAM protocol

From

Valery Popov

Date:

02 March 2016, 08:43:31

>>        <para>
>>         <varname>db_user_namespace</> causes the client's and
>>         server's user name representation to differ.
>>         Authentication checks are always done with the server's user name
>>         so authentication methods must be configured for the
>>         server's user name, not the client's.  Because
>>         <literal>md5</> uses the user name as salt on both the
>>         client and server, <literal>md5</> cannot be used with
>>         <varname>db_user_namespace</>.
>>        </para>
Also in doc/src/sgml/ref/create_role.sgml is should be instead of      <term>PASSWORD VERIFIERS ( <replaceable 
class="PARAMETER">verifier_type</replaceable> = '<replaceable 
class="PARAMETER">password</replaceable>'</term>
like this      <term><literal>PASSWORD VERIFIERS</> ( <replaceable 
class="PARAMETER">verifier_type</replaceable> = '<replaceable 
class="PARAMETER">password</replaceable>'</term>-- Regards, Valery Popov 
Postgres Professional http://www.postgrespro.com The Russian Postgres 
Company

Re: [REVIEW]: Password identifiers, protocol aging and SCRAM protocol

From

Michael Paquier

Date:

02 March 2016, 11:55:22

On Wed, Mar 2, 2016 at 5:43 PM, Valery Popov <v.popov@postgrespro.ru> wrote:
>
>>>        <para>
>>>         <varname>db_user_namespace</> causes the client's and
>>>         server's user name representation to differ.
>>>         Authentication checks are always done with the server's user name
>>>         so authentication methods must be configured for the
>>>         server's user name, not the client's.  Because
>>>         <literal>md5</> uses the user name as salt on both the
>>>         client and server, <literal>md5</> cannot be used with
>>>         <varname>db_user_namespace</>.
>>>        </para>
>
> Also in doc/src/sgml/ref/create_role.sgml is should be instead of
>       <term>PASSWORD VERIFIERS ( <replaceable
> class="PARAMETER">verifier_type</replaceable> = '<replaceable
> class="PARAMETER">password</replaceable>'</term>
> like this
>       <term><literal>PASSWORD VERIFIERS</> ( <replaceable
> class="PARAMETER">verifier_type</replaceable> = '<replaceable
> class="PARAMETER">password</replaceable>'</term>

So the <literal> markup is missing. Thanks. I am taking note of it.
-- 
Michael

Re: [REVIEW]: Password identifiers, protocol aging and SCRAM protocol

From

Valery Popov

Date:

03 March 2016, 10:32:43

This is a review of "Password identifiers, protocol aging and SCRAM
protocol" patches
http://www.postgresql.org/message-id/CAB7nPqSMXU35g=W9X74HVeQp0uvgJxvYOuA4A-A3M+0wfEBv-w@mail.gmail.com

Contents & Purpose
--------------------------
There was a discussion dedicated to SCRAM:
http://www.postgresql.org/message-id/55192AFE.6080106@iki.fi

This set of patches implements the following:
- Introduce in Postgres an extensible password aging facility, by having
a new concept of 1 user/multiple password verifier, one password
verifier per protocol.
- Give to system administrators tools to decide unsupported protocols,
and have pg_upgrade use that
- Introduce new password protocols for Postgres, aimed at replacing
existing, say limited ones.

This set of patches consists of 9 separate patches.
Description of each patch is well described in initial thread email and
in comments.
The first set of patches 0001-0008 adds facility to store multiple
password verifiers,
CREATE ROLE and ALTER ROLE are extended with PASSWORD VERIFIERS, new
superuser GUC parameters which specifies a list of supported password
protocols in Postgres backend, added pg_auth_verifiers_sanitize
function, removed password verifiers for unsupported protocols in
pg_upgrade, and more features.
The second set of patch 0005~0008 introduces a new protocol, SCRAM, and
0009 is SCRAM itself.

Initial Run
-------------
Included in the patches are:
- source code
- regression tests
- documentation
The source code is well commented.
The patches are in context diff format and were applied correctly to
HEAD (there were 2 warnings, and it was fixed by author).
There were several markup warnings, should be fixed by author.
Regression tests pass successfully, without errors. It seems that the
patches work as expected.
The patch 0009 depends on all previous patches 0001-0008: first we need
to apply patches 0001-0008, then 0009.

Performance
-----------
I have not tested possible performance issues yet.

Conclusion
--------------
I think introduced features are useful and I vote for commit +1.

On 03/02/2016 02:55 PM, Michael Paquier wrote:
> On Wed, Mar 2, 2016 at 5:43 PM, Valery Popov <v.popov@postgrespro.ru> wrote:
> So the <literal> markup is missing. Thanks. I am taking note of it.

--
Regards,
Valery Popov
Postgres Professional http://www.postgrespro.com
The Russian Postgres Company

Re: Password identifiers, protocol aging and SCRAM protocol

From

David Steele

Date:

14 March 2016, 15:32:38

On 2/23/16 2:17 AM, Michael Paquier wrote:

> As a continuation of the thread firstly dedicated to SCRAM:
> http://www.postgresql.org/message-id/55192AFE.6080106@iki.fi
> Here is a new thread aimed at gathering all the ideas of this previous
> thread and aimed at clarifying a bit what has been discussed until now
> regarding password protocols, verifiers, and SCRAM itself.

It looks like this patch set is a bit out of date.

When applying 0004:

$ git apply 
../other/0004-Remove-password-verifiers-for-unsupported-protocols-.patch
error: patch failed: src/bin/pg_upgrade/pg_upgrade.c:262
error: src/bin/pg_upgrade/pg_upgrade.c: patch does not apply

Then I tried to build with just 0001-0003:

cd /postgres/src/include/catalog && '/usr/bin/perl' ./duplicate_oids
3318
3319
3320
3321
3322
make[3]: *** [postgres.bki] Error 1

Could you provide an updated set of patches for review?  Meanwhile I am 
marking this as "waiting for author".

Thanks,
-- 
-David
david@pgmasters.net

Re: Password identifiers, protocol aging and SCRAM protocol

From

Michael Paquier

Date:

14 March 2016, 16:06:13

On Mon, Mar 14, 2016 at 4:32 PM, David Steele <david@pgmasters.net> wrote:
> On 2/23/16 2:17 AM, Michael Paquier wrote:
>
>> As a continuation of the thread firstly dedicated to SCRAM:
>> http://www.postgresql.org/message-id/55192AFE.6080106@iki.fi
>> Here is a new thread aimed at gathering all the ideas of this previous
>> thread and aimed at clarifying a bit what has been discussed until now
>> regarding password protocols, verifiers, and SCRAM itself.
>
>
> It looks like this patch set is a bit out of date.
>
> When applying 0004:
>
> $ git apply
> ../other/0004-Remove-password-verifiers-for-unsupported-protocols-.patch
> error: patch failed: src/bin/pg_upgrade/pg_upgrade.c:262
> error: src/bin/pg_upgrade/pg_upgrade.c: patch does not apply
>
> Then I tried to build with just 0001-0003:
>
> cd /postgres/src/include/catalog && '/usr/bin/perl' ./duplicate_oids
> 3318
> 3319
> 3320
> 3321
> 3322
> make[3]: *** [postgres.bki] Error 1
>
> Could you provide an updated set of patches for review?  Meanwhile I am
> marking this as "waiting for author".

Sure. I'll provide them shortly with all the comments addressed. Up to
now I just had a couple of comments about docs and whitespaces, so I
didn't really bother sending a new set, but this meritates a rebase.
-- 
Michael

Re: Password identifiers, protocol aging and SCRAM protocol

From

Michael Paquier

Date:

14 March 2016, 23:08:05

On Mon, Mar 14, 2016 at 5:06 PM, Michael Paquier
<michael.paquier@gmail.com> wrote:
> On Mon, Mar 14, 2016 at 4:32 PM, David Steele <david@pgmasters.net> wrote:
>> Could you provide an updated set of patches for review?  Meanwhile I am
>> marking this as "waiting for author".
>
> Sure. I'll provide them shortly with all the comments addressed. Up to
> now I just had a couple of comments about docs and whitespaces, so I
> didn't really bother sending a new set, but this meritates a rebase.

And here they are. I have addressed the documentation and the
whitespaces reported up to now at the same time.
--
Michael

Attachment

Re: Password identifiers, protocol aging and SCRAM protocol

From

Valery Popov

Date:

15 March 2016, 14:47:12

Hi, All

On 03/15/2016 02:07 AM, Michael Paquier wrote:
> Sure. I'll provide them shortly with all the comments addressed. Up to
> now I just had a couple of comments about docs and whitespaces, so I
> didn't really bother sending a new set, but this meritates a rebase.
> And here they are. I have addressed the documentation and the
> whitespaces reported up to now at the same time.
I've applied all of 0001-0009 patches from the new set with no any
warnings to today's master branch.
Then compiled with  configure options:
./configure --enable-debug --enable-nls --enable-cassert
--enable-tap-tests --with-perl
All regression tests passed successfully.
make check-world passed successfully.
make installcheck-world failed on several contrib modules:
dblink, file_fdw, hstore, pgcrypto, pgstattuple, postgres_fdw,
tablefunc. The tests results are attached.
Documentation looks good.
Where may be a problem with make check-world and make installcheck-world
results?

--
Regards,
Valery Popov
Postgres Professional http://www.postgrespro.com
The Russian Postgres Company

On Fri, Mar 18, 2016 at 3:16 AM, David Steele <david@pgmasters.net> wrote:
> Here's my full review of this patch set.

Thanks!

> First let me thank you for submitting this patch for the current CF.  I
> feel a bit guilty that I requested it and am only now posting a full
> review.  In my defense I can only say that being CFM has been rather
> more work than I was expecting, but I'm sure you know the feeling.

I get the idea. That's a very draining activity and I can see what you
are doing. That's impressive. Really.

> * [PATCH 1/9] Add facility to store multiple password verifiers
>
> This is a pretty big patch but I went through it carefully and found
> nothing to complain about.  Your attention to detail is impressive as
> always.
>
> Be sure to update the column names for pg_auth_verifiers as we discussed
> in [1].

Done. I have added as well the block of 0009 you pointed out into this
patch for clarity.

> * [PATCH 2/9] Introduce password_protocols
>
> diff --git a/src/test/regress/expected/password.out
> b/src/test/regress/expected/password.out
> +SET password_protocols = 'plain';
> +ALTER ROLE role_passwd5 PASSWORD VERIFIERS (plain = 'foo'); -- ok
> +ALTER ROLE role_passwd5 PASSWORD VERIFIERS (md5 = 'foo'); -- error
> +ERROR:  specified password protocol not allowed
> +DETAIL:  List of authorized protocols is specified by password_protocols.
>
> So that makes sense but you get the same result if you do:
>
> postgres=# alter user role_passwd5 password 'foo';
> ERROR:  specified password protocol not allowed
> DETAIL:  List of authorized protocols is specified by password_protocols.
>
> I don't think this makes sense - if I have explicitly set
> password_protocols to 'plain' and I don't specify a verifier for alter
> user then it seems like it should work.  If nothing else the error
> message lacks information needed to identify the problem.

Hm. The problem here is the interaction between the new
password_protocols and the existing password_encryption.
password_protocols involves that password_encryption should not
contain elements not listed in it, in short password_protocols @>
password_encryption. So I think that the GUC callbacks checking the
validity of those parameter values should check that each other are
not set to incorrect values. One thing to simplify those validity
checks would be to make password_protocols a PGC_POSTMASTER, aka it
needs a restart to be updated. This sacrifices a large portion of the
regression tests though... Do others have thoughts to share? I have
not updated the patch yet, and I would personally let both parameters
as they are now, aka password_protocols as PGC_SUSET and
password_encryption as PGC_USERSET, and check their validity when they
are updated, but I am not alone here (hopefully).

> * [PATCH 3/9] Add pg_auth_verifiers_sanitize
>
> This function is just a little scary but since password_protocols
> defaults to 'plain,md5' I can live with it.

Another thing that I thought about was to integrate as part of
pg_upgrade_support part. That's no big deal to do it this way as well,
though I thought that it could be useful for admins. So extra ideas
are welcome. That's superuser-only anyway... And a critical part to
manage old protocol deprecation.

> * [PATCH 4/9] Remove password verifiers for unsupported protocols in
> pg_upgrade
>
> Same as above - it will always be important for password_protocols to
> default to *all* protocols to avoid data being dropped during the
> pg_upgrade by accident.  You've done that here (and later in the SCRAM
> patch) so I'm satisfied but it bears watching.

We could have an extra keyword like "all" to all mapping to all the
existing protocols, but I find listing the protocols explicitly a more
verbose and simple concept, that's why I chose that.

> What I would do is add some extra comments in the GUC code to make it
> clear to always update the default when adding new verifiers.

Good idea.

> * [PATCH 5/9] Move sha1.c to src/common
>
> This looks fine to me and is a good reuse of code.

Yes.

> * [PATCH 6/9] Refactor sendAuthRequest
>
> I tested this across different client versions and it seems to work fine.

OK, cool!

> * [PATCH 7/9] Refactor RandomSalt to handle salts of different lengths
>
> A simple enough refactor.

That's something we should do as an independent change I think.

> * [PATCH 8/9] Move encoding routines to src/common/
>
> A bit surprising that these functions were never used by any front end code.

Perhaps there are some client tools that copy-paste it. I cannot be
sure. At least it seems to me that this is useful enough as an
independent change.

> * Subject: [PATCH 9/9] SCRAM authentication
>
> diff --git a/src/backend/commands/user.c b/src/backend/commands/user.c
> @@ -1616,18 +1619,34 @@ FlattenPasswordIdentifiers(List *verifiers, char
> *rolname)
>                  * instances of Postgres, an md5 hash passed as a plain verifier
>                  * should still be treated as an MD5 entry.
>                  */
> -               if (spec->veriftype == AUTH_VERIFIER_MD5 &&
> -                       !isMD5(spec->value))
> +               switch (spec->veriftype)
>                 {
> -                       char encrypted_passwd[MD5_PASSWD_LEN + 1];
> -                       if (!pg_md5_encrypt(spec->value, rolname, strlen(rolname),
> -                                                               encrypted_passwd))
> -                               elog(ERROR, "password encryption failed");
> -                       spec->value = pstrdup(encrypted_passwd);
> +                       case AUTH_VERIFIER_MD5:
>
> It seems like this case statement should have been introduced in patch
> 0001.  Were you just trying to avoid churn in the code unless SCRAM is
> committed?

Yeah, right. I have now plugged this portion into 0001.

> diff --git a/src/backend/libpq/auth-scram.c b/src/backend/libpq/auth-scram.c
> +
> +static char *
> +read_attr_value(char **input, char attr)
> +{
>
> Numerous functions like the above in auth-scram.c do not have comments.

Noted. I have done nothing on that yet though :) And I am lowering the
priority for 0009 in this CF to keep focus on the core machinery
instead, as well as other patches that need feedback.

> diff --git a/src/backend/libpq/crypt.c b/src/backend/libpq/crypt.c
> +       else if (strcmp(token->string, "scram") == 0)
> +       {
> +               if (Db_user_namespace)
> +               {
> +                       ereport(LOG,
> +                                       (errcode(ERRCODE_CONFIG_FILE_ERROR),
> +                                        errmsg("SCRAM authentication is not supported when
> \"db_user_namespace\" is enabled"),
> +                                        errcontext("line %d of configuration file \"%s\"",
> +                                                               line_num, HbaFileName)));
> +                       return NULL;
> +               }
> +               parsedline->auth_method = uaSASL;
> +       }
>
> Why is that?  Is it because gss auth should be expected in this case or
> some limitation of SCRAM?  Anyway, it wasn't clear to me why this would
> be true so some comments here would be good.

The username is part of the identifier used as part of the protocol,
so we cannot rely on mappings of db_user_namespace.

> diff --git a/src/common/scram-common.c b/src/common/scram-common.c
> +void
> +scram_HMAC_update(scram_HMAC_ctx *ctx, const char *str, int slen)
> +{
> +       SHA1Update(&ctx->sha1ctx, (const uint8 *) str, slen);
> +}
>
> Same in scram-common.c WRT comments.

OK, noted. I have not updated those comments yet though. At this stage
of the game considering 0009 for integration is a rather difficult
task, and I suspect enough work with the underlying patches. For 9.6,
I would be happy enough if we got the basic infra in core.

> diff --git a/src/include/common/scram-common.h
> b/src/include/common/scram-common.h
> +extern void scram_ClientOrServerKey(const char *password, const char
> *salt, int saltlen, int iterations, const char *keystr, uint8 *result);
>
> My, that's a very long line!

Oops. Sorry.

> * A few general things:
>
> Most of the new scram modules are seriously in need of better comments -
> I pointed out a few but all the new files suffer from this lack.

Indeed. Honestly, as you say, time flies, and by the time of the
feature freeze I am thinking that the only sane target for the CF
would be to focus on 0001~0004. That's the basic infrastructure I
think we need anyway. 0005~0008 are things that I think are useful
taken independently and are simple refactoring, so they could be
considered with the time frame we have. 0009 is a bit too complex. I
expect enough comments on the first patches to keep my time busy until
the end of this CF without that, that's still useful for testing by
the way.

> The strings "plain", "md5", and "scram" are used often enough that I
> think it would be nice if they were constants.

This makes sense. So I switched the code this way. Note that for md5 I
think that it makes sense to use a #define variable when referring to
the verifier method, not when referring to the prefix of a md5
verifier. Those full names are added in pg_auth_verifiers.h.

> I feel the same way
> about verifier methods 'm', 'p', 's' -- perhaps more so because they
> aren't very verbose.

I am thinking of the verifier abbreviations in the system catalog in a
way similar to pg_class' relkind, explaining the one-character
identifier, so I wish letting them as-is.

> It looks like this will need a bit of work if the GSSAPI patch goes in
> (and vice versa).  Not a problem but you'll need to be prepared to do
> that quickly in the event - time is flying.

That's not an issue for me to rebase this set of patches. The only
conflicts that I anticipate are on 0009, but I don't have high hopes
to get this portion integrating into core for 9.6, the rest of the
patches is complicated enough, and everyone bandwidth is limited.
--
Michael

On Tue, Mar 22, 2016 at 2:48 PM, Michael Paquier <michael.paquier@gmail.com> wrote:

On Mon, Mar 21, 2016 at 11:07 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> Well, I said before and I'll say again that I don't like the idea of
> multiple password verifiers. I think that's an accident waiting to
> happen, and I'm not prepared to put in the amount of time and energy
> that it would take to get that feature committed despite not wanting
> it myself, or for being responsible for it afterwards. I'd prefer we
> didn't do it at all, although I'm not going to dig in my heels. I
> might be willing to deal with SCRAM itself, but this whole area is not
> my strongest suit. So ideally some other committer would be willing
> to pick this up.

I won't bet my hand on that.

In principle I'd be happy to look at it, but I doubt that I will have enough time to get it done within this CF unfortunately. Thus I'd rather not commit to doing it.. It kind of fell off my radar too long ago, as I was originally planning to look at it back in the autumn, but failed.

So basically, if somebody else has the cycles to do it in time for 9.6, please do.

I have marked the patch as returned with feedback.

Yeah, unfortunately I think that's probably right. Let's focus on things that have a better chance of making it.

Magnus Hagander
Me: http://www.hagander.net/
Work: http://www.redpill-linpro.com/

Re: Password identifiers, protocol aging and SCRAM protocol

From

Julian Markwort

Date:

29 March 2016, 16:44:13

----[This is a rather informal user-review]----

Here are some thoughts and experiences on using the new features, I
focused on testing the basic funcionality of setting password_encryption
to scram and then generating some users with passwords. After that, I
took a look at the documentation, specifically all those parts that
mentioned "md5", but not SCRAM, so i took some time to write those down
and add my thoughts on them.

We're quite keen on seeing these features in a future release, so I
suggest that we add these patches to the next commitfest asap in order
to keep the discussion on this topic flowing.

For those of you who like to put the authentication method itself up for
discussion, I'd like to add that it seems fairly simple to insert code
for new authentication mechanisms.
In conclusion I think these patches are very useful.

My remarks follow below.

Kind regards,
Julian Markwort
julian.markwort@uni-muenster.de

Things I noticed:
1. when using either CREATE ROLE ALTER ROLE with the parameter ENCRYPTED md5 encryption
isalways assumed (I've come to realize that

UNENCRYPTED always equals plain and, in the past, ENCRYPTED equaled md5
since there were no other options)
I don't know if this is intended behaviour. Maybe this option
should be omitted (or marked as deprecated in the documentation) from
the CREATE/ALTER functions (since without this Option, the
password_encryption from pg_conf.hba is used) or maybe it should have it's own parameter like CREATE ROLE
testuserWITH LOGIN ENCRYPTED 'SCRAM' PASSWORD 'test'; so that the desired encryption is used. From my point of
view,this would be the sensible thing to do,

especially if different verifiers should be allowed (as proposed by
these patches). In either case, a bit of text explaining the (UN)ENCRYPTED option
should be added to the documentation of the CREATE/ALTER ROLE functions.

2. Documentation III. 17. Server Setup and Operation 17.2. Creating a Database Cluster: maybe
listSCRAM as a

possible method for securing the db-admin
19. Client Authentication 19.1. The pg_hba.conf File: SCRAM is not listed in the list
of available auth_methods to be specified in pg_conf.hba 19.3 Authentication Methods 19.3.2
PasswordAuthentication: SCRAM would belong to

the same category as md5 and password, as they are all password-based.
20. Database Roles 20.2. Role Attributes: password : list SCRAM as
authentication method as well
VI. ALTER ROLE: is SCRAM also dependent on the role name for
salting? if so, add warning. (it doesn't seem that way, however I'm curious as
to why the function FlattenPasswordIdentifiers in
src/backend/commands/user.c called by AlterRole passes rolname to
scram_build_verifier(), when that function does absolutely nothing with
this argument?) CREATE ROLE: can SCRAM also be used in the list of PASSWORD
VERIFIERS?
VII. 49. System Catalogs: 49.9 pg_auth_verifiers: Column names and types are mixed up
in description for column vervalue: explain some basic stuff
aboutmd5

maybe as well?
remark: the statements about the
composition of the string that is md5-hashed are contradictory. (concatenating "bar"
to"foo"

results in foobar, not the other way round, as it is implied in the
explanation of the md5 hashing), this however, is not really linked to
the changes introduced with these patches.
remark: naming inconsistency: md5
vervalues are stored "md5*" why don't we take the same approach and use
it on SCRAM hashes (i.e. "scram*" ). (if this is a general convention
thing, please ignore this comment, however I couldn't find anything in
the relevant RFC's while skimming through them).
50. Frontend/Backend Protocol 50.2.1 Start-up: add explanation for
"AuthenticationSCRAMPassword" authentication request message. (?) 50.5 message formats see 50.2.1

Re: Password identifiers, protocol aging and SCRAM protocol

From

Michael Paquier

Date:

30 March 2016, 13:46:50

On Wed, Mar 30, 2016 at 1:44 AM, Julian Markwort
<julian.markwort@uni-muenster.de> wrote:
> ----[This is a rather informal user-review]----
>
> Here are some thoughts and experiences on using the new features, I focused
> on testing the basic funcionality of setting password_encryption to scram
> and then generating some users with passwords. After that, I took a look at
> the documentation, specifically all those parts that mentioned "md5", but
> not SCRAM, so i took some time to write those down and add my thoughts on
> them.
>
> We're quite keen on seeing these features in a future release, so I suggest
> that we add these patches to the next commitfest asap in order to keep the
> discussion on this topic flowing.
>
> For those of you who like to put the authentication method itself up for
> discussion, I'd like to add that it seems fairly simple to insert code for
> new authentication mechanisms.
> In conclusion I think these patches are very useful.

The reception of the concept of multiple password verifiers for a
single role was rather... cold. So except if a committer pushes hard
for it is never going to show up. There is clear consensus that SCRAM
is something needed though, so we may as well just focus on that.

> Things I noticed:
> 1.
>     when using either
>         CREATE ROLE
>         ALTER ROLE
>     with the parameter
>         ENCRYPTED
>     md5 encryption is always assumed (I've come to realize that UNENCRYPTED
> always equals plain and, in the past, ENCRYPTED equaled md5 since there were
> no other options)

Yes, that's to match the current behavior, and make something fully
backward-compatible. Switching to md5 + scram may have made sense as
well though.

>     I don't know if this is intended behaviour.

This is an intended behavior.

> Maybe this option should be
> omitted (or marked as deprecated in the documentation) from the CREATE/ALTER
> functions (since without this Option, the password_encryption from
> pg_conf.hba is used)
>     or maybe it should have it's own parameter like
>         CREATE ROLE testuser WITH LOGIN ENCRYPTED 'SCRAM' PASSWORD 'test';
>     so that the desired encryption is used.
>     From my point of view, this would be the sensible thing to do,
> especially if different verifiers should be allowed (as proposed by these
> patches).

The extension PASSWORD VERIFIERS is aimed at covering this need. The
grammar of those queries is not a fixed thing though.

>     In either case, a bit of text explaining the (UN)ENCRYPTED option should
> be added to the documentation of the CREATE/ALTER ROLE functions.

It is specified here;
http://www.postgresql.org/docs/devel/static/sql-createrole.html
And the patch does not ignore that.

> 2.
>     Documentation
>     III.
>         17. Server Setup and Operation
>             17.2. Creating a Database Cluster: maybe list SCRAM as a
> possible method for securing the db-admin

Indeed.

>         19. Client Authentication
>             19.1. The pg_hba.conf File: SCRAM is not listed in the list of
> available auth_methods to be specified in pg_conf.hba
>             19.3 Authentication Methods
>                 19.3.2 Password Authentication: SCRAM would belong to the
> same category as md5 and password, as they are all password-based.
>
>         20. Database Roles
>             20.2. Role Attributes: password : list SCRAM as authentication
> method as well

Indeed.

>     VI.
>         ALTER ROLE: is SCRAM also dependent on the role name for salting? if
> so, add warning.

No.

>                     (it doesn't seem that way, however I'm curious as to why
> the function FlattenPasswordIdentifiers in src/backend/commands/user.c
> called by AlterRole passes rolname to scram_build_verifier(), when that
> function does absolutely nothing with this argument?)

Yeah, this argument could be removed.

>         CREATE ROLE: can SCRAM also be used in the list of PASSWORD
> VERIFIERS?

Yes.

>     VII.
>         49. System Catalogs:
>             49.9 pg_auth_verifiers: Column names and types are mixed up
>                                     in description for column vervalue:

Yes, things are messed up a bit there. Thanks for noticing.

>                                     remark: naming inconsistency: md5
> vervalues are stored "md5*" why don't we take the same approach and use it
> on SCRAM hashes (i.e. "scram*" ).

Perhaps this makes sense if there is no pg_auth_verifiers.
-- 
Michael

Re: Password identifiers, protocol aging and SCRAM protocol

From

Robert Haas

Date:

30 March 2016, 16:14:09

On Wed, Mar 30, 2016 at 9:46 AM, Michael Paquier
<michael.paquier@gmail.com> wrote:
>> Things I noticed:
>> 1.
>>     when using either
>>         CREATE ROLE
>>         ALTER ROLE
>>     with the parameter
>>         ENCRYPTED
>>     md5 encryption is always assumed (I've come to realize that UNENCRYPTED
>> always equals plain and, in the past, ENCRYPTED equaled md5 since there were
>> no other options)
>
> Yes, that's to match the current behavior, and make something fully
> backward-compatible. Switching to md5 + scram may have made sense as
> well though.

I think we're not going to have much luck getting people to switch
over to SCRAM if the default remains MD5.  Perhaps there should be a
GUC for this - and we can initially set that GUC to md5, allowing
people who are ready to adopt SCRAM to change it.  And then in a later
release we can change the default, once we're pretty confident that
most connectors have added support for the new authentication method.
This is going to take a long time to roll out.  Alternatively, we
could control it strictly through DDL.

Note that the existing behavior is pretty wonky:

alter user rhaas unencrypted password 'foo'; -> rolpassword foo
alter user rhaas encrypted password 'foo'; -> rolpassword
md5e748797a605a1c95f3d6b5f140b2d528
alter user rhaas encrypted password
'md5e748797a605a1c95f3d6b5f140b2d528'; -> rolpassword
md5e748797a605a1c95f3d6b5f140b2d528
alter user rhaas unencrypted password
'md5e748797a605a1c95f3d6b5f140b2d528'; -> rolpassword
md5e748797a605a1c95f3d6b5f140b2d528

So basically the use of the ENCRYPTED keyword means "if it does
already seem to be the sort of MD5 blob we're expecting, turn it into
that".  And we just rely on the format to distinguish between an MD5
verifier and an unencrypted password.  Personally, I think a good
start here, and I think you may have something like this in the patch
already, would be to split rolpassword into two columns, say
rolencryption and rolpassword.  rolencryption says how the password
verifier is encrypted and rolpassword contains the verifier itself.
Initially, rolencryption will be 'plain' or 'md5', but later we can
add 'scram' as another choice, or maybe it'll be more specific like
'scram-hmac-doodad'.  And then maybe introduce syntax like this:

alter user rhaas set password 'raw-unencrypted-passwordt' using
'verifier-method';
alter user rhaas set password verifier 'verifier-goes-here' using
'verifier-method';

That might require making verifier a key word, which would be good to
avoid.  Perhaps we could use "password validator" instead?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Password identifiers, protocol aging and SCRAM protocol

From

José Luis Tallón

Date:

30 March 2016, 16:31:22

On 03/30/2016 06:14 PM, Robert Haas wrote:
> So basically the use of the ENCRYPTED keyword means "if it does 
> already seem to be the sort of MD5 blob we're expecting, turn it into 
> that". 

If it does NOT already seem to be... I guess?

> And we just rely on the format to distinguish between an MD5 verifier 
> and an unencrypted password. Personally, I think a good start here, 
> and I think you may have something like this in the patch already, 
> would be to split rolpassword into two columns, say rolencryption and 
> rolpassword. 

This inches closer to Michael's suggestion to have multiple verifiers 
per pg_authid user ...

> rolencryption says how the password verifier is encrypted and 
> rolpassword contains the verifier itself. Initially, rolencryption 
> will be 'plain' or 'md5', but later we can add 'scram' as another 
> choice, or maybe it'll be more specific like 'scram-hmac-doodad'.

May I suggest using  "{" <scheme>["."<encoding>] "}" just like Dovecot does?

e.g. "{md5.hex}e748797a605a1c95f3d6b5f140b2d528"

where no "{ ... }" prefix means just fallback to the old method of 
trying to guess what the blob contains?    This would invalidate PLAIN passwords beginning with "{", though, 
so some measures would be needed.

> And then maybe introduce syntax like this: alter user rhaas set 
> password 'raw-unencrypted-passwordt' using 'verifier-method'; alter 
> user rhaas set password verifier 'verifier-goes-here' using 
> 'verifier-method'; That might require making verifier a key word, 
> which would be good to avoid. Perhaps we could use "password 
> validator" instead? 

I'd like USING best ... though by prepending the schema for ENCRYPTED, 
the required information is already conveyed within the verifier, so no 
need to specify it again :)

Just my .02€

    / J.L.

Re: Password identifiers, protocol aging and SCRAM protocol

From

Robert Haas

Date:

30 March 2016, 20:34:30

On Wed, Mar 30, 2016 at 12:31 PM, José Luis Tallón
<jltallon@adv-solutions.net> wrote:
> On 03/30/2016 06:14 PM, Robert Haas wrote:
>> So basically the use of the ENCRYPTED keyword means "if it does already
>> seem to be the sort of MD5 blob we're expecting, turn it into that".
>
> If it does NOT already seem to be... I guess?

Yes, that's what I meant.  Sorry.

>> rolencryption says how the password verifier is encrypted and rolpassword
>> contains the verifier itself. Initially, rolencryption will be 'plain' or
>> 'md5', but later we can add 'scram' as another choice, or maybe it'll be
>> more specific like 'scram-hmac-doodad'.
>
> May I suggest using  "{" <scheme>["."<encoding>] "}" just like Dovecot does?

Doesn't seem very SQL-ish to me...  I think we should normalize.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Password identifiers, protocol aging and SCRAM protocol

From

Michael Paquier

Date:

31 March 2016, 02:31:44

On Thu, Mar 31, 2016 at 1:14 AM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Wed, Mar 30, 2016 at 9:46 AM, Michael Paquier
> <michael.paquier@gmail.com> wrote:
>>> Things I noticed:
>>> 1.
>>>     when using either
>>>         CREATE ROLE
>>>         ALTER ROLE
>>>     with the parameter
>>>         ENCRYPTED
>>>     md5 encryption is always assumed (I've come to realize that UNENCRYPTED
>>> always equals plain and, in the past, ENCRYPTED equaled md5 since there were
>>> no other options)
>>
>> Yes, that's to match the current behavior, and make something fully
>> backward-compatible. Switching to md5 + scram may have made sense as
>> well though.
>
> I think we're not going to have much luck getting people to switch
> over to SCRAM if the default remains MD5. Perhaps there should be a
> GUC for this - and we can initially set that GUC to md5, allowing
> people who are ready to adopt SCRAM to change it.  And then in a later
> release we can change the default, once we're pretty confident that
> most connectors have added support for the new authentication method.
> This is going to take a long time to roll out.
> Alternatively, we could control it strictly through DDL.

This maps quite a lot with the existing password_encryption, so adding
a GUC to control only the format of protocols only for ENCRYPTED is
disturbing, say password_encryption_encrypted. I'd rather keep
ENCRYPTED to md5 as default when password_encryption is 'on', switch
to scram a couple of releases later, and extend the DDL grammar with
something like PROTOCOL {'md5' | 'plain' | 'scram'}, which can be used
instead of UNENCRYPTED | ENCRYPTED as an additional keyword. Smooth
transition to a more-extensive system.

> Note that the existing behavior is pretty wonky:
> alter user rhaas unencrypted password 'foo'; -> rolpassword foo
> alter user rhaas encrypted password 'foo'; -> rolpassword
> md5e748797a605a1c95f3d6b5f140b2d528
> alter user rhaas encrypted password
> 'md5e748797a605a1c95f3d6b5f140b2d528'; -> rolpassword
> md5e748797a605a1c95f3d6b5f140b2d528
> alter user rhaas unencrypted password
> 'md5e748797a605a1c95f3d6b5f140b2d528'; -> rolpassword
> md5e748797a605a1c95f3d6b5f140b2d528

I actually wrote some regression tests for that. Those are upthread as
part of 0001, have for example a look at password.sql.

> So basically the use of the ENCRYPTED keyword means "if it does
> already seem to be the sort of MD5 blob we're expecting, turn it into
> that".  And we just rely on the format to distinguish between an MD5
> verifier and an unencrypted password.  Personally, I think a good
> start here, and I think you may have something like this in the patch
> already, would be to split rolpassword into two columns, say
> rolencryption and rolpassword.  rolencryption says how the password
> verifier is encrypted and rolpassword contains the verifier itself.

The patch has something like that. And doing this split is not that
complicated to be honest. Surely that would be clearer than relying on
the prefix of the identifier to see if it is md5 or not.

> Initially, rolencryption will be 'plain' or 'md5', but later we can
> add 'scram' as another choice, or maybe it'll be more specific like
> 'scram-hmac-doodad'.  And then maybe introduce syntax like this:
>
> alter user rhaas set password 'raw-unencrypted-passwordt' using
> 'verifier-method';
> alter user rhaas set password verifier 'verifier-goes-here' using
> 'verifier-method';
>
> That might require making verifier a key word, which would be good to
> avoid.  Perhaps we could use "password validator" instead?

Yes, that matches what I wrote above. At this point putting that back
on board and discuss it openly at PGCon is the best course of action
IMO.
-- 
Michael

Re: Password identifiers, protocol aging and SCRAM protocol

From

Heikki Linnakangas

Date:

02 July 2016, 19:54:20

So, the consensus so far seems to be: We don't want the support for 
multiple password verifiers per user. At least not yet. Let's get SCRAM 
working first, in a way that a user can only have SCRAM or an MD5 hash 
stored in the database, not both. We can add support for multiple 
verifiers per user, password aging, etc. later. Hopefully we'll make 
some progress on those before 9.7 is released, too, but let's treat them 
as separate issues and focus on SCRAM.

I took a quick look at the patch set now again, and except that it needs 
to have the multiple password verifier support refactored out, I think 
it's in a pretty good shape. I don't like the pg_upgrade changes and its 
support function, that also seems like an orthogonal or add-on feature 
that would be better discussed separately. I think pg_upgrade should 
just do the upgrade with as little change to the system as possible, and 
let the admin reset/rehash/deprecate the passwords separately, when she 
wants to switch all users to SCRAM. So I suggest that we rip out those 
changes from the patch set as well.

In related news, RFC 7677 that describes a new SCRAM-SHA-256 
authentication mechanism, was published in November 2015. It's identical 
to SCRAM-SHA-1, which is what this patch set implements, except that 
SHA-1 has been replaced with SHA-256. Perhaps we should forget about 
SCRAM-SHA-1 and jump straight to SCRAM-SHA-256.

RFC 7677 also adds some verbiage, in response to vulnerabilities that 
have been found with the "tls-unique" channel binding mechanism:

>    To be secure, either SCRAM-SHA-256-PLUS and SCRAM-SHA-1-PLUS MUST be
>    used over a TLS channel that has had the session hash extension
>    [RFC7627] negotiated, or session resumption MUST NOT have been used.

So that doesn't affect details of the protocol per se, but once we 
implement channel binding, we need to check for those conditions somehow 
(or make sure that OpenSSL checks for them).

Michael, do you plan to submit a new version of this patch set for the 
next commitfest? I'd like to get this committed early in the 9.7 release 
cycle, so that we have time to work on all the add-on stuff before the 
release.

- Heikki

Re: Password identifiers, protocol aging and SCRAM protocol

From

Michael Paquier

Date:

02 July 2016, 22:32:54

On Sun, Jul 3, 2016 at 4:54 AM, Heikki Linnakangas <hlinnaka@iki.fi> wrote:
> I took a quick look at the patch set now again, and except that it needs to
> have the multiple password verifier support refactored out, I think it's in
> a pretty good shape. I don't like the pg_upgrade changes and its support
> function, that also seems like an orthogonal or add-on feature that would be
> better discussed separately. I think pg_upgrade should just do the upgrade
> with as little change to the system as possible, and let the admin
> reset/rehash/deprecate the passwords separately, when she wants to switch
> all users to SCRAM. So I suggest that we rip out those changes from the
> patch set as well.

That's as well what I recall from the consensus at PGCon: only focus
on the protocol addition and storage of the scram verifier. It was not
mentioned directly but that's what I guess should be done. So no
complains here.

> In related news, RFC 7677 that describes a new SCRAM-SHA-256 authentication
> mechanism, was published in November 2015. It's identical to SCRAM-SHA-1,
> which is what this patch set implements, except that SHA-1 has been replaced
> with SHA-256. Perhaps we should forget about SCRAM-SHA-1 and jump straight
> to SCRAM-SHA-256.

That's to consider. I don't thing switching to that is much complicated.

> RFC 7677 also adds some verbiage, in response to vulnerabilities that have
> been found with the "tls-unique" channel binding mechanism:
>
>>    To be secure, either SCRAM-SHA-256-PLUS and SCRAM-SHA-1-PLUS MUST be
>>    used over a TLS channel that has had the session hash extension
>>    [RFC7627] negotiated, or session resumption MUST NOT have been used.
>
> So that doesn't affect details of the protocol per se, but once we implement
> channel binding, we need to check for those conditions somehow (or make sure
> that OpenSSL checks for them).

Yes.

> Michael, do you plan to submit a new version of this patch set for the next
> commitfest? I'd like to get this committed early in the 9.7 release cycle,
> so that we have time to work on all the add-on stuff before the release.

Thanks. That's good news! Yes, I am still on track to submit a patch for CF1.
-- 
Michael

Re: Password identifiers, protocol aging and SCRAM protocol

From

David Steele

Date:

03 July 2016, 03:06:48

On 7/2/16 6:32 PM, Michael Paquier wrote:
> On Sun, Jul 3, 2016 at 4:54 AM, Heikki Linnakangas <hlinnaka@iki.fi> wrote:
>
>> Michael, do you plan to submit a new version of this patch set for the next
>> commitfest? I'd like to get this committed early in the 9.7 release cycle,
>> so that we have time to work on all the add-on stuff before the release.
> 
> Thanks. That's good news! Yes, I am still on track to submit a patch for CF1.

And I'm on board for reviews, testing, and whatever else I can help with.

-- 
-David
david@pgmasters.net

Re: Password identifiers, protocol aging and SCRAM protocol

From

Peter Eisentraut

Date:

03 July 2016, 21:34:14

On 7/2/16 3:54 PM, Heikki Linnakangas wrote:
> In related news, RFC 7677 that describes a new SCRAM-SHA-256
> authentication mechanism, was published in November 2015. It's identical
> to SCRAM-SHA-1, which is what this patch set implements, except that
> SHA-1 has been replaced with SHA-256. Perhaps we should forget about
> SCRAM-SHA-1 and jump straight to SCRAM-SHA-256.

I think a global change from SHA-1 to SHA-256 is in the air already, so 
if we're going to release something brand new in 2017 or so, it should 
be SHA-256.

I suspect this would be a relatively simple change, so I wouldn't mind 
seeing a SHA-1-based variant in CF1 to get things rolling.

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: Password identifiers, protocol aging and SCRAM protocol

From

Michael Paquier

Date:

04 July 2016, 03:54:41

On Mon, Jul 4, 2016 at 6:34 AM, Peter Eisentraut
<peter.eisentraut@2ndquadrant.com> wrote:
> On 7/2/16 3:54 PM, Heikki Linnakangas wrote:
>>
>> In related news, RFC 7677 that describes a new SCRAM-SHA-256
>> authentication mechanism, was published in November 2015. It's identical
>> to SCRAM-SHA-1, which is what this patch set implements, except that
>> SHA-1 has been replaced with SHA-256. Perhaps we should forget about
>> SCRAM-SHA-1 and jump straight to SCRAM-SHA-256.
>
> I think a global change from SHA-1 to SHA-256 is in the air already, so if
> we're going to release something brand new in 2017 or so, it should be
> SHA-256.
>
> I suspect this would be a relatively simple change, so I wouldn't mind
> seeing a SHA-1-based variant in CF1 to get things rolling.

I'd just move this thing to SHA256, we are likely going to use that at the end.

As I am coming back into that, I would as well suggest do the
following, that the current set of patches is clearly missing:
- Put the HMAC infrastructure stuff of pgcrypto into src/common/. It
is a bit a shame to not reuse what is currently available, then I
would suggest to reuse that with HMAC_SCRAM_SHAXXX as label.
- Move *all* the SHA-related things of pgcrypto to src/common,
including SHA1, SHA224 and SHA256. px_memset is a simple wrapper on
top of memset, we should clean up that first.
Any other things to consider that I am forgetting?
-- 
Michael

Re: Password identifiers, protocol aging and SCRAM protocol

From

Michael Paquier

Date:

05 July 2016, 08:06:18

On Mon, Jul 4, 2016 at 12:54 PM, Michael Paquier
<michael.paquier@gmail.com> wrote:
> As I am coming back into that, I would as well suggest do the
> following, that the current set of patches is clearly missing:
> - Put the HMAC infrastructure stuff of pgcrypto into src/common/. It
> is a bit a shame to not reuse what is currently available, then I
> would suggest to reuse that with HMAC_SCRAM_SHAXXX as label.
> - Move *all* the SHA-related things of pgcrypto to src/common,
> including SHA1, SHA224 and SHA256. px_memset is a simple wrapper on
> top of memset, we should clean up that first.
> Any other things to consider that I am forgetting?

After looking more into that, I have come up with PG-like equivalents
of things in openssl/sha.h:
pg_shaXX_init(pg_shaXX_ctx *ctx, data);
pg_shaXX_update(pg_shaXX_ctx *ctx, uint8 *data, size_t len);
pg_shaXX_final(uint8 *dest, pg_shaXX_ctx *ctx);
Then think about shaXX as 1, 224, 256, 384 and 512.

Hence all those functions, moved to src/common, finish with the
following shape, take an init() one:
#ifdef USE_SSL
#define <openssl/sha.h>
#endif
void
pg_shaXX_init(pg_shaXX_ctx *ctx)
{
#ifdef USE_SSL   SHAXX_Init((SHAXX_CTX *) ctx);
#else   //Here does the OpenBSD stuff, now part of pgcrypto
#endif
}

And that's really ugly, all the OpenBSD things that are used by
pgcrypto when the code is not built with --with-openssl gather into a
single place with parts wrapped around USE_SSL. A less ugly solution
would be to split that into two files, and one or the other gets
included in OBJS depending on if the build is done with or without
OpenSSL. We do a rather similar thing with fe/be-secure-openssl.c.

Another possibility is that we could say that SCRAM is designed to
work with TLS, as mentioned a bit upthread via the RFC, so we would
not support it in builds compiled without OpenSSL. I think that would
be a shame, but it would simplify all this refactoring juggling.

So, 3 possibilities here:
1) Use a single file src/common/sha.c that includes a set of functions
using USE_SSL
2) Have two files in src/common, one when build is used with OpenSSL,
and the second one when built-in methods are used
3) Disable the use of SCRAM when OpenSSL is not present in the build.

Opinions? My heart goes for 2) because 1) is ugly, and 3) is not
appealing in terms of flexibility.
-- 
Michael

Re: Password identifiers, protocol aging and SCRAM protocol

From

Magnus Hagander

Date:

05 July 2016, 08:50:18

On Tue, Jul 5, 2016 at 10:06 AM, Michael Paquier <michael.paquier@gmail.com> wrote:

On Mon, Jul 4, 2016 at 12:54 PM, Michael Paquier
<michael.paquier@gmail.com> wrote:
> As I am coming back into that, I would as well suggest do the
> following, that the current set of patches is clearly missing:
> - Put the HMAC infrastructure stuff of pgcrypto into src/common/. It
> is a bit a shame to not reuse what is currently available, then I
> would suggest to reuse that with HMAC_SCRAM_SHAXXX as label.
> - Move *all* the SHA-related things of pgcrypto to src/common,
> including SHA1, SHA224 and SHA256. px_memset is a simple wrapper on
> top of memset, we should clean up that first.
> Any other things to consider that I am forgetting?

After looking more into that, I have come up with PG-like equivalents
of things in openssl/sha.h:
pg_shaXX_init(pg_shaXX_ctx *ctx, data);
pg_shaXX_update(pg_shaXX_ctx *ctx, uint8 *data, size_t len);
pg_shaXX_final(uint8 *dest, pg_shaXX_ctx *ctx);
Then think about shaXX as 1, 224, 256, 384 and 512.

Hence all those functions, moved to src/common, finish with the
following shape, take an init() one:
#ifdef USE_SSL
#define <openssl/sha.h>
#endif
void
pg_shaXX_init(pg_shaXX_ctx *ctx)
{
#ifdef USE_SSL
SHAXX_Init((SHAXX_CTX *) ctx);
#else
//Here does the OpenBSD stuff, now part of pgcrypto
#endif
}

And that's really ugly, all the OpenBSD things that are used by
pgcrypto when the code is not built with --with-openssl gather into a
single place with parts wrapped around USE_SSL. A less ugly solution
would be to split that into two files, and one or the other gets
included in OBJS depending on if the build is done with or without
OpenSSL. We do a rather similar thing with fe/be-secure-openssl.c.

FWIW, the main reason for be-secure-openssl.c is that we could have support for another external SSL library. The idea was never to have a builtin replacement for it :)

However, is there something that's fundamentally better with the OpenSSL implementation? Or should we just keep *just* the #else branch in the code, the part we've imported from OpenBSD?

TLS is complex, we don't want to do that in that case. But just the sha functions isn't *that* complex, is it?

Another possibility is that we could say that SCRAM is designed to
work with TLS, as mentioned a bit upthread via the RFC, so we would
not support it in builds compiled without OpenSSL. I think that would
be a shame, but it would simplify all this refactoring juggling.

So, 3 possibilities here:
1) Use a single file src/common/sha.c that includes a set of functions
using USE_SSL
2) Have two files in src/common, one when build is used with OpenSSL,
and the second one when built-in methods are used
3) Disable the use of SCRAM when OpenSSL is not present in the build.

Opinions? My heart goes for 2) because 1) is ugly, and 3) is not
appealing in terms of flexibility.

I really dislike #3 - we want everybody to start using this...

I'm not sure how common a build without openssl is in the real world though. RPMs, DEBs, Windows installers etc all build with OpenSSL. But we probably don't want to make it mandatory, no...

Magnus Hagander
Me: http://www.hagander.net/
Work: http://www.redpill-linpro.com/

Re: Password identifiers, protocol aging and SCRAM protocol

From

Michael Paquier

Date:

06 July 2016, 07:18:15

On Tue, Jul 5, 2016 at 5:50 PM, Magnus Hagander <magnus@hagander.net> wrote:
> On Tue, Jul 5, 2016 at 10:06 AM, Michael Paquier <michael.paquier@gmail.com> wrote:
> However, is there something that's fundamentally better with the OpenSSL
> implementation? Or should we just keep *just* the #else branch in the code,
> the part we've imported from OpenBSD?

Good question. I think that we want both, giving priority to OpenSSL
if it is there. Usually their things prove to have more entropy, but I
didn't look at their code to be honest. If we only use the OpenBSD
stuff, it would be a good idea to refresh the in-core code. This is
from OpenBSD of 2002.

> TLS is complex, we don't want to do that in that case. But just the sha
> functions isn't *that* complex, is it?

No, they are not.

>> Another possibility is that we could say that SCRAM is designed to
>> work with TLS, as mentioned a bit upthread via the RFC, so we would
>> not support it in builds compiled without OpenSSL. I think that would
>> be a shame, but it would simplify all this refactoring juggling.
>>
>> So, 3 possibilities here:
>> 1) Use a single file src/common/sha.c that includes a set of functions
>> using USE_SSL
>> 2) Have two files in src/common, one when build is used with OpenSSL,
>> and the second one when built-in methods are used
>> 3) Disable the use of SCRAM when OpenSSL is not present in the build.
>>
>> Opinions? My heart goes for 2) because 1) is ugly, and 3) is not
>> appealing in terms of flexibility.
>
> I really dislike #3 - we want everybody to start using this...

OK, after hacking that for a bit I have finished with option 2 and the
set of PG-like set of routines, the use of USE_SSL in the file
containing all the SHA functions of OpenBSD has proved to be really
ugly, but with a split things are really clear to the eye. The stuff I
got builds on OSX, Linux and MSVC. pgcrypto cannot link directly to
libpgcommon.a, so I am making it compile directly with the source
files, as it is doing on HEAD.

> I'm not sure how common a build without openssl is in the real world though.
> RPMs, DEBs, Windows installers etc all build with OpenSSL. But we probably
> don't want to make it mandatory, no...

I don't think that it is this much common to have an enterprise-class
build of Postgres without SSL, but each company has always its own
reasons, so things could exist.

And I continue to move on... Thanks for the feedback.
-- 
Michael

Re: Password identifiers, protocol aging and SCRAM protocol

From

Michael Paquier

Date:

06 July 2016, 07:32:48

On Wed, Jul 6, 2016 at 4:18 PM, Michael Paquier
<michael.paquier@gmail.com> wrote:
> OK, after hacking that for a bit I have finished with option 2 and the
> set of PG-like set of routines, the use of USE_SSL in the file
> containing all the SHA functions of OpenBSD has proved to be really
> ugly, but with a split things are really clear to the eye. The stuff I
> got builds on OSX, Linux and MSVC. pgcrypto cannot link directly to
> libpgcommon.a, so I am making it compile directly with the source
> files, as it is doing on HEAD.

Btw, attached is the patch I did for this part if there is any interest in it.

Also, while working on the rest, I am not adding a new column to
pg_auth_id to identify the password verifier type. That's just to keep
the patch at a bare minimum size. Are there issues with that?
--
Michael

Attachment

0001-Refactor-SHA-functions-and-move-them-to-src-common.patch

Re: Password identifiers, protocol aging and SCRAM protocol

From

Stephen Frost

Date:

06 July 2016, 22:51:51

* Michael Paquier (michael.paquier@gmail.com) wrote:
> On Tue, Jul 5, 2016 at 5:50 PM, Magnus Hagander <magnus@hagander.net> wrote:
> > On Tue, Jul 5, 2016 at 10:06 AM, Michael Paquier <michael.paquier@gmail.com> wrote:
> > However, is there something that's fundamentally better with the OpenSSL
> > implementation? Or should we just keep *just* the #else branch in the code,
> > the part we've imported from OpenBSD?
>
> Good question. I think that we want both, giving priority to OpenSSL
> if it is there. Usually their things prove to have more entropy, but I
> didn't look at their code to be honest. If we only use the OpenBSD
> stuff, it would be a good idea to refresh the in-core code. This is
> from OpenBSD of 2002.

I agree that we definitely want to use the OpenSSL functions when they
are available.

> > I'm not sure how common a build without openssl is in the real world though.
> > RPMs, DEBs, Windows installers etc all build with OpenSSL. But we probably
> > don't want to make it mandatory, no...
>
> I don't think that it is this much common to have an enterprise-class
> build of Postgres without SSL, but each company has always its own
> reasons, so things could exist.

I agree that it's useful to have the support if PG isn't built with
OpenSSL for some reason.

Thanks!

Stephen

Re: Password identifiers, protocol aging and SCRAM protocol

From

Michael Paquier

Date:

15 July 2016, 13:30:24

On Thu, Jul 7, 2016 at 7:51 AM, Stephen Frost <sfrost@snowman.net> wrote:
> * Michael Paquier (michael.paquier@gmail.com) wrote:
>> > I'm not sure how common a build without openssl is in the real world though.
>> > RPMs, DEBs, Windows installers etc all build with OpenSSL. But we probably
>> > don't want to make it mandatory, no...
>>
>> I don't think that it is this much common to have an enterprise-class
>> build of Postgres without SSL, but each company has always its own
>> reasons, so things could exist.
>
> I agree that it's useful to have the support if PG isn't built with
> OpenSSL for some reason.

OK, I am doing that at the end.

And also while moving on...

On another topic, here are some ideas to extend CREATE/ALTER ROLE to
support SCRAM password directly:
1) protocol PASSWORD value, where protocol is { MD5 | PLAIN | SCRAM }, giving:
CREATE ROLE foorole SCRAM PASSWORD value;
2) PASSWORD (protocol) value.
3) Just add SCRAM PASSWORD
My mind is thinking about 1) as being the cleanest solution as this
does not touch the defaults, which may change a couple of releases
later. Other opinions?

Note that I am also switching password_encryption to an enum, able to
use as values on, off, md5, plain, scram. Of course, on => md5, off =>
plain to preserve the default.
Other things that I am making conservative:
- ENCRYPTED PASSWORD still implies MD5-encrypted password
- UNENCRYPTED PASSWORD still implies plain text password
- PASSWORD used alone depends on the value of password_encryption
So it would be possible to move to scram by default by setting
password_encryption to 'scram'.

Objections are welcome, I am moving into something respecting the
default behavior as much as possible.
-- 
Michael

Re: Password identifiers, protocol aging and SCRAM protocol

From

Robert Haas

Date:

20 July 2016, 15:15:11

On Fri, Jul 15, 2016 at 9:30 AM, Michael Paquier
<michael.paquier@gmail.com> wrote:
> OK, I am doing that at the end.
>
> And also while moving on...
>
> On another topic, here are some ideas to extend CREATE/ALTER ROLE to
> support SCRAM password directly:
> 1) protocol PASSWORD value, where protocol is { MD5 | PLAIN | SCRAM }, giving:
> CREATE ROLE foorole SCRAM PASSWORD value;
> 2) PASSWORD (protocol) value.
> 3) Just add SCRAM PASSWORD
> My mind is thinking about 1) as being the cleanest solution as this
> does not touch the defaults, which may change a couple of releases
> later. Other opinions?

I can't really understand what you are saying here, but I'm going to
be -1 on adding SCRAM as a parser keyword.  Let's pick a syntax like
"PASSWORD SConst USING SConst" or "PASSWORD SConst ENCRYPTED WITH
SConst".

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Password identifiers, protocol aging and SCRAM protocol

From

Alvaro Herrera

Date:

20 July 2016, 19:32:11

Michael Paquier wrote:
> On Wed, Jul 6, 2016 at 4:18 PM, Michael Paquier
> <michael.paquier@gmail.com> wrote:
> > OK, after hacking that for a bit I have finished with option 2 and the
> > set of PG-like set of routines, the use of USE_SSL in the file
> > containing all the SHA functions of OpenBSD has proved to be really
> > ugly, but with a split things are really clear to the eye. The stuff I
> > got builds on OSX, Linux and MSVC. pgcrypto cannot link directly to
> > libpgcommon.a, so I am making it compile directly with the source
> > files, as it is doing on HEAD.
> 
> Btw, attached is the patch I did for this part if there is any interest in it.

After quickly eyeballing your patch, I agree with the decision of going
with (2), even if my gut initially told me that (1) would be better
because it'd require less makefile trickery.

I'm surprised that you say pgcrypto cannot link libpgcommon directly.
Is there some insurmountable problem there?  I notice your MSVC patch
uses libpgcommon while the Makefile symlinks the files.

-- 
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: Password identifiers, protocol aging and SCRAM protocol

From

David Fetter

Date:

20 July 2016, 20:25:34

On Wed, Jul 20, 2016 at 02:12:57PM -0400, Alvaro Herrera wrote:
> Michael Paquier wrote:
> > On Wed, Jul 6, 2016 at 4:18 PM, Michael Paquier
> > <michael.paquier@gmail.com> wrote:
> > > OK, after hacking that for a bit I have finished with option 2 and the
> > > set of PG-like set of routines, the use of USE_SSL in the file
> > > containing all the SHA functions of OpenBSD has proved to be really
> > > ugly, but with a split things are really clear to the eye. The stuff I
> > > got builds on OSX, Linux and MSVC. pgcrypto cannot link directly to
> > > libpgcommon.a, so I am making it compile directly with the source
> > > files, as it is doing on HEAD.
> > 
> > Btw, attached is the patch I did for this part if there is any interest in it.
> 
> After quickly eyeballing your patch, I agree with the decision of going
> with (2), even if my gut initially told me that (1) would be better
> because it'd require less makefile trickery.
> 
> I'm surprised that you say pgcrypto cannot link libpgcommon directly.
> Is there some insurmountable problem there?  I notice your MSVC patch
> uses libpgcommon while the Makefile symlinks the files.

People have, in the past, expressed concerns about linking in
pgcrypto.  Apparently, in some countries, it's a legal problem.

Best,
David.
-- 
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter      XMPP: david(dot)fetter(at)gmail(dot)com

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

Re: Password identifiers, protocol aging and SCRAM protocol

From

Michael Paquier

Date:

20 July 2016, 23:39:07

On Thu, Jul 21, 2016 at 12:15 AM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Fri, Jul 15, 2016 at 9:30 AM, Michael Paquier
> <michael.paquier@gmail.com> wrote:
>> OK, I am doing that at the end.
>>
>> And also while moving on...
>>
>> On another topic, here are some ideas to extend CREATE/ALTER ROLE to
>> support SCRAM password directly:
>> 1) protocol PASSWORD value, where protocol is { MD5 | PLAIN | SCRAM }, giving:
>> CREATE ROLE foorole SCRAM PASSWORD value;
>> 2) PASSWORD (protocol) value.
>> 3) Just add SCRAM PASSWORD
>> My mind is thinking about 1) as being the cleanest solution as this
>> does not touch the defaults, which may change a couple of releases
>> later. Other opinions?
>
> I can't really understand what you are saying here, but I'm going to
> be -1 on adding SCRAM as a parser keyword.  Let's pick a syntax like
> "PASSWORD SConst USING SConst" or "PASSWORD SConst ENCRYPTED WITH
> SConst".

No, I do not mean to make SCRAM or MD5 keywords. While hacking that, I
got at some point in the mood of using "PASSWORD Sconst Sconst" but
that's ugly. Sticking a keyword in between makes more sense, and USING
is a good idea. I haven't thought of this one.

By the way, the core patch does not have any grammar extension. The
grammar extension will be on top of it and the core patch can just
activate scram passwords using password_encryption. That's user
unfriendly, but as the patch is large I try to cut it in as many
pieces as necessary.
-- 
Michael

Re: Password identifiers, protocol aging and SCRAM protocol

From

Michael Paquier

Date:

20 July 2016, 23:42:56

On Thu, Jul 21, 2016 at 5:25 AM, David Fetter <david@fetter.org> wrote:
> On Wed, Jul 20, 2016 at 02:12:57PM -0400, Alvaro Herrera wrote:
>> Michael Paquier wrote:
>> > On Wed, Jul 6, 2016 at 4:18 PM, Michael Paquier
>> > <michael.paquier@gmail.com> wrote:
>> > > OK, after hacking that for a bit I have finished with option 2 and the
>> > > set of PG-like set of routines, the use of USE_SSL in the file
>> > > containing all the SHA functions of OpenBSD has proved to be really
>> > > ugly, but with a split things are really clear to the eye. The stuff I
>> > > got builds on OSX, Linux and MSVC. pgcrypto cannot link directly to
>> > > libpgcommon.a, so I am making it compile directly with the source
>> > > files, as it is doing on HEAD.
>> >
>> > Btw, attached is the patch I did for this part if there is any interest in it.
>>
>> After quickly eyeballing your patch, I agree with the decision of going
>> with (2), even if my gut initially told me that (1) would be better
>> because it'd require less makefile trickery.

Yeah, I thought the same thing as well when putting my hands in the
dirt... But the in the end (2) is really less ugly.

>> I'm surprised that you say pgcrypto cannot link libpgcommon directly.
>> Is there some insurmountable problem there?  I notice your MSVC patch
>> uses libpgcommon while the Makefile symlinks the files.

I am running into some weird things when linking both on OSX... But I
am not done with it completely yet. I'll adjust that a bit more when
producing the set of patches that will be published. So let's see.

> People have, in the past, expressed concerns about linking in
> pgcrypto.  Apparently, in some countries, it's a legal problem.

Do you have any references? I don't see that as a problem.
-- 
Michael

Re: Password identifiers, protocol aging and SCRAM protocol

From

Robert Haas

Date:

21 July 2016, 16:19:35

On Wed, Jul 20, 2016 at 7:42 PM, Michael Paquier
<michael.paquier@gmail.com> wrote:
>> People have, in the past, expressed concerns about linking in
>> pgcrypto.  Apparently, in some countries, it's a legal problem.
>
> Do you have any references? I don't see that as a problem.

I don't have a link to previous discussion handy, but I definitely
recall that it's been discussed.  I don't think that would mean that
libpgcrypto couldn't depend on libpgcommon, but the reverse direction
would make libpgcrypto essentially mandatory which I don't think is a
direction we want to go for both technical and legal reasons.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Password identifiers, protocol aging and SCRAM protocol

From

David Steele

Date:

21 July 2016, 16:59:57

On 7/21/16 12:19 PM, Robert Haas wrote:
> On Wed, Jul 20, 2016 at 7:42 PM, Michael Paquier
> <michael.paquier@gmail.com> wrote:
>>> People have, in the past, expressed concerns about linking in
>>> pgcrypto.  Apparently, in some countries, it's a legal problem.
>>
>> Do you have any references? I don't see that as a problem.
> 
> I don't have a link to previous discussion handy, but I definitely
> recall that it's been discussed.  I don't think that would mean that
> libpgcrypto couldn't depend on libpgcommon, but the reverse direction
> would make libpgcrypto essentially mandatory which I don't think is a
> direction we want to go for both technical and legal reasons.

I searched a few different ways and finally came up with this post from Tom:

https://www.postgresql.org/message-id/11392.1389991321@sss.pgh.pa.us

It's the only thing I could find, but thought it might jog something
loose for somebody else.

I know that export controls have been an issue for crypto in the past
but have no idea what the current state of that is.

-- 
-David
david@pgmasters.net

Re: Password identifiers, protocol aging and SCRAM protocol

From

Tom Lane

Date:

21 July 2016, 17:31:59

David Steele <david@pgmasters.net> writes:
> On 7/21/16 12:19 PM, Robert Haas wrote:
>> On Wed, Jul 20, 2016 at 7:42 PM, Michael Paquier
>> <michael.paquier@gmail.com> wrote:
>>>> People have, in the past, expressed concerns about linking in
>>>> pgcrypto.  Apparently, in some countries, it's a legal problem.

>>> Do you have any references? I don't see that as a problem.

>> I don't have a link to previous discussion handy, but I definitely
>> recall that it's been discussed.  I don't think that would mean that
>> libpgcrypto couldn't depend on libpgcommon, but the reverse direction
>> would make libpgcrypto essentially mandatory which I don't think is a
>> direction we want to go for both technical and legal reasons.

> I searched a few different ways and finally came up with this post from Tom:
> https://www.postgresql.org/message-id/11392.1389991321@sss.pgh.pa.us
> It's the only thing I could find, but thought it might jog something
> loose for somebody else.

Way back when, like fifteen years ago, there absolutely were US export
control restrictions on software containing crypto.  I believe the US has
figured out that that was silly, but I'm not sure everyplace else has.
(And if you've been reading the news you will notice that legal
restrictions on crypto are back in vogue, so it would not be wise to
assume that the question is dead and buried.)  So our project policy
since at least the turn of the century has been that any crypto facility
has to be in a separable extension, where it would be fairly easy for
a packager to delete it if they need to ship a crypto-free version.

Note that "crypto" for this purpose generally means reversible encryption;
I've never heard that one-way hashes are illegal anywhere.  So password
hashing such as md5 is fine in core, and a stronger hash would be too.
But pulling in pgcrypto lock, stock, and barrel is not OK.
        regards, tom lane

Re: Password identifiers, protocol aging and SCRAM protocol

From

Michael Paquier

Date:

21 July 2016, 23:28:06

On Fri, Jul 22, 2016 at 2:31 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Way back when, like fifteen years ago, there absolutely were US export
> control restrictions on software containing crypto.  I believe the US has
> figured out that that was silly, but I'm not sure everyplace else has.

England is these days legally running a battle against data
encryption. I have not heard how this is evolving these days.

> (And if you've been reading the news you will notice that legal
> restrictions on crypto are back in vogue, so it would not be wise to
> assume that the question is dead and buried.)  So our project policy
> since at least the turn of the century has been that any crypto facility
> has to be in a separable extension, where it would be fairly easy for
> a packager to delete it if they need to ship a crypto-free version.
> Note that "crypto" for this purpose generally means reversible encryption;
> I've never heard that one-way hashes are illegal anywhere.  So password
> hashing such as md5 is fine in core, and a stronger hash would be too.
> But pulling in pgcrypto lock, stock, and barrel is not OK.

So it would be an issue if pgcrypto.so links directly to libpqcommon?
Because that's not what I am doing now, perhaps fortunately. I moved
the sha functions to src/common. But actually but thinking more about
that, I don't need to do so because the routines of SCRAM shared
between the frontend and the backend just need to be part of libpq so
they could just be part of backend/libpq like md5.

Tom, if I get it correctly, it would not be an issue if the SHA
functions are directly part of the compiled backend like md5, right?
Because I would like to just change my set of patches to have the SHA
and the encoding functions in src/backend/libpq instead of src/common,
and then have pgcrypto be compiled with a link to those files. That's
a cleaner design btw, more in line with what is done for md5..
-- 
Michael

Re: Password identifiers, protocol aging and SCRAM protocol

From

Tom Lane

Date:

21 July 2016, 23:49:14

Michael Paquier <michael.paquier@gmail.com> writes:
> On Fri, Jul 22, 2016 at 2:31 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Note that "crypto" for this purpose generally means reversible encryption;
>> I've never heard that one-way hashes are illegal anywhere.  So password
>> hashing such as md5 is fine in core, and a stronger hash would be too.
>> But pulling in pgcrypto lock, stock, and barrel is not OK.

> So it would be an issue if pgcrypto.so links directly to libpqcommon?

No, I don't see why that'd be an issue.  What we can't do is have
libpgcommon depending on pgcrypto.so, or containing anything more than
one-way-hash functionality itself.

> Because I would like to just change my set of patches to have the SHA
> and the encoding functions in src/backend/libpq instead of src/common,
> and then have pgcrypto be compiled with a link to those files. That's
> a cleaner design btw, more in line with what is done for md5..

I'm confused.  We need that code in both libpq and backend, no?
src/common is the place for stuff of that description.
        regards, tom lane

Re: Password identifiers, protocol aging and SCRAM protocol

From

Michael Paquier

Date:

21 July 2016, 23:57:17

On Fri, Jul 22, 2016 at 8:48 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Michael Paquier <michael.paquier@gmail.com> writes:
>> Because I would like to just change my set of patches to have the SHA
>> and the encoding functions in src/backend/libpq instead of src/common,
>> and then have pgcrypto be compiled with a link to those files. That's
>> a cleaner design btw, more in line with what is done for md5..
>
> I'm confused.  We need that code in both libpq and backend, no?
> src/common is the place for stuff of that description.

Not necessarily. src/interfaces/libpq/Makefile uses a set of files
like md5.c which is located in the backend code and directly compiles
libpq.so with them, so one possibility would be to do the same for
sha.c: locate the file in src/backend/libpq/ and then fetch the file
directly when compiling libpq's shared library.

One thing about my current set of patches is that I have begun adding
files from src/common/ to libpq's list of files. As that would be new
I am wondering if I should avoid doing so. Here is what I mean:
--- a/src/interfaces/libpq/Makefile
+++ b/src/interfaces/libpq/Makefile
@@ -43,6 +43,14 @@ OBJS += $(filter crypt.o getaddrinfo.o getpeereid.o
inet_aton.o open.o system.oOBJS += ip.o md5.o# utils/mbOBJS += encnames.o wchar.o
+# common/
+OBJS += encode.o scram-common.o
+
-- 
Michael

Re: Password identifiers, protocol aging and SCRAM protocol

From

Tom Lane

Date:

22 July 2016, 00:02:52

Michael Paquier <michael.paquier@gmail.com> writes:
> On Fri, Jul 22, 2016 at 8:48 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> I'm confused.  We need that code in both libpq and backend, no?
>> src/common is the place for stuff of that description.

> Not necessarily. src/interfaces/libpq/Makefile uses a set of files
> like md5.c which is located in the backend code and directly compiles
> libpq.so with them, so one possibility would be to do the same for
> sha.c: locate the file in src/backend/libpq/ and then fetch the file
> directly when compiling libpq's shared library.

Meh.  That seems like a hack left over from before we had src/common.

Having said that, src/interfaces/libpq/ does have some special
requirements, because it needs the code compiled with -fpic (on most
hardware), which means it can't just use the client-side libpgcommon.a
builds.  So maybe it's not worth improving this.

> One thing about my current set of patches is that I have begun adding
> files from src/common/ to libpq's list of files. As that would be new
> I am wondering if I should avoid doing so.

Well, it could link source files from there just as easily as from the
backend.  Not object files, though.
        regards, tom lane

Re: Password identifiers, protocol aging and SCRAM protocol

From

Michael Paquier

Date:

22 July 2016, 00:06:20

On Fri, Jul 22, 2016 at 9:02 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Michael Paquier <michael.paquier@gmail.com> writes:
>> One thing about my current set of patches is that I have begun adding
>> files from src/common/ to libpq's list of files. As that would be new
>> I am wondering if I should avoid doing so.
>
> Well, it could link source files from there just as easily as from the
> backend.  Not object files, though.

OK. I'll just keep things the current way then :)
-- 
Michael

Re: Password identifiers, protocol aging and SCRAM protocol

From

Craig Ringer

Date:

22 July 2016, 05:19:40

On 22 July 2016 at 01:31, Tom Lane <tgl@sss.pgh.pa.us> wrote:

David Steele <david@pgmasters.net> writes:
> On 7/21/16 12:19 PM, Robert Haas wrote:
>> On Wed, Jul 20, 2016 at 7:42 PM, Michael Paquier
>> <michael.paquier@gmail.com> wrote:
>>>> People have, in the past, expressed concerns about linking in
>>>> pgcrypto. Apparently, in some countries, it's a legal problem.

>>> Do you have any references? I don't see that as a problem.

>> I don't have a link to previous discussion handy, but I definitely
>> recall that it's been discussed. I don't think that would mean that
>> libpgcrypto couldn't depend on libpgcommon, but the reverse direction
>> would make libpgcrypto essentially mandatory which I don't think is a
>> direction we want to go for both technical and legal reasons.

> I searched a few different ways and finally came up with this post from Tom:
> https://www.postgresql.org/message-id/11392.1389991321@sss.pgh.pa.us
> It's the only thing I could find, but thought it might jog something
> loose for somebody else.

Way back when, like fifteen years ago, there absolutely were US export
control restrictions on software containing crypto. I believe the US has
figured out that that was silly, but I'm not sure everyplace else has.

Australia has recently enacted laws that are reminiscent of the US's defunct crypto export control laws, but they add penalties for *teaching* encryption too. Yup, you can be charged for talking about it. Of course they'll only actually USE those new powers to Stop The Terrorist Threat, they promise...

http://www.defence.gov.au/deco/DTC.asp

Unless recently amended, they even failed to exclude academic institutions. I haven't been following it closely because, frankly, it's too ridiculous to pay much attention to, and I don't work directly with crypto anyway. But it's far from the only such colossally ignorant and idiotic law floating around.

Despite the technical frustrations involved, we should keep crypto implementations in a separate library. I agree with Tom that one-way hashes are not a practical concern, even if the laws are probably written too poorly to draw a distinction.

Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Re: Password identifiers, protocol aging and SCRAM protocol

From

Michael Paquier

Date:

22 July 2016, 06:43:36

On Fri, Jul 22, 2016 at 9:06 AM, Michael Paquier
<michael.paquier@gmail.com> wrote:
> On Fri, Jul 22, 2016 at 9:02 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Michael Paquier <michael.paquier@gmail.com> writes:
>>> One thing about my current set of patches is that I have begun adding
>>> files from src/common/ to libpq's list of files. As that would be new
>>> I am wondering if I should avoid doing so.
>>
>> Well, it could link source files from there just as easily as from the
>> backend.  Not object files, though.
>
> OK. I'll just keep things the current way then :)

Note: I have put more energy into that and I think that I will be able
to publish a new patch set pretty soon, like at the beginning of next
week.
-- 
Michael

Re: Password identifiers, protocol aging and SCRAM protocol

From

Michael Paquier

Date:

25 July 2016, 08:04:30

On Fri, Jul 22, 2016 at 3:43 PM, Michael Paquier
<michael.paquier@gmail.com> wrote:
> On Fri, Jul 22, 2016 at 9:06 AM, Michael Paquier
> <michael.paquier@gmail.com> wrote:
>> On Fri, Jul 22, 2016 at 9:02 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> Michael Paquier <michael.paquier@gmail.com> writes:
>>>> One thing about my current set of patches is that I have begun adding
>>>> files from src/common/ to libpq's list of files. As that would be new
>>>> I am wondering if I should avoid doing so.
>>>
>>> Well, it could link source files from there just as easily as from the
>>> backend.  Not object files, though.
>>
>> OK. I'll just keep things the current way then :)
>
> Note: I have put more energy into that and I think that I will be able
> to publish a new patch set pretty soon, like at the beginning of next
> week.

Ok, here is the real deal. As discussed at PGcon, I have shaved off
from the set of patches the following things:
- No separate catalog pg_auth_verifier
- No additional column in pg_authid to determine the password type.
All the logic used check if the password string has a wanted format.
We do that for MD5 now, this set does it for SCRAM.
- Removal of the pg_upgrade stuff.
- Removal of password_protocols, so we don't care anymore about protocol aging.
In short, the SCRAM verifiers get stored in rolpassword.

And here is what this set of patches does:
- Implementation of SCRAM-SHA-256, and not SHA1. I have moved to the
one that makes the most sense considering the current situation based
on RFC 5802 and 7677.
- No channel binding support. I guess that this could be added later on.
- password_encryption is now an enum, and gains three values: md5,
plain and scram. true => md5, false => plain for backward
compatibility
- Grammar of CREATE/ALTER ROLE is extended with PASSWORD val USING
protocol, that's a separate patch applying on top of the core patch
for SASL.

I have noticed as well a couple of bugs in the previous set(s) of patches:
- valid_until was not checked for SCRAM
- When using ENCRYPTED or UNENCRYPTED, already encrypted password
should be used as-is. The same is applied to PASSWORD USING protocol
to ease dump and reload. That's actually what is used for MD5.

And here is a detail of the patches:
- 0001, refactoring of SHA functions into src/common.
- 0002, refactoring for sendAuthRequest
- 0003, Refactoring for RandomSalt to accomodate with the salt used by
scram (length of 10 bytes, md5 is 4).
- 0004, move encoding routines to src/common/
- 0005, make password_encryption an enum
- 0006, refactor some code in CREATE/ALTER role code paths related the
use of password_encryption
- 0007, refactor some code to have a single routine to fetch password
and valid_until from pg_authid
- 0008, The core implementation of SCRAM-SHA-256, with the SASL
communication protocol. if you want to use SCRAM with that, things go
with password_encryption = 'scram'.
- 0009, addition of PASSWORD val USING protocol
- 0010. regression tests for passwords. Not sure how useful they would
be. But they helped me a bit.

I am adding an entry in the next CF. Comments are welcome.
--
Michael

On Fri, Aug 19, 2016 at 1:51 AM, Heikki Linnakangas <hlinnaka@iki.fi> wrote:
> On 08/18/2016 03:45 PM, Michael Paquier wrote:
>>
>> On Thu, Aug 18, 2016 at 9:28 PM, Heikki Linnakangas <hlinnaka@iki.fi>
>> wrote:
>> For the current ip.c, I don't have a better idea than putting in
>> src/common/ip.c the set of routines used by both the frontend and
>> backend, and have fe_ip.c the new file that has the frontend-only
>> things. Need a patch?
>
>
> Yes, please. I don't think there's anything there that's needed by only the
> frontend, but some of the functions are needed by only the backend. So I
> think we'll end up with src/common/ip.c, and src/backend/libpq/be-ip.c. (Not
> sure about those names, pick something that makes sense, given what's left
> in the files.)

OK, so let's do that first correctly. Attached are two patches:
- 0001 moves md5 to src/common
- 0002 that does the same for ip.c.
By the way, it seems to me that having be-ip.c is not that much worth
it. I am noticing that only pg_range_sockaddr could be marked as
backend-only. pg_foreach_ifaddr is being used as well by
tools/ifaddrs/, and this one calls as well pg_sockaddr_cidr_mask. Or
is there still some utility in having src/tools/ifaddrs? If not we
could move pg_sockaddr_cidr_mask and pg_foreach_ifaddr to be
backend-only. With pg_range_sockaddr that would make half the routines
to be marked as backend-only.

I have not rebased the whole series yet of SCRAM... I'll do that after
we agree on those two patches with the two commits you have already
done cleaned up of course (thanks btw for those ones!).
--
Michael

On Fri, Sep 2, 2016 at 10:23 PM, Michael Paquier
<michael.paquier@gmail.com> wrote:
> On Fri, Sep 2, 2016 at 7:57 PM, Heikki Linnakangas <hlinnaka@iki.fi> wrote:
>> I decided to split ip.c anyway. I'd like to keep the files in
>> src/common/ip.c as small as possible, so I think it makes sense to be quite
>> surgical when moving things there. I kept the pg_foreach_ifaddr() function
>> in src/backend/libpq/ifaddr.c (I renamed the file to avoid confusion with
>> the ip.c that got moved), even though it means that test_ifaddr will have to
>> continue to copy the file directly from src/backend/libpq. I'm OK with that,
>> because test_ifaddrs is just a little test program that mimics the backend's
>> behaviour of enumerating interfaces. I don't consider it to be a "real"
>> frontend application.
>>
>> Pushed, after splitting. Thanks! Now let's move on to the more substantial
>> patches.

Thanks for the push.

> Before I send a new series of patches... There is one thing that I am
> still troubled with: the compilation of pgcrypto. First from
> contrib/pgcrypto/Makefile I am noticing the following issue with this
> block:
> CF_SRCS = $(if $(subst no,,$(with_openssl)), $(OSSL_SRCS), $(INT_SRCS))
> CF_TESTS = $(if $(subst no,,$(with_openssl)), $(OSSL_TESTS), $(INT_TESTS))
> CF_PGP_TESTS = $(if $(subst no,,$(with_zlib)), $(ZLIB_TST), $(ZLIB_OFF_TST))
> How is that correct if src/Makefile.global is not loaded first?
> Variables like with_openssl are still not loaded at that point.
>
> Then, as per patch 0001 there are two files holding the SHA routines:
> sha.c with the interface taken from OpenBSD, and sha_openssl.c that
> uses the interface of OpenSSL. And when compiling pgcrypto, the choice
> of file is made depending on the value of $(with_openssl).

So I have solved my identity crisis here by just using INT_SRCS and
OSSL_SRCS to list the correct files holding the SHA files. Thanks Tom
for the hint. I need to study more my Makefile-fu.

Attached is a new series:
- 0001, refactoring of SHA functions into src/common.
- 0002, move encoding routines to src/common/
- 0003, make password_encryption an enum
- 0004, refactor some code in CREATE/ALTER role code paths related the
use of password_encryption
- 0005, refactor some code to have a single routine to fetch password
and valid_until from pg_authid
- 0006, The core implementation of SCRAM-SHA-256, with the SASL
communication protocol. if you want to use SCRAM with that, things go
with password_encryption = 'scram'. I have spotted here a bug with the
MSVC build on the way.
- 0007, addition of PASSWORD val USING protocol
- 0008. regression tests for passwords. Those do not trigger the
internal sha routines, which lead to inconsistent results.
--
Michael

Attachment

Re: Password identifiers, protocol aging and SCRAM protocol

From

David Steele

Date:

25 September 2016, 17:15:38

On 9/3/16 8:36 AM, Michael Paquier wrote:
>
> Attached is a new series:

* [PATCH 1/8] Refactor SHA functions and move them to src/common/

I'd like to see more code comments in sha.c (though I realize this was
copied directly from pgcrypto.)

I tested by building with and without --with-openssl and running make
check for the project as a whole and the pgcrypto extension.

I notice that the copyright from pgcrypto/sha1.c was carried over but
not the copyright from pgcrypto/sha2.c.  I'm no expert on how this
works, but I believe the copyright from sha2.c must be copied over.

Also, are there any plans to expose these functions directly to the user
without loading pgcrypto?  Now that the functionality is in core it
seems that would be useful.  In addition, it would make this patch stand
on its own rather than just being a building block

* [PATCH 2/8] Move encoding routines to src/common/

I wonder if it is confusing to have two of encode.h/encode.c.  Perhaps
they should be renamed to make them distinct?

* [PATCH 3/8] Switch password_encryption to a enum

Does not apply on HEAD (98c2d3332):

error: patch failed: src/backend/commands/user.c:139
error: src/backend/commands/user.c: patch does not apply
error: patch failed: src/include/commands/user.h:15
error: src/include/commands/user.h: patch does not apply

For here on I used 39b691f251 for review and testing.

I seems you are keeping on/off for backwards compatibility, shouldn't
the default now be "md5"?

-#password_encryption = on
+#password_encryption = on        # on, off, md5 or plain

* [PATCH 4/8] Refactor decision-making of password encryption into a
single routine

+++ b/src/backend/commands/user.c
+        new_record[Anum_pg_authid_rolpassword - 1] =
+            CStringGetTextDatum(encrypted_passwd);

pfree(encrypted_passwd) here or let it get freed with the context?

* [PATCH 5/8] Create generic routine to fetch password and valid until
values for a role

Couldn't md5_crypt_verify() be made more general and take the hash type?For instance, password_crypt_verify() with the
lastparam as the new

password type enum.

* [PATCH 6/8] Support for SCRAM-SHA-256 authentication

+++ b/contrib/passwordcheck/passwordcheck.c
+        case PASSWORD_TYPE_SCRAM:
+            /* unfortunately not much can be done here */
+            break;

Why can't we at least do the same check as md5 to make sure the username
was not used as the password?

+++ b/src/backend/libpq/auth.c
+     * without relying on the length word, but we hardly care about protocol
+     * version or older anymore.)

Do you mean protocol version 2 or older?

+++ b/src/backend/libpq/crypt.c        return STATUS_ERROR;    /* empty password */
+

Looks like a stray LF.

+++ b/src/backend/parser/gram.y
+    SAVEPOINT SCHEMA SCRAM SCROLL SEARCH SECOND_P SECURITY SELECT SEQUENCE

Doesn't this belong in patch 7?  Even in patch 7 it doesn't appear that
SCRAM is a keyword since the protocol specified after USING is quoted.

I tested this patch using both md5 and scram and was able to get both of
them to working separately.

However, it doesn't look like they can be used in conjunction since the
pg_hba.conf entry must specify either m5 or scram (though the database
can easily contain a mixture).  This would probably make a migration
very unpleasant.

Is there any chance of a mixed mode that will allow new passwords to be
set as scram while still honoring the old md5 passwords? Or does that
cause too many complications with the protocol?

* [PATCH 7/8] Add clause PASSWORD val USING protocol to CREATE/ALTER ROLE

+++ b/doc/src/sgml/ref/create_role.sgml
+        Sets the role's password using the wanted protocol.

How about "Sets the role's password using the requested procotol."

+        an unencrypted password.   If the presented password string is
already
+        in MD5-encrypted or SCRAM-encrypted format, then it is stored
encrypted
+        as-is.

How about, "If the password string is..."

* [PATCH 8/8] Add regression tests for passwords

OK.

On the whole I find this patch set easier to digest than what was
submitted for 9.6.  It is more targeted but still provides very valuable
functionality.

I'm a bit concerned that a mixture of md5/scram could cause confusion
and think this may warrant discussion somewhere in the documentation
since the idea is for users to migrate from md5 to scram.

-- 
-David
david@pgmasters.net

Re: Password identifiers, protocol aging and SCRAM protocol

From

Michael Paquier

Date:

26 September 2016, 06:02:39

On Mon, Sep 26, 2016 at 2:15 AM, David Steele <david@pgmasters.net> wrote:
> On 9/3/16 8:36 AM, Michael Paquier wrote:
>>
>> Attached is a new series:

Thanks for the review and the comments!

> * [PATCH 1/8] Refactor SHA functions and move them to src/common/
>
> I'd like to see more code comments in sha.c (though I realize this was
> copied directly from pgcrypto.)

OK... I have added some comments for the user-facing routines, as well
as the private routines that are doing step-by-step random
calculations.

> I notice that the copyright from pgcrypto/sha1.c was carried over but
> not the copyright from pgcrypto/sha2.c.  I'm no expert on how this
> works, but I believe the copyright from sha2.c must be copied over.

Right, those copyright bits are missing:
- * AUTHOR: Aaron D. Gifford <me@aarongifford.com>
[...]
- * Copyright (c) 2000-2001, Aaron D. Gifford
The license block being the same, it seems to me that there is no need
to copy it over. The copyright should be enough.

> Also, are there any plans to expose these functions directly to the user
> without loading pgcrypto?  Now that the functionality is in core it
> seems that would be useful.  In addition, it would make this patch stand
> on its own rather than just being a building block.

There have been discussions about avoiding enabling those functions by
default in the distribution. We'd rather not do that...

> * [PATCH 2/8] Move encoding routines to src/common/
>
> I wonder if it is confusing to have two of encode.h/encode.c.  Perhaps
> they should be renamed to make them distinct?

Yes it may be a good idea to rename that, like encode_utils.[c|h] for
the new files.

> * [PATCH 3/8] Switch password_encryption to a enum
>
> Does not apply on HEAD (98c2d3332):

Interesting, it works for me on da6c4f6.

> For here on I used 39b691f251 for review and testing.
> I seems you are keeping on/off for backwards compatibility, shouldn't
> the default now be "md5"?
>
> -#password_encryption = on
> +#password_encryption = on              # on, off, md5 or plain

That sounds like a good idea, so switched this way.

> * [PATCH 4/8] Refactor decision-making of password encryption into a
> single routine
>
> +++ b/src/backend/commands/user.c
> +               new_record[Anum_pg_authid_rolpassword - 1] =
> +                       CStringGetTextDatum(encrypted_passwd);
>
> pfree(encrypted_passwd) here or let it get freed with the context?

Calling encrypt_password did not ensure that the password needs to be
free'd.. So I guess that at the moment I coded that I just relied on
the context. But well reading now let's do this cleanly and have
encrypt_password return a palloc'ed string. That's more consistent.

> * [PATCH 5/8] Create generic routine to fetch password and valid until
> values for a role
>
> Couldn't md5_crypt_verify() be made more general and take the hash type?
>  For instance, password_crypt_verify() with the last param as the new
> password type enum.

This would mean incorporating the whole SASL message exchange into
this routine because the password string is part of the scram
initialization context, and it seems to me that it is better to just
do once a lookup at the entry in pg_authid. So we'd finish with a more
confusing code I am afraid. At least that's the conclusion I came up
with when doing that.. md5_crypt_verify does only the work on a
received password.

> * [PATCH 6/8] Support for SCRAM-SHA-256 authentication
>
> +++ b/contrib/passwordcheck/passwordcheck.c
> +               case PASSWORD_TYPE_SCRAM:
> +                       /* unfortunately not much can be done here */
> +                       break;
>
> Why can't we at least do the same check as md5 to make sure the username
> was not used as the password?

You are right. We could at least check that, so changed the way you suggest.

> +++ b/src/backend/libpq/auth.c
> +        * without relying on the length word, but we hardly care about protocol
> +        * version or older anymore.)
>
> Do you mean protocol version 2 or older?
>
> +++ b/src/backend/libpq/crypt.c
>                 return STATUS_ERROR;    /* empty password */
> +
>
> Looks like a stray LF.

Fixed.

> +++ b/src/backend/parser/gram.y
> +       SAVEPOINT SCHEMA SCRAM SCROLL SEARCH SECOND_P SECURITY SELECT SEQUENCE
>
> Doesn't this belong in patch 7?  Even in patch 7 it doesn't appear that
> SCRAM is a keyword since the protocol specified after USING is quoted.

This is some garbage from a past version. Fixed.

> However, it doesn't look like they can be used in conjunction since the
> pg_hba.conf entry must specify either m5 or scram (though the database
> can easily contain a mixture).  This would probably make a migration
> very unpleasant.

Yep, it uses a given auth-method once user and database match. This is
partially related to the problem to support multiple password
verifiers per users, which was submitted last CF but got rejected
because of a lack of interest, and removed to simplify this patch. You
need as well to think about other things like password and protocol
aging. But well, it is a problem that we don't have to tackle with
this patch...

> Is there any chance of a mixed mode that will allow new passwords to be
> set as scram while still honoring the old md5 passwords? Or does that
> cause too many complications with the protocol?

Hm. That looks complicated to me. This sounds to me like a retry logic
if for multiple authentication methods, and a different feature. What
you'd be looking for here is a connection parameter to specify a list
of protocols and try them all, no?

And that:
+    * multiple messags sent in both directions. First message is always from

> * [PATCH 7/8] Add clause PASSWORD val USING protocol to CREATE/ALTER ROLE
>
> +++ b/doc/src/sgml/ref/create_role.sgml
> +        Sets the role's password using the wanted protocol.
>
> How about "Sets the role's password using the requested procotol."

Done.

> +        an unencrypted password.   If the presented password string is
> already
> +        in MD5-encrypted or SCRAM-encrypted format, then it is stored
> encrypted
> +        as-is.
>
> How about, "If the password string is..."

OK.

> On the whole I find this patch set easier to digest than what was
> submitted for 9.6.  It is more targeted but still provides very valuable
> functionality.

Thanks.

> I'm a bit concerned that a mixture of md5/scram could cause confusion
> and think this may warrant discussion somewhere in the documentation
> since the idea is for users to migrate from md5 to scram.

We could finish with a red warning in the docs to say that users are
recommended to use SCRAM instead of MD5. Just an idea, perhaps that's
not mandatory for the first shot though.
--
Michael

On 09/26/2016 09:02 AM, Michael Paquier wrote:
> On Mon, Sep 26, 2016 at 2:15 AM, David Steele <david@pgmasters.net> wrote:
>> On 9/3/16 8:36 AM, Michael Paquier wrote:
>>>
>>> Attached is a new series:
>
> Thanks for the review and the comments!

I read-through this again, and did a bunch of little fixes:

* Added error-handling for OOM and other errors in liybpq
* In libpq, added check that the server sent back the same client-nonce
* Turned ERRORs into COMMERRORs and removed DEBUG4 lines (they could
reveal useful information to an attacker)
* Improved comments

Some things that need to be resolved (I also added FIXME comments for
some of this):

* A source of random values. This currently uses PostmasterRandom()
similarly to how the MD5 salt is generated, in the server, but plain old
random() in the client. If built with OpenSSL, we should probably use
RAND_bytes(). But what if the client is built without OpenSSL? I believe
the protocol doesn't require cryptographically strong randomness for the
nonces, i.e. it's OK if they're predictable, but they should be
different for each session.

* Nonce and salt lengths. The patch currently uses 10 bytes for both,
but I think I just pulled number that out of thin air. The spec doesn't
say anything about nonce and salt lengths AFAICS. What do other
implementations use? Is 10 bytes enough?

* The spec defines a final "server-error" message that the server sends
on authentication failure, or e.g. if a required extension is not
supported. The patch just uses FATAL for those. Should we try to send a
server-error message instead, or before, the elog(FATAL) ?

I'll continue hacking this later, but need a little break for now.

>> I'm a bit concerned that a mixture of md5/scram could cause confusion
>> and think this may warrant discussion somewhere in the documentation
>> since the idea is for users to migrate from md5 to scram.
>
> We could finish with a red warning in the docs to say that users are
> recommended to use SCRAM instead of MD5. Just an idea, perhaps that's
> not mandatory for the first shot though.

Some sort of Migration Guide would certainly be in order. There isn't
any easy migration path with this patch series alone, so perhaps that
should be part of the follow-up patches that add the "MD5 or SCRAM"
authentication method to pg_hba.conf, or support for having both
verifiers for the same user in pg_authid.

- Heikki

On Thu, Sep 29, 2016 at 12:48 PM, Michael Paquier
<michael.paquier@gmail.com> wrote:
> On Wed, Sep 28, 2016 at 8:55 PM, Michael Paquier
> <michael.paquier@gmail.com> wrote:
>>> Our b64_encode routine does use whitespace, so we can't use it as is for
>>> SCRAM. As the patch stands, we might never output anything long enough to
>>> create linefeeds, but let's be tidy. The base64 implementation is about 100
>>> lines of code, so perhaps we should just leave src/backend/utils/encode.c
>>> alone, and make a new copy of the base64 routines in src/common.
>>
>> OK, I'll refresh that tomorrow with the rest. Thanks for the commit to
>> extend password_encryption.
>
> OK, so after more chatting with Heikki, here is a list of TODO items
> and a summary of the state of things:
> - base64 encoding routines should drop whitespace (' ', \r, \t), and
> it would be better to just copy those from the backend's encode.c to
> src/common/. No need to move escape and binary things, nor touch
> backend's base64 routines.
> - No need to move sha1.c to src/common/. Better to just get sha2.c
> into src/common/ as we aim at SCRAM-SHA-256.
> - random() called in the client is no good. We need something better here.
> - The error handling needs to be reworked and should follow the
> protocol presented by RFC5802, by sending back e= messages. This needs
> a bit of work, not much I think though as the infra is in place in the
> core patch.
> - Let's discard the md5-or-scram optional thing in pg_hba.conf. This
> complicates the error handling protocol.
>
> I am marking this patch as returned with feedback for current CF and
> will post a new set soon, moving it to the next CF once I have the new
> set of patches ready for posting.

And so we are back on that, with a new set:
- 0001, introducing pg_strong_random() in src/port/ to have the
backend portion of SCRAM use it instead of random(). This patch is
from Magnus who has kindly sent is to me, so the authorship goes to
him. This patch replaces at the same time PostmasterRandom() with it,
this way once SCRAM gets integrated both the frontend and the backend
finish using the same facility. I think that's good for consistency.
Compared to the version Magnus has sent me, I have changed two things:
-- Reading from /dev/urandom and /dev/random is not influenced by
EINTR. read() handling is also made better in case of partial reads
from a given source.
-- Win32 Crypto routines use MS_DEF_PROV instead of NULL. I think
that's a better idea to not let the user the choice of the encryption
source here.
- 0002, moving all the SHA2 functions to src/common/. As mentioned
upthread, this keeps the amount of code moved to src/common/ to a
minimum. I have been careful to get the header files and copyright
mentions into a correct shape at the same time. I have moved a couple
of code blocks in a shape that make a bit more sense, not sure how you
feel about that, Heikki.
- 0003, creating a set of base64 routines without whitespace handling.
That's more or less a copy of what is in encode.c, simplified for
SCRAM. At the same time I have prefixed the routines with pg_ to make
a difference with what is in encode.c.
- 0004 does some refactoring regarding encrypted passwords in user.c
- 0005 creates a generic routine to fetch password and valid until
values for a role
- 0006 adds support for SCRAM-SHA-256. I have not yet addressed the
concerns regarding the handling of e= messages yet. I have fixed the
nonce generation with random() though.
- 0007 adds the extension for CREATE ROLE .. PASSWORD foo USING protocol
- 0008 is a basic set of regression tests to test passwords.

To be honest, I have now put some love into 0001~0004, but less in the
rest. The first refactoring patches are going to be subject to enough
comments I guess :) I'll put more love into 0005~ in the next couple
of days though while reworking the message interface.

Thanks,
--
Michael

Attachment

Re: Password identifiers, protocol aging and SCRAM protocol

From

Heikki Linnakangas

Date:

14 October 2016, 12:09:07

On 10/12/2016 11:11 AM, Michael Paquier wrote:
> And so we are back on that, with a new set:

Great! I'm looking at this first one for now:

> - 0001, introducing pg_strong_random() in src/port/ to have the
> backend portion of SCRAM use it instead of random(). This patch is
> from Magnus who has kindly sent is to me, so the authorship goes to
> him. This patch replaces at the same time PostmasterRandom() with it,
> this way once SCRAM gets integrated both the frontend and the backend
> finish using the same facility. I think that's good for consistency.
> Compared to the version Magnus has sent me, I have changed two things:
> -- Reading from /dev/urandom and /dev/random is not influenced by
> EINTR. read() handling is also made better in case of partial reads
> from a given source.
> -- Win32 Crypto routines use MS_DEF_PROV instead of NULL. I think
> that's a better idea to not let the user the choice of the encryption
> source here.

I spent some time whacking that around:

* Renamed the file to src/port/pg_strong_random.c "pgsrandom" makes me 
think of srandom(), which this isn't.

* Changed pg_strong_random() to return false on error, and let the 
callers handle errors. That's more error-prone than throwing an error in 
the function itself, as it's an easy mistake to forget to check for the 
return value, but we can't just "exit(1)" if called in the frontend. If 
it gets called from libpq during authentication, as it will with SCRAM, 
we want to close the connection and report an error, not exit the whole 
user application. Likewise, in postmaster, if we fail to generate a 
query cancel key when forking a backend, we don't want to FATAL and shut 
down the whole postmaster.

* There used to be this:

>         /*
> -        * Precompute password salt values to use for this connection. It's
> -        * slightly annoying to do this long in advance of knowing whether we'll
> -        * need 'em or not, but we must do the random() calls before we fork, not
> -        * after.  Else the postmaster's random sequence won't get advanced, and
> -        * all backends would end up using the same salt...
> -        */
> -       RandomSalt(port->md5Salt, sizeof(port->md5Salt));

But that whole business of advancing postmaster's random sequence is 
moot now. So I moved the generation of md5 salt from postmaster to where 
MD5 authentication is performed.

* This comment in postmaster.c was wrong:

> @@ -581,7 +571,7 @@ PostmasterMain(int argc, char *argv[])
>       * Note: the seed is pretty predictable from externally-visible facts such
>       * as postmaster start time, so avoid using random() for security-critical
>       * random values during postmaster startup.  At the time of first
> -     * connection, PostmasterRandom will select a hopefully-more-random seed.
> +     * connection, pg_strong_random will select a hopefully-more-random seed.
>       */
>      srandom((unsigned int) (MyProcPid ^ MyStartTime));

We don't use pg_strong_random() for that, the same PID+timestamp method 
is still used as before. Adjusted the comment to reflect reality.

* Added "#include <Wincrypt.h>", for the CryptAcquireContext and 
CryptGenRandom functions? It compiled OK without that, so I guess it got 
pulled in via some other header file, but seems more clear and 
future-proof to #include it directly.

* random comment kibitzing (no pun intended).

This is pretty much ready for commit now, IMO, but please do review one 
more time. And I do have some small questions still:

* We now open and close /dev/(u)random on every pg_strong_random() call. 
Should we be worried about performance of that?

* Now that we don't call random() in postmaster anymore, is there any 
point in calling srandom() there (i.e. where the above incorrect comment 
was)? Should we remove it? random() might be used by pre-loaded 
extensions, though. (Hopefully not for cryptographic purposes.)

* Should we backport this? Sorry if we discussed that already, but I 
don't remember.

- Heikki

Re: Password identifiers, protocol aging and SCRAM protocol

From

Heikki Linnakangas

Date:

14 October 2016, 12:10:50

On 10/14/2016 03:08 PM, Heikki Linnakangas wrote:
> I spent some time whacking that around:

Sigh, forgot attachment. Here you go.

- Heikki

Attachment

0001-Replace-PostmasterRandom-with-a-stronger-way-of-gene.patch

Re: Password identifiers, protocol aging and SCRAM protocol

From

Michael Paquier

Date:

15 October 2016, 13:26:49

On Fri, Oct 14, 2016 at 9:08 PM, Heikki Linnakangas <hlinnaka@iki.fi> wrote:
> On 10/12/2016 11:11 AM, Michael Paquier wrote:
> * Changed pg_strong_random() to return false on error, and let the callers
> handle errors. That's more error-prone than throwing an error in the
> function itself, as it's an easy mistake to forget to check for the return
> value, but we can't just "exit(1)" if called in the frontend. If it gets
> called from libpq during authentication, as it will with SCRAM, we want to
> close the connection and report an error, not exit the whole user
> application. Likewise, in postmaster, if we fail to generate a query cancel
> key when forking a backend, we don't want to FATAL and shut down the whole
> postmaster.

Okay for this one. Indeed that's a cleaner interface.

> This is pretty much ready for commit now, IMO, but please do review one more
> time.

OK, I had an extra lookup and the patch looks in pretty good shape
seen from here.

-   MyCancelKey = PostmasterRandom();
+   if (!pg_strong_random(&MyCancelKey, sizeof(MyCancelKey)))
+   {
+       rw->rw_crashed_at = GetCurrentTimestamp();
+       return false;
+   }
It would be nice to LOG an entry here for bgworkers.

+               /*
+                * fork failed, fall through to report -- actual error
message was
+                * logged by StartAutoVacWorker
+                */
Since you created a new block, the first line gets longer than 80 characters.

> * We now open and close /dev/(u)random on every pg_strong_random() call.
> Should we be worried about performance of that?

Actually I have hacked up a small program that can be used to compare
using /dev/urandom with random() calls (this emulates RandomSalt), and
opening/closing /dev/urandom causes a performance hit, but the
difference becomes noticeable with loop calls higher than 10k on my
Linux laptop. I recall that /dev/urandom is quite slow on Linux
compared to other platforms still... So for a single call per
connection attempt we won't actually notice it much. I am just
attaching that if you want to play with it, and you can use it as
follows:
./calc [dev|random] nbytes loops
That's really a quick hack but it does the job if you worry about the
performance.

> * Now that we don't call random() in postmaster anymore, is there any point
> in calling srandom() there (i.e. where the above incorrect comment was)?
> Should we remove it? random() might be used by pre-loaded extensions,
> though. (Hopefully not for cryptographic purposes.)

That's the business of the maintainers such modules, so my heart is
telling me to rip it off, but my mind tells me that there is no point
in making them unhappy either if they rely on it. I'd trust my mind on
this one, other opinions are welcome.

> * Should we backport this? Sorry if we discussed that already, but I don't
> remember.

I think that we discussed quickly the point at last PGCon during the
SCRAM-committee-unofficial meeting, and that we talked about doing
that only for HEAD.
--
Michael

Attachment

calculate_random.c

Re: Password identifiers, protocol aging and SCRAM protocol

From

Heikki Linnakangas

Date:

17 October 2016, 08:55:30

On 10/15/2016 04:26 PM, Michael Paquier wrote:
>> * Now that we don't call random() in postmaster anymore, is there any point
>> in calling srandom() there (i.e. where the above incorrect comment was)?
>> Should we remove it? random() might be used by pre-loaded extensions,
>> though. (Hopefully not for cryptographic purposes.)
>
> That's the business of the maintainers such modules, so my heart is
> telling me to rip it off, but my mind tells me that there is no point
> in making them unhappy either if they rely on it. I'd trust my mind on
> this one, other opinions are welcome.

I kept it for now. Doesn't do any harm either, even if it's unnecessary.

>> * Should we backport this? Sorry if we discussed that already, but I don't
>> remember.
>
> I think that we discussed quickly the point at last PGCon during the
> SCRAM-committee-unofficial meeting, and that we talked about doing
> that only for HEAD.

Ok, committed to HEAD.

Thanks!

- Heikki

Re: Password identifiers, protocol aging and SCRAM protocol

From

Michael Paquier

Date:

17 October 2016, 09:18:58

On Mon, Oct 17, 2016 at 5:55 PM, Heikki Linnakangas <hlinnaka@iki.fi> wrote:
> On 10/15/2016 04:26 PM, Michael Paquier wrote:
>>>
>>> * Now that we don't call random() in postmaster anymore, is there any
>>> point
>>> in calling srandom() there (i.e. where the above incorrect comment was)?
>>> Should we remove it? random() might be used by pre-loaded extensions,
>>> though. (Hopefully not for cryptographic purposes.)
>>
>>
>> That's the business of the maintainers such modules, so my heart is
>> telling me to rip it off, but my mind tells me that there is no point
>> in making them unhappy either if they rely on it. I'd trust my mind on
>> this one, other opinions are welcome.
>
>
> I kept it for now. Doesn't do any harm either, even if it's unnecessary.
>
>>> * Should we backport this? Sorry if we discussed that already, but I
>>> don't
>>> remember.
>>
>>
>> I think that we discussed quickly the point at last PGCon during the
>> SCRAM-committee-unofficial meeting, and that we talked about doing
>> that only for HEAD.
>
>
> Ok, committed to HEAD.

You removed the part of pgcrypto in charge of randomness, nice move. I
was wondering about how to do with the perfc and the unix_std at some
point, and ripping them off as you did is fine for me.
-- 
Michael

Re: Password identifiers, protocol aging and SCRAM protocol

From

Heikki Linnakangas

Date:

17 October 2016, 09:27:30

On 10/17/2016 12:18 PM, Michael Paquier wrote:
> You removed the part of pgcrypto in charge of randomness, nice move. I
> was wondering about how to do with the perfc and the unix_std at some
> point, and ripping them off as you did is fine for me.

Yeah. I didn't understand the need for the perfc stuff. Are there 
Windows systems that don't have the Crypto APIs? I doubt it, but the 
buildfarm will tell us in a moment if there are.

And if we don't have a good source of randomness like /dev/random, I 
think it's better to fail, than try to collect entropy ourselves (which 
is what unix_std did). If there's a platform where that doesn't work, 
someone will hopefully send us a patch, rather than silently fall back 
to an iffy implementation.

- Heikki

Re: Password identifiers, protocol aging and SCRAM protocol

From

Heikki Linnakangas

Date:

17 October 2016, 14:41:20

On 10/17/2016 12:27 PM, Heikki Linnakangas wrote:
> On 10/17/2016 12:18 PM, Michael Paquier wrote:
>> You removed the part of pgcrypto in charge of randomness, nice move. I
>> was wondering about how to do with the perfc and the unix_std at some
>> point, and ripping them off as you did is fine for me.
>
> Yeah. I didn't understand the need for the perfc stuff. Are there
> Windows systems that don't have the Crypto APIs? I doubt it, but the
> buildfarm will tell us in a moment if there are.
>
> And if we don't have a good source of randomness like /dev/random, I
> think it's better to fail, than try to collect entropy ourselves (which
> is what unix_std did). If there's a platform where that doesn't work,
> someone will hopefully send us a patch, rather than silently fall back
> to an iffy implementation.

Looks like Tom's old HP-UX box, pademelon, is not happy about this. Does 
(that version of) HP-UX not have /dev/urandom?

I think we're going to need a bit more logging if no randomness source 
is available. What we have now is just "could not generate random query 
cancel key", which isn't very informative. Perhaps we should also call 
pg_strong_random() once at postmaster startup, to check that it works, 
instead of starting up but not accepting any connections.

- Heikki

Re: Password identifiers, protocol aging and SCRAM protocol

From

Michael Paquier

Date:

18 October 2016, 07:35:40

On Mon, Oct 17, 2016 at 6:18 PM, Michael Paquier
<michael.paquier@gmail.com> wrote:
> On Mon, Oct 17, 2016 at 5:55 PM, Heikki Linnakangas <hlinnaka@iki.fi> wrote:
>> Ok, committed to HEAD.

Attached is a rebased patch set for SCRAM, with the following things:
- 0001, moving all the SHA2 functions to src/common/ and introducing a
PG-like interface. No actual changes here.
- 0002, creating a set of base64 routines without whitespace handling.
Previous version sent had a bug: I missed the point that the backend
version of base64 was adding a newline every 76 characters. So this is
removed to make the encoding not using any whitespace. Also the
routines are reworked so as they return -1 in the event of an error
instead of generating an elog by themselves. That will be useful for
SCRAM that needs to do its own error handling with the e= messages
from the server. I think that's cleaner this way. Encoding does not
have any error code paths, but decoding has, so one possible
improvement would be to add in arguments a string to store an error
message to make things easier for callers to debug.
- 0003 does some refactoring regarding encrypted passwords in user.c.
I am pretty happy with this one as well.
- 0004 adds the extension for CREATE ROLE .. PASSWORD foo USING
protocol. I found a bug in this one when using CREATE|ALTER ROLE ..
PASSWORD missing to update the given password correctly using
password_encryption. This one I am happy with it. Even if it depends
on 0005 in this patch set it is possible to make it independent of it
to introduce the grammar just for 'plain' and 'md5' first. In previous
sets it was located after SCRAM, but it looks cleaner to get that
first. I don't think I am going to change that much more now.
- 0005 adds support for SCRAM-SHA-256. There is still some work to do
here, particularly the error handling that requires to be extended
with the e= messages sent back to the client before moving to a
PG-like error code path. Those need to be set in the context of the
SASL message exchange. I noticed as well that this is missing a hell
lot of error checks when building the exchange messages, and when
doing encoding and decoding of base64 strings. I'll address that in
the next couple of days.
- 0006 is the basic set of regression tests for passwords. Nothing new
here, they are useful as basic tests when checking the patch. I don't
think that they are worth having committed at the end.
--
Michael

Attachment

Re: Password identifiers, protocol aging and SCRAM protocol

From

Michael Paquier

Date:

20 October 2016, 05:14:23

On Tue, Oct 18, 2016 at 4:35 PM, Michael Paquier
<michael.paquier@gmail.com> wrote:
> On Mon, Oct 17, 2016 at 6:18 PM, Michael Paquier
> <michael.paquier@gmail.com> wrote:
>> On Mon, Oct 17, 2016 at 5:55 PM, Heikki Linnakangas <hlinnaka@iki.fi> wrote:
>>> Ok, committed to HEAD.
>
> Attached is a rebased patch set for SCRAM, with the following things:
> [...]

And as the PostmasterRandom() patch has been reverted, here is once
again a new set:
- 0001, moving all the SHA2 functions to src/common/ and introducing a
PG-like interface. No actual changes here.
- 0002, replacing PostmasterRandom by pg_strong_random(), with a fix
for the cancel key problem.
- 0003, adding for pg_strong_random() a fallback for any nix platform
not having /dev/random. This should be grouped with 0002, but I split
it for clarity.
- 0004, Add encoding routines for base64 without whitespace in
src/common/. I improved the error handling here by making them return
-1 in case of error and let the caller handle the error.
- 0005, Refactor decision-making of password encryption into a single routine.
- 0006, Add clause PASSWORD val USING protocol to CREATE/ALTER ROLE.
- 0007, the SCRAM implementation. I have reworked the error handling
on both the frontend and the backend. In the frontend, there were many
code paths that did not bother much about many sanity checks like
OOMs, so I addressed that as a whole thing. For the backend, in the
event of an error, the backend sends back to the client a e= message
with an error string corresponding to what happened per RFC5802.
Sanity checks of the user data on the server (get the SCRAM verifier,
its validuntil, empty password and the user name itself), are made
part of the message exchange as in case of errors we need to return
errors like e=unknown-user, e=other-errors and stuff similar to that.
This makes the code in auth.c slightly cleaner btw.
- 0008 is a set of regression tests.

The PostmasterRandom() patch sent in this set contains the fix for
cancel keys that were previously broken. I have also implemented a
fallback method in 0003 inspired by pgcrypto's try_unix_std. It simply
uses gettimeofday() (should be put in the upper loop actually now that
I think about it!), getpid() and random() to generate some randomness,
and then processes the whole through a SHA-256 hash, generating chunks
of random data worth of SHA256_DIGEST_LENGTH bytes. I have not added a
./configure switch for it, but there were voices in favor of that. And
this is not available on Windows (no need to care anyway as there are
crypto APIs). A requirement of this patch is to have the SHA-256
routines in src/common/ first, and this will allow any platform
without /dev/random to generate random numbers like pademelon.

The fallback method for the pg_strong_random() is clearly not ready
for commit, one reason is that libpgport should stand at a level lower
than libpgcommon as far as I understand. But this patch makes
pg_strong_random() in src/port depend on the SHA2 routines in
src/common so it would make more sense if pg_strong_random() is moved
as well to src/common instead of src/port. Honestly I think that we'd
get away better with something like that than trying for example to
reimplement a dependency with PRNG knowing that OpenSSL does it
already, and perhaps better than we could do it.

Thoughts welcome. A lot of bits are independent of that part in the
patch set anyway.
--
Michael

On Sat, Nov 5, 2016 at 9:36 PM, Michael Paquier
<michael.paquier@gmail.com> wrote:
> On Sat, Nov 5, 2016 at 12:58 AM, Peter Eisentraut
> <peter.eisentraut@2ndquadrant.com> wrote:
>> The organization of these patches makes sense to me.
>>
>> On 10/20/16 1:14 AM, Michael Paquier wrote:
>>> - 0001, moving all the SHA2 functions to src/common/ and introducing a
>>> PG-like interface. No actual changes here.
>>
>> That's probably alright, although the patch contains a lot more changes
>> than I would imagine for a simple file move.  I'll still have to review
>> that in detail.
>
> The main point is to know if people are happy of having an interface
> of the type pg_sha256_[init|update|finish] to tackle the fact that
> core code contains a set of routines that map with some of the OpenSSL
> APIs...

Or in short that:
+extern void pg_sha256_init(pg_sha256_ctx *ctx);
+extern void pg_sha256_update(pg_sha256_ctx *ctx,
+                       const uint8 *input0, size_t len);
+extern void pg_sha256_final(pg_sha256_ctx *ctx, uint8 *dest);

>>> - 0005, Refactor decision-making of password encryption into a single routine.
>>
>> It makes sense to factor this out.  We probably don't need the pstrdup
>> if we just keep the string as is.  (You could make an argument for it if
>> the input values were const char *.)  We probably also don't need the
>> pfree.  The Assert(0) can probably be done better.  We usually use
>> elog() in such cases.
>
> Hm, OK. Agreed with that.

I have replaced the Assert(0) with an elog(ERROR). OK for the
additional palloc and pfree calls. I just made that for consistency in
the routine for all the password types, but changed your way.

>>> - 0006, Add clause PASSWORD val USING protocol to CREATE/ALTER ROLE.
>>
>> "protocol" is a weird choice here.  Maybe something like "method" is
>> better.  The way the USING clause is placed can be confusing.  It's not
>> clear that it belongs to PASSWORD.  If someone wants to augment another
>> clause in CREATE ROLE with a secondary argument, then it could get
>> really confusing.  I'd suggest something to group things together, like
>> PASSWORD (val USING method).  The method could be an identifier instead
>> of a string.
>
> Why not.

Done.

>> Please add an example to the documentation and explain better how this
>> interacts with the existing ENCRYPTED PASSWORD clause.
>
> Sure.

Done.

>>> - 0007, the SCRAM implementation.
>>
>> No documentation about pg_hba.conf changes, so I don't know how to use
>> this. ;-)
>
> Oops. I have focused on the code a lot during last rewrite of the
> patch and forgot that. I'll think about something.
>
>> This implements SASL and SCRAM and SHA256.  We need to be clear about
>> which term we advertise to users.  An explanation in the missing
>> documentation would probably be a good start.
>
> pg_hba.conf uses "scram" as keyword, but scram refers to a family of
> authentication methods. There is as well SCRAM-SHA-1, SCRAM-SHA-256
> (what this patch does). Hence wouldn't it make sense to use
> scram_sha256 in pg_hba.conf instead? If for example in the future
> there is a SHA-512 version of SCRAM we could switch easily to that and
> define scram_sha512.

OK, I have added more docs regarding the use of scram in pg_hba.conf,
particularly in client-auth.sgml to describe what scram is better than
md5 in terms of protection, and also completed the data of pg_hba.conf
about the new keyword used in it.

>> I would also like to see a test suite that covers the authentication
>> specifically.
>
> What you have in mind is a TAP test with a couple of roles and
> pg_hba.conf getting rewritten then reloaded? Adding it in
> src/test/recovery/ is the first place that comes in mind but that's
> not really something related to recovery... Any ideas?

OK, hearing no complaints I have done exactly that and added a test in
src/test/recovery/ with patch 0009. This place may not be the best fit
though, but it looks like an overkill to add a new module in
src/test/modules just for that and that's a pretty compact test.

On Wed, Nov 9, 2016 at 3:13 PM, Victor Wagner <vitus@wagner.pp.ru> wrote:
> On Tue, 18 Oct 2016 16:35:27 +0900
> Michael Paquier <michael.paquier@gmail.com> wrote:
>> Attached is a rebased patch set for SCRAM, with the following things:
>> - 0001, moving all the SHA2 functions to src/common/ and introducing a
>> PG-like interface. No actual changes here.
>
> It seems, that client nonce generation in this patch is not
> RFC-compliant.
>
> RFC 5802 states that SCRAM nonce should be
>
> a sequence of random printable ASCII
>       characters excluding ','
>
> while this patch uses sequence of random bytes from pg_strong_random
> function with zero byte appended.

Right, I have fixed that in 0007 with a solution less exotic than what
you suggested upthread by scanning the ASCII characters between '!'
and '~', ignoring comma if selected.
--
Michael

On Wed, Nov 16, 2016 at 4:46 AM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Tue, Nov 15, 2016 at 5:12 PM, Michael Paquier
> <michael.paquier@gmail.com> wrote:
>> On Tue, Nov 15, 2016 at 12:40 PM, Robert Haas <robertmhaas@gmail.com> wrote:
>>> On Tue, Nov 15, 2016 at 2:24 PM, Michael Paquier
>>> <michael.paquier@gmail.com> wrote:
>>>> How do you plug in that with OpenSSL? Are you suggesting to use a set
>>>> of undef definitions in the new header in the same way as pgcrypto is
>>>> doing, which is rather ugly? Because that's what the deal is about in
>>>> this patch.
>>>
>>> Perhaps that justifies renaming them -- although I would think the
>>> fact that they are static would prevent conflicts -- but why reorder
>>> them and change variable names?
>>
>> Yeah... Perhaps I should not have done that, which was just for
>> consistency's sake, and even if the new reordering makes more sense
>> actually...
>
> Yeah, I don't see a point to that.

OK, by doing so here is what I have. The patch generated by
format-patch, as well as diffs generated by git diff -M are reduced
and the patch gets half in size. They could be reduced more by adding
at the top of sha2.c a couple of defined to map the old SHAXXX_YYY
variables with their PG_ equivalents, but that does not seem worth it
to me, and diffs are listed line by line.
--
Michael

Attachment

0001-Refactor-SHA2-functions-and-move-them-to-src-common.patch

Re: Password identifiers, protocol aging and SCRAM protocol

From

Robert Haas

Date:

16 November 2016, 19:24:59

On Wed, Nov 16, 2016 at 1:53 PM, Michael Paquier
<michael.paquier@gmail.com> wrote:
>> Yeah, I don't see a point to that.
>
> OK, by doing so here is what I have. The patch generated by
> format-patch, as well as diffs generated by git diff -M are reduced
> and the patch gets half in size. They could be reduced more by adding
> at the top of sha2.c a couple of defined to map the old SHAXXX_YYY
> variables with their PG_ equivalents, but that does not seem worth it
> to me, and diffs are listed line by line.

All right, this version is much easier to review.  I am a bit puzzled,
though.  It looks like src/common will include sha2.o if built without
OpenSSL and sha2_openssl.o if built with OpenSSL.  So far, so good.
One would think, then, that pgcrypto would not need to worry about
these functions any more because libpgcommon_srv.a is linked into the
server, so any references to those symbols would presumably just work.
However, that's not what you did.  On Windows, you added a dependency
on libpgcommon which I think is unnecessary because that stuff is
already linked into the server.  On non-Windows systems, however, you
have instead taught pgcrypto to copy the source file it needs from
src/common and recompile it.  I don't understand why you need to do
any of that, or why it should be different on Windows vs. non-Windows.
So I think that the changes for the pgcrypto Makefile could just look
like this:

diff --git a/contrib/pgcrypto/Makefile b/contrib/pgcrypto/Makefile
index 805db76..ddb0183 100644
--- a/contrib/pgcrypto/Makefile
+++ b/contrib/pgcrypto/Makefile
@@ -1,6 +1,6 @@# contrib/pgcrypto/Makefile

-INT_SRCS = md5.c sha1.c sha2.c internal.c internal-sha2.c blf.c rijndael.c \
+INT_SRCS = md5.c sha1.c internal.c internal-sha2.c blf.c rijndael.c \        fortuna.c random.c pgp-mpi-internal.c
imath.cINT_TESTS= sha2

And for Mkvcbuild.pm I think you could just do this:

diff --git a/src/tools/msvc/Mkvcbuild.pm b/src/tools/msvc/Mkvcbuild.pm
index de764dd..1993764 100644
--- a/src/tools/msvc/Mkvcbuild.pm
+++ b/src/tools/msvc/Mkvcbuild.pm
@@ -114,6 +114,15 @@ sub mkvcbuild      md5.c pg_lzcompress.c pgfnames.c psprintf.c relpath.c rmtree.c      string.c
username.cwait_error.c);

+    if ($solution->{options}->{openssl})
+    {
+        push(@pgcommonallfiles, 'sha2_openssl.c');
+    }
+    else
+    {
+        push(@pgcommonallfiles, 'sha2.c');
+    }
+    our @pgcommonfrontendfiles = (        @pgcommonallfiles, qw(fe_memutils.c file_utils.c
restricted_token.c));
@@ -422,7 +431,7 @@ sub mkvcbuild    {        $pgcrypto->AddFiles(            'contrib/pgcrypto',   'md5.c',
-            'sha1.c',             'sha2.c',
+            'sha1.c',            'internal.c',         'internal-sha2.c',            'blf.c',
'rijndael.c',           'fortuna.c',          'random.c',

Is there some reason that won't work?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Password identifiers, protocol aging and SCRAM protocol

From

Michael Paquier

Date:

16 November 2016, 23:56:19

On Wed, Nov 16, 2016 at 11:24 AM, Robert Haas <robertmhaas@gmail.com> wrote:
> diff --git a/contrib/pgcrypto/Makefile b/contrib/pgcrypto/Makefile
> index 805db76..ddb0183 100644
> --- a/contrib/pgcrypto/Makefile
> +++ b/contrib/pgcrypto/Makefile
> @@ -1,6 +1,6 @@
>  # contrib/pgcrypto/Makefile
>
> -INT_SRCS = md5.c sha1.c sha2.c internal.c internal-sha2.c blf.c rijndael.c \
> +INT_SRCS = md5.c sha1.c internal.c internal-sha2.c blf.c rijndael.c \
>          fortuna.c random.c pgp-mpi-internal.c imath.c
>  INT_TESTS = sha2

I would like to do so. And while Linux is happy with that, macOS is
not, this results in linking resolution errors when compiling the
library.

> And for Mkvcbuild.pm I think you could just do this:
>
> diff --git a/src/tools/msvc/Mkvcbuild.pm b/src/tools/msvc/Mkvcbuild.pm
> index de764dd..1993764 100644
> --- a/src/tools/msvc/Mkvcbuild.pm
> +++ b/src/tools/msvc/Mkvcbuild.pm
> @@ -114,6 +114,15 @@ sub mkvcbuild
>        md5.c pg_lzcompress.c pgfnames.c psprintf.c relpath.c rmtree.c
>        string.c username.c wait_error.c);
>
> +    if ($solution->{options}->{openssl})
> +    {
> +        push(@pgcommonallfiles, 'sha2_openssl.c');
> +    }
> +    else
> +    {
> +        push(@pgcommonallfiles, 'sha2.c');
> +    }
> +
>      our @pgcommonfrontendfiles = (
>          @pgcommonallfiles, qw(fe_memutils.c file_utils.c
>            restricted_token.c));
> @@ -422,7 +431,7 @@ sub mkvcbuild
>      {
>          $pgcrypto->AddFiles(
>              'contrib/pgcrypto',   'md5.c',
> -            'sha1.c',             'sha2.c',
> +            'sha1.c',
>              'internal.c',         'internal-sha2.c',
>              'blf.c',              'rijndael.c',
>              'fortuna.c',          'random.c',
>
> Is there some reason that won't work?

Yes we could do that for consistency with the other nix platforms. But
is that really necessary as libpgcommon already has those objects?
-- 
Michael

Re: Password identifiers, protocol aging and SCRAM protocol

From

Robert Haas

Date:

17 November 2016, 00:29:48

On Wed, Nov 16, 2016 at 6:56 PM, Michael Paquier
<michael.paquier@gmail.com> wrote:
> On Wed, Nov 16, 2016 at 11:24 AM, Robert Haas <robertmhaas@gmail.com> wrote:
>> diff --git a/contrib/pgcrypto/Makefile b/contrib/pgcrypto/Makefile
>> index 805db76..ddb0183 100644
>> --- a/contrib/pgcrypto/Makefile
>> +++ b/contrib/pgcrypto/Makefile
>> @@ -1,6 +1,6 @@
>>  # contrib/pgcrypto/Makefile
>>
>> -INT_SRCS = md5.c sha1.c sha2.c internal.c internal-sha2.c blf.c rijndael.c \
>> +INT_SRCS = md5.c sha1.c internal.c internal-sha2.c blf.c rijndael.c \
>>          fortuna.c random.c pgp-mpi-internal.c imath.c
>>  INT_TESTS = sha2
>
> I would like to do so. And while Linux is happy with that, macOS is
> not, this results in linking resolution errors when compiling the
> library.

Well, I'm running macOS and it worked for me.  TBH, I don't even quite
understand how it could NOT work.  What makes the symbols provided by
libpgcommon any different from any other symbols that are part of the
binary?  How could one set work and the other set fail?  I can
understand how there might be some problem if the backend were
dynamically linked libpgcommon, but it's not.  It's doing this:

gcc -Wall -Wmissing-prototypes -Wpointer-arith
-Wdeclaration-after-statement -Wendif-labels
-Wmissing-format-attribute -Wformat-security -fno-strict-aliasing
-fwrapv -g -O2 -Wall -Werror -L../../src/port -L../../src/common
-Wl,-dead_strip_dylibs  -Wall -Werror   access/brin/brin.o [many more
.o files omitted for brevity] utils/fmgrtab.o
../../src/timezone/localtime.o ../../src/timezone/strftime.o
../../src/timezone/pgtz.o ../../src/port/libpgport_srv.a
../../src/common/libpgcommon_srv.a -lm -o postgres

As I understand it, listing the .a file on the linker command line
like that is exactly equivalent to listing out each individual .o file
that is part of that static library.  There shouldn't be any
difference in how a symbol that's provided by one of the .o files
looks vs. how a symbol that's provided by one of the .a files looks.
Let's test it.

[rhaas pgsql]$ nm src/backend/postgres | grep -E 'GetUserIdAndContext|psprintf'
00000001003d71d0 T _GetUserIdAndContext
000000010040f160 T _psprintf

So... how would the dynamic loader know that it was supposed to find
the first one and fail to find the second one?  More to the point,
it's clear that it DOES find the second one on every platform in the
buildfarm, because adminpack, dblink, pageinspect, and pgstattuple all
use psprintf without the push-ups you are proposing to undertake here.
pg_md5_encrypt is used by passwordcheck, and forkname_to_number is
used by pageinspect and pg_prewarm.  It all just works.  No special
magic required.

> Yes we could do that for consistency with the other nix platforms. But
> is that really necessary as libpgcommon already has those objects?

The point is that *postgres* already has those objects.  You don't
need to include them twice.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Password identifiers, protocol aging and SCRAM protocol

From

Andres Freund

Date:

17 November 2016, 00:36:35

Hi,

On 2016-11-16 19:29:41 -0500, Robert Haas wrote:
> On Wed, Nov 16, 2016 at 6:56 PM, Michael Paquier
> <michael.paquier@gmail.com> wrote:
> > On Wed, Nov 16, 2016 at 11:24 AM, Robert Haas <robertmhaas@gmail.com> wrote:
> >> diff --git a/contrib/pgcrypto/Makefile b/contrib/pgcrypto/Makefile
> >> index 805db76..ddb0183 100644
> >> --- a/contrib/pgcrypto/Makefile
> >> +++ b/contrib/pgcrypto/Makefile
> >> @@ -1,6 +1,6 @@
> >>  # contrib/pgcrypto/Makefile
> >>
> >> -INT_SRCS = md5.c sha1.c sha2.c internal.c internal-sha2.c blf.c rijndael.c \
> >> +INT_SRCS = md5.c sha1.c internal.c internal-sha2.c blf.c rijndael.c \
> >>          fortuna.c random.c pgp-mpi-internal.c imath.c
> >>  INT_TESTS = sha2
> >
> > I would like to do so. And while Linux is happy with that, macOS is
> > not, this results in linking resolution errors when compiling the
> > library.
>
> Well, I'm running macOS and it worked for me.  TBH, I don't even quite
> understand how it could NOT work.  What makes the symbols provided by
> libpgcommon any different from any other symbols that are part of the
> binary?  How could one set work and the other set fail?  I can
> understand how there might be some problem if the backend were
> dynamically linked libpgcommon, but it's not.  It's doing this:

With -Wl,--as-neeeded the linker will dismiss unused symbols found in a
static library. Maybe that's the difference?

Andres

Re: Password identifiers, protocol aging and SCRAM protocol

From

Robert Haas

Date:

17 November 2016, 02:51:50

On Wed, Nov 16, 2016 at 7:36 PM, Andres Freund <andres@anarazel.de> wrote:
> With -Wl,--as-neeeded the linker will dismiss unused symbols found in a
> static library. Maybe that's the difference?

The man page --as-needed says that --as-needed modifies the behavior
of dynamic libraries, not static ones.  If there is any such effect,
it is undocumented.  Here is the text:

LD> This option affects ELF DT_NEEDED tags for dynamic libraries mentioned
LD> on the command line after the --as-needed option. Normally the linker will
LD> add a DT_NEEDED tag for each dynamic library mentioned on the
LD> command line, regardless of whether the library is actually needed or not.
LD> --as-needed causes a DT_NEEDED tag to only be emitted for a library
LD> that at that point in the link satisfies a non-weak undefined
symbol reference
LD> from a regular object file or, if the library is not found in the DT_NEEDED
LD> lists of other needed libraries, a non-weak undefined symbol reference
LD> from another needed dynamic library. Object files or libraries appearing
LD> on the command line after the library in question do not affect whether the
LD> library is seen as needed. This is similar to the rules for
extraction of object
LD> files from archives. --no-as-needed restores the default behaviour.

Some experimentation on my Mac reveals that my previous statement
about how this works was incorrect.  See attached patch for what I
tried.  What I find is:

1. If I create an additional source file in src/common containing a
completely unused symbol (wunk) it appears in the nm output for
libpgcommon_srv.a but not in the nm output for the postgres binary.

2. If I add an additional function to an existing source file in
src/common containing a completely unused symbol (quux) it appears in
the nm output for both libpgcommon_srv.a and also in the nm output for
the postgres binary.

3. If I create an additional source file in src/backend containing a
completely unused symbol (blarfle) it appears in the nm output for the
postgres binary.

So, it seems that the linker is willing to drop archive members if the
entire .o file is used, but not individual symbols.  That explains why
Michael thinks we need to do something special here, because with his
0001 patch, nothing in the new sha2(_openssl).c file would immediately
be used in the backend.  And indeed I see now that my earlier testing
was done incorrectly, and pgcrypto does in fact fail to build under my
proposal.  Oops.

But I think that's a temporary thing.  As soon as the backend is using
the sha2 routines for anything (which is the point, right?) the build
changes become unnecessary.  For example, if I apply this patch:

--- a/src/backend/lib/binaryheap.c
+++ b/src/backend/lib/binaryheap.c
@@ -305,3 +305,7 @@ sift_down(binaryheap *heap, int node_off)
                node_off = swap_off;
        }
 }
+
+#include "common/sha2.h"
+extern void ugh(void);
+void ugh(void) { pg_sha224_init(NULL); }

...then the backend ends up sucking in everything in sha2.c and the
pgcrypto build works again.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Attachment

wunk.patch

Re: Password identifiers, protocol aging and SCRAM protocol

From

Michael Paquier

Date:

17 November 2016, 04:04:46

On Wed, Nov 16, 2016 at 6:51 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> So, it seems that the linker is willing to drop archive members if the
> entire .o file is used, but not individual symbols.  That explains why
> Michael thinks we need to do something special here, because with his
> 0001 patch, nothing in the new sha2(_openssl).c file would immediately
> be used in the backend.  And indeed I see now that my earlier testing
> was done incorrectly, and pgcrypto does in fact fail to build under my
> proposal.  Oops.

Ah, thanks! I did not notice that before in configure.in:
if test "$PORTNAME" = "darwin"; then PGAC_PROG_CC_LDFLAGS_OPT([-Wl,-dead_strip_dylibs], $link_test_func)
elif test "$PORTNAME" = "openbsd"; then PGAC_PROG_CC_LDFLAGS_OPT([-Wl,-Bdynamic], $link_test_func)
else PGAC_PROG_CC_LDFLAGS_OPT([-Wl,--as-needed], $link_test_func)
fi

In the current set of patches, the sha2 functions would not get used
until the main patch for SCRAM gets committed so that's a couple of
steps and many months ahead.. And --as-needed/--no-as-needed are not
supported in macos. So I would believe that the best route is just to
use this patch with the way it does things, and once SCRAM gets in we
could switch the build into more appropriate linking. At least that's
far less ugly than having fake objects in the backend code. Of course
a comment in pgcrypo's Makefile would be appropriate.
-- 
Michael

Re: Password identifiers, protocol aging and SCRAM protocol

From

Michael Paquier

Date:

17 November 2016, 04:28:58

On Wed, Nov 16, 2016 at 8:04 PM, Michael Paquier
<michael.paquier@gmail.com> wrote:
> In the current set of patches, the sha2 functions would not get used
> until the main patch for SCRAM gets committed so that's a couple of
> steps and many months ahead.. And --as-needed/--no-as-needed are not
> supported in macos. So I would believe that the best route is just to
> use this patch with the way it does things, and once SCRAM gets in we
> could switch the build into more appropriate linking. At least that's
> far less ugly than having fake objects in the backend code. Of course
> a comment in pgcrypo's Makefile would be appropriate.

Or a comment with a "ifeq ($(PORTNAME), darwin)" containing the
additional objects to make clear that this is proper to only OSX.
Other ideas are welcome.
-- 
Michael

Re: Password identifiers, protocol aging and SCRAM protocol

From

Robert Haas

Date:

17 November 2016, 16:12:40

On Wed, Nov 16, 2016 at 11:28 PM, Michael Paquier
<michael.paquier@gmail.com> wrote:
> On Wed, Nov 16, 2016 at 8:04 PM, Michael Paquier
> <michael.paquier@gmail.com> wrote:
>> In the current set of patches, the sha2 functions would not get used
>> until the main patch for SCRAM gets committed so that's a couple of
>> steps and many months ahead.. And --as-needed/--no-as-needed are not
>> supported in macos. So I would believe that the best route is just to
>> use this patch with the way it does things, and once SCRAM gets in we
>> could switch the build into more appropriate linking. At least that's
>> far less ugly than having fake objects in the backend code. Of course
>> a comment in pgcrypo's Makefile would be appropriate.
>
> Or a comment with a "ifeq ($(PORTNAME), darwin)" containing the
> additional objects to make clear that this is proper to only OSX.
> Other ideas are welcome.

So, the problem isn't Darwin-specific.  I experimented with this on
Linux and found Linux does the same thing with libpgcommon_srv.a that
macOS does: a file in the archive that is totally unused is omitted
from the postgres binary.  In Linux, however, that doesn't prevent
pgcrypto from compiling anyway.  It does, however, prevent it from
working.  Instead of failing at compile time with a complaint about
missing symbols, it fails at load time.  I think that's because macOS
has -bundle-loader and we use it; without that, I think we'd get the
same behavior on macOS that we get on Windows.

The fundamental problem here is that the archive-member-dropping
behavior that we're getting here is not really what we want, and I
think that's going to happen on most or all architectures.  For GNU
ld, we could add -Wl,--whole-archive, and macOS has -all_load, but I
that this is just a nest of portability problems waiting to happen.  I
think there are two things we can do here that are far simpler:

1. Rejigger things so that we don't build libpgcommon_srv.a in the
first place, and instead add $(top_builddir)/src/common to
src/backend/Makefile's value of SUBDIRS.  With appropriate adjustments
to src/common/Makefile, this should allow us to include all of the
object files on the linker command line individually instead of
building an archive library that is then used only for the postgres
binary itself anyway.  Then, things wouldn't get dropped.

2. Just postpone committing this patch until we're ready to use the
new code in the backend someplace (or add a dummy reference to it
someplace).

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Password identifiers, protocol aging and SCRAM protocol

From

Michael Paquier

Date:

17 November 2016, 17:51:56

On Thu, Nov 17, 2016 at 8:12 AM, Robert Haas <robertmhaas@gmail.com> wrote:
> So, the problem isn't Darwin-specific.  I experimented with this on
> Linux and found Linux does the same thing with libpgcommon_srv.a that
> macOS does: a file in the archive that is totally unused is omitted
> from the postgres binary.  In Linux, however, that doesn't prevent
> pgcrypto from compiling anyway.  It does, however, prevent it from
> working.  Instead of failing at compile time with a complaint about
> missing symbols, it fails at load time.  I think that's because macOS
> has -bundle-loader and we use it; without that, I think we'd get the
> same behavior on macOS that we get on Windows.

Yes, right. I recall seeing the regression tests failing with pgcrypto
when doing that. Though I did not recall if this was specific to macos
or Linux when I looked again at this patch yesterday. When testing
again yesterday I was able to make the tests of pgcrypto to pass, but
perhaps my build was not in a clean state...

> 1. Rejigger things so that we don't build libpgcommon_srv.a in the
> first place, and instead add $(top_builddir)/src/common to
> src/backend/Makefile's value of SUBDIRS.  With appropriate adjustments
> to src/common/Makefile, this should allow us to include all of the
> object files on the linker command line individually instead of
> building an archive library that is then used only for the postgres
> binary itself anyway.  Then, things wouldn't get dropped.
>
> 2. Just postpone committing this patch until we're ready to use the
> new code in the backend someplace (or add a dummy reference to it
> someplace).

At the end this refactoring makes sense because it will be used in the
backend with the SCRAM engine, so we could just wait for 2 instead of
having some workarounds. This is dropping the ball for later and there
will be already a lot of work for the SCRAM core part, though I don't
think that the SHA2 refactoring will change much going forward.

Option 3 would be to do things the patch does it, aka just compiling
pgcrypto using the source files directly and put a comment to revert
that once the APIs are used in the backend. I can guess that you don't
like that.
-- 
Michael

Re: Password identifiers, protocol aging and SCRAM protocol

From

Michael Paquier

Date:

29 November 2016, 04:36:27

On Fri, Nov 18, 2016 at 2:51 AM, Michael Paquier
<michael.paquier@gmail.com> wrote:
> On Thu, Nov 17, 2016 at 8:12 AM, Robert Haas <robertmhaas@gmail.com> wrote:
>> So, the problem isn't Darwin-specific.  I experimented with this on
>> Linux and found Linux does the same thing with libpgcommon_srv.a that
>> macOS does: a file in the archive that is totally unused is omitted
>> from the postgres binary.  In Linux, however, that doesn't prevent
>> pgcrypto from compiling anyway.  It does, however, prevent it from
>> working.  Instead of failing at compile time with a complaint about
>> missing symbols, it fails at load time.  I think that's because macOS
>> has -bundle-loader and we use it; without that, I think we'd get the
>> same behavior on macOS that we get on Windows.
>
> Yes, right. I recall seeing the regression tests failing with pgcrypto
> when doing that. Though I did not recall if this was specific to macos
> or Linux when I looked again at this patch yesterday. When testing
> again yesterday I was able to make the tests of pgcrypto to pass, but
> perhaps my build was not in a clean state...
>
>> 1. Rejigger things so that we don't build libpgcommon_srv.a in the
>> first place, and instead add $(top_builddir)/src/common to
>> src/backend/Makefile's value of SUBDIRS.  With appropriate adjustments
>> to src/common/Makefile, this should allow us to include all of the
>> object files on the linker command line individually instead of
>> building an archive library that is then used only for the postgres
>> binary itself anyway.  Then, things wouldn't get dropped.
>>
>> 2. Just postpone committing this patch until we're ready to use the
>> new code in the backend someplace (or add a dummy reference to it
>> someplace).
>
> At the end this refactoring makes sense because it will be used in the
> backend with the SCRAM engine, so we could just wait for 2 instead of
> having some workarounds. This is dropping the ball for later and there
> will be already a lot of work for the SCRAM core part, though I don't
> think that the SHA2 refactoring will change much going forward.
>
> Option 3 would be to do things the patch does it, aka just compiling
> pgcrypto using the source files directly and put a comment to revert
> that once the APIs are used in the backend. I can guess that you don't
> like that.

Nothing more will likely happen in this CF, so I have moved it to
2017-01 with the same status of "Needs Review".
-- 
Michael

Re: Password identifiers, protocol aging and SCRAM protocol

From

Michael Paquier

Date:

07 December 2016, 06:39:32

On Tue, Nov 29, 2016 at 1:36 PM, Michael Paquier
<michael.paquier@gmail.com> wrote:
> Nothing more will likely happen in this CF, so I have moved it to
> 2017-01 with the same status of "Needs Review".

Attached is a new set of patches using the new routines
pg_backend_random() and pg_strong_random() to handle the randomness in
SCRAM:
- 0001 refactors the SHA2 routines. pgcrypto uses raw files from
src/common when compiling with this patch. That works on any platform,
and this is the simplified version of upthread.
- 0002 adds base64 routines to src/common.
- 0003 does some refactoring regarding the password encryption in
ALTER/CREATE USER queries.
- 0004 adds the clause PASSWORD (val USING method) in CREATE/ALTER USER.
- 0005 is the code patch for SCRAM. Note that this switches pgcrypto
to link to libpgcommon as SHA2 routines are used by the backend.
- 0006 adds some regression tests for passwords.
- 0007 adds some TAP tests for authentication.
This is added to the upcoming CF.

Thanks,
--
Michael

Attachment

Re: Password identifiers, protocol aging and SCRAM protocol

From

Heikki Linnakangas

Date:

07 December 2016, 20:55:08

On 12/07/2016 08:39 AM, Michael Paquier wrote:
> On Tue, Nov 29, 2016 at 1:36 PM, Michael Paquier
> <michael.paquier@gmail.com> wrote:
>> Nothing more will likely happen in this CF, so I have moved it to
>> 2017-01 with the same status of "Needs Review".
>
> Attached is a new set of patches using the new routines
> pg_backend_random() and pg_strong_random() to handle the randomness in
> SCRAM:
> - 0001 refactors the SHA2 routines. pgcrypto uses raw files from
> src/common when compiling with this patch. That works on any platform,
> and this is the simplified version of upthread.
> - 0002 adds base64 routines to src/common.
> - 0003 does some refactoring regarding the password encryption in
> ALTER/CREATE USER queries.
> - 0004 adds the clause PASSWORD (val USING method) in CREATE/ALTER USER.
> - 0005 is the code patch for SCRAM. Note that this switches pgcrypto
> to link to libpgcommon as SHA2 routines are used by the backend.
> - 0006 adds some regression tests for passwords.
> - 0007 adds some TAP tests for authentication.
> This is added to the upcoming CF.

I spent a little time reading through this once again. Steady progress,
did some small fixes:

* Rewrote the nonce generation. In the server-side, it first generated a
string of ascii-printable characters, then base64-encoded them, which is
superfluous. Also, avoid calling pg_strong_random() one byte at a time,
for performance reasons.

* Added a more sophisticated fallback implementation in libpq, for the
--disable-strong-random cases, similar to pg_backend_random().

* No need to disallow SCRAM with db_user_namespace. It doesn't include
the username in the salt like MD5 does.

Attached those here, as add-on patches to your latest patch set. I'll
continue reviewing, but a couple of things caught my eye that you may
want to jump on, in the meanwhile:

On error messages, the spec says:

> o  e: This attribute specifies an error that occurred during
>       authentication exchange.  It is sent by the server in its final
>       message and can help diagnose the reason for the authentication
>       exchange failure.  On failed authentication, the entire server-
>       final-message is OPTIONAL; specifically, a server implementation
>       MAY conclude the SASL exchange with a failure without sending the
>       server-final-message.  This results in an application-level error
>       response without an extra round-trip.  If the server-final-message
>       is sent on authentication failure, then the "e" attribute MUST be
>       included.

Note that it says that the server can send the error message with the e=
attribute, in the *final message*. It's not a valid response in the
earlier state, before sending server-first-message. I think we need to
change the INIT state handling in pg_be_scram_exchange() to not send e=
messages to the client. On an error at that state, it needs to just bail
out without a message. The spec allows that. We can always log the
detailed reason in the server log, anyway.

As Peter E pointed out earlier, the documentation is lacking, on how to
configure MD5 and/or SCRAM. If you put "scram" as the authentication
method in pg_hba.conf, what does it mean? If you have a line for both
"scram" and "md5" in pg_hba.conf, with the same database/user/hostname
combo, what does that mean? Answer: The first one takes effect, the
second one has no effect. Yet the example in the docs now has that,
which is nonsense :-). Hopefully we'll have some kind of a "both"
option, before the release, but in the meanwhile, we need describe how
this works now in the docs.

- Heikki

On 12/09/2016 05:58 AM, Michael Paquier wrote:
>
> One thing is: when do we look up at pg_authid? After receiving the
> first message from client or before beginning the exchange? As the
> first message from client has the user name, it would make sense to do
> the lookup after receiving it, but from PG prospective it would just
> make sense to use the data already present in the startup packet. The
> current patch does the latter. What do you think?

While hacking on this, I came up with the attached refactoring, against 
current master. I think it makes the current code more readable, anyway, 
and it provides a get_role_password() function that SCRAM can use, to 
look up the stored password. (This is essentially the same refactoring 
that was included in the SCRAM patch set, that introduced the 
get_role_details() function.)

Barring objections, I'll go ahead and commit this first.

- Heikki

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Attachment

0001-Refactor-the-code-for-verifying-user-s-password.patch

Re: [HACKERS] Password identifiers, protocol aging and SCRAM protocol

From

Michael Paquier

Date:

09 December 2016, 14:10:41

On Fri, Dec 09, 2016 at 11:51:45AM +0200, Heikki Linnakangas wrote:
> On 12/09/2016 05:58 AM, Michael Paquier wrote:
> >
> > One thing is: when do we look up at pg_authid? After receiving the
> > first message from client or before beginning the exchange? As the
> > first message from client has the user name, it would make sense to do
> > the lookup after receiving it, but from PG prospective it would just
> > make sense to use the data already present in the startup packet. The
> > current patch does the latter. What do you think?
>
> While hacking on this, I came up with the attached refactoring, against
> current master. I think it makes the current code more readable, anyway, and
> it provides a get_role_password() function that SCRAM can use, to look up
> the stored password. (This is essentially the same refactoring that was
> included in the SCRAM patch set, that introduced the get_role_details()
> function.)
>
> Barring objections, I'll go ahead and commit this first.

Here are some comments.

> @@ -720,12 +721,16 @@ CheckMD5Auth(Port *port, char **logdetail)
>      sendAuthRequest(port, AUTH_REQ_MD5, md5Salt, 4);
>
>      passwd = recv_password_packet(port);
> -
>      if (passwd == NULL)
>          return STATUS_EOF;        /* client wouldn't send password */

This looks like useless noise.

> -    shadow_pass = TextDatumGetCString(datum);
> +    *shadow_pass = TextDatumGetCString(datum);
>
>      datum = SysCacheGetAttr(AUTHNAME, roleTup,
>                              Anum_pg_authid_rolvaliduntil, &isnull);
> @@ -83,100 +83,146 @@ md5_crypt_verify(const char *role, char *client_pass,
>      {
>          *logdetail = psprintf(_("User \"%s\" has an empty password."),
>                                role);
> +        *shadow_pass = NULL;
>          return STATUS_ERROR;    /* empty password */
>      }

Here the password is allocated by text_to_cstring(), that's only 1 byte
but it should be free()'d.
--
Michael

Re: [HACKERS] Password identifiers, protocol aging and SCRAM protocol

From

Heikki Linnakangas

Date:

12 December 2016, 13:52:21

On 12/09/2016 01:10 PM, Michael Paquier wrote:
> On Fri, Dec 09, 2016 at 11:51:45AM +0200, Heikki Linnakangas wrote:
>> On 12/09/2016 05:58 AM, Michael Paquier wrote:
>>>
>>> One thing is: when do we look up at pg_authid? After receiving the
>>> first message from client or before beginning the exchange? As the
>>> first message from client has the user name, it would make sense to do
>>> the lookup after receiving it, but from PG prospective it would just
>>> make sense to use the data already present in the startup packet. The
>>> current patch does the latter. What do you think?
>>
>> While hacking on this, I came up with the attached refactoring, against
>> current master. I think it makes the current code more readable, anyway, and
>> it provides a get_role_password() function that SCRAM can use, to look up
>> the stored password. (This is essentially the same refactoring that was
>> included in the SCRAM patch set, that introduced the get_role_details()
>> function.)
>>
>> Barring objections, I'll go ahead and commit this first.

Ok, committed.

>> -    shadow_pass = TextDatumGetCString(datum);
>> +    *shadow_pass = TextDatumGetCString(datum);
>>
>>      datum = SysCacheGetAttr(AUTHNAME, roleTup,
>>                              Anum_pg_authid_rolvaliduntil, &isnull);
>> @@ -83,100 +83,146 @@ md5_crypt_verify(const char *role, char *client_pass,
>>      {
>>          *logdetail = psprintf(_("User \"%s\" has an empty password."),
>>                                role);
>> +        *shadow_pass = NULL;
>>          return STATUS_ERROR;    /* empty password */
>>      }
>
> Here the password is allocated by text_to_cstring(), that's only 1 byte
> but it should be free()'d.

Fixed. Thanks, good catch! It doesn't matter in practice as we'll 
disconnect shortly afterwards anyway, but given that the callers pfree() 
other things on error, let's be tidy.

- Heikki

Re: [HACKERS] Password identifiers, protocol aging and SCRAM protocol

From

Heikki Linnakangas

Date:

12 December 2016, 17:39:55

A few couple more things that caught my eye while hacking on this:

1. We don't use SASLPrep to scrub username's and passwords. That's by 
choice, for usernames, because historically in PostgreSQL usernames can 
be stored in any encoding, but SASLPrep assumes UTF-8. We dodge that by 
passing an empty username in the authentication exchange anyway, because 
we always use the username we got from the startup packet. But for 
passwords, I think we need to fix that. The spec is very clear on that:

> Note that implementations MUST either implement SASLprep or disallow
> use of non US-ASCII Unicode codepoints in "str".


2. I think we should check nonces, etc. more carefully, to not contain 
invalid characters. For example, in the server, we use the 
read_attr_value() function to read the client's nonce. Per the spec, the 
nonce should consist of ASCII printable characters, but we will accept 
anything except the comma. That's no trouble to the server, but let's be 
strict.


To summarize, here's the overall TODO list so far:

* Use SASLPrep for passwords.

* Check nonces, etc. to not contain invalid characters.

* Derive mock SCRAM verifier for non-existent users deterministically 
from username.

* Allow plain 'password' authentication for users with a SCRAM verifier 
in rolpassword.

* Throw an error if an "authorization identity" is given. ATM, we just 
ignore it, but seems better to reject the attempt than do something that 
might not be what the client expects.

* Add "scram-sha-256" prefix to SCRAM verifiers stored in 
pg_authid.rolpassword.

Anything else I'm missing?

I've created a wiki page, mostly to host that TODO list, while we hack 
this to completion: 
https://wiki.postgresql.org/wiki/SCRAM_authentication. Feel free to add 
stuff that comes to mind, and remove stuff as you push patches to the 
branch on github.

- Heikki

Re: [HACKERS] Password identifiers, protocol aging and SCRAM protocol

From

Craig Ringer

Date:

13 December 2016, 04:35:59

On 12 December 2016 at 22:39, Heikki Linnakangas <hlinnaka@iki.fi> wrote:

> * Throw an error if an "authorization identity" is given. ATM, we just
> ignore it, but seems better to reject the attempt than do something that
> might not be what the client expects.

Yeah. That might be an opportunity to make admins' and connection
poolers' lives much happier down the track, but first we'd need a way
of specifying a mapping for the other users a given user is permitted
to masquerade as (like we have for roles and role membership). We have
SET SESSION AUTHORIZATION already, which has all the same benefits and
security problems as allowing connect-time selection of authorization
identity without such a framework. And we have SET ROLE.

ERRORing is the right thing to do here, so we can safely use this
protocol functionality later if we want to allow user masquerading.

-- Craig Ringer                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services

Re: [HACKERS] Password identifiers, protocol aging and SCRAM protocol

From

Michael Paquier

Date:

13 December 2016, 04:43:22

On Mon, Dec 12, 2016 at 11:39 PM, Heikki Linnakangas <hlinnaka@iki.fi> wrote:
> A few couple more things that caught my eye while hacking on this:
>
> 1. We don't use SASLPrep to scrub username's and passwords. That's by
> choice, for usernames, because historically in PostgreSQL usernames can be
> stored in any encoding, but SASLPrep assumes UTF-8. We dodge that by passing
> an empty username in the authentication exchange anyway, because we always
> use the username we got from the startup packet. But for passwords, I think
> we need to fix that. The spec is very clear on that:
>
>> Note that implementations MUST either implement SASLprep or disallow
>> use of non US-ASCII Unicode codepoints in "str".
>
> 2. I think we should check nonces, etc. more carefully, to not contain
> invalid characters. For example, in the server, we use the read_attr_value()
> function to read the client's nonce. Per the spec, the nonce should consist
> of ASCII printable characters, but we will accept anything except the comma.
> That's no trouble to the server, but let's be strict.
>
> To summarize, here's the overall TODO list so far:
>
> * Use SASLPrep for passwords.
>
> * Check nonces, etc. to not contain invalid characters.
>
> * Derive mock SCRAM verifier for non-existent users deterministically from
> username.
>
> * Allow plain 'password' authentication for users with a SCRAM verifier in
> rolpassword.
>
> * Throw an error if an "authorization identity" is given. ATM, we just
> ignore it, but seems better to reject the attempt than do something that
> might not be what the client expects.
>
> * Add "scram-sha-256" prefix to SCRAM verifiers stored in
> pg_authid.rolpassword.
>
> Anything else I'm missing?
>
> I've created a wiki page, mostly to host that TODO list, while we hack this
> to completion: https://wiki.postgresql.org/wiki/SCRAM_authentication. Feel
> free to add stuff that comes to mind, and remove stuff as you push patches
> to the branch on github.

Based on the current code, I think you have the whole list. I'll try
to look once again at the code to see I have anything else in mind.
Improving the TAP regression tests is also an item, with SCRAM
authentication support when a plain password is stored.
-- 
Michael

Re: [HACKERS] Password identifiers, protocol aging and SCRAM protocol

From

Michael Paquier

Date:

13 December 2016, 08:44:07

On Tue, Dec 13, 2016 at 10:43 AM, Michael Paquier
<michael.paquier@gmail.com> wrote:
> On Mon, Dec 12, 2016 at 11:39 PM, Heikki Linnakangas <hlinnaka@iki.fi> wrote:
>> A few couple more things that caught my eye while hacking on this:

Looking at what we have now, in the branch...

>> * Use SASLPrep for passwords.

SASLPrep is defined here:
https://tools.ietf.org/html/rfc4013
And stringprep is here:
https://tools.ietf.org/html/rfc3454
So that's roughly applying a conversion from the mapping table, taking
into account prohibited, bi-directional, mapping characters, etc. The
spec says that the password should be in unicode. But we cannot be
sure of that, right? Those mapping tables should be likely a separated
thing.. (perl has Unicode::Stringprep::Mapping for example).

>> * Check nonces, etc. to not contain invalid characters.

Fixed this one.

>> * Derive mock SCRAM verifier for non-existent users deterministically from
>> username.

You have put in place the facility to allow that. The only thing that
comes in mind to generate something per-cluster is to have
BootStrapXLOG() generate an "authentication secret identifier" with a
uint64 and add that in the control file. Using pg_backend_random()
would be a good idea here.

>> * Allow plain 'password' authentication for users with a SCRAM verifier in
>> rolpassword.

Done.

>> * Throw an error if an "authorization identity" is given. ATM, we just
>> ignore it, but seems better to reject the attempt than do something that
>> might not be what the client expects.

Done.

>> * Add "scram-sha-256" prefix to SCRAM verifiers stored in
>> pg_authid.rolpassword.

You did it.
-- 
Michael

pg_authid.rolpassword format (was Re: [HACKERS] Password identifiers,protocol aging and SCRAM protocol)

From

Heikki Linnakangas

Date:

14 December 2016, 11:51:55

On 12/09/2016 10:19 AM, Michael Paquier wrote:
> On Fri, Dec 9, 2016 at 5:11 PM, Heikki Linnakangas <hlinnaka@iki.fi> wrote:
>> Couple of things I should write down before I forget:
>>
>> 1. It's a bit cumbersome that the scram verifiers stored in
>> pg_authid.rolpassword don't have any clear indication that they're scram
>> verifiers. MD5 hashes are readily identifiable by the "md5" prefix. I think
>> we should use a "scram-sha-256:" for scram verifiers.
>
> scram-sha-256 would make the most sense to me.
>
>> Actually, I think it'd be awfully nice to also prefix plaintext passwords
>> with "plain:", but I'm not sure it's worth breaking the compatibility, if
>> there are tools out there that peek into rolpassword. Thoughts?
>
> pgbouncer is the only thing coming up in mind. It looks at pg_shadow
> for password values. pg_dump'ing data from pre-10 instances will also
> need to adapt. I see tricky the compatibility with the exiting CREATE
> USER PASSWORD command though, so I am wondering if that's worth the
> complication.
>
>> 2. It's currently not possible to use the plaintext "password"
>> authentication method, for a user that has a SCRAM verifier in rolpassword.
>> That seems like an oversight. We can't do MD5 authentication with a SCRAM
>> verifier, but "password" we could.
>
> Yeah, that should be possible...

The tip of the work branch can now do SCRAM authentication, when a user 
has a plaintext password in pg_authid.rolpassword. The reverse doesn't 
work, however: you cannot do plain "password" authentication, when the 
user has a SCRAM verifier in pg_authid.rolpassword. It gets worse: plain 
"password" authentication doesn't check if the string stored in 
pg_authid.rolpassword is a SCRAM authenticator, and treats it as a 
plaintext password, so you can do this:

PGPASSWORD="scram-sha-256:mDBuqO1mEekieg==:4096:17dc259499c1a184c26ee5b19715173d9354195f510b4d3af8be585acb39ae33:d3d713149c6becbbe56bae259aafe4e95b79ab7e3b50f2fbd850ea7d7b7c114f"
psql postgres  -h localhost -U scram_user

I think we're going to have a more bugs like this, if we don't start to 
explicitly label plaintext passwords as such.

So, let's add "plain:" prefix to plaintext passwords, in 
pg_authid.rolpassword. With that, these would be valid values in 
pg_authid.rolpassword:

plain:foo
md55a962ce7a24371a10e85627a484cac28

scram-sha-256:mDBuqO1mEekieg==:4096:17dc259499c1a184c26ee5b19715173d9354195f510b4d3af8be585acb39ae33:d3d713149c6becbbe56bae259aafe4e95b79ab7e3b50f2fbd850ea7d7b7c114f

But anything that doesn't begin with "plain:", "md5", or 
"scram-sha-256:" would be invalid. You shouldn't have invalid values in 
the column, but if you do, all the authentication mechanisms would 
reject it.

It would be nice to also change the format of MD5 passwords to have a 
colon, as in "md5:<hash>", but that's probably not worth breaking 
compatibility for. Almost no-one stores passwords in plaintext, so 
changing the format of that wouldn't affect many people, but there might 
well be tools out there that peek into MD5 hashes.

- Heikki

Re: pg_authid.rolpassword format (was Re: [HACKERS] Passwordidentifiers, protocol aging and SCRAM protocol)

From

Michael Paquier

Date:

14 December 2016, 13:15:28

On Wed, Dec 14, 2016 at 5:51 PM, Heikki Linnakangas <hlinnaka@iki.fi> wrote:
> The tip of the work branch can now do SCRAM authentication, when a user has
> a plaintext password in pg_authid.rolpassword. The reverse doesn't work,
> however: you cannot do plain "password" authentication, when the user has a
> SCRAM verifier in pg_authid.rolpassword. It gets worse: plain "password"
> authentication doesn't check if the string stored in pg_authid.rolpassword
> is a SCRAM authenticator, and treats it as a plaintext password, so you can
> do this:
>
>
PGPASSWORD="scram-sha-256:mDBuqO1mEekieg==:4096:17dc259499c1a184c26ee5b19715173d9354195f510b4d3af8be585acb39ae33:d3d713149c6becbbe56bae259aafe4e95b79ab7e3b50f2fbd850ea7d7b7c114f"
> psql postgres  -h localhost -U scram_user

This one's fun.

> I think we're going to have a more bugs like this, if we don't start to
> explicitly label plaintext passwords as such.
>
> So, let's add "plain:" prefix to plaintext passwords, in
> pg_authid.rolpassword. With that, these would be valid values in
> pg_authid.rolpassword:
>
> [...]
>
> But anything that doesn't begin with "plain:", "md5", or "scram-sha-256:"
> would be invalid. You shouldn't have invalid values in the column, but if
> you do, all the authentication mechanisms would reject it.

I would be tempted to suggest adding the verifier type as a new column
of pg_authid, but as CREATE USER PASSWORD accepts strings with md5
prefix as-is for ages using the "plain:" prefix is definitely a better
plan. My opinion on the matter has changed compared to a couple of
months back.

> It would be nice to also change the format of MD5 passwords to have a colon,
> as in "md5:<hash>", but that's probably not worth breaking compatibility
> for. Almost no-one stores passwords in plaintext, so changing the format of
> that wouldn't affect many people, but there might well be tools out there
> that peek into MD5 hashes.

Yes, let's not take this road.

This work is definitely something that should be done before anything
else. Need a patch or are you on it?
-- 
Michael

Re: pg_authid.rolpassword format (was Re: [HACKERS] Passwordidentifiers, protocol aging and SCRAM protocol)

From

Magnus Hagander

Date:

14 December 2016, 13:27:15

On Wed, Dec 14, 2016 at 9:51 AM, Heikki Linnakangas <hlinnaka@iki.fi> wrote:

On 12/09/2016 10:19 AM, Michael Paquier wrote:
On Fri, Dec 9, 2016 at 5:11 PM, Heikki Linnakangas <hlinnaka@iki.fi> wrote:
Couple of things I should write down before I forget:

1. It's a bit cumbersome that the scram verifiers stored in
pg_authid.rolpassword don't have any clear indication that they're scram
verifiers. MD5 hashes are readily identifiable by the "md5" prefix. I think
we should use a "scram-sha-256:" for scram verifiers.

scram-sha-256 would make the most sense to me.

Actually, I think it'd be awfully nice to also prefix plaintext passwords
with "plain:", but I'm not sure it's worth breaking the compatibility, if
there are tools out there that peek into rolpassword. Thoughts?

pgbouncer is the only thing coming up in mind. It looks at pg_shadow
for password values. pg_dump'ing data from pre-10 instances will also
need to adapt. I see tricky the compatibility with the exiting CREATE
USER PASSWORD command though, so I am wondering if that's worth the
complication.

2. It's currently not possible to use the plaintext "password"
authentication method, for a user that has a SCRAM verifier in rolpassword.
That seems like an oversight. We can't do MD5 authentication with a SCRAM
verifier, but "password" we could.

Yeah, that should be possible...

The tip of the work branch can now do SCRAM authentication, when a user has a plaintext password in pg_authid.rolpassword. The reverse doesn't work, however: you cannot do plain "password" authentication, when the user has a SCRAM verifier in pg_authid.rolpassword. It gets worse: plain "password" authentication doesn't check if the string stored in pg_authid.rolpassword is a SCRAM authenticator, and treats it as a plaintext password, so you can do this:

PGPASSWORD="scram-sha-256:mDBuqO1mEekieg==:4096:17dc259499c1a184c26ee5b19715173d9354195f510b4d3af8be585acb39ae33:d3d713149c6becbbe56bae259aafe4e95b79ab7e3b50f2fbd850ea7d7b7c114f" psql postgres -h localhost -U scram_user

I think we're going to have a more bugs like this, if we don't start to explicitly label plaintext passwords as such.

So, let's add "plain:" prefix to plaintext passwords, in pg_authid.rolpassword. With that, these would be valid values in pg_authid.rolpassword:

plain:foo
md55a962ce7a24371a10e85627a484cac28
scram-sha-256:mDBuqO1mEekieg==:4096:17dc259499c1a184c26ee5b19715173d9354195f510b4d3af8be585acb39ae33:d3d713149c6becbbe56bae259aafe4e95b79ab7e3b50f2fbd850ea7d7b7c114f

I would so like to just drop support for plain passwords completely :) But there's a backwards compatibility issue to think about of course.

But -- is there any actual usecase for them anymore?

If not, another option could be to just specifically check that it's *not* "md5<something>" or "scram-<something>:<something>". That would invalidate plaintext passwords that have those texts in them of course, but what's the likelyhood of that in reality?

Though I guess that might at least in theory be more bug-prone, so going with a "plain:" prefix seems like a good idea as well.

But anything that doesn't begin with "plain:", "md5", or "scram-sha-256:" would be invalid. You shouldn't have invalid values in the column, but if you do, all the authentication mechanisms would reject it.

It would be nice to also change the format of MD5 passwords to have a colon, as in "md5:<hash>", but that's probably not worth breaking compatibility for. Almost no-one stores passwords in plaintext, so changing the format of that wouldn't affect many people, but there might well be tools out there that peek into MD5 hashes.

There are definitely tools that do that, so +1 on leaving that alone.

Magnus Hagander
Me: http://www.hagander.net/
Work: http://www.redpill-linpro.com/

Re: pg_authid.rolpassword format (was Re: [HACKERS] Passwordidentifiers, protocol aging and SCRAM protocol)

From

Heikki Linnakangas

Date:

14 December 2016, 13:32:20

On 12/14/2016 12:15 PM, Michael Paquier wrote:
> This work is definitely something that should be done before anything
> else. Need a patch or are you on it?

I'm on it..

- Heikki

Re: pg_authid.rolpassword format (was Re: [HACKERS] Passwordidentifiers, protocol aging and SCRAM protocol)

From

Heikki Linnakangas

Date:

14 December 2016, 14:33:10

On 12/14/2016 12:27 PM, Magnus Hagander wrote:
> I would so like to just drop support for plain passwords completely :) But
> there's a backwards compatibility issue to think about of course.
>
> But -- is there any actual usecase for them anymore?

Hmm. At the moment, I don't think there is.

But, a password stored in plaintext works with either MD5 or SCRAM, or 
any future authentication mechanism. So as soon as we have SCRAM 
authentication, it becomes somewhat useful again.

In a nutshell:

auth / stored    MD5    SCRAM    plaintext
-----------------------------------------
password    Y    Y    Y
md5        Y    N    Y
scram        N    Y    Y

If a password is stored in plaintext, it can be used with any 
authentication mechanism. And the plaintext 'password' authentication 
mechanism works with any kind of a stored password. But an MD5 hash 
cannot be used with SCRAM authentication, or vice versa.

I just noticed that the manual for CREATE ROLE says:

> Note that older clients might lack support for the MD5 authentication
> mechanism that is needed to work with passwords that are stored
> encrypted.

That's is incorrect. The alternative to MD5 authentication is plain 
'password' authentication, and that works just fine with MD5-hashed 
passwords. I think that sentence is a leftover from when we still 
supported "crypt" authentication (so I actually get to blame you for 
that ;-), commit 53a5026b). Back then, it was true that if an MD5 hash 
was stored in pg_authid, you couldn't do "crypt" authentication. That 
might have left old clients out in the cold.

Now that we're getting SCRAM authentication, we'll need a similar notice 
there again, for the incompatibility of a SCRAM verifier with MDD5 
authentication and vice versa.

> If not, another option could be to just specifically check that it's *not*
> "md5<something>" or "scram-<something>:<something>". That would invalidate
> plaintext passwords that have those texts in them of course, but what's the
> likelyhood of that in reality?

Hmm, we have dismissed that risk for the MD5 hashes (and we also have a 
length check for them), but as we get new hash formats, the risk 
increases. Someone might well want to use "plain:of:jars" as password. 
Perhaps we should use a more complicated pattern.

I googled around for how others store SCRAM and other password hashes. 
Many other systems seem to have similar naming schemes. The closest 
thing to a standard I could find was:

https://github.com/P-H-C/phc-string-format/blob/master/phc-sf-spec.md

Perhaps we should also use something like "$plain$<password>" or 
"$scram-sha-256$<iterations>$<salt>$<key>$"?

There's also https://tools.ietf.org/html/rfc5803, which specifies how to 
store SCRAM verifiers in LDAP. I don't understand enough of LDAP to 
understand what those actually look like, though, and there were no 
examples in the RFC.

I wonder if we should also worry about storing multiple verifiers in 
rolpassword? We don't support that now, but we might in the future. It 
might come handy, if you could easily store multiple hashes in a single 
string, separated by commas for example.

- Heikki

Re: pg_authid.rolpassword format (was Re: [HACKERS] Passwordidentifiers, protocol aging and SCRAM protocol)

From

Peter Eisentraut

Date:

14 December 2016, 17:52:48

On 12/14/16 5:15 AM, Michael Paquier wrote:
> I would be tempted to suggest adding the verifier type as a new column
> of pg_authid

Yes please.

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: pg_authid.rolpassword format (was Re: [HACKERS] Passwordidentifiers, protocol aging and SCRAM protocol)

From

Stephen Frost

Date:

14 December 2016, 17:57:23

* Peter Eisentraut (peter.eisentraut@2ndquadrant.com) wrote:
> On 12/14/16 5:15 AM, Michael Paquier wrote:
> > I would be tempted to suggest adding the verifier type as a new column
> > of pg_authid
>
> Yes please.

This discussion seems to continue to come up and I don't entirely
understand why we keep trying to shove more things into pg_authid, or
worse, into rolpassword.

We should have an independent table for the verifiers, which has a
different column for the verifier type, and either starts off supporting
multiple verifiers per role or at least gives us the ability to add that
easily later.  We should also move rolvaliduntil to that new table.

No, I am specifically *not* concerned with "backwards compatibility" of
that table- we continually add to it and change it and applications
which are so closely tied to PG that they look at pg_authid need to be
updated with nearly every release anyway.  What we *do* need to make
sure we get correct is what pg_dump/pg_upgrade do, but that's entirely
within our control to manage and shouldn't be that much of an issue to
implement.

Thanks!

Stephen

Re: pg_authid.rolpassword format (was Re: [HACKERS] Passwordidentifiers, protocol aging and SCRAM protocol)

From

Bruce Momjian

Date:

14 December 2016, 21:12:05

On Wed, Dec 14, 2016 at 11:27:15AM +0100, Magnus Hagander wrote:
> I would so like to just drop support for plain passwords completely :) But
> there's a backwards compatibility issue to think about of course.
> 
> But -- is there any actual usecase for them anymore?

I thought we recommended 'password' for SSL connections because if you
use MD5 passwords the password text layout is known and that simplifies
cryptanalysis.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+                      Ancient Roman grave inscription +

Re: pg_authid.rolpassword format (was Re: [HACKERS] Password identifiers, protocol aging and SCRAM protocol)

From

Heikki Linnakangas

Date:

14 December 2016, 22:34:55

On 14 December 2016 20:12:05 EET, Bruce Momjian <bruce@momjian.us> wrote:
>On Wed, Dec 14, 2016 at 11:27:15AM +0100, Magnus Hagander wrote:
>> I would so like to just drop support for plain passwords completely
>:) But
>> there's a backwards compatibility issue to think about of course.
>> 
>> But -- is there any actual usecase for them anymore?
>
>I thought we recommended 'password' for SSL connections because if you
>use MD5 passwords the password text layout is known and that simplifies
>cryptanalysis.

No, that makes no sense. And whether you use 'password' or 'md5' authentication is a different question than whether
youstore passwords in plaintext or as md5 hashes. Magnus was asking whether it ever makes sense to *store* passwords in
plaintext.

Since you brought it up, there is a legitimate argument to be made that 'password' authentication is more secure than
'md5',when SSL is used. Namely, if an attacker can acquire contents of pg_authid e.g. by stealing a backup tape, with
'md5'authentication he can log in as any user, using just the stolen hashes. But with 'password', he needs to reverse
thehash first. It's not a great difference, but it's something.

- Heikki

Re: pg_authid.rolpassword format (was Re: [HACKERS] Passwordidentifiers, protocol aging and SCRAM protocol)

From

Stephen Frost

Date:

14 December 2016, 22:41:41

* Heikki Linnakangas (hlinnaka@iki.fi) wrote:
> On 14 December 2016 20:12:05 EET, Bruce Momjian <bruce@momjian.us> wrote:
> >On Wed, Dec 14, 2016 at 11:27:15AM +0100, Magnus Hagander wrote:
> >> I would so like to just drop support for plain passwords completely
> >:) But
> >> there's a backwards compatibility issue to think about of course.
> >>
> >> But -- is there any actual usecase for them anymore?
> >
> >I thought we recommended 'password' for SSL connections because if you
> >use MD5 passwords the password text layout is known and that simplifies
> >cryptanalysis.
>
> No, that makes no sense. And whether you use 'password' or 'md5' authentication is a different question than whether
youstore passwords in plaintext or as md5 hashes. Magnus was asking whether it ever makes sense to *store* passwords in
plaintext.

Right.

> Since you brought it up, there is a legitimate argument to be made that 'password' authentication is more secure than
'md5',when SSL is used. Namely, if an attacker can acquire contents of pg_authid e.g. by stealing a backup tape, with
'md5'authentication he can log in as any user, using just the stolen hashes. But with 'password', he needs to reverse
thehash first. It's not a great difference, but it's something. 

Tunnelled passwords which are stored as hashes is also well understood
and comparable to SSH with passwords in /etc/passwd.

Storing plaintext passwords has been bad form for just about forever and
I wouldn't be sad to see our support of it go.  At the least, as was
discussed somewhere, but I'm not sure where it ended up, we should give
administrators the ability to control what ways a password can be
stored.  In particular, once a user has migrated all of their users to
SCRAM, they should be able to say "don't let new passwords be in any
format other than SCRAM-SHA-256".

Thanks!

Stephen

Re: pg_authid.rolpassword format (was Re: [HACKERS] Passwordidentifiers, protocol aging and SCRAM protocol)

From

"Joshua D. Drake"

Date:

14 December 2016, 22:58:51

On 12/14/2016 11:41 AM, Stephen Frost wrote:
> * Heikki Linnakangas (hlinnaka@iki.fi) wrote:
>> On 14 December 2016 20:12:05 EET, Bruce Momjian <bruce@momjian.us> wrote:
>>> On Wed, Dec 14, 2016 at 11:27:15AM +0100, Magnus Hagander wrote:

> Storing plaintext passwords has been bad form for just about forever and
> I wouldn't be sad to see our support of it go.  At the least, as was
> discussed somewhere, but I'm not sure where it ended up, we should give
> administrators the ability to control what ways a password can be
> stored.  In particular, once a user has migrated all of their users to
> SCRAM, they should be able to say "don't let new passwords be in any
> format other than SCRAM-SHA-256".

It isn't as bad as it used to be. I remember with PASSWORD was the 
default. I agree that we should be able to set a policy that says, "we 
only allow X for password storage".

JD


>
> Thanks!
>
> Stephen
>


-- 
Command Prompt, Inc.                  http://the.postgres.company/                        +1-503-667-4564
PostgreSQL Centered full stack support, consulting and development.
Everyone appreciates your honesty, until you are honest with them.

Re: pg_authid.rolpassword format (was Re: [HACKERS] Passwordidentifiers, protocol aging and SCRAM protocol)

From

Michael Paquier

Date:

15 December 2016, 04:00:23

On Wed, Dec 14, 2016 at 8:33 PM, Heikki Linnakangas <hlinnaka@iki.fi> wrote:
> But, a password stored in plaintext works with either MD5 or SCRAM, or any
> future authentication mechanism. So as soon as we have SCRAM authentication,
> it becomes somewhat useful again.
>
> In a nutshell:
>
> auth / stored   MD5     SCRAM   plaintext
> -----------------------------------------
> password        Y       Y       Y
> md5             Y       N       Y
> scram           N       Y       Y
>
> If a password is stored in plaintext, it can be used with any authentication
> mechanism. And the plaintext 'password' authentication mechanism works with
> any kind of a stored password. But an MD5 hash cannot be used with SCRAM
> authentication, or vice versa.

So.. I have been thinking about this portion of the thread. And what I
find the most scary is not the fact that we use plain passwords for
SCRAM authentication, it is the fact that we would need to do a
catalog lookup earlier in the connection workflow to decide what is
the connection protocol to use depending on the username provided in
the startup packet if the pg_hba.conf entry matching the user and
database names uses "password".

And, honestly, why do we actually need to have a support table that
spread? SCRAM is designed to be secure, so it seems to me that it
would on the contrary a bad idea to encourage the use of plain
passwords if we actually think that they should never be used (they
are actually useful for located, development instances, not production
ones). So what I would suggest would be to have a support table like
that:
auth / stored   MD5     SCRAM   plaintext
-----------------------------------------
password        Y       Y       N
md5             Y       N       Y
scram           N       N       Y

So here is an idea for things to do now:
1) do not change the format of the existing passwords
2) do not change pg_authid
3) block access to instances if "password" or "md5" are used in
pg_hba.conf if the user have a SCRAM verifier.
4) block access if "scram" is used and if user has a plain or md5 verifier.
5) Allow access if "scram" is used and if user has a SCRAM verifier.
We had a similar discussion regarding verifier/password formats last
year but that did not end well. It would be sad to fall back again
into this discussion and get no result. If somebody wants to support
access to SCRAM with plain password entries, why not. But that would
gain a -1 from me regarding the earlier lookup of pg_authid needed to
do the decision making on the protocol to use. And I think that we
want SCRAM to be designed to be a maximum stable and secure.
-- 
Michael

Re: [HACKERS] Password identifiers, protocol aging and SCRAM protocol

From

Michael Paquier

Date:

15 December 2016, 09:17:57

On Tue, Dec 13, 2016 at 2:44 PM, Michael Paquier
<michael.paquier@gmail.com> wrote:
> SASLPrep is defined here:
> https://tools.ietf.org/html/rfc4013
> And stringprep is here:
> https://tools.ietf.org/html/rfc3454
> So that's roughly applying a conversion from the mapping table, taking
> into account prohibited, bi-directional, mapping characters, etc. The
> spec says that the password should be in unicode. But we cannot be
> sure of that, right? Those mapping tables should be likely a separated
> thing.. (perl has Unicode::Stringprep::Mapping for example).

OK. I have look at that and I have bumped into libidn, that offers a
couple of APIs that could be used directly for this purpose.
Particularly, what has caught my eyes is stringprep_profile():
https://www.gnu.org/software/libidn/manual/html_node/Stringprep-Functions.html
res = stringprep_profile (input, output, "SASLprep", STRINGPREP_NO_UNASSIGNED);

libidn can be installed on Windows, and I have found packages for
cygwin, mingw, linux, freebsd and macos via brew. In the case where
libidn is not installed, I think that the safest path would be to
check if the input string has any high bits set (0x80) and bail out
because that would mean that it is a UTF-8 string that we cannot
change. Any thoughts about using libidn?

Also, after discussion with Heikki, here are the things that we need to do:
1) In libpq, we need to check if the string is valid utf-8. If that's
valid utf-8, apply SASLprep. if not, copy the string as-is. We could
error as well in this case... Perhaps a WARNING could be more adapted,
that's the most tricky case, and if the client does not use utf-8 that
may lead to unexpected behavior.
2) In server, when the password verifier is created. If
client_encoding is utf-8, but not server_encoding, convert the
password to utf-8 and build the verifier after applying SASLprep.

In the case where the binaries are *not* built with libidn, I think
that we had better reject valid UTF-8 string directly and just allow
ASCII? SASLprep is a no-op on ASCII characters.

Thoughts about this approach?
-- 
Michael

Re: pg_authid.rolpassword format (was Re: [HACKERS] Passwordidentifiers, protocol aging and SCRAM protocol)

From

Heikki Linnakangas

Date:

15 December 2016, 12:57:41

On 12/14/2016 04:57 PM, Stephen Frost wrote:
> * Peter Eisentraut (peter.eisentraut@2ndquadrant.com) wrote:
>> On 12/14/16 5:15 AM, Michael Paquier wrote:
>>> I would be tempted to suggest adding the verifier type as a new column
>>> of pg_authid
>>
>> Yes please.
>
> This discussion seems to continue to come up and I don't entirely
> understand why we keep trying to shove more things into pg_authid, or
> worse, into rolpassword.

I understand the relational beauty of having a separate column for the 
verifier type, but I don't think it would be practical. For starters, 
we'd still like to have a self-identifying string format like 
"scram-sha-256:<stuff>", so that you can conveniently pass the verifier 
as a string to CREATE USER. I think it'll be much better to stick to one 
format, than try to split the verifier into type and the string, when it 
enters the catalog table.

> We should have an independent table for the verifiers, which has a
> different column for the verifier type, and either starts off supporting
> multiple verifiers per role or at least gives us the ability to add that
> easily later.  We should also move rolvaliduntil to that new table.

I agree we'll probably need a new table for verifiers. Or turn 
rolpassword into an array or something. We discussed that before, 
however, and it didn't really go anywhere, so right now I'd like to get 
SCRAM in with minimal changes to the rest of the system. There is a lot 
of room for improvement once it's in.

- Heikki

Re: pg_authid.rolpassword format (was Re: [HACKERS] Passwordidentifiers, protocol aging and SCRAM protocol)

From

Heikki Linnakangas

Date:

15 December 2016, 15:48:50

On 12/15/2016 03:00 AM, Michael Paquier wrote:
> On Wed, Dec 14, 2016 at 8:33 PM, Heikki Linnakangas <hlinnaka@iki.fi> wrote:
>> But, a password stored in plaintext works with either MD5 or SCRAM, or any
>> future authentication mechanism. So as soon as we have SCRAM authentication,
>> it becomes somewhat useful again.
>>
>> In a nutshell:
>>
>> auth / stored   MD5     SCRAM   plaintext
>> -----------------------------------------
>> password        Y       Y       Y
>> md5             Y       N       Y
>> scram           N       Y       Y
>>
>> If a password is stored in plaintext, it can be used with any authentication
>> mechanism. And the plaintext 'password' authentication mechanism works with
>> any kind of a stored password. But an MD5 hash cannot be used with SCRAM
>> authentication, or vice versa.
>
> So.. I have been thinking about this portion of the thread. And what I
> find the most scary is not the fact that we use plain passwords for
> SCRAM authentication, it is the fact that we would need to do a
> catalog lookup earlier in the connection workflow to decide what is
> the connection protocol to use depending on the username provided in
> the startup packet if the pg_hba.conf entry matching the user and
> database names uses "password".

I don't see why we would need to do a catalog lookup any earlier. With 
"password" authentication, the server can simply request the client to 
send its password. When it receives it, it performs the catalog lookup 
to get pg_authid.rolpassword. If it's in plaintext, just compare it, if 
it's an MD5 hash, hash the client's password and compare, and if it's a 
SCRAM verifier, build a verifier with the same salt and iteration count 
and compare.

> And, honestly, why do we actually need to have a support table that
> spread? SCRAM is designed to be secure, so it seems to me that it
> would on the contrary a bad idea to encourage the use of plain
> passwords if we actually think that they should never be used (they
> are actually useful for located, development instances, not production
> ones).

I agree we should not encourage bad password practices. But as long as 
we support passwords to be stored in plaintext at all, it makes no sense 
to not allow them to be used with SCRAM. The fact that you can use a 
password stored in plaintext with both MD5 and SCRAM is literally the 
only reason you would store a password in plaintext, so if we don't want 
to allow that, we should disallow storing passwords in plaintext altogether.

> So what I would suggest would be to have a support table like
> that:
> auth / stored   MD5     SCRAM   plaintext
> -----------------------------------------
> password        Y       Y       N
> md5             Y       N       Y
> scram           N       N       Y

I was using 'Y' to indicate that the combination works, and 'N' to 
indicate that it does not. Assuming you're using the same notation, the 
above doesn't make any sense.

> So here is an idea for things to do now:
> 1) do not change the format of the existing passwords
> 2) do not change pg_authid
> 3) block access to instances if "password" or "md5" are used in
> pg_hba.conf if the user have a SCRAM verifier.
> 4) block access if "scram" is used and if user has a plain or md5 verifier.
> 5) Allow access if "scram" is used and if user has a SCRAM verifier.
> We had a similar discussion regarding verifier/password formats last
> year but that did not end well. It would be sad to fall back again
> into this discussion and get no result. If somebody wants to support
> access to SCRAM with plain password entries, why not. But that would
> gain a -1 from me regarding the earlier lookup of pg_authid needed to
> do the decision making on the protocol to use. And I think that we
> want SCRAM to be designed to be a maximum stable and secure.

The bottom line is that at the moment, when plaintext passwords are 
stored as is, without any indicator that it's a plaintext password, it's 
ambiguous whether a password is a SCRAM verifier, or if it's a plaintext 
password that just happens to begin with the word "scram:". That is 
completely unrelated to which combinations of stored passwords and 
authentication mechanisms we actually support or allow to work.

The only way to distinguish, is to know about every verifier kind there 
is, and check whether rolpassword looks valid as anything else than a 
plaintext password. And we already got tripped by a bug-of-omission on 
that once. If we add more verifier formats in the future, it's bound to 
happen again. Let's nip that source of bugs in the bud. Attached is a 
patch to implement what I have in mind.

Alternatively, you could argue that we should forbid storing passwords 
in plaintext altogether. I'm OK with that, too, if that's what people 
prefer. Then you cannot have a user that can log in with both MD5 and 
SCRAM authentication, but it's certainly more secure, and it's easier to 
document.

- Heikki

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Attachment

0001-Use-plain-prefix-for-plaintext-passwords-stored-in-p.patch

Re: pg_authid.rolpassword format (was Re: [HACKERS] Passwordidentifiers, protocol aging and SCRAM protocol)

From

Stephen Frost

Date:

15 December 2016, 16:40:30

* Heikki Linnakangas (hlinnaka@iki.fi) wrote:
> On 12/14/2016 04:57 PM, Stephen Frost wrote:
> >* Peter Eisentraut (peter.eisentraut@2ndquadrant.com) wrote:
> >>On 12/14/16 5:15 AM, Michael Paquier wrote:
> >>>I would be tempted to suggest adding the verifier type as a new column
> >>>of pg_authid
> >>
> >>Yes please.
> >
> >This discussion seems to continue to come up and I don't entirely
> >understand why we keep trying to shove more things into pg_authid, or
> >worse, into rolpassword.
>
> I understand the relational beauty of having a separate column for
> the verifier type, but I don't think it would be practical.

I disagree.

> For
> starters, we'd still like to have a self-identifying string format
> like "scram-sha-256:<stuff>", so that you can conveniently pass the
> verifier as a string to CREATE USER.

I don't follow why we can't change the syntax for CREATE USER to allow
specifying the verifier type independently.  Generally speaking, I don't
expect *users* to be providing actual encoded *verifiers* very often, so
it seems like a bit of extra syntax that pg_dump has to use isn't that
big of a deal.

> I think it'll be much better to
> stick to one format, than try to split the verifier into type and
> the string, when it enters the catalog table.

Apparently, multiple people disagree with this approach.  I don't think
history is really on your side here either.

> >We should have an independent table for the verifiers, which has a
> >different column for the verifier type, and either starts off supporting
> >multiple verifiers per role or at least gives us the ability to add that
> >easily later.  We should also move rolvaliduntil to that new table.
>
> I agree we'll probably need a new table for verifiers. Or turn
> rolpassword into an array or something. We discussed that before,
> however, and it didn't really go anywhere, so right now I'd like to
> get SCRAM in with minimal changes to the rest of the system. There
> is a lot of room for improvement once it's in.

Using an array strikes me as an absolutely terrible idea- how are you
going to handle having different valid_until times then?

I do agree with trying to get SCRAM in without changing too much of the
rest of the system, but I wanted to make it clear that it's the only
point that I agree with for continuing down this path and that we should
absolutely be looking to change the CREATE USER syntax to specify the
verifier independently, plan to use a different table for the verifiers
with an independent column for the verifier type, support multiple
verifiers per role, etc, in the (hopefully very near...) future.

Thanks!

Stephen

Re: pg_authid.rolpassword format (was Re: [HACKERS] Passwordidentifiers, protocol aging and SCRAM protocol)

From

Michael Paquier

Date:

16 December 2016, 04:31:14


On Thu, Dec 15, 2016 at 9:48 PM, Heikki Linnakangas <hlinnaka@iki.fi> wrote:
> The only way to distinguish, is to know about every verifier kind there is,
> and check whether rolpassword looks valid as anything else than a plaintext
> password. And we already got tripped by a bug-of-omission on that once. If
> we add more verifier formats in the future, it's bound to happen again.
> Let's nip that source of bugs in the bud. Attached is a patch to implement
> what I have in mind.

OK, I had a look at the patch proposed.

-    if (!pg_md5_encrypt(username, username, namelen, encrypted))
-        elog(ERROR, "password encryption failed");
-    if (strcmp(password, encrypted) == 0)
-        ereport(ERROR,
-                (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
-                 errmsg("password must not contain user name")));

This patch removes the only possible check for MD5 hashes that it has
never been done in passwordcheck. It may be fine to remove it, but I would
think that it is a good source of example regarding what could be done with
MD5 hashes, though limited. So it seems to me that this check should involve
as well pg_md5_encrypt on the username and compare if with the MD5 hash
given by the caller. The new code is being careful about trying to pass
down a plain password, but it is possible to load MD5 hashes directly as
well, aka pg_dumpall.

A simple ALTER USER role PASSWORD 'foo' causes a crash:
#0  0x00000000004764d7 in heap_compute_data_size (tupleDesc=0x277f090, values=0x27504b8, isnull=0x2750550 "") at
heaptuple.c:106
106                VARATT_CAN_MAKE_SHORT(DatumGetPointer(val)))
(gdb) bt
#0  0x00000000004764d7 in heap_compute_data_size (tupleDesc=0x277f090, values=0x27504b8, isnull=0x2750550 "") at
heaptuple.c:106
#1  0x00000000004781e9 in heap_form_tuple (tupleDescriptor=0x277f090, values=0x27504b8, isnull=0x2750550 "") at
heaptuple.c:736
#2  0x00000000004784d0 in heap_modify_tuple (tuple=0x277adc8, tupleDesc=0x277f090, replValues=0x7fff1369d030,
replIsnull=0x7fff1369d020"", doReplace=0x7fff1369d010 "")   at heaptuple.c:833   #3  0x0000000000673788 in AlterRole
(stmt=0x27a4f78)at user.c:845   #4  0x000000000082aa49 in standard_ProcessUtility (parsetree=0x27a4f78,
queryString=0x27a43e8"alter role ioltas password 'toto';", context=PROCESS_UTILITY_TOPLEVEL,       params=0x0,
dest=0x27a5300,completionTag=0x7fff1369d5b0 "") at utility.c:711
 

+        case PASSWORD_TYPE_PLAINTEXT:
+            shadow_pass = &shadow_pass[strlen("plain:")];
+            break;
It would be a good idea to have a generic routine able to get the plain
password value. In short I think that we should reduce the amount of
locations where "plain:" prefix is hardcoded.

> Alternatively, you could argue that we should forbid storing passwords in
> plaintext altogether. I'm OK with that, too, if that's what people prefer.
> Then you cannot have a user that can log in with both MD5 and SCRAM
> authentication, but it's certainly more secure, and it's easier to document.

At the end this may prove to be a bad idea for some developers. In local
deployments when working on a backend application with Postgres as backend,
it is actually useful to have plain passwords. At least I have found that
useful in some stuff I did many years ago.
-- 
Michael

Re: pg_authid.rolpassword format (was Re: [HACKERS] Passwordidentifiers, protocol aging and SCRAM protocol)

From

Robert Haas

Date:

16 December 2016, 18:48:20

On Thu, Dec 15, 2016 at 8:40 AM, Stephen Frost <sfrost@snowman.net> wrote:
> * Heikki Linnakangas (hlinnaka@iki.fi) wrote:
>> On 12/14/2016 04:57 PM, Stephen Frost wrote:
>> >* Peter Eisentraut (peter.eisentraut@2ndquadrant.com) wrote:
>> >>On 12/14/16 5:15 AM, Michael Paquier wrote:
>> >>>I would be tempted to suggest adding the verifier type as a new column
>> >>>of pg_authid
>> >>
>> >>Yes please.
>> >
>> >This discussion seems to continue to come up and I don't entirely
>> >understand why we keep trying to shove more things into pg_authid, or
>> >worse, into rolpassword.
>>
>> I understand the relational beauty of having a separate column for
>> the verifier type, but I don't think it would be practical.
>
> I disagree.

Me, too.  I think the idea of moving everything into a separate table
that allows multiple verifiers is probably not a good thing to do just
right now, because that introduces a bunch of additional issues above
and beyond what we need to do to get SCRAM implemented.  There are
administration and policy decisions to be made there that we should
not conflate with SCRAM proper.

However, Heikki's proposal seems to be that it's reasonable to force
rolpassword to be of the form 'type:verifier' in all cases but not
reasonable to have separate columns for type and verifier.  Eh?

>> For
>> starters, we'd still like to have a self-identifying string format
>> like "scram-sha-256:<stuff>", so that you can conveniently pass the
>> verifier as a string to CREATE USER.
>
> I don't follow why we can't change the syntax for CREATE USER to allow
> specifying the verifier type independently.  Generally speaking, I don't
> expect *users* to be providing actual encoded *verifiers* very often, so
> it seems like a bit of extra syntax that pg_dump has to use isn't that
> big of a deal.

We don't have to change the CREATE USER syntax at all.  It could just
split on the first colon and put the two halves of the string in
different places.  Of course, changing the syntax might be a good idea
anyway -- or not --- but the point is, right now, when you look at
rolpassword, there's not a clear rule for what kind of thing you've
got in there.  That's absolutely terrible design and has got to be
fixed.  Heikki's proposal of prefixing every entry with a type and a
':' will solve that problem and I'm not going to roll over in my grave
if we do it that way, but there is such a thing as normalization and
that technique could be applied here.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: pg_authid.rolpassword format (was Re: [HACKERS] Passwordidentifiers, protocol aging and SCRAM protocol)

From

Peter Eisentraut

Date:

16 December 2016, 23:40:55

On 12/15/16 8:40 AM, Stephen Frost wrote:
> I don't follow why we can't change the syntax for CREATE USER to allow
> specifying the verifier type independently.

That's what the last patch set I looked at actually does.

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: pg_authid.rolpassword format (was Re: [HACKERS] Passwordidentifiers, protocol aging and SCRAM protocol)

From

Stephen Frost

Date:

16 December 2016, 23:42:44

* Peter Eisentraut (peter.eisentraut@2ndquadrant.com) wrote:
> On 12/15/16 8:40 AM, Stephen Frost wrote:
> > I don't follow why we can't change the syntax for CREATE USER to allow
> > specifying the verifier type independently.
>
> That's what the last patch set I looked at actually does.

Well, same here, but it was quite a while ago and things have progressed
since then wrt SCRAM, as I understand it...

Thanks!

Stephen

Re: pg_authid.rolpassword format (was Re: [HACKERS] Passwordidentifiers, protocol aging and SCRAM protocol)

From

Michael Paquier

Date:

17 December 2016, 01:30:30

On Sat, Dec 17, 2016 at 5:42 AM, Stephen Frost <sfrost@snowman.net> wrote:
> * Peter Eisentraut (peter.eisentraut@2ndquadrant.com) wrote:
>> On 12/15/16 8:40 AM, Stephen Frost wrote:
>> > I don't follow why we can't change the syntax for CREATE USER to allow
>> > specifying the verifier type independently.
>>
>> That's what the last patch set I looked at actually does.
>
> Well, same here, but it was quite a while ago and things have progressed
> since then wrt SCRAM, as I understand it...

From the discussions of last year on -hackers, it was decided to *not*
have an additional column per complains from a couple of hackers
(Robert you were in this set at this point), and the same thing was
concluded during the informal lunch meeting at PGcon. The point is,
the existing SCRAM patch set can survive without touching at *all* the
format of pg_authid. We could block SCRAM authentication when
"password" is used in pg_hba.conf and as well as when "scram" is used
with a plain password stored in pg_authid. Or look at the format of
the string in the catalog if "password" is defined and decide the
authentication protocol to follow based on that.
-- 
Michael

Re: pg_authid.rolpassword format (was Re: [HACKERS] Passwordidentifiers, protocol aging and SCRAM protocol)

From

Stephen Frost

Date:

17 December 2016, 04:23:22

Michael,

* Michael Paquier (michael.paquier@gmail.com) wrote:
> On Sat, Dec 17, 2016 at 5:42 AM, Stephen Frost <sfrost@snowman.net> wrote:
> > * Peter Eisentraut (peter.eisentraut@2ndquadrant.com) wrote:
> >> On 12/15/16 8:40 AM, Stephen Frost wrote:
> >> > I don't follow why we can't change the syntax for CREATE USER to allow
> >> > specifying the verifier type independently.
> >>
> >> That's what the last patch set I looked at actually does.
> >
> > Well, same here, but it was quite a while ago and things have progressed
> > since then wrt SCRAM, as I understand it...
>
> From the discussions of last year on -hackers, it was decided to *not*
> have an additional column per complains from a couple of hackers

It seems that, at best, we didn't have consensus on it.  Hopefully we
are moving in a direction of consensus.

> (Robert you were in this set at this point), and the same thing was
> concluded during the informal lunch meeting at PGcon. The point is,
> the existing SCRAM patch set can survive without touching at *all* the
> format of pg_authid. We could block SCRAM authentication when
> "password" is used in pg_hba.conf and as well as when "scram" is used
> with a plain password stored in pg_authid. Or look at the format of
> the string in the catalog if "password" is defined and decide the
> authentication protocol to follow based on that.

As I mentioned up-thread, moving forward with minimal changes to get
SCRAM in certainly makes sense, but I do think we should be open to
(and, ideally, encouraging people to work towards) having a seperate
table for verifiers with independent columns for type and verifier.

Thanks!

Stephen

Re: pg_authid.rolpassword format (was Re: [HACKERS] Passwordidentifiers, protocol aging and SCRAM protocol)

From

Michael Paquier

Date:

17 December 2016, 06:48:30

On Sat, Dec 17, 2016 at 10:23 AM, Stephen Frost <sfrost@snowman.net> wrote:
> * Michael Paquier (michael.paquier@gmail.com) wrote:
>> (Robert you were in this set at this point), and the same thing was
>> concluded during the informal lunch meeting at PGcon. The point is,
>> the existing SCRAM patch set can survive without touching at *all* the
>> format of pg_authid. We could block SCRAM authentication when
>> "password" is used in pg_hba.conf and as well as when "scram" is used
>> with a plain password stored in pg_authid. Or look at the format of
>> the string in the catalog if "password" is defined and decide the
>> authentication protocol to follow based on that.
>
> As I mentioned up-thread, moving forward with minimal changes to get
> SCRAM in certainly makes sense, but I do think we should be open to
> (and, ideally, encouraging people to work towards) having a seperate
> table for verifiers with independent columns for type and verifier.

Definitely, and you know my position on the matter or I would not have
written last year's patch series. Both things are just orthogonal IMO
at this point. And it would be good to focus just on one problem at
the moment to get it out.
-- 
Michael

Re: pg_authid.rolpassword format (was Re: [HACKERS] Passwordidentifiers, protocol aging and SCRAM protocol)

From

Robert Haas

Date:

17 December 2016, 21:59:41

On Fri, Dec 16, 2016 at 5:30 PM, Michael Paquier
<michael.paquier@gmail.com> wrote:
> On Sat, Dec 17, 2016 at 5:42 AM, Stephen Frost <sfrost@snowman.net> wrote:
>> * Peter Eisentraut (peter.eisentraut@2ndquadrant.com) wrote:
>>> On 12/15/16 8:40 AM, Stephen Frost wrote:
>>> > I don't follow why we can't change the syntax for CREATE USER to allow
>>> > specifying the verifier type independently.
>>>
>>> That's what the last patch set I looked at actually does.
>>
>> Well, same here, but it was quite a while ago and things have progressed
>> since then wrt SCRAM, as I understand it...
>
> From the discussions of last year on -hackers, it was decided to *not*
> have an additional column per complains from a couple of hackers
> (Robert you were in this set at this point), ...

Hmm, I don't recall taking that position, but then there are a lot of
things that I ought to recall and don't.  (Ask my wife!)

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: pg_authid.rolpassword format (was Re: [HACKERS] Passwordidentifiers, protocol aging and SCRAM protocol)

From

Michael Paquier

Date:

18 December 2016, 01:48:48

On Sun, Dec 18, 2016 at 3:59 AM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Fri, Dec 16, 2016 at 5:30 PM, Michael Paquier
> <michael.paquier@gmail.com> wrote:
>> From the discussions of last year on -hackers, it was decided to *not*
>> have an additional column per complains from a couple of hackers
>> (Robert you were in this set at this point), ...
>
> Hmm, I don't recall taking that position, but then there are a lot of
> things that I ought to recall and don't.  (Ask my wife!)

[... digging objects of the past ...]
From the past thread:
https://www.postgresql.org/message-id/CA+TgmoY790rphHBogXMbTG6MzSeNdoxdBXebEkAet9ZpZ8gvtw@mail.gmail.com
The complain is directed directly to multiple verifiers per users
though, not to have the type in a separate column.
-- 
Michael

Re: [HACKERS] Password identifiers, protocol aging and SCRAM protocol

From

Michael Paquier

Date:

20 December 2016, 04:47:06

On Thu, Dec 15, 2016 at 3:17 PM, Michael Paquier
<michael.paquier@gmail.com> wrote:
> In the case where the binaries are *not* built with libidn, I think
> that we had better reject valid UTF-8 string directly and just allow
> ASCII? SASLprep is a no-op on ASCII characters.
>
> Thoughts about this approach?

And Heikki has mentioned me that he'd prefer not having an extra
dependency for the normalization, which is LGPL-licensed by the way.
So I have looked at the SASLprep business to see what should be done
to get a complete implementation in core, completely independent of
anything known.

The first thing is to be able to understand in the SCRAM code if a
string is UTF-8 or not, and this code is in src/common/. pg_wchar.c
offers a set of routines exactly for this purpose, which is built with
libpq but that's not available for src/common/. So instead of moving
all the file, I'd like to create a new file in src/common/utf8.c which
includes pg_utf_mblen() and pg_utf8_islegal(). On top of that I think
that having a routine able to check a full string would be useful for
many users, as pg_utf8_islegal() can only check one set of characters.
If the password string is found to be of UTF-8 format, SASLprepare is
applied. If not, the string is copied as-is with perhaps unexpected
effects for the client But he's in trouble already if client is not
using UTF-8.

Then comes the real business... Note that's my first time touching
encoding, particularly UTF-8 in depth, so please be nice. I may write
things that are incorrect or sound so from here :)

The second thing is the normalization itself. Per RFC4013, NFKC needs
to be applied to the string.  The operation is described in [1]
completely, and it is named as doing 1) a compatibility decomposition
of the bytes of the string, followed by 2) a canonical composition.

About 1). The compatibility decomposition is defined in [2], "by
recursively applying the canonical and compatibility mappings, then
applying the canonical reordering algorithm". Canonical and
compatibility mapping are some data available in UnicodeData.txt, the
6th column of the set defined in [3] to be precise. The meaning of the
decomposition mappings is defined in [2] as well. The canonical
decomposition is basically to look for a given UTF-8 character, and
then apply the multiple characters resulting in its new shape. The
compatibility mapping should as well be applied, but [5], a perl tool
called charlint.pl doing this normalization work, does not care about
this phase... Do we?

About 2)... Once the decomposition has been applied, those bytes need
to be recomposed using the Canonical_Combining_Class field of
UnicodeData.txt in [3], which is the 3rd column of the set. Its values
are defined in [4]. An other interesting thing, charlint.pl [5] does
not care about this phase. I am wondering if we should as well not
just drop this part as well...

Once 1) and 2) are done, NKFC is complete, and so is SASLPrepare.

So what we need from Postgres side is a mapping table to, having the
following fields:
1) Hexa sequence of UTF8 character.
2) Its canonical combining class.
3) The kind of decomposition mapping if defined.
4) The decomposition mapping, in hexadecimal format.
Based on what I looked at, either perl or python could be used to
process UnicodeData.txt and to generate a header file that would be
included in the tree. There are 30k entries in UnicodeData.txt, 5k of
them have a mapping, so that will result in many tables. One thing to
improve performance would be to store the length of the table in a
static variable, order the entries by their hexadecimal keys and do a
dichotomy lookup to find an entry. We could as well use more fancy
things like a set of tables using a Radix tree using decomposed by
bytes. We should finish by just doing one lookup of the table for each
character sets anyway.

In conclusion, at this point I am looking for feedback regarding the
following items:
1) Where to put the UTF8 check routines and what to move.
2) How to generate the mapping table using UnicodeData.txt. I'd think
that using perl would be better.
3) The shape of the mapping table, which depends on how many
operations we want to support in the normalization of the strings.
The decisions for those items will drive the implementation in one
sense or another.

[1]: http://www.unicode.org/reports/tr15/#Description_Norm
[2]: http://www.unicode.org/Public/5.1.0/ucd/UCD.html#Character_Decomposition_Mappings
[3]: http://www.unicode.org/Public/5.1.0/ucd/UCD.html#UnicodeData.txt
[4]: http://www.unicode.org/Public/5.1.0/ucd/UCD.html#Canonical_Combining_Class_Values
[5]: https://www.w3.org/International/charlint/

Heikki, others, thoughts?
-- 
Michael

Re: pg_authid.rolpassword format (was Re: [HACKERS] Passwordidentifiers, protocol aging and SCRAM protocol)

From

Robert Haas

Date:

20 December 2016, 06:47:20

On Sat, Dec 17, 2016 at 5:48 PM, Michael Paquier
<michael.paquier@gmail.com> wrote:
> On Sun, Dec 18, 2016 at 3:59 AM, Robert Haas <robertmhaas@gmail.com> wrote:
>> On Fri, Dec 16, 2016 at 5:30 PM, Michael Paquier
>> <michael.paquier@gmail.com> wrote:
>>> From the discussions of last year on -hackers, it was decided to *not*
>>> have an additional column per complains from a couple of hackers
>>> (Robert you were in this set at this point), ...
>>
>> Hmm, I don't recall taking that position, but then there are a lot of
>> things that I ought to recall and don't.  (Ask my wife!)
>
> [... digging objects of the past ...]
> From the past thread:
> https://www.postgresql.org/message-id/CA+TgmoY790rphHBogXMbTG6MzSeNdoxdBXebEkAet9ZpZ8gvtw@mail.gmail.com
> The complain is directed directly to multiple verifiers per users
> though, not to have the type in a separate column.

Yes, I rather like the separate column.  But since Heikki is doing the
work (or if he is) I'm not going to gripe too much.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: pg_authid.rolpassword format (was Re: [HACKERS] Passwordidentifiers, protocol aging and SCRAM protocol)

From

Heikki Linnakangas

Date:

20 December 2016, 14:37:35

On 12/16/2016 05:48 PM, Robert Haas wrote:
> On Thu, Dec 15, 2016 at 8:40 AM, Stephen Frost <sfrost@snowman.net> wrote:
>> * Heikki Linnakangas (hlinnaka@iki.fi) wrote:
>>> On 12/14/2016 04:57 PM, Stephen Frost wrote:
>>>> * Peter Eisentraut (peter.eisentraut@2ndquadrant.com) wrote:
>>>>> On 12/14/16 5:15 AM, Michael Paquier wrote:
>>>>>> I would be tempted to suggest adding the verifier type as a new column
>>>>>> of pg_authid
>>>>>
>>>>> Yes please.
>>>>
>>>> This discussion seems to continue to come up and I don't entirely
>>>> understand why we keep trying to shove more things into pg_authid, or
>>>> worse, into rolpassword.
>>>
>>> I understand the relational beauty of having a separate column for
>>> the verifier type, but I don't think it would be practical.
>>
>> I disagree.
>
> Me, too.  I think the idea of moving everything into a separate table
> that allows multiple verifiers is probably not a good thing to do just
> right now, because that introduces a bunch of additional issues above
> and beyond what we need to do to get SCRAM implemented.  There are
> administration and policy decisions to be made there that we should
> not conflate with SCRAM proper.
>
> However, Heikki's proposal seems to be that it's reasonable to force
> rolpassword to be of the form 'type:verifier' in all cases but not
> reasonable to have separate columns for type and verifier.  Eh?

I fear we'll just have to agree to disagree here, but I'll try to 
explain myself one more time.

Even if you have a separate "verifier type" column, it's not fully 
normalized, because there's still a dependency between the verifier and 
verifier type columns. You will always need to look at the verifier type 
to make sense of the verifier itself.

It's more convenient to carry the type information with the verifier 
itself, in backend code, in pg_dump, etc. Sure, you could have a 
separate "transfer" text format that has the prefix, and strip it out 
when the datum enters the system. But it is even simpler to have only 
one format, with the prefix, and use that everywhere.

It might make sense to add a separate column, to e.g. make it easier to 
e.g. query for users that have an MD5 verifier. You could do "WHERE 
rolverifiertype = 'md5'", instead of "WHERE rolpassword LIKE 'md5%'". 
It's not a big difference, though. But even if we did that, I would 
still love to have the type information *also* included with the 
verifier itself, for convenience. And if we include it in the verifier 
itself, adding a separate type column seems more trouble than it's worth.

For comparison, imagine that we added a column to pg_authid for a 
picture of the user, stored as a bytea. The picture can be in JPEG or 
PNG format. Looking at the first few bytes of the image, you can tell 
which one it is. Would it make sense to add a separate "type" column, to 
tell what format the image is in? I think it would be more convenient 
and robust to rely on the first bytes of the image data instead.

- Heikki

Re: pg_authid.rolpassword format (was Re: [HACKERS] Passwordidentifiers, protocol aging and SCRAM protocol)

From

Heikki Linnakangas

Date:

20 December 2016, 15:23:51

On 12/16/2016 03:31 AM, Michael Paquier wrote:
> On Thu, Dec 15, 2016 at 9:48 PM, Heikki Linnakangas <hlinnaka@iki.fi> wrote:
>> The only way to distinguish, is to know about every verifier kind there is,
>> and check whether rolpassword looks valid as anything else than a plaintext
>> password. And we already got tripped by a bug-of-omission on that once. If
>> we add more verifier formats in the future, it's bound to happen again.
>> Let's nip that source of bugs in the bud. Attached is a patch to implement
>> what I have in mind.
>
> OK, I had a look at the patch proposed.
>
> -    if (!pg_md5_encrypt(username, username, namelen, encrypted))
> -        elog(ERROR, "password encryption failed");
> -    if (strcmp(password, encrypted) == 0)
> -        ereport(ERROR,
> -                (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
> -                 errmsg("password must not contain user name")));
>
> This patch removes the only possible check for MD5 hashes that it has
> never been done in passwordcheck. It may be fine to remove it, but I would
> think that it is a good source of example regarding what could be done with
> MD5 hashes, though limited. So it seems to me that this check should involve
> as well pg_md5_encrypt on the username and compare if with the MD5 hash
> given by the caller.

Actually, it does still perform that check. There's a new function, 
plain_crypt_verify, that passwordcheck uses now. plain_crypt_verify() is 
intended to work with any future hash formats we might introduce in the 
future (including SCRAM), so that passwordcheck doesn't need to know 
about all the hash formats.

> A simple ALTER USER role PASSWORD 'foo' causes a crash:

Ah, fixed.

> +        case PASSWORD_TYPE_PLAINTEXT:
> +            shadow_pass = &shadow_pass[strlen("plain:")];
> +            break;
> It would be a good idea to have a generic routine able to get the plain
> password value. In short I think that we should reduce the amount of
> locations where "plain:" prefix is hardcoded.

There is such a function included in the patch, get_plain_password(char 
*shadow_pass), actually. Contrib/passwordcheck uses it. I figured that 
in crypt.c itself, it's OK to do the above directly, but 
get_plain_password() is intended to be used elsewhere.

Thanks for having a look! Attached is a new version, with that bug fixed.

- Heikki


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Attachment

0001-Use-plain-prefix-for-plaintext-passwords-stored-in-p-2.patch

Re: pg_authid.rolpassword format (was Re: [HACKERS] Passwordidentifiers, protocol aging and SCRAM protocol)

From

Robert Haas

Date:

20 December 2016, 15:56:43

On Tue, Dec 20, 2016 at 6:37 AM, Heikki Linnakangas <hlinnaka@iki.fi> wrote:
> It's more convenient to carry the type information with the verifier itself,
> in backend code, in pg_dump, etc. Sure, you could have a separate "transfer"
> text format that has the prefix, and strip it out when the datum enters the
> system. But it is even simpler to have only one format, with the prefix, and
> use that everywhere.

I see your point.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: pg_authid.rolpassword format (was Re: [HACKERS] Passwordidentifiers, protocol aging and SCRAM protocol)

From

Stephen Frost

Date:

20 December 2016, 16:34:19

Heikki,

* Heikki Linnakangas (hlinnaka@iki.fi) wrote:
> Even if you have a separate "verifier type" column, it's not fully
> normalized, because there's still a dependency between the verifier
> and verifier type columns. You will always need to look at the
> verifier type to make sense of the verifier itself.

That's true- but you don't need to look at the verifier, or even have
*access* to the verifier, to look at the verifier type.  That is
actually very useful when you start thinking about the downstream side
of this- what about the monitoring tool which will want to check and
make sure there are only certain verifier types being used?  It'll have
to be a superuser, or have access to some superuser security defined
function, and that really sucks.  I'm not saying that we would
necessairly want the verifier type to be publicly visible, but being
able to see it without being a superuser would be good, imv.

> It's more convenient to carry the type information with the verifier
> itself, in backend code, in pg_dump, etc. Sure, you could have a
> separate "transfer" text format that has the prefix, and strip it
> out when the datum enters the system. But it is even simpler to have
> only one format, with the prefix, and use that everywhere.

It's more convenient when you need to look at both- it's not more
convenient when you only wish to look at the verifier type.  Further, it
means that we have to have a construct that assumes things about the
verifier type and verifier- what if a verifier type came along that used
a colon?  We'd have to do some special magic to handle that correctly,
and that just sucks, and anyone who is writing code to generically deal
with these fields will end up writing that same code (or forgetting to,
and not handling the case correctly).

> It might make sense to add a separate column, to e.g. make it easier
> to e.g. query for users that have an MD5 verifier. You could do
> "WHERE rolverifiertype = 'md5'", instead of "WHERE rolpassword LIKE
> 'md5%'". It's not a big difference, though. But even if we did that,
> I would still love to have the type information *also* included with
> the verifier itself, for convenience. And if we include it in the
> verifier itself, adding a separate type column seems more trouble
> than it's worth.

I don't agree that it's "not a big difference."  As I argue above- your
approach also assumes that anyone who would like to investigate the
verifier type should have access to the verifier itself, which I do not
agree with.  I also have a hard time buying the argument that it's
really so much more convenient to have the verifier type included in the
same string as the verifier that we should duplicate that information
and then run the risk that we end up with the two not matching or that
we won't ever run into complications down the road when our chosen
separator causes us difficulties.

Thanks!

Stephen

Re: pg_authid.rolpassword format (was Re: [HACKERS] Passwordidentifiers, protocol aging and SCRAM protocol)

From

David Fetter

Date:

20 December 2016, 19:08:01

On Tue, Dec 20, 2016 at 08:34:19AM -0500, Stephen Frost wrote:
> Heikki,
> 
> * Heikki Linnakangas (hlinnaka@iki.fi) wrote:
> > Even if you have a separate "verifier type" column, it's not fully
> > normalized, because there's still a dependency between the
> > verifier and verifier type columns. You will always need to look
> > at the verifier type to make sense of the verifier itself.
> 
> That's true- but you don't need to look at the verifier, or even
> have *access* to the verifier, to look at the verifier type.

Would a view that shows only what's to the left of the first semicolon
suit this purpose?

Best,
David.
-- 
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter      XMPP: david(dot)fetter(at)gmail(dot)com

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

Re: pg_authid.rolpassword format (was Re: [HACKERS] Passwordidentifiers, protocol aging and SCRAM protocol)

From

Michael Paquier

Date:

21 December 2016, 00:30:49

On Wed, Dec 21, 2016 at 1:08 AM, David Fetter <david@fetter.org> wrote:
> Would a view that shows only what's to the left of the first semicolon
> suit this purpose?

Of course it would, you would just need to make the routines now
checking the shape of MD5 and SCRAM identifiers available at SQL level
and feed the strings into them. Now I am not sure that it's worth
having a new superuser view for that. pg_roles and pg_shadow hide the
information about verifiers.
-- 
Michael

Re: pg_authid.rolpassword format (was Re: [HACKERS] Passwordidentifiers, protocol aging and SCRAM protocol)

From

Stephen Frost

Date:

21 December 2016, 02:14:40

David,

* David Fetter (david@fetter.org) wrote:
> On Tue, Dec 20, 2016 at 08:34:19AM -0500, Stephen Frost wrote:
> > * Heikki Linnakangas (hlinnaka@iki.fi) wrote:
> > > Even if you have a separate "verifier type" column, it's not fully
> > > normalized, because there's still a dependency between the
> > > verifier and verifier type columns. You will always need to look
> > > at the verifier type to make sense of the verifier itself.
> >
> > That's true- but you don't need to look at the verifier, or even
> > have *access* to the verifier, to look at the verifier type.
>
> Would a view that shows only what's to the left of the first semicolon
> suit this purpose?

Obviously a (security barrier...) view or a (security definer) function
could be used, but I don't believe either is actually a good idea.

Thanks!

Stephen

Re: pg_authid.rolpassword format (was Re: [HACKERS] Passwordidentifiers, protocol aging and SCRAM protocol)

From

David Fetter

Date:

21 December 2016, 02:29:12

On Tue, Dec 20, 2016 at 06:14:40PM -0500, Stephen Frost wrote:
> David,
> 
> * David Fetter (david@fetter.org) wrote:
> > On Tue, Dec 20, 2016 at 08:34:19AM -0500, Stephen Frost wrote:
> > > * Heikki Linnakangas (hlinnaka@iki.fi) wrote:
> > > > Even if you have a separate "verifier type" column, it's not fully
> > > > normalized, because there's still a dependency between the
> > > > verifier and verifier type columns. You will always need to look
> > > > at the verifier type to make sense of the verifier itself.
> > > 
> > > That's true- but you don't need to look at the verifier, or even
> > > have *access* to the verifier, to look at the verifier type.
> > 
> > Would a view that shows only what's to the left of the first semicolon
> > suit this purpose?
> 
> Obviously a (security barrier...) view or a (security definer) function
> could be used, but I don't believe either is actually a good idea.

Would you be so kind as to help me understand what's wrong with that idea?

Best,
David.
-- 
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter      XMPP: david(dot)fetter(at)gmail(dot)com

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

Re: pg_authid.rolpassword format (was Re: [HACKERS] Passwordidentifiers, protocol aging and SCRAM protocol)

From

Stephen Frost

Date:

21 December 2016, 03:54:52

David,

* David Fetter (david@fetter.org) wrote:
> On Tue, Dec 20, 2016 at 06:14:40PM -0500, Stephen Frost wrote:
> > * David Fetter (david@fetter.org) wrote:
> > > On Tue, Dec 20, 2016 at 08:34:19AM -0500, Stephen Frost wrote:
> > > > * Heikki Linnakangas (hlinnaka@iki.fi) wrote:
> > > > > Even if you have a separate "verifier type" column, it's not fully
> > > > > normalized, because there's still a dependency between the
> > > > > verifier and verifier type columns. You will always need to look
> > > > > at the verifier type to make sense of the verifier itself.
> > > >
> > > > That's true- but you don't need to look at the verifier, or even
> > > > have *access* to the verifier, to look at the verifier type.
> > >
> > > Would a view that shows only what's to the left of the first semicolon
> > > suit this purpose?
> >
> > Obviously a (security barrier...) view or a (security definer) function
> > could be used, but I don't believe either is actually a good idea.
>
> Would you be so kind as to help me understand what's wrong with that idea?

For starters, it doubles-down on the assumption that we'll always be
happy with that particular separator and implies to anyone watching that
they'll be able to trust it.  Further, it's additional complication
which, at least to my eyes, is entirely in the wrong direction.

We could push everything in pg_authid into a single colon-separated text
field and call it simpler because we don't have to deal with those silly
column things, and we'd have something a lot closer to a unix passwd
file too!, but it wouldn't make it a terribly smart thing to do.  We
aren't a bunch of individual C programs having to parse out things out
of flat text files, after all.

Thanks!

Stephen

Re: pg_authid.rolpassword format (was Re: [HACKERS] Passwordidentifiers, protocol aging and SCRAM protocol)

From

Michael Paquier

Date:

21 December 2016, 05:09:59

On Tue, Dec 20, 2016 at 9:23 PM, Heikki Linnakangas <hlinnaka@iki.fi> wrote:
> On 12/16/2016 03:31 AM, Michael Paquier wrote:
> Actually, it does still perform that check. There's a new function,
> plain_crypt_verify, that passwordcheck uses now. plain_crypt_verify() is
> intended to work with any future hash formats we might introduce in the
> future (including SCRAM), so that passwordcheck doesn't need to know about
> all the hash formats.

Bah. I have misread the first version of the patch, and it is indeed
keeping the username checks. Now that things don't crash that behaves
as expected:
=# load 'passwordcheck';
LOAD
=# alter role mpaquier password 'mpaquier';
ERROR:  22023: password must not contain user name
LOCATION:  check_password, passwordcheck.c:101
=# alter role mpaquier password 'md58349d3a1bc8f4f7399b1ff9dea493b15';
ERROR:  22023: password must not contain user name
LOCATION:  check_password, passwordcheck.c:82
With the patch:

>> +        case PASSWORD_TYPE_PLAINTEXT:
>> +            shadow_pass = &shadow_pass[strlen("plain:")];
>> +            break;
>> It would be a good idea to have a generic routine able to get the plain
>> password value. In short I think that we should reduce the amount of
>> locations where "plain:" prefix is hardcoded.
>
> There is such a function included in the patch, get_plain_password(char
> *shadow_pass), actually. Contrib/passwordcheck uses it. I figured that in
> crypt.c itself, it's OK to do the above directly, but get_plain_password()
> is intended to be used elsewhere.

The idea would be to have the function not return an allocated string,
just a position to it. That would be useful in plain_crypt_verify()
for example, for a total of 4 places, including get_plain_password()
where the new string allocation is done. Well, it's not like this
prefix "plain:" would change anyway in the future nor that it is going
to spread much.

> Thanks for having a look! Attached is a new version, with that bug fixed.

I have been able more advanced testing without the crash and things
seem to work properly. The attached set of tests is also able to pass
for all the combinations of hba configurations and password formats.
And looking at the code I don't have more comments.
-- 
Michael

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Attachment

009_authentication.pl

Re: pg_authid.rolpassword format (was Re: [HACKERS] Passwordidentifiers, protocol aging and SCRAM protocol)

From

Heikki Linnakangas

Date:

03 January 2017, 15:11:20

On 12/14/2016 01:33 PM, Heikki Linnakangas wrote:
> I just noticed that the manual for CREATE ROLE says:
>
>> Note that older clients might lack support for the MD5 authentication
>> mechanism that is needed to work with passwords that are stored
>> encrypted.
>
> That's is incorrect. The alternative to MD5 authentication is plain
> 'password' authentication, and that works just fine with MD5-hashed
> passwords. I think that sentence is a leftover from when we still
> supported "crypt" authentication (so I actually get to blame you for
> that ;-), commit 53a5026b). Back then, it was true that if an MD5 hash
> was stored in pg_authid, you couldn't do "crypt" authentication. That
> might have left old clients out in the cold.
>
> Now that we're getting SCRAM authentication, we'll need a similar notice
> there again, for the incompatibility of a SCRAM verifier with MDD5
> authentication and vice versa.

I went ahead and removed the current bogus notice from the docs. We 
might need to put back something like it, with the SCRAM patch, but it 
needs to be rewritten anyway.

- Heikki

Re: pg_authid.rolpassword format (was Re: [HACKERS] Passwordidentifiers, protocol aging and SCRAM protocol)

From

Heikki Linnakangas

Date:

03 January 2017, 17:09:34

On 12/21/2016 04:09 AM, Michael Paquier wrote:
>> Thanks for having a look! Attached is a new version, with that bug fixed.
>
> I have been able more advanced testing without the crash and things
> seem to work properly. The attached set of tests is also able to pass
> for all the combinations of hba configurations and password formats.
> And looking at the code I don't have more comments.

Thanks!

Since not everyone agrees with this approach, I split this patch into 
two. The first patch refactors things, replacing the isMD5() function 
with get_password_type(), without changing the representation of 
pg_authid.rolpassword. That is hopefully uncontroversial. And the second 
patch adds the "plain:" prefix, which not everyone agrees on.

Barring objections I'm going to at least commit the first patch. I think 
we should commit the second one too, but it's not as critical, and the 
first patch matters more for the SCRAM patch, too.

- Heikki

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

On 2 February 2017 at 00:13, Heikki Linnakangas <hlinnaka@iki.fi> wrote:
> Ok, I'll drop the second patch for now. I committed the first patch after
> fixing the things you and Michael pointed out. Thanks for the review!

dbd69118 caused small compiler warning for me.

The attached fixed it.

-- 
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Attachment

encrypt_password_warning_fix.patch

Re: pg_authid.rolpassword format (was Re: [HACKERS] Passwordidentifiers, protocol aging and SCRAM protocol)

From

Heikki Linnakangas

Date:

02 February 2017, 11:45:17

On 02/02/2017 05:50 AM, David Rowley wrote:
> On 2 February 2017 at 00:13, Heikki Linnakangas <hlinnaka@iki.fi> wrote:
>> Ok, I'll drop the second patch for now. I committed the first patch after
>> fixing the things you and Michael pointed out. Thanks for the review!
>
> dbd69118 caused small compiler warning for me.
>
> The attached fixed it.

Fixed, thanks!

- Heikki

Re: [HACKERS] Password identifiers, protocol aging and SCRAM protocol

From

Heikki Linnakangas

Date:

03 February 2017, 15:52:52

On 12/20/2016 03:47 AM, Michael Paquier wrote:
> The first thing is to be able to understand in the SCRAM code if a
> string is UTF-8 or not, and this code is in src/common/. pg_wchar.c
> offers a set of routines exactly for this purpose, which is built with
> libpq but that's not available for src/common/. So instead of moving
> all the file, I'd like to create a new file in src/common/utf8.c which
> includes pg_utf_mblen() and pg_utf8_islegal().

Sounds reasonable. They're short functions, might also be ok to just 
copy-paste them to scram-common.c.

> On top of that I think that having a routine able to check a full
> string would be useful for many users, as pg_utf8_islegal() can only
> check one set of characters. If the password string is found to be of
> UTF-8 format, SASLprepare is applied. If not, the string is copied
> as-is with perhaps unexpected effects for the client But he's in
> trouble already if client is not using UTF-8.

Yeah.

> The second thing is the normalization itself. Per RFC4013, NFKC needs
> to be applied to the string.  The operation is described in [1]
> completely, and it is named as doing 1) a compatibility decomposition
> of the bytes of the string, followed by 2) a canonical composition.
>
> About 1). The compatibility decomposition is defined in [2], "by
> recursively applying the canonical and compatibility mappings, then
> applying the canonical reordering algorithm". Canonical and
> compatibility mapping are some data available in UnicodeData.txt, the
> 6th column of the set defined in [3] to be precise. The meaning of the
> decomposition mappings is defined in [2] as well. The canonical
> decomposition is basically to look for a given UTF-8 character, and
> then apply the multiple characters resulting in its new shape. The
> compatibility mapping should as well be applied, but [5], a perl tool
> called charlint.pl doing this normalization work, does not care about
> this phase... Do we?

Not sure. We need to do whatever the "right thing" is, according to the 
RFC. I would assume that the spec is not ambiguous this, but I haven't 
looked into the details. If it's ambiguous, then I think we need to look 
at some popular implementations to see what they do.

> About 2)... Once the decomposition has been applied, those bytes need
> to be recomposed using the Canonical_Combining_Class field of
> UnicodeData.txt in [3], which is the 3rd column of the set. Its values
> are defined in [4]. An other interesting thing, charlint.pl [5] does
> not care about this phase. I am wondering if we should as well not
> just drop this part as well...
>
> Once 1) and 2) are done, NKFC is complete, and so is SASLPrepare.

Ok.

> So what we need from Postgres side is a mapping table to, having the
> following fields:
> 1) Hexa sequence of UTF8 character.
> 2) Its canonical combining class.
> 3) The kind of decomposition mapping if defined.
> 4) The decomposition mapping, in hexadecimal format.
> Based on what I looked at, either perl or python could be used to
> process UnicodeData.txt and to generate a header file that would be
> included in the tree. There are 30k entries in UnicodeData.txt, 5k of
> them have a mapping, so that will result in many tables. One thing to
> improve performance would be to store the length of the table in a
> static variable, order the entries by their hexadecimal keys and do a
> dichotomy lookup to find an entry. We could as well use more fancy
> things like a set of tables using a Radix tree using decomposed by
> bytes. We should finish by just doing one lookup of the table for each
> character sets anyway.

Ok. I'm not too worried about the performance of this. It's only used 
for passwords, which are not that long, and it's only done when 
connecting. I'm more worried about the disk/memory usage. How small can 
we pack the tables? 10kB? 100kB? Even a few MB would probably not be too 
bad in practice, but I'd hate to bloat up libpq just for this.

> In conclusion, at this point I am looking for feedback regarding the
> following items:
> 1) Where to put the UTF8 check routines and what to move.

Covered that above.

> 2) How to generate the mapping table using UnicodeData.txt. I'd think
> that using perl would be better.

Agreed, it needs to be in Perl. That's what we require to be present 
when building PostgreSQL, it's what we use for generating other tables 
and functions.

> 3) The shape of the mapping table, which depends on how many
> operations we want to support in the normalization of the strings.
> The decisions for those items will drive the implementation in one
> sense or another.

Let's aim for small disk/memory footprint.

- Heikki

> [1]: http://www.unicode.org/reports/tr15/#Description_Norm
> [2]: http://www.unicode.org/Public/5.1.0/ucd/UCD.html#Character_Decomposition_Mappings
> [3]: http://www.unicode.org/Public/5.1.0/ucd/UCD.html#UnicodeData.txt
> [4]: http://www.unicode.org/Public/5.1.0/ucd/UCD.html#Canonical_Combining_Class_Values
> [5]: https://www.w3.org/International/charlint/

Re: [HACKERS] Password identifiers, protocol aging and SCRAM protocol

From

Michael Paquier

Date:

04 February 2017, 02:01:10

On Fri, Feb 3, 2017 at 9:52 PM, Heikki Linnakangas <hlinnaka@iki.fi> wrote:
> On 12/20/2016 03:47 AM, Michael Paquier wrote:
>>
>> The first thing is to be able to understand in the SCRAM code if a
>> string is UTF-8 or not, and this code is in src/common/. pg_wchar.c
>> offers a set of routines exactly for this purpose, which is built with
>> libpq but that's not available for src/common/. So instead of moving
>> all the file, I'd like to create a new file in src/common/utf8.c which
>> includes pg_utf_mblen() and pg_utf8_islegal().
>
> Sounds reasonable. They're short functions, might also be ok to just
> copy-paste them to scram-common.c.

Having a separate file makes the most sense to me I think, if we can
avoid code duplication that's better.

>> The second thing is the normalization itself. Per RFC4013, NFKC needs
>> to be applied to the string.  The operation is described in [1]
>> completely, and it is named as doing 1) a compatibility decomposition
>> of the bytes of the string, followed by 2) a canonical composition.
>>
>> About 1). The compatibility decomposition is defined in [2], "by
>> recursively applying the canonical and compatibility mappings, then
>> applying the canonical reordering algorithm". Canonical and
>> compatibility mapping are some data available in UnicodeData.txt, the
>> 6th column of the set defined in [3] to be precise. The meaning of the
>> decomposition mappings is defined in [2] as well. The canonical
>> decomposition is basically to look for a given UTF-8 character, and
>> then apply the multiple characters resulting in its new shape. The
>> compatibility mapping should as well be applied, but [5], a perl tool
>> called charlint.pl doing this normalization work, does not care about
>
> Not sure. We need to do whatever the "right thing" is, according to the RFC.
> I would assume that the spec is not ambiguous this, but I haven't looked
> into the details. If it's ambiguous, then I think we need to look at some
> popular implementations to see what they do.

The spec defines quite correctly what should be done. The
implementations are sometimes quite loose on some points though (see
charlint.pl).

>> So what we need from Postgres side is a mapping table to, having the
>> following fields:
>> 1) Hexa sequence of UTF8 character.
>> 2) Its canonical combining class.
>> 3) The kind of decomposition mapping if defined.
>> 4) The decomposition mapping, in hexadecimal format.
>> Based on what I looked at, either perl or python could be used to
>> process UnicodeData.txt and to generate a header file that would be
>> included in the tree. There are 30k entries in UnicodeData.txt, 5k of
>> them have a mapping, so that will result in many tables. One thing to
>> improve performance would be to store the length of the table in a
>> static variable, order the entries by their hexadecimal keys and do a
>> dichotomy lookup to find an entry. We could as well use more fancy
>> things like a set of tables using a Radix tree using decomposed by
>> bytes. We should finish by just doing one lookup of the table for each
>> character sets anyway.
>
> Ok. I'm not too worried about the performance of this. It's only used for
> passwords, which are not that long, and it's only done when connecting. I'm
> more worried about the disk/memory usage. How small can we pack the tables?
> 10kB? 100kB? Even a few MB would probably not be too bad in practice, but
> I'd hate to bloat up libpq just for this.

Indeed. I think I'll develop first a small utility able to do
operation. There is likely some knowledge in mb/Unicode that we can
use here. The radix tree patch would perhaps help?

>> 3) The shape of the mapping table, which depends on how many
>> operations we want to support in the normalization of the strings.
>> The decisions for those items will drive the implementation in one
>> sense or another.
>
> Let's aim for small disk/memory footprint.

OK, I'll try to give it a shot in a couple of days in the shape of an
extention or something like that. Thanks for the feedback.
-- 
Michael