Thread: Password identifiers, protocol aging and SCRAM protocol

Password identifiers, protocol aging and SCRAM protocol

From
Michael Paquier
Date:
Hi all

As a continuation of the thread firstly dedicated to SCRAM:
http://www.postgresql.org/message-id/55192AFE.6080106@iki.fi
Here is a new thread aimed at gathering all the ideas of this previous
thread and aimed at clarifying a bit what has been discussed until now
regarding password protocols, verifiers, and SCRAM itself.

Attached is a set of patches implementing a couple of things that have
been discussed, so let's roll in. There are a couple of concepts that
are introduced in this set of patches, and those patches are aimed at
resolving the following things:
- Introduce in Postgres an extensible password aging facility, by
having a new concept of 1 user/multiple password verifier, one
password verifier per protocol.
- Give to system administrators tools to decide unsupported protocols,
and have pg_upgrade use that
- Introduce new password protocols for Postgres, aimed at replacing
existing, say limited ones.
Note that here is not discussed the point of password verifier
rolling, which is the possibility to have multiple verifiers of the
same protocol for the same user (this maps with the fact that
valid_until is still part of pg_authid here, but in order to support
authentication rolling it would be necessary to move it to
pg_auth_verifiers).

Here is a short description of each patch and what they do:
1) 0001, removing the password column from pg_authid and putting it
into a new catalog called pg_auth_verifiers that has the following
format:
- Role OID
- Password protocol
- Password verifier
The protocols proposed in this patch are "plain" and "md5", which map
to the current things that Postgres has, so there is nothing new. What
is new is the new clause PASSWORD VERIFIERS usable by CREATE/ALTER
USER, like that:
ALTER ROLE foo PASSWORD VERIFIERS (md5 = 'foo', plain = 'foo');
This is easily extensible as new protocols can be added on top of
that. This has been discussed in the previous thread.
As discussed as well previously, password_encryption is switched from
a boolean switch to a list of protocols, which is md5 by default in
this patch.
Also, as discussed in 6174.1455501497@sss.pgh.pa.us, pg_shadow has
been changed so as the password value is replaced by '*****'.
This patch adds docs, regression tests, pg_dump support, etc.

2) 0002, introduction of a new GUC parameter password_protocols
(superuser-only) aimed at controlling the password verifiers of
protocols that can be created. This is quite simple: all the protocols
specified in this list define what are the protocols allowed when
creating password verifiers using CREATE/ALTER ROLE. By default, and
in this patch, this is set to 'plain,md5', which is the current
default in Postgres, though a system admin could set it to 'md5', to
forbid the creation of unencrypted passwords for example. Docs and
regressions are added on the stack, the regression tests taking
advantage of the fact that this is a superuser parameters.
This patch is an answer to remarks done in the last thread regarding
the fact that there is no way to handle how a system controls what are
the password verifier types created, and protocol aging gets its sense
with with patch and 0003...

3) 0003, Introduction of a system function, that I called
pg_auth_verifiers_sanitize, which is superuser-only, aimed at cleaning
up password verifiers in pg_auth_verifiers depending on what the user
has defined in password_protocols. This basically does a heap scan of
pg_auth_verifiers, and deletes the tuple entries that are of protocols
not listed in password_protocols. I have hesitated to put that in
pg_upgrade_support.c, perhaps it would make more sense to have it
there, but feedback is welcome. I have in mind that it is actually
useful for users to have this function at hand to do post-upgrade
cleanup operations. Regression tests cannot be added for this one, I
guess the reason to not have them is obvious when considering
installcheck...

4) 0004, Have pg_upgrade make use of the system function introduced by
0003. This is quite simple, and this allows pg_upgrade to remove
entries of outdated protocols.

Those 4 patches are aimed at putting in-core basics for the concept I
call password protocol aging, which is a way to allow multiple
password protocols to be defined in Postgres, and aimed at easing
administration as well as retirement of outdated protocols, which is
something that is not doable now in Postgres.

The second set of patch 0005~0008 introduces a new protocol, SCRAM.
This is a brushed up, rebased version of the previous patches, and is
divided as follows:
5) 0005, Move of SHA1 routines of pgcrypto to src/common to allow
frontend authentication code path to use SHA1.
6) 0006 is a refactoring of sendAuthRequest that taken independently
makes sense.
7) 0007 is a small refactoring of RandomSalt(), to allow this function
to handle salt values of different lengths
8) 0008 is another refactoring, moving a set of encoding routines from
the backend's encode.c to src/common, escape, base64 and hex are moved
as such, though SCRAM uses only base64. For consistency moving all the
set made more sense to me.
9) 0009 is the SCRAM authentication itself....

The first 4 patches obviously are the core portion that I would like
to discuss about in this CF, as they put in the base for the rest, and
will surely help Postgres long-term. 0005~0008 are just refactoring
patches, so they are quite simple. 0009 though is quite difficult, and
needs careful review because it manipulates areas of the code where it
is not necessary to be an authenticated user, so if there are bugs in
it it would be possible for example to crash down Postgres just by
sending authentication requests.
Regards,
--
Michael

Attachment

Re: Password identifiers, protocol aging and SCRAM protocol

From
Valery Popov
Date:
Hi, Michael


23.02.2016 10:17, Michael Paquier пишет:
> Attached is a set of patches implementing a couple of things that have
> been discussed, so let's roll in.
>
> Those 4 patches are aimed at putting in-core basics for the concept I
> call password protocol aging, which is a way to allow multiple
> password protocols to be defined in Postgres, and aimed at easing
> administration as well as retirement of outdated protocols, which is
> something that is not doable now in Postgres.
>
> The second set of patch 0005~0008 introduces a new protocol, SCRAM.
> 9) 0009 is the SCRAM authentication itself....
The theme with password checking is interesting for me, and I can give 
review for CF for some features.
I think that review of all suggested features will require a lot of time.
Is it possible to make subset of patches concerning only password 
strength and its aging?
The patches you have applied are non-independent. They should be apply 
consequentially one by one.
Thus the patch 0009 can't be applied without git error  before 0001.
In this conditions all patches were successfully applied and compiled.
All tests successfully passed.
> The first 4 patches obviously are the core portion that I would like
> to discuss about in this CF, as they put in the base for the rest, and
> will surely help Postgres long-term. 0005~0008 are just refactoring
> patches, so they are quite simple. 0009 though is quite difficult, and
> needs careful review because it manipulates areas of the code where it
> is not necessary to be an authenticated user, so if there are bugs in
> it it would be possible for example to crash down Postgres just by
> sending authentication requests.
>
-- 
Regards,
Valery Popov
Postgres Professional http://www.postgrespro.com
The Russian Postgres Company




Re: Password identifiers, protocol aging and SCRAM protocol

From
Michael Paquier
Date:
On Fri, Feb 26, 2016 at 1:38 AM, Valery Popov <v.popov@postgrespro.ru> wrote:
> Hi, Michael
>
>
> 23.02.2016 10:17, Michael Paquier пишет:
>>
>> Attached is a set of patches implementing a couple of things that have
>> been discussed, so let's roll in.
>>
>> Those 4 patches are aimed at putting in-core basics for the concept I
>> call password protocol aging, which is a way to allow multiple
>> password protocols to be defined in Postgres, and aimed at easing
>> administration as well as retirement of outdated protocols, which is
>> something that is not doable now in Postgres.
>>
>> The second set of patch 0005~0008 introduces a new protocol, SCRAM.
>> 9) 0009 is the SCRAM authentication itself....
>
> The theme with password checking is interesting for me, and I can give
> review for CF for some features.
> I think that review of all suggested features will require a lot of time.
> Is it possible to make subset of patches concerning only password strength
> and its aging?
> The patches you have applied are non-independent. They should be apply
> consequentially one by one.
> Thus the patch 0009 can't be applied without git error  before 0001.
> In this conditions all patches were successfully applied and compiled.
> All tests successfully passed.

If you want to focus on the password protocol aging, you could just
have a look at 0001~0004.
--
Michael



Re: Password identifiers, protocol aging and SCRAM protocol

From
Valery Popov
Date:

26.02.2016 01:10, Michael Paquier пишет:
> On Fri, Feb 26, 2016 at 1:38 AM, Valery Popov <v.popov@postgrespro.ru> wrote:
>> Hi, Michael
>>
>>
>> 23.02.2016 10:17, Michael Paquier пишет:
>>> Attached is a set of patches implementing a couple of things that have
>>> been discussed, so let's roll in.
>>>
>>> Those 4 patches are aimed at putting in-core basics for the concept I
>>> call password protocol aging, which is a way to allow multiple
>>> password protocols to be defined in Postgres, and aimed at easing
>>> administration as well as retirement of outdated protocols, which is
>>> something that is not doable now in Postgres.
>>>
>>> The second set of patch 0005~0008 introduces a new protocol, SCRAM.
>>> 9) 0009 is the SCRAM authentication itself....
>> The theme with password checking is interesting for me, and I can give
>> review for CF for some features.
>> I think that review of all suggested features will require a lot of time.
>> Is it possible to make subset of patches concerning only password strength
>> and its aging?
>> The patches you have applied are non-independent. They should be apply
>> consequentially one by one.
>> Thus the patch 0009 can't be applied without git error  before 0001.
>> In this conditions all patches were successfully applied and compiled.
>> All tests successfully passed.
> If you want to focus on the password protocol aging, you could just
> have a look at 0001~0004.
OK, I will review patches 0001-0004, for starting.

-- 
Regards,
Valery Popov
Postgres Professional http://www.postgrespro.com
The Russian Postgres Company




Re: [REVIEW]: Password identifiers, protocol aging and SCRAM protocol

From
Valery Popov
Date:
Hi, Michael
>>>
>>>
>>> 23.02.2016 10:17, Michael Paquier пишет:
>>>> Attached is a set of patches implementing a couple of things that have
>>>> been discussed, so let's roll in.
>>>>
>>>> Those 4 patches are aimed at putting in-core basics for the concept I
>>>> call password protocol aging, which is a way to allow multiple
>>>> password protocols to be defined in Postgres, and aimed at easing
>>>> administration as well as retirement of outdated protocols, which is
>>>> something that is not doable now in Postgres.
>>>>
>>>> The second set of patch 0005~0008 introduces a new protocol, SCRAM.
>>>> 9) 0009 is the SCRAM authentication itself....
>>> The theme with password checking is interesting for me, and I can give
>>> review for CF for some features.
>>> I think that review of all suggested features will require a lot of 
>>> time.
>>> Is it possible to make subset of patches concerning only password 
>>> strength
>>> and its aging?
>>> The patches you have applied are non-independent. They should be apply
>>> consequentially one by one.
>>> Thus the patch 0009 can't be applied without git error  before 0001.
>>> In this conditions all patches were successfully applied and compiled.
>>> All tests successfully passed.
>> If you want to focus on the password protocol aging, you could just
>> have a look at 0001~0004.
> OK, I will review patches 0001-0004, for starting.
>
Below are the results of compiling and testing.
============================
I've got the last version of sources from 
git://git.postgresql.org/git/postgresql.git.

vpopov@vpopov-Ubuntu:~/Projects/pwdtest/postgresql$ git branch
* master

Then I've applied patches 0001-0004 with two warnings:
vpopov@vpopov-Ubuntu:~/Projects/pwdtest/postgresql$ git apply 
0001-Add-facility-to-store-multiple-password-verifiers.patch
0001-Add-facility-to-store-multiple-password-verifiers.patch:2547: 
trailing whitespace.
warning: 1 line adds whitespace errors.
vpopov@vpopov-Ubuntu:~/Projects/pwdtest/postgresql$ git apply 
0002-Introduce-password_protocols.patch
vpopov@vpopov-Ubuntu:~/Projects/pwdtest/postgresql$ git apply 
0003-Add-pg_auth_verifiers_sanitize.patch
0003-Add-pg_auth_verifiers_sanitize.patch:87: indent with spaces.    if (!superuser())
warning: 1 line adds whitespace errors.
vpopov@vpopov-Ubuntu:~/Projects/pwdtest/postgresql$ git apply 
0004-Remove-password-verifiers-for-unsupported-protocols-.patch
The compilation with option ./configure --enable-debug --enable-nls 
--enable-cassert  --enable-tap-tests --with-perl
was successful.
Regression tests and all TAP-tests also passed successfully.

Also I've applied patches 0005-0008 into clean sources directory with no 
warnings.
vpopov@vpopov-Ubuntu:~/Projects/pwdtest2/postgresql$ git apply 
0005-Move-sha1.c-to-src-common.patch
vpopov@vpopov-Ubuntu:~/Projects/pwdtest2/postgresql$ git apply 
0006-Refactor-sendAuthRequest.patch
vpopov@vpopov-Ubuntu:~/Projects/pwdtest2/postgresql$ git apply 
0007-Refactor-RandomSalt-to-handle-salts-of-different-len.patch
vpopov@vpopov-Ubuntu:~/Projects/pwdtest2/postgresql$ git apply 
0008-Move-encoding-routines-to-src-common.patch
The compilation with option ./configure --enable-debug --enable-nls 
--enable-cassert  --enable-tap-tests --with-perl
was successful.
Regression and the TAP-tests also passed successfully.

The patch 0009 depends on all previous patches 0001-0008: first we need 
to apply patches 0001-0008, then 0009.
Then, all patches were successfully compiled.
All test passed.

-- 
Regards,
Valery Popov
Postgres Professional http://www.postgrespro.com
The Russian Postgres Company




Re: [REVIEW]: Password identifiers, protocol aging and SCRAM protocol

From
Michael Paquier
Date:
On Mon, Feb 29, 2016 at 8:43 PM, Valery Popov <v.popov@postgrespro.ru> wrote:
> vpopov@vpopov-Ubuntu:~/Projects/pwdtest/postgresql$ git branch

Thanks for the input!

> 0001-Add-facility-to-store-multiple-password-verifiers.patch:2547: trailing
> whitespace.
> warning: 1 line adds whitespace errors.
> 0003-Add-pg_auth_verifiers_sanitize.patch:87: indent with spaces.
>     if (!superuser())
> warning: 1 line adds whitespace errors.

Argh, yes. Those two ones have slipped though my successive rebases I
think. Will fix in my tree, I don't think that it is worth sending
again the whole series just for that though.
-- 
Michael



Re: [REVIEW]: Password identifiers, protocol aging and SCRAM protocol

From
Dmitry Dolgov
Date:
On 1 March 2016 at 06:34, Michael Paquier <michael.paquier@gmail.com> wrote:
On Mon, Feb 29, 2016 at 8:43 PM, Valery Popov <v.popov@postgrespro.ru> wrote:
> vpopov@vpopov-Ubuntu:~/Projects/pwdtest/postgresql$ git branch

Thanks for the input!

> 0001-Add-facility-to-store-multiple-password-verifiers.patch:2547: trailing
> whitespace.
> warning: 1 line adds whitespace errors.
> 0003-Add-pg_auth_verifiers_sanitize.patch:87: indent with spaces.
>     if (!superuser())
> warning: 1 line adds whitespace errors.

Argh, yes. Those two ones have slipped though my successive rebases I
think. Will fix in my tree, I don't think that it is worth sending
again the whole series just for that though.
--
Michael


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

 
Hi, Michael

Few questions about the documentation.

config.sgml:1200

>      <listitem>
>       <para>
>        Specifies a comma-separated list of supported password formats by
>        the server. Supported formats are currently <literal>plain</> and
>        <literal>md5</>.
>       </para>
>
>       <para>
>        When a password is specified in <xref linkend="sql-createuser"> or
>        <xref linkend="sql-alterrole">, this parameter determines if the
>        password specified is authorized to be stored or not, returning
>        an error message to caller if it is not.
>       </para>
>
>       <para>
>        The default is <literal>plain,md5,scram</>, meaning that MD5-encrypted
>        passwords, plain passwords, and SCRAM-encrypted passwords are accepted.
>       </para>
>      </listitem>

The default value contains "scram". Shouldn't be here also:

>        Specifies a comma-separated list of supported password formats by
>        the server. Supported formats are currently <literal>plain</>,
>        <literal>md5</> and <literal>scram</>.

Or I missed something?

And one more:

config.sgml:1284

>       <para>
>        <varname>db_user_namespace</> causes the client's and
>        server's user name representation to differ.
>        Authentication checks are always done with the server's user name
>        so authentication methods must be configured for the
>        server's user name, not the client's.  Because
>        <literal>md5</> uses the user name as salt on both the
>        client and server, <literal>md5</> cannot be used with
>        <varname>db_user_namespace</>.
>       </para>

Looks like the same (pls, correct me if I'm wrong) is applicable for "scram" as I see from the code below. Shouldn't be "scram" mentioned here also? Here's the code:

> diff --git a/src/backend/libpq/hba.c b/src/backend/libpq/hba.c
> index 28f9fb5..df0cc1d 100644
> --- a/src/backend/libpq/hba.c
> +++ b/src/backend/libpq/hba.c
> @@ -1184,6 +1184,19 @@ parse_hba_line(List *line, int line_num, char *raw_line)
}
parsedline->auth_method = uaMD5;
}
>+ else if (strcmp(token->string, "scram") == 0)
>+ {
>+ if (Db_user_namespace)
>+ {
>+ ereport(LOG,
>+ (errcode(ERRCODE_CONFIG_FILE_ERROR),
>+ errmsg("SCRAM authentication is not supported when \"db_user_namespace\" is enabled"),
>+ errcontext("line %d of configuration file \"%s\"",
>+ line_num, HbaFileName)));
>+ return NULL;
>+ }
>+ parsedline->auth_method = uaSASL;
>+ }
else if (strcmp(token->string, "pam") == 0)
> #ifdef USE_PAM
parsedline->auth_method = uaPAM;


Re: [REVIEW]: Password identifiers, protocol aging and SCRAM protocol

From
Michael Paquier
Date:
On Wed, Mar 2, 2016 at 4:05 AM, Dmitry Dolgov <9erthalion6@gmail.com> wrote:
> [...]

Thanks for the review.

> The default value contains "scram". Shouldn't be here also:
>
>>        Specifies a comma-separated list of supported password formats by
>>        the server. Supported formats are currently <literal>plain</>,
>>        <literal>md5</> and <literal>scram</>.
>
> Or I missed something?

Ah, I see. That's in the documentation of password_protocols. Yes
scram should be listed there as well. That should be fixed in 0009.

>>       <para>
>>        <varname>db_user_namespace</> causes the client's and
>>        server's user name representation to differ.
>>        Authentication checks are always done with the server's user name
>>        so authentication methods must be configured for the
>>        server's user name, not the client's.  Because
>>        <literal>md5</> uses the user name as salt on both the
>>        client and server, <literal>md5</> cannot be used with
>>        <varname>db_user_namespace</>.
>>       </para>
>
> Looks like the same (pls, correct me if I'm wrong) is applicable for "scram"
> as I see from the code below. Shouldn't be "scram" mentioned here also?

Oops. Good catch. Yes it should be mentioned as part of the SCRAM patch (0009).
-- 
Michael



Re: [REVIEW]: Password identifiers, protocol aging and SCRAM protocol

From
Valery Popov
Date:
>>        <para>
>>         <varname>db_user_namespace</> causes the client's and
>>         server's user name representation to differ.
>>         Authentication checks are always done with the server's user name
>>         so authentication methods must be configured for the
>>         server's user name, not the client's.  Because
>>         <literal>md5</> uses the user name as salt on both the
>>         client and server, <literal>md5</> cannot be used with
>>         <varname>db_user_namespace</>.
>>        </para>
Also in doc/src/sgml/ref/create_role.sgml is should be instead of      <term>PASSWORD VERIFIERS ( <replaceable 
class="PARAMETER">verifier_type</replaceable> = '<replaceable 
class="PARAMETER">password</replaceable>'</term>
like this      <term><literal>PASSWORD VERIFIERS</> ( <replaceable 
class="PARAMETER">verifier_type</replaceable> = '<replaceable 
class="PARAMETER">password</replaceable>'</term>-- Regards, Valery Popov 
Postgres Professional http://www.postgrespro.com The Russian Postgres 
Company



Re: [REVIEW]: Password identifiers, protocol aging and SCRAM protocol

From
Michael Paquier
Date:
On Wed, Mar 2, 2016 at 5:43 PM, Valery Popov <v.popov@postgrespro.ru> wrote:
>
>>>        <para>
>>>         <varname>db_user_namespace</> causes the client's and
>>>         server's user name representation to differ.
>>>         Authentication checks are always done with the server's user name
>>>         so authentication methods must be configured for the
>>>         server's user name, not the client's.  Because
>>>         <literal>md5</> uses the user name as salt on both the
>>>         client and server, <literal>md5</> cannot be used with
>>>         <varname>db_user_namespace</>.
>>>        </para>
>
> Also in doc/src/sgml/ref/create_role.sgml is should be instead of
>       <term>PASSWORD VERIFIERS ( <replaceable
> class="PARAMETER">verifier_type</replaceable> = '<replaceable
> class="PARAMETER">password</replaceable>'</term>
> like this
>       <term><literal>PASSWORD VERIFIERS</> ( <replaceable
> class="PARAMETER">verifier_type</replaceable> = '<replaceable
> class="PARAMETER">password</replaceable>'</term>

So the <literal> markup is missing. Thanks. I am taking note of it.
-- 
Michael



Re: [REVIEW]: Password identifiers, protocol aging and SCRAM protocol

From
Valery Popov
Date:
This is a review of "Password identifiers, protocol aging and SCRAM 
protocol" patches
http://www.postgresql.org/message-id/CAB7nPqSMXU35g=W9X74HVeQp0uvgJxvYOuA4A-A3M+0wfEBv-w@mail.gmail.com

Contents & Purpose
--------------------------
There was a discussion dedicated to SCRAM:
http://www.postgresql.org/message-id/55192AFE.6080106@iki.fi

This set of patches implements the following:
- Introduce in Postgres an extensible password aging facility, by having 
a new concept of 1 user/multiple password verifier, one password 
verifier per protocol.
- Give to system administrators tools to decide unsupported protocols, 
and have pg_upgrade use that
- Introduce new password protocols for Postgres, aimed at replacing 
existing, say limited ones.

This set of patches consists of 9 separate patches.
Description of each patch is well described in initial thread email and 
in comments.
The first set of patches 0001-0008 adds facility to store multiple 
password verifiers,
CREATE ROLE and ALTER ROLE are extended with PASSWORD VERIFIERS, new 
superuser GUC parameters which specifies a list of supported password
protocols in Postgres backend, added pg_auth_verifiers_sanitize 
function, removed password verifiers for unsupported protocols in 
pg_upgrade, and more features.
The second set of patch 0005~0008 introduces a new protocol, SCRAM, and 
0009 is SCRAM itself.

Initial Run
-------------
Included in the patches are:
- source code
- regression tests
- documentation
The source code is well commented.
The patches are in context diff format and were applied correctly to 
HEAD (there were 2 warnings, and it was fixed by author).
There were several markup warnings, should be fixed by author.
Regression tests pass successfully, without errors. It seems that the 
patches work as expected.
The patch 0009 depends on all previous patches 0001-0008: first we need 
to apply patches 0001-0008, then 0009.

Performance
-----------
I have not tested possible performance issues yet.

Conclusion
--------------
I think introduced features are useful and I vote for commit +1.

On 03/02/2016 02:55 PM, Michael Paquier wrote:
> On Wed, Mar 2, 2016 at 5:43 PM, Valery Popov <v.popov@postgrespro.ru> wrote:
> So the <literal> markup is missing. Thanks. I am taking note of it. 

-- 
Regards,
Valery Popov
Postgres Professional http://www.postgrespro.com
The Russian Postgres Company




Re: Password identifiers, protocol aging and SCRAM protocol

From
David Steele
Date:
On 2/23/16 2:17 AM, Michael Paquier wrote:

> As a continuation of the thread firstly dedicated to SCRAM:
> http://www.postgresql.org/message-id/55192AFE.6080106@iki.fi
> Here is a new thread aimed at gathering all the ideas of this previous
> thread and aimed at clarifying a bit what has been discussed until now
> regarding password protocols, verifiers, and SCRAM itself.

It looks like this patch set is a bit out of date.

When applying 0004:

$ git apply 
../other/0004-Remove-password-verifiers-for-unsupported-protocols-.patch
error: patch failed: src/bin/pg_upgrade/pg_upgrade.c:262
error: src/bin/pg_upgrade/pg_upgrade.c: patch does not apply

Then I tried to build with just 0001-0003:

cd /postgres/src/include/catalog && '/usr/bin/perl' ./duplicate_oids
3318
3319
3320
3321
3322
make[3]: *** [postgres.bki] Error 1

Could you provide an updated set of patches for review?  Meanwhile I am 
marking this as "waiting for author".

Thanks,
-- 
-David
david@pgmasters.net



Re: Password identifiers, protocol aging and SCRAM protocol

From
Michael Paquier
Date:
On Mon, Mar 14, 2016 at 4:32 PM, David Steele <david@pgmasters.net> wrote:
> On 2/23/16 2:17 AM, Michael Paquier wrote:
>
>> As a continuation of the thread firstly dedicated to SCRAM:
>> http://www.postgresql.org/message-id/55192AFE.6080106@iki.fi
>> Here is a new thread aimed at gathering all the ideas of this previous
>> thread and aimed at clarifying a bit what has been discussed until now
>> regarding password protocols, verifiers, and SCRAM itself.
>
>
> It looks like this patch set is a bit out of date.
>
> When applying 0004:
>
> $ git apply
> ../other/0004-Remove-password-verifiers-for-unsupported-protocols-.patch
> error: patch failed: src/bin/pg_upgrade/pg_upgrade.c:262
> error: src/bin/pg_upgrade/pg_upgrade.c: patch does not apply
>
> Then I tried to build with just 0001-0003:
>
> cd /postgres/src/include/catalog && '/usr/bin/perl' ./duplicate_oids
> 3318
> 3319
> 3320
> 3321
> 3322
> make[3]: *** [postgres.bki] Error 1
>
> Could you provide an updated set of patches for review?  Meanwhile I am
> marking this as "waiting for author".

Sure. I'll provide them shortly with all the comments addressed. Up to
now I just had a couple of comments about docs and whitespaces, so I
didn't really bother sending a new set, but this meritates a rebase.
-- 
Michael



Re: Password identifiers, protocol aging and SCRAM protocol

From
Michael Paquier
Date:
On Mon, Mar 14, 2016 at 5:06 PM, Michael Paquier
<michael.paquier@gmail.com> wrote:
> On Mon, Mar 14, 2016 at 4:32 PM, David Steele <david@pgmasters.net> wrote:
>> Could you provide an updated set of patches for review?  Meanwhile I am
>> marking this as "waiting for author".
>
> Sure. I'll provide them shortly with all the comments addressed. Up to
> now I just had a couple of comments about docs and whitespaces, so I
> didn't really bother sending a new set, but this meritates a rebase.

And here they are. I have addressed the documentation and the
whitespaces reported up to now at the same time.
--
Michael

Attachment

Re: Password identifiers, protocol aging and SCRAM protocol

From
Valery Popov
Date:
Hi, All

On 03/15/2016 02:07 AM, Michael Paquier wrote:
> Sure. I'll provide them shortly with all the comments addressed. Up to
> now I just had a couple of comments about docs and whitespaces, so I
> didn't really bother sending a new set, but this meritates a rebase.
> And here they are. I have addressed the documentation and the
> whitespaces reported up to now at the same time.
I've applied all of 0001-0009 patches from the new set with no any
warnings to today's master branch.
Then compiled with  configure options:
./configure --enable-debug --enable-nls --enable-cassert
--enable-tap-tests --with-perl
All regression tests passed successfully.
make check-world passed successfully.
make installcheck-world failed on several contrib modules:
dblink, file_fdw, hstore, pgcrypto, pgstattuple, postgres_fdw,
tablefunc. The tests results are attached.
Documentation looks good.
Where may be a problem with make check-world and make installcheck-world
results?

--
Regards,
Valery Popov
Postgres Professional http://www.postgrespro.com
The Russian Postgres Company


Attachment

Re: Password identifiers, protocol aging and SCRAM protocol

From
Michael Paquier
Date:
On Tue, Mar 15, 2016 at 3:46 PM, Valery Popov wrote:
> make installcheck-world failed on several contrib modules:
> dblink, file_fdw, hstore, pgcrypto, pgstattuple, postgres_fdw, tablefunc.
> The tests results are attached.
> Documentation looks good.
> Where may be a problem with make check-world and make installcheck-world
> results?

I cannot reproduce this, and my guess is that the binaries of those
contrib/ modules are not up to date for the installed instance of
Postgres you are running the tests on. Particularly I find this
portion doubtful:
 SELECT avg(normal_rand)::int FROM normal_rand(100, 250, 0.2);
! server closed the connection unexpectedly
! This probably means the server terminated abnormally
! before or while processing the request.
! connection to server was lost

The set of patches I am proposing here does not go through those code
paths, and this is likely an aggregate failure.
-- 
Michael



Re: Password identifiers, protocol aging and SCRAM protocol

From
David Steele
Date:
Hi Michael,

On 3/14/16 7:07 PM, Michael Paquier wrote:

> On Mon, Mar 14, 2016 at 5:06 PM, Michael Paquier <michael.paquier@gmail.com> wrote:
>
>> On Mon, Mar 14, 2016 at 4:32 PM, David Steele <david@pgmasters.net> wrote:
>>
>>> Could you provide an updated set of patches for review?  Meanwhile I am
>>> marking this as "waiting for author".
>>
>> Sure. I'll provide them shortly with all the comments addressed. Up to
>> now I just had a couple of comments about docs and whitespaces, so I
>> didn't really bother sending a new set, but this meritates a rebase.
> 
> And here they are. I have addressed the documentation and the
> whitespaces reported up to now at the same time.

For this first review I would like to focus on the user visible changes
introduced in 0001-0002.

First I created two new users with each type of supported verifier:

postgres=# create user test with password 'test';
CREATE ROLE
postgres=# create user testu with unencrypted password 'testu'          valid until '2017-01-01';
CREATE ROLE

1) I see that rolvaliduntil is still in pg_authid:

postgres=# select oid, rolname, rolvaliduntil from pg_authid;
 oid  | rolname |     rolvaliduntil
-------+---------+------------------------   10 | vagrant |16387 | test    |16388 | testu   | 2017-01-01 00:00:00+00

I think that's OK if we now define it to be "role validity" (it's still
password validity in the patched docs).  I would also like to see a
validuntil column in pg_auth_verifiers so we can track password
expiration for each verifier separately.  For now I think it's enough to
copy the same validity both places since there can only be one verifier.

2) I don't think the column naming in pg_auth_verifiers is consistent
with other catalogs:

postgres=# select * from pg_auth_verifiers;
roleid | verimet |               verival
--------+---------+------------------------------------- 16387 | m       | md505a671c66aefea124cc08b76ea6d30bb 16388 |
p      | testu
 

System catalogs generally use a 3 character prefix so I would expect the
columns to be (if we pick avr as a prefix):

avrrole
avrmethod
avrverifier
avrvaliduntil

I'm not a big fan in abbreviating too much so you can see I've expanded
the names a bit.

3) rolpassword is still in pg_shadow even though it is not useful anymore:

postgres=# select usename, passwd, valuntil from pg_shadow;
usename |  passwd  |        valuntil
---------+----------+------------------------vagrant | ******** |test    | ******** |testu   | ******** | 2017-01-01
00:00:00+00

If anyone is actually using this column in a meaningful way they are in
for a nasty surprise when trying use the value in passwd as a verifier.I would prefer to drop the column entirely and
producea clear error.
 

Perhaps a better option would be to drop pg_shadow entirely since it
seems to have no further purpose in life.

Thanks,
-- 
-David
david@pgmasters.net



Re: Password identifiers, protocol aging and SCRAM protocol

From
Valery Popov
Date:
Hi!

On 03/15/2016 06:59 PM, Michael Paquier wrote:
> The set of patches I am proposing here does not go through those code 
> paths, and this is likely an aggregate failure. 
Michael, you were right. It was incorrect installation of contrib binaries.
Now all tests pass OK, both check-world and installcheck-world,
Thanks.

-- 
Regards,
Valery Popov
Postgres Professional http://www.postgrespro.com
The Russian Postgres Company




Re: Password identifiers, protocol aging and SCRAM protocol

From
Michael Paquier
Date:
On Tue, Mar 15, 2016 at 6:38 PM, David Steele <david@pgmasters.net> wrote:
> Hi Michael,
>
> On 3/14/16 7:07 PM, Michael Paquier wrote:
>
>> On Mon, Mar 14, 2016 at 5:06 PM, Michael Paquier <michael.paquier@gmail.com> wrote:
>>
>>> On Mon, Mar 14, 2016 at 4:32 PM, David Steele <david@pgmasters.net> wrote:
>>>
>>>> Could you provide an updated set of patches for review?  Meanwhile I am
>>>> marking this as "waiting for author".
>>>
>>> Sure. I'll provide them shortly with all the comments addressed. Up to
>>> now I just had a couple of comments about docs and whitespaces, so I
>>> didn't really bother sending a new set, but this meritates a rebase.
>>
>> And here they are. I have addressed the documentation and the
>> whitespaces reported up to now at the same time.
>
> For this first review I would like to focus on the user visible changes
> introduced in 0001-0002.

Thanks for the input!

> 1) I see that rolvaliduntil is still in pg_authid:
> I think that's OK if we now define it to be "role validity" (it's still
> password validity in the patched docs).  I would also like to see a
> validuntil column in pg_auth_verifiers so we can track password
> expiration for each verifier separately.  For now I think it's enough to
> copy the same validity both places since there can only be one verifier.

FWIW, this is an intentional change, and my goal is to focus on only
the protocol aging for now. We will need to move rolvaliduntil to
pg_auth_verifiers if we want to allow rolling updates of password
verifiers for a given role, but that's a different patch, and we need
to think about the SQL interface carefully. This infrastructure makes
the move easier by the way to do that, and honestly I don't really see
what we gain now by copying the same value to two different system
catalogs.

> 2) I don't think the column naming in pg_auth_verifiers is consistent
> with other catalogs:
> postgres=# select * from pg_auth_verifiers;
>  roleid | verimet |               verival
> --------+---------+-------------------------------------
>   16387 | m       | md505a671c66aefea124cc08b76ea6d30bb
>   16388 | p       | testu
>
> System catalogs generally use a 3 character prefix so I would expect the
> columns to be (if we pick avr as a prefix):

OK, this makes sense.

> avrrole
> avrmethod
> avrverifier

Assuming "ver" is the prefix, we get: verroleid, vermethod, vervalue.
I kind of like those ones, more than with "avr" as prefix actually.
Other ideas are of course welcome.

> I'm not a big fan in abbreviating too much so you can see I've expanded
> the names a bit.

Sure.

> 3) rolpassword is still in pg_shadow even though it is not useful anymore:
> postgres=# select usename, passwd, valuntil from pg_shadow;
>
>  usename |  passwd  |        valuntil
> ---------+----------+------------------------
>  vagrant | ******** |
>  test    | ******** |
>  testu   | ******** | 2017-01-01 00:00:00+00
>
> If anyone is actually using this column in a meaningful way they are in
> for a nasty surprise when trying use the value in passwd as a verifier.
>  I would prefer to drop the column entirely and produce a clear error.
>
> Perhaps a better option would be to drop pg_shadow entirely since it
> seems to have no further purpose in life.

We discussed that on the previous thread and the conclusion was to
keep pg_shadow, but to clobber the password value with "***",
explaining this choice:
http://www.postgresql.org/message-id/6174.1455501497@sss.pgh.pa.us
-- 
Michael



Re: Password identifiers, protocol aging and SCRAM protocol

From
David Steele
Date:
On 3/16/16 9:00 AM, Michael Paquier wrote:

> On Tue, Mar 15, 2016 at 6:38 PM, David Steele <david@pgmasters.net> wrote:
>
>> 1) I see that rolvaliduntil is still in pg_authid:
>> I think that's OK if we now define it to be "role validity" (it's still
>> password validity in the patched docs).  I would also like to see a
>> validuntil column in pg_auth_verifiers so we can track password
>> expiration for each verifier separately.  For now I think it's enough to
>> copy the same validity both places since there can only be one verifier.
> 
> FWIW, this is an intentional change, and my goal is to focus on only
> the protocol aging for now. We will need to move rolvaliduntil to
> pg_auth_verifiers if we want to allow rolling updates of password
> verifiers for a given role, but that's a different patch, and we need
> to think about the SQL interface carefully. This infrastructure makes
> the move easier by the way to do that, and honestly I don't really see
> what we gain now by copying the same value to two different system
> catalogs.

Here's my thinking.  If validuntil is moved to pg_auth_verifiers now
then people can start using it there.  That will make it less traumatic
when/if validuntil in pg_authid is removed later.  The field in
pg_authid could be deprecated in this release to let people know not to
use it.

Or, as I suggested it could be recast as role validity, which right now
happens to be the same as password validity.

>> 2) I don't think the column naming in pg_auth_verifiers is consistent
>> with other catalogs:
>> postgres=# select * from pg_auth_verifiers;
>>  roleid | verimet |               verival
>> --------+---------+-------------------------------------
>>   16387 | m       | md505a671c66aefea124cc08b76ea6d30bb
>>   16388 | p       | testu
>>
>> System catalogs generally use a 3 character prefix so I would expect the
>> columns to be (if we pick avr as a prefix):
> 
> OK, this makes sense.
> 
>> avrrole
>> avrmethod
>> avrverifier
> 
> Assuming "ver" is the prefix, we get: verroleid, vermethod, vervalue.
> I kind of like those ones, more than with "avr" as prefix actually.
> Other ideas are of course welcome.

ver is fine as a prefix.

>> 3) rolpassword is still in pg_shadow even though it is not useful anymore:
>> postgres=# select usename, passwd, valuntil from pg_shadow;
>>
>>  usename |  passwd  |        valuntil
>> ---------+----------+------------------------
>>  vagrant | ******** |
>>  test    | ******** |
>>  testu   | ******** | 2017-01-01 00:00:00+00
>>
>> If anyone is actually using this column in a meaningful way they are in
>> for a nasty surprise when trying use the value in passwd as a verifier.
>>  I would prefer to drop the column entirely and produce a clear error.
>>
>> Perhaps a better option would be to drop pg_shadow entirely since it
>> seems to have no further purpose in life.
> 
> We discussed that on the previous thread and the conclusion was to
> keep pg_shadow, but to clobber the password value with "***",
> explaining this choice:
> http://www.postgresql.org/message-id/6174.1455501497@sss.pgh.pa.us

Ah, I missed that one.

-- 
-David
david@pgmasters.net



Re: Password identifiers, protocol aging and SCRAM protocol

From
David Steele
Date:
Hi Michael,

On 3/14/16 7:07 PM, Michael Paquier wrote:
> On Mon, Mar 14, 2016 at 5:06 PM, Michael Paquier
> <michael.paquier@gmail.com> wrote:
>> On Mon, Mar 14, 2016 at 4:32 PM, David Steele <david@pgmasters.net> wrote:
>>> Could you provide an updated set of patches for review?  Meanwhile I am
>>> marking this as "waiting for author".
>>
>> Sure. I'll provide them shortly with all the comments addressed. Up to
>> now I just had a couple of comments about docs and whitespaces, so I
>> didn't really bother sending a new set, but this meritates a rebase.
> 
> And here they are. I have addressed the documentation and the
> whitespaces reported up to now at the same time.

Here's my full review of this patch set.

First let me thank you for submitting this patch for the current CF.  I
feel a bit guilty that I requested it and am only now posting a full
review.  In my defense I can only say that being CFM has been rather
more work than I was expecting, but I'm sure you know the feeling.

* [PATCH 1/9] Add facility to store multiple password verifiers

This is a pretty big patch but I went through it carefully and found
nothing to complain about.  Your attention to detail is impressive as
always.

Be sure to update the column names for pg_auth_verifiers as we discussed
in [1].

* [PATCH 2/9] Introduce password_protocols

diff --git a/src/test/regress/expected/password.out
b/src/test/regress/expected/password.out
+SET password_protocols = 'plain';
+ALTER ROLE role_passwd5 PASSWORD VERIFIERS (plain = 'foo'); -- ok
+ALTER ROLE role_passwd5 PASSWORD VERIFIERS (md5 = 'foo'); -- error
+ERROR:  specified password protocol not allowed
+DETAIL:  List of authorized protocols is specified by password_protocols.

So that makes sense but you get the same result if you do:

postgres=# alter user role_passwd5 password 'foo';
ERROR:  specified password protocol not allowed
DETAIL:  List of authorized protocols is specified by password_protocols.

I don't think this makes sense - if I have explicitly set
password_protocols to 'plain' and I don't specify a verifier for alter
user then it seems like it should work.  If nothing else the error
message lacks information needed to identify the problem.

* [PATCH 3/9] Add pg_auth_verifiers_sanitize

This function is just a little scary but since password_protocols
defaults to 'plain,md5' I can live with it.

* [PATCH 4/9] Remove password verifiers for unsupported protocols in
pg_upgrade

Same as above - it will always be important for password_protocols to
default to *all* protocols to avoid data being dropped during the
pg_upgrade by accident.  You've done that here (and later in the SCRAM
patch) so I'm satisfied but it bears watching.

What I would do is add some extra comments in the GUC code to make it
clear to always update the default when adding new verifiers.

* [PATCH 5/9] Move sha1.c to src/common

This looks fine to me and is a good reuse of code.

* [PATCH 6/9] Refactor sendAuthRequest

I tested this across different client versions and it seems to work fine.

* [PATCH 7/9] Refactor RandomSalt to handle salts of different lengths

A simple enough refactor.

* [PATCH 8/9] Move encoding routines to src/common/

A bit surprising that these functions were never used by any front end code.

* Subject: [PATCH 9/9] SCRAM authentication

diff --git a/src/backend/commands/user.c b/src/backend/commands/user.c
@@ -1616,18 +1619,34 @@ FlattenPasswordIdentifiers(List *verifiers, char
*rolname)         * instances of Postgres, an md5 hash passed as a plain verifier         * should still be treated as
anMD5 entry.         */
 
-        if (spec->veriftype == AUTH_VERIFIER_MD5 &&
-            !isMD5(spec->value))
+        switch (spec->veriftype)        {
-            char encrypted_passwd[MD5_PASSWD_LEN + 1];
-            if (!pg_md5_encrypt(spec->value, rolname, strlen(rolname),
-                                encrypted_passwd))
-                elog(ERROR, "password encryption failed");
-            spec->value = pstrdup(encrypted_passwd);
+            case AUTH_VERIFIER_MD5:

It seems like this case statement should have been introduced in patch
0001.  Were you just trying to avoid churn in the code unless SCRAM is
committed?

diff --git a/src/backend/libpq/auth-scram.c b/src/backend/libpq/auth-scram.c
+
+static char *
+read_attr_value(char **input, char attr)
+{

Numerous functions like the above in auth-scram.c do not have comments.

diff --git a/src/backend/libpq/crypt.c b/src/backend/libpq/crypt.c
+    else if (strcmp(token->string, "scram") == 0)
+    {
+        if (Db_user_namespace)
+        {
+            ereport(LOG,
+                    (errcode(ERRCODE_CONFIG_FILE_ERROR),
+                     errmsg("SCRAM authentication is not supported when
\"db_user_namespace\" is enabled"),
+                     errcontext("line %d of configuration file \"%s\"",
+                                line_num, HbaFileName)));
+            return NULL;
+        }
+        parsedline->auth_method = uaSASL;
+    }

Why is that?  Is it because gss auth should be expected in this case or
some limitation of SCRAM?  Anyway, it wasn't clear to me why this would
be true so some comments here would be good.

diff --git a/src/common/scram-common.c b/src/common/scram-common.c
+void
+scram_HMAC_update(scram_HMAC_ctx *ctx, const char *str, int slen)
+{
+    SHA1Update(&ctx->sha1ctx, (const uint8 *) str, slen);
+}

Same in scram-common.c WRT comments.

diff --git a/src/include/common/scram-common.h
b/src/include/common/scram-common.h
+extern void scram_ClientOrServerKey(const char *password, const char
*salt, int saltlen, int iterations, const char *keystr, uint8 *result);

My, that's a very long line!

* A few general things:

Most of the new scram modules are seriously in need of better comments -
I pointed out a few but all the new files suffer from this lack.

The strings "plain", "md5", and "scram" are used often enough that I
think it would be nice if they were constants.  I feel the same way
about verifier methods 'm', 'p', 's' -- perhaps more so because they
aren't very verbose.

It looks like this will need a bit of work if the GSSAPI patch goes in
(and vice versa).  Not a problem but you'll need to be prepared to do
that quickly in the event - time is flying.

-- 
-David
david@pgmasters.net

[1]
http://www.postgresql.org/message-id/CAB7nPqSGm-9c4yFULt4GS9TzoSuz8XbO-K7TGGGw08sztfG2Uw@mail.gmail.com



Re: Password identifiers, protocol aging and SCRAM protocol

From
Michael Paquier
Date:
On Fri, Mar 18, 2016 at 3:16 AM, David Steele <david@pgmasters.net> wrote:
> Here's my full review of this patch set.

Thanks!

> First let me thank you for submitting this patch for the current CF.  I
> feel a bit guilty that I requested it and am only now posting a full
> review.  In my defense I can only say that being CFM has been rather
> more work than I was expecting, but I'm sure you know the feeling.

I get the idea. That's a very draining activity and I can see what you
are doing. That's impressive. Really.

> * [PATCH 1/9] Add facility to store multiple password verifiers
>
> This is a pretty big patch but I went through it carefully and found
> nothing to complain about.  Your attention to detail is impressive as
> always.
>
> Be sure to update the column names for pg_auth_verifiers as we discussed
> in [1].

Done. I have added as well the block of 0009 you pointed out into this
patch for clarity.

> * [PATCH 2/9] Introduce password_protocols
>
> diff --git a/src/test/regress/expected/password.out
> b/src/test/regress/expected/password.out
> +SET password_protocols = 'plain';
> +ALTER ROLE role_passwd5 PASSWORD VERIFIERS (plain = 'foo'); -- ok
> +ALTER ROLE role_passwd5 PASSWORD VERIFIERS (md5 = 'foo'); -- error
> +ERROR:  specified password protocol not allowed
> +DETAIL:  List of authorized protocols is specified by password_protocols.
>
> So that makes sense but you get the same result if you do:
>
> postgres=# alter user role_passwd5 password 'foo';
> ERROR:  specified password protocol not allowed
> DETAIL:  List of authorized protocols is specified by password_protocols.
>
> I don't think this makes sense - if I have explicitly set
> password_protocols to 'plain' and I don't specify a verifier for alter
> user then it seems like it should work.  If nothing else the error
> message lacks information needed to identify the problem.

Hm. The problem here is the interaction between the new
password_protocols and the existing password_encryption.
password_protocols involves that password_encryption should not
contain elements not listed in it, in short password_protocols @>
password_encryption. So I think that the GUC callbacks checking the
validity of those parameter values should check that each other are
not set to incorrect values. One thing to simplify those validity
checks would be to make password_protocols a PGC_POSTMASTER, aka it
needs a restart to be updated. This sacrifices a large portion of the
regression tests though... Do others have thoughts to share? I have
not updated the patch yet, and I would personally let both parameters
as they are now, aka password_protocols as PGC_SUSET and
password_encryption as PGC_USERSET, and check their validity when they
are updated, but I am not alone here (hopefully).

> * [PATCH 3/9] Add pg_auth_verifiers_sanitize
>
> This function is just a little scary but since password_protocols
> defaults to 'plain,md5' I can live with it.

Another thing that I thought about was to integrate as part of
pg_upgrade_support part. That's no big deal to do it this way as well,
though I thought that it could be useful for admins. So extra ideas
are welcome. That's superuser-only anyway... And a critical part to
manage old protocol deprecation.

> * [PATCH 4/9] Remove password verifiers for unsupported protocols in
> pg_upgrade
>
> Same as above - it will always be important for password_protocols to
> default to *all* protocols to avoid data being dropped during the
> pg_upgrade by accident.  You've done that here (and later in the SCRAM
> patch) so I'm satisfied but it bears watching.

We could have an extra keyword like "all" to all mapping to all the
existing protocols, but I find listing the protocols explicitly a more
verbose and simple concept, that's why I chose that.

> What I would do is add some extra comments in the GUC code to make it
> clear to always update the default when adding new verifiers.

Good idea.

> * [PATCH 5/9] Move sha1.c to src/common
>
> This looks fine to me and is a good reuse of code.

Yes.

> * [PATCH 6/9] Refactor sendAuthRequest
>
> I tested this across different client versions and it seems to work fine.

OK, cool!

> * [PATCH 7/9] Refactor RandomSalt to handle salts of different lengths
>
> A simple enough refactor.

That's something we should do as an independent change I think.

> * [PATCH 8/9] Move encoding routines to src/common/
>
> A bit surprising that these functions were never used by any front end code.

Perhaps there are some client tools that copy-paste it. I cannot be
sure. At least it seems to me that this is useful enough as an
independent change.

> * Subject: [PATCH 9/9] SCRAM authentication
>
> diff --git a/src/backend/commands/user.c b/src/backend/commands/user.c
> @@ -1616,18 +1619,34 @@ FlattenPasswordIdentifiers(List *verifiers, char
> *rolname)
>                  * instances of Postgres, an md5 hash passed as a plain verifier
>                  * should still be treated as an MD5 entry.
>                  */
> -               if (spec->veriftype == AUTH_VERIFIER_MD5 &&
> -                       !isMD5(spec->value))
> +               switch (spec->veriftype)
>                 {
> -                       char encrypted_passwd[MD5_PASSWD_LEN + 1];
> -                       if (!pg_md5_encrypt(spec->value, rolname, strlen(rolname),
> -                                                               encrypted_passwd))
> -                               elog(ERROR, "password encryption failed");
> -                       spec->value = pstrdup(encrypted_passwd);
> +                       case AUTH_VERIFIER_MD5:
>
> It seems like this case statement should have been introduced in patch
> 0001.  Were you just trying to avoid churn in the code unless SCRAM is
> committed?

Yeah, right. I have now plugged this portion into 0001.

> diff --git a/src/backend/libpq/auth-scram.c b/src/backend/libpq/auth-scram.c
> +
> +static char *
> +read_attr_value(char **input, char attr)
> +{
>
> Numerous functions like the above in auth-scram.c do not have comments.

Noted. I have done nothing on that yet though :) And I am lowering the
priority for 0009 in this CF to keep focus on the core machinery
instead, as well as other patches that need feedback.

> diff --git a/src/backend/libpq/crypt.c b/src/backend/libpq/crypt.c
> +       else if (strcmp(token->string, "scram") == 0)
> +       {
> +               if (Db_user_namespace)
> +               {
> +                       ereport(LOG,
> +                                       (errcode(ERRCODE_CONFIG_FILE_ERROR),
> +                                        errmsg("SCRAM authentication is not supported when
> \"db_user_namespace\" is enabled"),
> +                                        errcontext("line %d of configuration file \"%s\"",
> +                                                               line_num, HbaFileName)));
> +                       return NULL;
> +               }
> +               parsedline->auth_method = uaSASL;
> +       }
>
> Why is that?  Is it because gss auth should be expected in this case or
> some limitation of SCRAM?  Anyway, it wasn't clear to me why this would
> be true so some comments here would be good.

The username is part of the identifier used as part of the protocol,
so we cannot rely on mappings of db_user_namespace.

> diff --git a/src/common/scram-common.c b/src/common/scram-common.c
> +void
> +scram_HMAC_update(scram_HMAC_ctx *ctx, const char *str, int slen)
> +{
> +       SHA1Update(&ctx->sha1ctx, (const uint8 *) str, slen);
> +}
>
> Same in scram-common.c WRT comments.

OK, noted. I have not updated those comments yet though. At this stage
of the game considering 0009 for integration is a rather difficult
task, and I suspect enough work with the underlying patches. For 9.6,
I would be happy enough if we got the basic infra in core.

> diff --git a/src/include/common/scram-common.h
> b/src/include/common/scram-common.h
> +extern void scram_ClientOrServerKey(const char *password, const char
> *salt, int saltlen, int iterations, const char *keystr, uint8 *result);
>
> My, that's a very long line!

Oops. Sorry.

> * A few general things:
>
> Most of the new scram modules are seriously in need of better comments -
> I pointed out a few but all the new files suffer from this lack.

Indeed. Honestly, as you say, time flies, and by the time of the
feature freeze I am thinking that the only sane target for the CF
would be to focus on 0001~0004. That's the basic infrastructure I
think we need anyway. 0005~0008 are things that I think are useful
taken independently and are simple refactoring, so they could be
considered with the time frame we have. 0009 is a bit too complex. I
expect enough comments on the first patches to keep my time busy until
the end of this CF without that, that's still useful for testing by
the way.

> The strings "plain", "md5", and "scram" are used often enough that I
> think it would be nice if they were constants.

This makes sense. So I switched the code this way. Note that for md5 I
think that it makes sense to use a #define variable when referring to
the verifier method, not when referring to the prefix of a md5
verifier. Those full names are added in pg_auth_verifiers.h.

> I feel the same way
> about verifier methods 'm', 'p', 's' -- perhaps more so because they
> aren't very verbose.

I am thinking of the verifier abbreviations in the system catalog in a
way similar to pg_class' relkind, explaining the one-character
identifier, so I wish letting them as-is.

> It looks like this will need a bit of work if the GSSAPI patch goes in
> (and vice versa).  Not a problem but you'll need to be prepared to do
> that quickly in the event - time is flying.

That's not an issue for me to rebase this set of patches. The only
conflicts that I anticipate are on 0009, but I don't have high hopes
to get this portion integrating into core for 9.6, the rest of the
patches is complicated enough, and everyone bandwidth is limited.
--
Michael

Attachment

Re: Password identifiers, protocol aging and SCRAM protocol

From
Robert Haas
Date:
On Fri, Mar 18, 2016 at 9:31 AM, Michael Paquier
<michael.paquier@gmail.com> wrote:
> That's not an issue for me to rebase this set of patches. The only
> conflicts that I anticipate are on 0009, but I don't have high hopes
> to get this portion integrating into core for 9.6, the rest of the
> patches is complicated enough, and everyone bandwidth is limited.

I really think we ought to consider pushing this whole thing out to
9.7.  I don't see how we're going to get all of this into 9.6, and
these are big, user-facing changes that I don't think we should rush
into under time pressure.  I think it'd be better to do this early in
the 9.7 cycle so that it has time to settle before the time crunch at
the end.  I predict this is going to have a lot of loose ends that are
going to take months to settle, and we don't have that time right now.
And I'd rather see all of the changes in one release than split them
across two releases.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Password identifiers, protocol aging and SCRAM protocol

From
Michael Paquier
Date:
On Sat, Mar 19, 2016 at 12:28 AM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Fri, Mar 18, 2016 at 9:31 AM, Michael Paquier
> <michael.paquier@gmail.com> wrote:
>> That's not an issue for me to rebase this set of patches. The only
>> conflicts that I anticipate are on 0009, but I don't have high hopes
>> to get this portion integrating into core for 9.6, the rest of the
>> patches is complicated enough, and everyone bandwidth is limited.
>
> I really think we ought to consider pushing this whole thing out to
> 9.7.  I don't see how we're going to get all of this into 9.6, and
> these are big, user-facing changes that I don't think we should rush
> into under time pressure.  I think it'd be better to do this early in
> the 9.7 cycle so that it has time to settle before the time crunch at
> the end.  I predict this is going to have a lot of loose ends that are
> going to take months to settle, and we don't have that time right now.
> And I'd rather see all of the changes in one release than split them
> across two releases.

FWIW, the catalog separation is not that much a complicated patch, and
that's really a change independent on SCRAM, the main matter being to
manage critical index and relation entries correctly and it does not
touch the authentication code, which is what IMO is the sensitive
part. The catalog separation opens the door as well to multiple
verifiers for the same protocol for a single role, facilitating
password rolling policies, which is a feature that has been asked a
lot. Nothing prevents the development of moving validuntil into
pg_auth_verifiers in parallel of the SCRAM for the 9.7 release cycle,
though it would facilitate it to have some basic infra in place. Just
my 2c.
-- 
Michael



Re: Password identifiers, protocol aging and SCRAM protocol

From
Stephen Frost
Date:
Robert, all,

* Robert Haas (robertmhaas@gmail.com) wrote:
> On Fri, Mar 18, 2016 at 9:31 AM, Michael Paquier
> <michael.paquier@gmail.com> wrote:
> > That's not an issue for me to rebase this set of patches. The only
> > conflicts that I anticipate are on 0009, but I don't have high hopes
> > to get this portion integrating into core for 9.6, the rest of the
> > patches is complicated enough, and everyone bandwidth is limited.
>
> I really think we ought to consider pushing this whole thing out to
> 9.7.  I don't see how we're going to get all of this into 9.6, and
> these are big, user-facing changes that I don't think we should rush
> into under time pressure.  I think it'd be better to do this early in
> the 9.7 cycle so that it has time to settle before the time crunch at
> the end.  I predict this is going to have a lot of loose ends that are
> going to take months to settle, and we don't have that time right now.

I'm not sure that I agree with the above.  This patch has been through
the ringer multiple times regarding the user-facing bits and, by and
large, the results appear reasonable.  Further, getting a better auth
method into PG is something which I do view as a priority considering
the concerns and complaints that have been, justifiably, raised against
our current password-based authentication support.

This isn't a new patch set either, it was submitted initially over the
summer after it was pointed out, over a year ago, that people actually
do care about the problems with our current implementation (amusingly, I
recall having pointed out the same 5+ years ago, but only did so to this
list).

I've been following along on this patch set and asked David to spend
time reviewing it as I feel that it's stil got a chance for 9.6, since
it's been through multiple CF rounds and has had a fair bit of
discussion, review, and consideration.

> And I'd rather see all of the changes in one release than split them
> across two releases.

I agree with this.  If we aren't going to get SCRAM into 9.6 then the
rest is just breaking things with little benefit.  I'm optomistic that
we will be able to include SCRAM support in 9.6, but if that ends up not
being feasible then we need to put all of the changes to the next
release.

I do think that if we push this off to 9.7 then we're going to have
SCRAM *plus* a bunch of other changes around password policies in that
release, and it'd be better to introduce SCRAM independently of the
other changes.

All that said, this is just my voice from having followed this thread
and discussing it with David and I'm not trying to force anything.  It'd
certainly be nice to have and to be able to tell people that we do have
a strong and recognized approach to password-based authentication in PG,
but I've long been telling everyone that they should be using GSSAPI
and/or SSL and can continue to do so for another year if necessary.

Thanks!

Stephen

Re: Password identifiers, protocol aging and SCRAM protocol

From
Robert Haas
Date:
On Fri, Mar 18, 2016 at 2:12 PM, Stephen Frost <sfrost@snowman.net> wrote:
> I'm not sure that I agree with the above.  This patch has been through
> the ringer multiple times regarding the user-facing bits and, by and
> large, the results appear reasonable.  Further, getting a better auth
> method into PG is something which I do view as a priority considering
> the concerns and complaints that have been, justifiably, raised against
> our current password-based authentication support.
>
> This isn't a new patch set either, it was submitted initially over the
> summer after it was pointed out, over a year ago, that people actually
> do care about the problems with our current implementation (amusingly, I
> recall having pointed out the same 5+ years ago, but only did so to this
> list).

I am not disputing the importance of the topic, and I do realize that
the patch has been around in some form since March.  However, I don't
think there's been a whole heck of a lot in terms of detailed
code-level review, and I think that's pretty important for something
that necessarily involves wire protocol changes.  Doing that with the
level of detail and care that it seems to me to require seems like an
almost-impossible task.  Most of the major features I've committed
this CommitFest are patches where I've personally done multiple rounds
of review on over the last several months, and in many cases, other
people have been doing code reviews for months before that.  I'm not
denying that this patch has prompted a good deal of discussion and
what I would call design review, but detailed code review?  I just
haven't seen much of that.

>> And I'd rather see all of the changes in one release than split them
>> across two releases.
>
> I agree with this.  If we aren't going to get SCRAM into 9.6 then the
> rest is just breaking things with little benefit.  I'm optomistic that
> we will be able to include SCRAM support in 9.6, but if that ends up not
> being feasible then we need to put all of the changes to the next
> release.

OK, glad we agree on that.

> I do think that if we push this off to 9.7 then we're going to have
> SCRAM *plus* a bunch of other changes around password policies in that
> release, and it'd be better to introduce SCRAM independently of the
> other changes.

Well, for my part, I'd be happy enough to do all of that in a release
cycle - maybe SCRAM at the beginning and those other changes a little
later on.  I don't see that as a real conflict, and in fact, sometimes
when you do several things like that in a single cycle, people start
to see whatever the common theme is - security, say - as part of the
message of that release a little more than they would if a feature
lands here and another there.  That's not all a bad thing.

> All that said, this is just my voice from having followed this thread
> and discussing it with David and I'm not trying to force anything.  It'd
> certainly be nice to have and to be able to tell people that we do have
> a strong and recognized approach to password-based authentication in PG,
> but I've long been telling everyone that they should be using GSSAPI
> and/or SSL and can continue to do so for another year if necessary.

I agree it's unfortunate, but IMHO that's kinda where we are at.  If
Heikki were still involved and had been working on this, I strongly
suspect it would have been committed already.  But he's not, and it's
not clear when or if he's coming back, and I cannot imagine how we are
going to begin and complete pushing in a feature of this magnitude in
the three weeks before feature freeze without a lot of collateral
damage.  That is an opinion, not a fact, but it's one I feel pretty
confident about.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Password identifiers, protocol aging and SCRAM protocol

From
Michael Paquier
Date:
On Sat, Mar 19, 2016 at 3:52 AM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Fri, Mar 18, 2016 at 2:12 PM, Stephen Frost <sfrost@snowman.net> wrote:
>> I'm not sure that I agree with the above.  This patch has been through
>> the ringer multiple times regarding the user-facing bits and, by and
>> large, the results appear reasonable.  Further, getting a better auth
>> method into PG is something which I do view as a priority considering
>> the concerns and complaints that have been, justifiably, raised against
>> our current password-based authentication support.
>>
>> This isn't a new patch set either, it was submitted initially over the
>> summer after it was pointed out, over a year ago, that people actually
>> do care about the problems with our current implementation (amusingly, I
>> recall having pointed out the same 5+ years ago, but only did so to this
>> list).
>
> I am not disputing the importance of the topic, and I do realize that
> the patch has been around in some form since March.  However, I don't
> think there's been a whole heck of a lot in terms of detailed
> code-level review, and I think that's pretty important for something
> that necessarily involves wire protocol changes.

Yep, that's the desert here, though there are surely a lot of people
would like a way to get out of md5 and get into something more modern
(see STIG), and many companies want to get something, my company
included, though this is really a complicated task, and there are few
people who could really help out here I guess.

> Doing that with the
> level of detail and care that it seems to me to require seems like an
> almost-impossible task.  Most of the major features I've committed
> this CommitFest are patches where I've personally done multiple rounds
> of review on over the last several months, and in many cases, other
> people have been doing code reviews for months before that.  I'm not
> denying that this patch has prompted a good deal of discussion and
> what I would call design review, but detailed code review?  I just
> haven't seen much of that.

There has been none, as well as no real discussion regarding what we
want to do. The current result, particularly for the management of
protocol aging, is based on things I wrote by myself which negate the
many negative opinions received up to now for the past patches (mainly
the feedback was "I don't like that", without real output or fresh
ideas during discussion to explain why that's the case).

>>> And I'd rather see all of the changes in one release than split them
>>> across two releases.
>>
>> I agree with this.  If we aren't going to get SCRAM into 9.6 then the
>> rest is just breaking things with little benefit.  I'm optimistic that
>> we will be able to include SCRAM support in 9.6, but if that ends up not
>> being feasible then we need to put all of the changes to the next
>> release.
>
> OK, glad we agree on that.

Speaking as a co-author of the stuff of this thread, the two main
patches are 0001, introducing pg_auth_verifiers and 0009, adding
SCRAM-SHA1. The rest is just refactoring and addition of a couple of
utilities to manage the protocol aging, which are really
straight-forward, and all the user-visible changes are introduced by
0001. While I really like the shape of 0001, 0009 is not there yet,
and really requires more time than 3 weeks, that's more than what I
can do by feature freeze of 9.6. So if the conclusion is if there is
no SCRAM, all the other changes don't make much sense, let's bump it
to 9.7. There is honestly still interest from here, and I would guess
that the only thing I could do on top of having patches for the first
CF of 9.7 is discussing the topic at the dev unconference of PGCon.

>> I do think that if we push this off to 9.7 then we're going to have
>> SCRAM *plus* a bunch of other changes around password policies in that
>> release, and it'd be better to introduce SCRAM independently of the
>> other changes.
>
> Well, for my part, I'd be happy enough to do all of that in a release
> cycle - maybe SCRAM at the beginning and those other changes a little
> later on.  I don't see that as a real conflict, and in fact, sometimes
> when you do several things like that in a single cycle, people start
> to see whatever the common theme is - security, say - as part of the
> message of that release a little more than they would if a feature
> lands here and another there.  That's not all a bad thing.

Having a centralized theme for a given release cycle is not a bad
thing, I agree. And I'd like to think that the same discussion is not
going to happen again in one year...
-- 
Michael



Re: Password identifiers, protocol aging and SCRAM protocol

From
Robert Haas
Date:
On Sat, Mar 19, 2016 at 8:30 AM, Michael Paquier
<michael.paquier@gmail.com> wrote:
>> Doing that with the
>> level of detail and care that it seems to me to require seems like an
>> almost-impossible task.  Most of the major features I've committed
>> this CommitFest are patches where I've personally done multiple rounds
>> of review on over the last several months, and in many cases, other
>> people have been doing code reviews for months before that.  I'm not
>> denying that this patch has prompted a good deal of discussion and
>> what I would call design review, but detailed code review?  I just
>> haven't seen much of that.
>
> There has been none, as well as no real discussion regarding what we
> want to do. The current result, particularly for the management of
> protocol aging, is based on things I wrote by myself which negate the
> many negative opinions received up to now for the past patches (mainly
> the feedback was "I don't like that", without real output or fresh
> ideas during discussion to explain why that's the case).

Well, I said before and I'll say again that I don't like the idea of
multiple password verifiers.  I think that's an accident waiting to
happen, and I'm not prepared to put in the amount of time and energy
that it would take to get that feature committed despite not wanting
it myself, or for being responsible for it afterwards.  I'd prefer we
didn't do it at all, although I'm not going to dig in my heels.  I
might be willing to deal with SCRAM itself, but this whole area is not
my strongest suit.  So ideally some other committer would be willing
to pick this up.

But the problem isn't even just that somebody has to hit the final
commit button - as we've both said, there's a woeful lack of any
meaningful review on this thread, and this sort of change really needs
quite a lot of review.  This has implications for
backward-compatibility, for connectors that don't use libpq, etc.
Really, I'm not even sure we have consensus on the direction.  I mean,
Heikki's proposal to adopt SCRAM sounds good enough at a broad level,
but I don't really know what the alternatives are, I'm mostly just
taking his word for it, and like you say, there's been a fair amount
of miscellaneous negativity floating around.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Password identifiers, protocol aging and SCRAM protocol

From
Michael Paquier
Date:
On Mon, Mar 21, 2016 at 11:07 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> Well, I said before and I'll say again that I don't like the idea of
> multiple password verifiers.  I think that's an accident waiting to
> happen, and I'm not prepared to put in the amount of time and energy
> that it would take to get that feature committed despite not wanting
> it myself, or for being responsible for it afterwards.  I'd prefer we
> didn't do it at all, although I'm not going to dig in my heels.  I
> might be willing to deal with SCRAM itself, but this whole area is not
> my strongest suit.  So ideally some other committer would be willing
> to pick this up.

I won't bet my hand on that.

> But the problem isn't even just that somebody has to hit the final
> commit button - as we've both said, there's a woeful lack of any
> meaningful review on this thread, and this sort of change really needs
> quite a lot of review.

Yep.

> This has implications for
> backward-compatibility, for connectors that don't use libpq, etc.
> Really, I'm not even sure we have consensus on the direction.  I mean,
> Heikki's proposal to adopt SCRAM sounds good enough at a broad level,
> but I don't really know what the alternatives are, I'm mostly just
> taking his word for it, and like you say, there's been a fair amount
> of miscellaneous negativity floating around.

PAKE or J-PAKE are other alternatives I have in mind.

I have marked the patch as returned with feedback.
-- 
Michael



Re: Password identifiers, protocol aging and SCRAM protocol

From
Magnus Hagander
Date:
On Tue, Mar 22, 2016 at 2:48 PM, Michael Paquier <michael.paquier@gmail.com> wrote:
On Mon, Mar 21, 2016 at 11:07 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> Well, I said before and I'll say again that I don't like the idea of
> multiple password verifiers.  I think that's an accident waiting to
> happen, and I'm not prepared to put in the amount of time and energy
> that it would take to get that feature committed despite not wanting
> it myself, or for being responsible for it afterwards.  I'd prefer we
> didn't do it at all, although I'm not going to dig in my heels.  I
> might be willing to deal with SCRAM itself, but this whole area is not
> my strongest suit.  So ideally some other committer would be willing
> to pick this up.

I won't bet my hand on that.

In principle I'd be happy to look at it, but I doubt that I will have enough time to get it done within this CF unfortunately. Thus I'd rather not commit to doing it.. It kind of fell off my radar too long ago, as I was originally planning to look at it back in the autumn, but failed. 

So basically, if somebody else has the cycles to do it in time for 9.6, please do.


I have marked the patch as returned with feedback.


Yeah, unfortunately I think that's probably right. Let's focus on things that have a better chance of making it. 

--

Re: Password identifiers, protocol aging and SCRAM protocol

From
Julian Markwort
Date:
----[This is a rather informal user-review]----

Here are some thoughts and experiences on using the new features, I 
focused on testing the basic funcionality of setting password_encryption 
to scram and then generating some users with passwords. After that, I 
took a look at the documentation, specifically all those parts that 
mentioned "md5", but not SCRAM, so i took some time to write those down 
and add my thoughts on them.

We're quite keen on seeing these features in a future release, so I 
suggest that we add these patches to the next commitfest asap in order 
to keep the discussion on this topic flowing.

For those of you who like to put the authentication method itself up for 
discussion, I'd like to add that it seems fairly simple to insert code 
for new authentication mechanisms.
In conclusion I think these patches are very useful.


My remarks follow below.

Kind regards,
Julian Markwort
julian.markwort@uni-muenster.de




Things I noticed:
1.    when using either        CREATE ROLE        ALTER ROLE    with the parameter        ENCRYPTED    md5 encryption
isalways assumed (I've come to realize that 
 
UNENCRYPTED always equals plain and, in the past, ENCRYPTED equaled md5 
since there were no other options)
    I don't know if this is intended behaviour. Maybe this option 
should be omitted (or marked as deprecated in the documentation) from 
the CREATE/ALTER functions (since without this Option, the 
password_encryption from pg_conf.hba is used)    or maybe it should have it's own parameter like        CREATE ROLE
testuserWITH LOGIN ENCRYPTED 'SCRAM' PASSWORD 'test';    so that the desired encryption is used.    From my point of
view,this would be the sensible thing to do, 
 
especially if different verifiers should be allowed (as proposed by 
these patches).    In either case, a bit of text explaining the (UN)ENCRYPTED option 
should be added to the documentation of the CREATE/ALTER ROLE functions.

2.    Documentation    III.        17. Server Setup and Operation            17.2. Creating a Database Cluster: maybe
listSCRAM as a 
 
possible method for securing the db-admin
        19. Client Authentication            19.1. The pg_hba.conf File: SCRAM is not listed in the list 
of available auth_methods to be specified in pg_conf.hba            19.3 Authentication Methods                19.3.2
PasswordAuthentication: SCRAM would belong to 
 
the same category as md5 and password, as they are all password-based.
        20. Database Roles            20.2. Role Attributes: password : list SCRAM as 
authentication method as well
    VI.        ALTER ROLE: is SCRAM also dependent on the role name for 
salting? if so, add warning.                    (it doesn't seem that way, however I'm curious as 
to why the function FlattenPasswordIdentifiers in 
src/backend/commands/user.c called by AlterRole passes rolname to 
scram_build_verifier(), when that function does absolutely nothing with 
this argument?)        CREATE ROLE: can SCRAM also be used in the list of PASSWORD 
VERIFIERS?
    VII.        49. System Catalogs:            49.9 pg_auth_verifiers: Column names and types are mixed up
                      in description for column vervalue:                                    explain some basic stuff
aboutmd5 
 
maybe as well?
                                    remark: the statements about the 
composition of the string that is md5-hashed are contradictory.                                    (concatenating "bar"
to"foo" 
 
results in foobar, not the other way round, as it is implied in the 
explanation of the md5 hashing), this however, is not really linked to 
the changes introduced with these patches.
                                    remark: naming inconsistency: md5 
vervalues are stored "md5*" why don't we take the same approach and use 
it on SCRAM hashes (i.e. "scram*" ).                                    (if this is a general convention 
thing, please ignore this comment, however I couldn't find anything in 
the relevant RFC's while skimming through them).
        50. Frontend/Backend Protocol            50.2.1 Start-up:  add explanation for 
"AuthenticationSCRAMPassword" authentication request message. (?)            50.5 message formats  see 50.2.1




Re: Password identifiers, protocol aging and SCRAM protocol

From
Michael Paquier
Date:
On Wed, Mar 30, 2016 at 1:44 AM, Julian Markwort
<julian.markwort@uni-muenster.de> wrote:
> ----[This is a rather informal user-review]----
>
> Here are some thoughts and experiences on using the new features, I focused
> on testing the basic funcionality of setting password_encryption to scram
> and then generating some users with passwords. After that, I took a look at
> the documentation, specifically all those parts that mentioned "md5", but
> not SCRAM, so i took some time to write those down and add my thoughts on
> them.
>
> We're quite keen on seeing these features in a future release, so I suggest
> that we add these patches to the next commitfest asap in order to keep the
> discussion on this topic flowing.
>
> For those of you who like to put the authentication method itself up for
> discussion, I'd like to add that it seems fairly simple to insert code for
> new authentication mechanisms.
> In conclusion I think these patches are very useful.

The reception of the concept of multiple password verifiers for a
single role was rather... cold. So except if a committer pushes hard
for it is never going to show up. There is clear consensus that SCRAM
is something needed though, so we may as well just focus on that.

> Things I noticed:
> 1.
>     when using either
>         CREATE ROLE
>         ALTER ROLE
>     with the parameter
>         ENCRYPTED
>     md5 encryption is always assumed (I've come to realize that UNENCRYPTED
> always equals plain and, in the past, ENCRYPTED equaled md5 since there were
> no other options)

Yes, that's to match the current behavior, and make something fully
backward-compatible. Switching to md5 + scram may have made sense as
well though.

>     I don't know if this is intended behaviour.

This is an intended behavior.

> Maybe this option should be
> omitted (or marked as deprecated in the documentation) from the CREATE/ALTER
> functions (since without this Option, the password_encryption from
> pg_conf.hba is used)
>     or maybe it should have it's own parameter like
>         CREATE ROLE testuser WITH LOGIN ENCRYPTED 'SCRAM' PASSWORD 'test';
>     so that the desired encryption is used.
>     From my point of view, this would be the sensible thing to do,
> especially if different verifiers should be allowed (as proposed by these
> patches).

The extension PASSWORD VERIFIERS is aimed at covering this need. The
grammar of those queries is not a fixed thing though.

>     In either case, a bit of text explaining the (UN)ENCRYPTED option should
> be added to the documentation of the CREATE/ALTER ROLE functions.

It is specified here;
http://www.postgresql.org/docs/devel/static/sql-createrole.html
And the patch does not ignore that.

> 2.
>     Documentation
>     III.
>         17. Server Setup and Operation
>             17.2. Creating a Database Cluster: maybe list SCRAM as a
> possible method for securing the db-admin

Indeed.

>         19. Client Authentication
>             19.1. The pg_hba.conf File: SCRAM is not listed in the list of
> available auth_methods to be specified in pg_conf.hba
>             19.3 Authentication Methods
>                 19.3.2 Password Authentication: SCRAM would belong to the
> same category as md5 and password, as they are all password-based.
>
>         20. Database Roles
>             20.2. Role Attributes: password : list SCRAM as authentication
> method as well

Indeed.

>     VI.
>         ALTER ROLE: is SCRAM also dependent on the role name for salting? if
> so, add warning.

No.

>                     (it doesn't seem that way, however I'm curious as to why
> the function FlattenPasswordIdentifiers in src/backend/commands/user.c
> called by AlterRole passes rolname to scram_build_verifier(), when that
> function does absolutely nothing with this argument?)

Yeah, this argument could be removed.

>         CREATE ROLE: can SCRAM also be used in the list of PASSWORD
> VERIFIERS?

Yes.

>     VII.
>         49. System Catalogs:
>             49.9 pg_auth_verifiers: Column names and types are mixed up
>                                     in description for column vervalue:

Yes, things are messed up a bit there. Thanks for noticing.

>                                     remark: naming inconsistency: md5
> vervalues are stored "md5*" why don't we take the same approach and use it
> on SCRAM hashes (i.e. "scram*" ).

Perhaps this makes sense if there is no pg_auth_verifiers.
-- 
Michael



Re: Password identifiers, protocol aging and SCRAM protocol

From
Robert Haas
Date:
On Wed, Mar 30, 2016 at 9:46 AM, Michael Paquier
<michael.paquier@gmail.com> wrote:
>> Things I noticed:
>> 1.
>>     when using either
>>         CREATE ROLE
>>         ALTER ROLE
>>     with the parameter
>>         ENCRYPTED
>>     md5 encryption is always assumed (I've come to realize that UNENCRYPTED
>> always equals plain and, in the past, ENCRYPTED equaled md5 since there were
>> no other options)
>
> Yes, that's to match the current behavior, and make something fully
> backward-compatible. Switching to md5 + scram may have made sense as
> well though.

I think we're not going to have much luck getting people to switch
over to SCRAM if the default remains MD5.  Perhaps there should be a
GUC for this - and we can initially set that GUC to md5, allowing
people who are ready to adopt SCRAM to change it.  And then in a later
release we can change the default, once we're pretty confident that
most connectors have added support for the new authentication method.
This is going to take a long time to roll out.  Alternatively, we
could control it strictly through DDL.

Note that the existing behavior is pretty wonky:

alter user rhaas unencrypted password 'foo'; -> rolpassword foo
alter user rhaas encrypted password 'foo'; -> rolpassword
md5e748797a605a1c95f3d6b5f140b2d528
alter user rhaas encrypted password
'md5e748797a605a1c95f3d6b5f140b2d528'; -> rolpassword
md5e748797a605a1c95f3d6b5f140b2d528
alter user rhaas unencrypted password
'md5e748797a605a1c95f3d6b5f140b2d528'; -> rolpassword
md5e748797a605a1c95f3d6b5f140b2d528

So basically the use of the ENCRYPTED keyword means "if it does
already seem to be the sort of MD5 blob we're expecting, turn it into
that".  And we just rely on the format to distinguish between an MD5
verifier and an unencrypted password.  Personally, I think a good
start here, and I think you may have something like this in the patch
already, would be to split rolpassword into two columns, say
rolencryption and rolpassword.  rolencryption says how the password
verifier is encrypted and rolpassword contains the verifier itself.
Initially, rolencryption will be 'plain' or 'md5', but later we can
add 'scram' as another choice, or maybe it'll be more specific like
'scram-hmac-doodad'.  And then maybe introduce syntax like this:

alter user rhaas set password 'raw-unencrypted-passwordt' using
'verifier-method';
alter user rhaas set password verifier 'verifier-goes-here' using
'verifier-method';

That might require making verifier a key word, which would be good to
avoid.  Perhaps we could use "password validator" instead?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Password identifiers, protocol aging and SCRAM protocol

From
José Luis Tallón
Date:
On 03/30/2016 06:14 PM, Robert Haas wrote:
> So basically the use of the ENCRYPTED keyword means "if it does 
> already seem to be the sort of MD5 blob we're expecting, turn it into 
> that". 

If it does NOT already seem to be... I guess?

> And we just rely on the format to distinguish between an MD5 verifier 
> and an unencrypted password. Personally, I think a good start here, 
> and I think you may have something like this in the patch already, 
> would be to split rolpassword into two columns, say rolencryption and 
> rolpassword. 

This inches closer to Michael's suggestion to have multiple verifiers 
per pg_authid user ...

> rolencryption says how the password verifier is encrypted and 
> rolpassword contains the verifier itself. Initially, rolencryption 
> will be 'plain' or 'md5', but later we can add 'scram' as another 
> choice, or maybe it'll be more specific like 'scram-hmac-doodad'.

May I suggest using  "{" <scheme>["."<encoding>] "}" just like Dovecot does?

e.g. "{md5.hex}e748797a605a1c95f3d6b5f140b2d528"

where no "{ ... }" prefix means just fallback to the old method of 
trying to guess what the blob contains?    This would invalidate PLAIN passwords beginning with "{", though, 
so some measures would be needed.

> And then maybe introduce syntax like this: alter user rhaas set 
> password 'raw-unencrypted-passwordt' using 'verifier-method'; alter 
> user rhaas set password verifier 'verifier-goes-here' using 
> 'verifier-method'; That might require making verifier a key word, 
> which would be good to avoid. Perhaps we could use "password 
> validator" instead? 

I'd like USING best ... though by prepending the schema for ENCRYPTED, 
the required information is already conveyed within the verifier, so no 
need to specify it again :)


Just my .02€

    / J.L.




Re: Password identifiers, protocol aging and SCRAM protocol

From
Robert Haas
Date:
On Wed, Mar 30, 2016 at 12:31 PM, José Luis Tallón
<jltallon@adv-solutions.net> wrote:
> On 03/30/2016 06:14 PM, Robert Haas wrote:
>> So basically the use of the ENCRYPTED keyword means "if it does already
>> seem to be the sort of MD5 blob we're expecting, turn it into that".
>
> If it does NOT already seem to be... I guess?

Yes, that's what I meant.  Sorry.

>> rolencryption says how the password verifier is encrypted and rolpassword
>> contains the verifier itself. Initially, rolencryption will be 'plain' or
>> 'md5', but later we can add 'scram' as another choice, or maybe it'll be
>> more specific like 'scram-hmac-doodad'.
>
> May I suggest using  "{" <scheme>["."<encoding>] "}" just like Dovecot does?

Doesn't seem very SQL-ish to me...  I think we should normalize.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Password identifiers, protocol aging and SCRAM protocol

From
Michael Paquier
Date:
On Thu, Mar 31, 2016 at 1:14 AM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Wed, Mar 30, 2016 at 9:46 AM, Michael Paquier
> <michael.paquier@gmail.com> wrote:
>>> Things I noticed:
>>> 1.
>>>     when using either
>>>         CREATE ROLE
>>>         ALTER ROLE
>>>     with the parameter
>>>         ENCRYPTED
>>>     md5 encryption is always assumed (I've come to realize that UNENCRYPTED
>>> always equals plain and, in the past, ENCRYPTED equaled md5 since there were
>>> no other options)
>>
>> Yes, that's to match the current behavior, and make something fully
>> backward-compatible. Switching to md5 + scram may have made sense as
>> well though.
>
> I think we're not going to have much luck getting people to switch
> over to SCRAM if the default remains MD5. Perhaps there should be a
> GUC for this - and we can initially set that GUC to md5, allowing
> people who are ready to adopt SCRAM to change it.  And then in a later
> release we can change the default, once we're pretty confident that
> most connectors have added support for the new authentication method.
> This is going to take a long time to roll out.
> Alternatively, we could control it strictly through DDL.

This maps quite a lot with the existing password_encryption, so adding
a GUC to control only the format of protocols only for ENCRYPTED is
disturbing, say password_encryption_encrypted. I'd rather keep
ENCRYPTED to md5 as default when password_encryption is 'on', switch
to scram a couple of releases later, and extend the DDL grammar with
something like PROTOCOL {'md5' | 'plain' | 'scram'}, which can be used
instead of UNENCRYPTED | ENCRYPTED as an additional keyword. Smooth
transition to a more-extensive system.

> Note that the existing behavior is pretty wonky:
> alter user rhaas unencrypted password 'foo'; -> rolpassword foo
> alter user rhaas encrypted password 'foo'; -> rolpassword
> md5e748797a605a1c95f3d6b5f140b2d528
> alter user rhaas encrypted password
> 'md5e748797a605a1c95f3d6b5f140b2d528'; -> rolpassword
> md5e748797a605a1c95f3d6b5f140b2d528
> alter user rhaas unencrypted password
> 'md5e748797a605a1c95f3d6b5f140b2d528'; -> rolpassword
> md5e748797a605a1c95f3d6b5f140b2d528

I actually wrote some regression tests for that. Those are upthread as
part of 0001, have for example a look at password.sql.

> So basically the use of the ENCRYPTED keyword means "if it does
> already seem to be the sort of MD5 blob we're expecting, turn it into
> that".  And we just rely on the format to distinguish between an MD5
> verifier and an unencrypted password.  Personally, I think a good
> start here, and I think you may have something like this in the patch
> already, would be to split rolpassword into two columns, say
> rolencryption and rolpassword.  rolencryption says how the password
> verifier is encrypted and rolpassword contains the verifier itself.

The patch has something like that. And doing this split is not that
complicated to be honest. Surely that would be clearer than relying on
the prefix of the identifier to see if it is md5 or not.

> Initially, rolencryption will be 'plain' or 'md5', but later we can
> add 'scram' as another choice, or maybe it'll be more specific like
> 'scram-hmac-doodad'.  And then maybe introduce syntax like this:
>
> alter user rhaas set password 'raw-unencrypted-passwordt' using
> 'verifier-method';
> alter user rhaas set password verifier 'verifier-goes-here' using
> 'verifier-method';
>
> That might require making verifier a key word, which would be good to
> avoid.  Perhaps we could use "password validator" instead?

Yes, that matches what I wrote above. At this point putting that back
on board and discuss it openly at PGCon is the best course of action
IMO.
-- 
Michael



Re: Password identifiers, protocol aging and SCRAM protocol

From
Heikki Linnakangas
Date:
So, the consensus so far seems to be: We don't want the support for 
multiple password verifiers per user. At least not yet. Let's get SCRAM 
working first, in a way that a user can only have SCRAM or an MD5 hash 
stored in the database, not both. We can add support for multiple 
verifiers per user, password aging, etc. later. Hopefully we'll make 
some progress on those before 9.7 is released, too, but let's treat them 
as separate issues and focus on SCRAM.

I took a quick look at the patch set now again, and except that it needs 
to have the multiple password verifier support refactored out, I think 
it's in a pretty good shape. I don't like the pg_upgrade changes and its 
support function, that also seems like an orthogonal or add-on feature 
that would be better discussed separately. I think pg_upgrade should 
just do the upgrade with as little change to the system as possible, and 
let the admin reset/rehash/deprecate the passwords separately, when she 
wants to switch all users to SCRAM. So I suggest that we rip out those 
changes from the patch set as well.

In related news, RFC 7677 that describes a new SCRAM-SHA-256 
authentication mechanism, was published in November 2015. It's identical 
to SCRAM-SHA-1, which is what this patch set implements, except that 
SHA-1 has been replaced with SHA-256. Perhaps we should forget about 
SCRAM-SHA-1 and jump straight to SCRAM-SHA-256.

RFC 7677 also adds some verbiage, in response to vulnerabilities that 
have been found with the "tls-unique" channel binding mechanism:

>    To be secure, either SCRAM-SHA-256-PLUS and SCRAM-SHA-1-PLUS MUST be
>    used over a TLS channel that has had the session hash extension
>    [RFC7627] negotiated, or session resumption MUST NOT have been used.

So that doesn't affect details of the protocol per se, but once we 
implement channel binding, we need to check for those conditions somehow 
(or make sure that OpenSSL checks for them).

Michael, do you plan to submit a new version of this patch set for the 
next commitfest? I'd like to get this committed early in the 9.7 release 
cycle, so that we have time to work on all the add-on stuff before the 
release.

- Heikki




Re: Password identifiers, protocol aging and SCRAM protocol

From
Michael Paquier
Date:
On Sun, Jul 3, 2016 at 4:54 AM, Heikki Linnakangas <hlinnaka@iki.fi> wrote:
> I took a quick look at the patch set now again, and except that it needs to
> have the multiple password verifier support refactored out, I think it's in
> a pretty good shape. I don't like the pg_upgrade changes and its support
> function, that also seems like an orthogonal or add-on feature that would be
> better discussed separately. I think pg_upgrade should just do the upgrade
> with as little change to the system as possible, and let the admin
> reset/rehash/deprecate the passwords separately, when she wants to switch
> all users to SCRAM. So I suggest that we rip out those changes from the
> patch set as well.

That's as well what I recall from the consensus at PGCon: only focus
on the protocol addition and storage of the scram verifier. It was not
mentioned directly but that's what I guess should be done. So no
complains here.

> In related news, RFC 7677 that describes a new SCRAM-SHA-256 authentication
> mechanism, was published in November 2015. It's identical to SCRAM-SHA-1,
> which is what this patch set implements, except that SHA-1 has been replaced
> with SHA-256. Perhaps we should forget about SCRAM-SHA-1 and jump straight
> to SCRAM-SHA-256.

That's to consider. I don't thing switching to that is much complicated.

> RFC 7677 also adds some verbiage, in response to vulnerabilities that have
> been found with the "tls-unique" channel binding mechanism:
>
>>    To be secure, either SCRAM-SHA-256-PLUS and SCRAM-SHA-1-PLUS MUST be
>>    used over a TLS channel that has had the session hash extension
>>    [RFC7627] negotiated, or session resumption MUST NOT have been used.
>
> So that doesn't affect details of the protocol per se, but once we implement
> channel binding, we need to check for those conditions somehow (or make sure
> that OpenSSL checks for them).

Yes.

> Michael, do you plan to submit a new version of this patch set for the next
> commitfest? I'd like to get this committed early in the 9.7 release cycle,
> so that we have time to work on all the add-on stuff before the release.

Thanks. That's good news! Yes, I am still on track to submit a patch for CF1.
-- 
Michael



Re: Password identifiers, protocol aging and SCRAM protocol

From
David Steele
Date:
On 7/2/16 6:32 PM, Michael Paquier wrote:
> On Sun, Jul 3, 2016 at 4:54 AM, Heikki Linnakangas <hlinnaka@iki.fi> wrote:
>
>> Michael, do you plan to submit a new version of this patch set for the next
>> commitfest? I'd like to get this committed early in the 9.7 release cycle,
>> so that we have time to work on all the add-on stuff before the release.
> 
> Thanks. That's good news! Yes, I am still on track to submit a patch for CF1.

And I'm on board for reviews, testing, and whatever else I can help with.

-- 
-David
david@pgmasters.net



Re: Password identifiers, protocol aging and SCRAM protocol

From
Peter Eisentraut
Date:
On 7/2/16 3:54 PM, Heikki Linnakangas wrote:
> In related news, RFC 7677 that describes a new SCRAM-SHA-256
> authentication mechanism, was published in November 2015. It's identical
> to SCRAM-SHA-1, which is what this patch set implements, except that
> SHA-1 has been replaced with SHA-256. Perhaps we should forget about
> SCRAM-SHA-1 and jump straight to SCRAM-SHA-256.

I think a global change from SHA-1 to SHA-256 is in the air already, so 
if we're going to release something brand new in 2017 or so, it should 
be SHA-256.

I suspect this would be a relatively simple change, so I wouldn't mind 
seeing a SHA-1-based variant in CF1 to get things rolling.

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Password identifiers, protocol aging and SCRAM protocol

From
Michael Paquier
Date:
On Mon, Jul 4, 2016 at 6:34 AM, Peter Eisentraut
<peter.eisentraut@2ndquadrant.com> wrote:
> On 7/2/16 3:54 PM, Heikki Linnakangas wrote:
>>
>> In related news, RFC 7677 that describes a new SCRAM-SHA-256
>> authentication mechanism, was published in November 2015. It's identical
>> to SCRAM-SHA-1, which is what this patch set implements, except that
>> SHA-1 has been replaced with SHA-256. Perhaps we should forget about
>> SCRAM-SHA-1 and jump straight to SCRAM-SHA-256.
>
> I think a global change from SHA-1 to SHA-256 is in the air already, so if
> we're going to release something brand new in 2017 or so, it should be
> SHA-256.
>
> I suspect this would be a relatively simple change, so I wouldn't mind
> seeing a SHA-1-based variant in CF1 to get things rolling.

I'd just move this thing to SHA256, we are likely going to use that at the end.

As I am coming back into that, I would as well suggest do the
following, that the current set of patches is clearly missing:
- Put the HMAC infrastructure stuff of pgcrypto into src/common/. It
is a bit a shame to not reuse what is currently available, then I
would suggest to reuse that with HMAC_SCRAM_SHAXXX as label.
- Move *all* the SHA-related things of pgcrypto to src/common,
including SHA1, SHA224 and SHA256. px_memset is a simple wrapper on
top of memset, we should clean up that first.
Any other things to consider that I am forgetting?
-- 
Michael



Re: Password identifiers, protocol aging and SCRAM protocol

From
Michael Paquier
Date:
On Mon, Jul 4, 2016 at 12:54 PM, Michael Paquier
<michael.paquier@gmail.com> wrote:
> As I am coming back into that, I would as well suggest do the
> following, that the current set of patches is clearly missing:
> - Put the HMAC infrastructure stuff of pgcrypto into src/common/. It
> is a bit a shame to not reuse what is currently available, then I
> would suggest to reuse that with HMAC_SCRAM_SHAXXX as label.
> - Move *all* the SHA-related things of pgcrypto to src/common,
> including SHA1, SHA224 and SHA256. px_memset is a simple wrapper on
> top of memset, we should clean up that first.
> Any other things to consider that I am forgetting?

After looking more into that, I have come up with PG-like equivalents
of things in openssl/sha.h:
pg_shaXX_init(pg_shaXX_ctx *ctx, data);
pg_shaXX_update(pg_shaXX_ctx *ctx, uint8 *data, size_t len);
pg_shaXX_final(uint8 *dest, pg_shaXX_ctx *ctx);
Then think about shaXX as 1, 224, 256, 384 and 512.

Hence all those functions, moved to src/common, finish with the
following shape, take an init() one:
#ifdef USE_SSL
#define <openssl/sha.h>
#endif
void
pg_shaXX_init(pg_shaXX_ctx *ctx)
{
#ifdef USE_SSL   SHAXX_Init((SHAXX_CTX *) ctx);
#else   //Here does the OpenBSD stuff, now part of pgcrypto
#endif
}

And that's really ugly, all the OpenBSD things that are used by
pgcrypto when the code is not built with --with-openssl gather into a
single place with parts wrapped around USE_SSL. A less ugly solution
would be to split that into two files, and one or the other gets
included in OBJS depending on if the build is done with or without
OpenSSL. We do a rather similar thing with fe/be-secure-openssl.c.

Another possibility is that we could say that SCRAM is designed to
work with TLS, as mentioned a bit upthread via the RFC, so we would
not support it in builds compiled without OpenSSL. I think that would
be a shame, but it would simplify all this refactoring juggling.

So, 3 possibilities here:
1) Use a single file src/common/sha.c that includes a set of functions
using USE_SSL
2) Have two files in src/common, one when build is used with OpenSSL,
and the second one when built-in methods are used
3) Disable the use of SCRAM when OpenSSL is not present in the build.

Opinions? My heart goes for 2) because 1) is ugly, and 3) is not
appealing in terms of flexibility.
-- 
Michael



Re: Password identifiers, protocol aging and SCRAM protocol

From
Magnus Hagander
Date:
On Tue, Jul 5, 2016 at 10:06 AM, Michael Paquier <michael.paquier@gmail.com> wrote:
On Mon, Jul 4, 2016 at 12:54 PM, Michael Paquier
<michael.paquier@gmail.com> wrote:
> As I am coming back into that, I would as well suggest do the
> following, that the current set of patches is clearly missing:
> - Put the HMAC infrastructure stuff of pgcrypto into src/common/. It
> is a bit a shame to not reuse what is currently available, then I
> would suggest to reuse that with HMAC_SCRAM_SHAXXX as label.
> - Move *all* the SHA-related things of pgcrypto to src/common,
> including SHA1, SHA224 and SHA256. px_memset is a simple wrapper on
> top of memset, we should clean up that first.
> Any other things to consider that I am forgetting?

After looking more into that, I have come up with PG-like equivalents
of things in openssl/sha.h:
pg_shaXX_init(pg_shaXX_ctx *ctx, data);
pg_shaXX_update(pg_shaXX_ctx *ctx, uint8 *data, size_t len);
pg_shaXX_final(uint8 *dest, pg_shaXX_ctx *ctx);
Then think about shaXX as 1, 224, 256, 384 and 512.

Hence all those functions, moved to src/common, finish with the
following shape, take an init() one:
#ifdef USE_SSL
#define <openssl/sha.h>
#endif
void
pg_shaXX_init(pg_shaXX_ctx *ctx)
{
#ifdef USE_SSL
    SHAXX_Init((SHAXX_CTX *) ctx);
#else
    //Here does the OpenBSD stuff, now part of pgcrypto
#endif
}

And that's really ugly, all the OpenBSD things that are used by
pgcrypto when the code is not built with --with-openssl gather into a
single place with parts wrapped around USE_SSL. A less ugly solution
would be to split that into two files, and one or the other gets
included in OBJS depending on if the build is done with or without
OpenSSL. We do a rather similar thing with fe/be-secure-openssl.c.

FWIW, the main reason for be-secure-openssl.c is that we could have support for another external SSL library. The idea was never to have a builtin replacement for it :)

However, is there something that's fundamentally better with the OpenSSL implementation? Or should we just keep *just* the #else branch in the code, the part we've imported from OpenBSD?

TLS is complex, we don't want to do that in that case. But just the sha functions isn't *that* complex, is it?

 
Another possibility is that we could say that SCRAM is designed to
work with TLS, as mentioned a bit upthread via the RFC, so we would
not support it in builds compiled without OpenSSL. I think that would
be a shame, but it would simplify all this refactoring juggling.

So, 3 possibilities here:
1) Use a single file src/common/sha.c that includes a set of functions
using USE_SSL
2) Have two files in src/common, one when build is used with OpenSSL,
and the second one when built-in methods are used
3) Disable the use of SCRAM when OpenSSL is not present in the build.

Opinions? My heart goes for 2) because 1) is ugly, and 3) is not
appealing in terms of flexibility.


I really dislike #3 - we want everybody to start using this...

I'm not sure how common a build without openssl is in the real world though. RPMs, DEBs, Windows installers etc all build with OpenSSL. But we probably don't want to make it mandatory, no...

--

Re: Password identifiers, protocol aging and SCRAM protocol

From
Michael Paquier
Date:
On Tue, Jul 5, 2016 at 5:50 PM, Magnus Hagander <magnus@hagander.net> wrote:
> On Tue, Jul 5, 2016 at 10:06 AM, Michael Paquier <michael.paquier@gmail.com> wrote:
> However, is there something that's fundamentally better with the OpenSSL
> implementation? Or should we just keep *just* the #else branch in the code,
> the part we've imported from OpenBSD?

Good question. I think that we want both, giving priority to OpenSSL
if it is there. Usually their things prove to have more entropy, but I
didn't look at their code to be honest. If we only use the OpenBSD
stuff, it would be a good idea to refresh the in-core code. This is
from OpenBSD of 2002.

> TLS is complex, we don't want to do that in that case. But just the sha
> functions isn't *that* complex, is it?

No, they are not.

>> Another possibility is that we could say that SCRAM is designed to
>> work with TLS, as mentioned a bit upthread via the RFC, so we would
>> not support it in builds compiled without OpenSSL. I think that would
>> be a shame, but it would simplify all this refactoring juggling.
>>
>> So, 3 possibilities here:
>> 1) Use a single file src/common/sha.c that includes a set of functions
>> using USE_SSL
>> 2) Have two files in src/common, one when build is used with OpenSSL,
>> and the second one when built-in methods are used
>> 3) Disable the use of SCRAM when OpenSSL is not present in the build.
>>
>> Opinions? My heart goes for 2) because 1) is ugly, and 3) is not
>> appealing in terms of flexibility.
>
> I really dislike #3 - we want everybody to start using this...

OK, after hacking that for a bit I have finished with option 2 and the
set of PG-like set of routines, the use of USE_SSL in the file
containing all the SHA functions of OpenBSD has proved to be really
ugly, but with a split things are really clear to the eye. The stuff I
got builds on OSX, Linux and MSVC. pgcrypto cannot link directly to
libpgcommon.a, so I am making it compile directly with the source
files, as it is doing on HEAD.

> I'm not sure how common a build without openssl is in the real world though.
> RPMs, DEBs, Windows installers etc all build with OpenSSL. But we probably
> don't want to make it mandatory, no...

I don't think that it is this much common to have an enterprise-class
build of Postgres without SSL, but each company has always its own
reasons, so things could exist.

And I continue to move on... Thanks for the feedback.
-- 
Michael



Re: Password identifiers, protocol aging and SCRAM protocol

From
Michael Paquier
Date:
On Wed, Jul 6, 2016 at 4:18 PM, Michael Paquier
<michael.paquier@gmail.com> wrote:
> OK, after hacking that for a bit I have finished with option 2 and the
> set of PG-like set of routines, the use of USE_SSL in the file
> containing all the SHA functions of OpenBSD has proved to be really
> ugly, but with a split things are really clear to the eye. The stuff I
> got builds on OSX, Linux and MSVC. pgcrypto cannot link directly to
> libpgcommon.a, so I am making it compile directly with the source
> files, as it is doing on HEAD.

Btw, attached is the patch I did for this part if there is any interest in it.

Also, while working on the rest, I am not adding a new column to
pg_auth_id to identify the password verifier type. That's just to keep
the patch at a bare minimum size. Are there issues with that?
--
Michael

Attachment

Re: Password identifiers, protocol aging and SCRAM protocol

From
Stephen Frost
Date:
* Michael Paquier (michael.paquier@gmail.com) wrote:
> On Tue, Jul 5, 2016 at 5:50 PM, Magnus Hagander <magnus@hagander.net> wrote:
> > On Tue, Jul 5, 2016 at 10:06 AM, Michael Paquier <michael.paquier@gmail.com> wrote:
> > However, is there something that's fundamentally better with the OpenSSL
> > implementation? Or should we just keep *just* the #else branch in the code,
> > the part we've imported from OpenBSD?
>
> Good question. I think that we want both, giving priority to OpenSSL
> if it is there. Usually their things prove to have more entropy, but I
> didn't look at their code to be honest. If we only use the OpenBSD
> stuff, it would be a good idea to refresh the in-core code. This is
> from OpenBSD of 2002.

I agree that we definitely want to use the OpenSSL functions when they
are available.

> > I'm not sure how common a build without openssl is in the real world though.
> > RPMs, DEBs, Windows installers etc all build with OpenSSL. But we probably
> > don't want to make it mandatory, no...
>
> I don't think that it is this much common to have an enterprise-class
> build of Postgres without SSL, but each company has always its own
> reasons, so things could exist.

I agree that it's useful to have the support if PG isn't built with
OpenSSL for some reason.

Thanks!

Stephen

Re: Password identifiers, protocol aging and SCRAM protocol

From
Michael Paquier
Date:
On Thu, Jul 7, 2016 at 7:51 AM, Stephen Frost <sfrost@snowman.net> wrote:
> * Michael Paquier (michael.paquier@gmail.com) wrote:
>> > I'm not sure how common a build without openssl is in the real world though.
>> > RPMs, DEBs, Windows installers etc all build with OpenSSL. But we probably
>> > don't want to make it mandatory, no...
>>
>> I don't think that it is this much common to have an enterprise-class
>> build of Postgres without SSL, but each company has always its own
>> reasons, so things could exist.
>
> I agree that it's useful to have the support if PG isn't built with
> OpenSSL for some reason.

OK, I am doing that at the end.

And also while moving on...

On another topic, here are some ideas to extend CREATE/ALTER ROLE to
support SCRAM password directly:
1) protocol PASSWORD value, where protocol is { MD5 | PLAIN | SCRAM }, giving:
CREATE ROLE foorole SCRAM PASSWORD value;
2) PASSWORD (protocol) value.
3) Just add SCRAM PASSWORD
My mind is thinking about 1) as being the cleanest solution as this
does not touch the defaults, which may change a couple of releases
later. Other opinions?

Note that I am also switching password_encryption to an enum, able to
use as values on, off, md5, plain, scram. Of course, on => md5, off =>
plain to preserve the default.
Other things that I am making conservative:
- ENCRYPTED PASSWORD still implies MD5-encrypted password
- UNENCRYPTED PASSWORD still implies plain text password
- PASSWORD used alone depends on the value of password_encryption
So it would be possible to move to scram by default by setting
password_encryption to 'scram'.

Objections are welcome, I am moving into something respecting the
default behavior as much as possible.
-- 
Michael



Re: Password identifiers, protocol aging and SCRAM protocol

From
Robert Haas
Date:
On Fri, Jul 15, 2016 at 9:30 AM, Michael Paquier
<michael.paquier@gmail.com> wrote:
> OK, I am doing that at the end.
>
> And also while moving on...
>
> On another topic, here are some ideas to extend CREATE/ALTER ROLE to
> support SCRAM password directly:
> 1) protocol PASSWORD value, where protocol is { MD5 | PLAIN | SCRAM }, giving:
> CREATE ROLE foorole SCRAM PASSWORD value;
> 2) PASSWORD (protocol) value.
> 3) Just add SCRAM PASSWORD
> My mind is thinking about 1) as being the cleanest solution as this
> does not touch the defaults, which may change a couple of releases
> later. Other opinions?

I can't really understand what you are saying here, but I'm going to
be -1 on adding SCRAM as a parser keyword.  Let's pick a syntax like
"PASSWORD SConst USING SConst" or "PASSWORD SConst ENCRYPTED WITH
SConst".

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Password identifiers, protocol aging and SCRAM protocol

From
Alvaro Herrera
Date:
Michael Paquier wrote:
> On Wed, Jul 6, 2016 at 4:18 PM, Michael Paquier
> <michael.paquier@gmail.com> wrote:
> > OK, after hacking that for a bit I have finished with option 2 and the
> > set of PG-like set of routines, the use of USE_SSL in the file
> > containing all the SHA functions of OpenBSD has proved to be really
> > ugly, but with a split things are really clear to the eye. The stuff I
> > got builds on OSX, Linux and MSVC. pgcrypto cannot link directly to
> > libpgcommon.a, so I am making it compile directly with the source
> > files, as it is doing on HEAD.
> 
> Btw, attached is the patch I did for this part if there is any interest in it.

After quickly eyeballing your patch, I agree with the decision of going
with (2), even if my gut initially told me that (1) would be better
because it'd require less makefile trickery.

I'm surprised that you say pgcrypto cannot link libpgcommon directly.
Is there some insurmountable problem there?  I notice your MSVC patch
uses libpgcommon while the Makefile symlinks the files.

-- 
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Password identifiers, protocol aging and SCRAM protocol

From
David Fetter
Date:
On Wed, Jul 20, 2016 at 02:12:57PM -0400, Alvaro Herrera wrote:
> Michael Paquier wrote:
> > On Wed, Jul 6, 2016 at 4:18 PM, Michael Paquier
> > <michael.paquier@gmail.com> wrote:
> > > OK, after hacking that for a bit I have finished with option 2 and the
> > > set of PG-like set of routines, the use of USE_SSL in the file
> > > containing all the SHA functions of OpenBSD has proved to be really
> > > ugly, but with a split things are really clear to the eye. The stuff I
> > > got builds on OSX, Linux and MSVC. pgcrypto cannot link directly to
> > > libpgcommon.a, so I am making it compile directly with the source
> > > files, as it is doing on HEAD.
> > 
> > Btw, attached is the patch I did for this part if there is any interest in it.
> 
> After quickly eyeballing your patch, I agree with the decision of going
> with (2), even if my gut initially told me that (1) would be better
> because it'd require less makefile trickery.
> 
> I'm surprised that you say pgcrypto cannot link libpgcommon directly.
> Is there some insurmountable problem there?  I notice your MSVC patch
> uses libpgcommon while the Makefile symlinks the files.

People have, in the past, expressed concerns about linking in
pgcrypto.  Apparently, in some countries, it's a legal problem.

Best,
David.
-- 
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter      XMPP: david(dot)fetter(at)gmail(dot)com

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate



Re: Password identifiers, protocol aging and SCRAM protocol

From
Michael Paquier
Date:
On Thu, Jul 21, 2016 at 12:15 AM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Fri, Jul 15, 2016 at 9:30 AM, Michael Paquier
> <michael.paquier@gmail.com> wrote:
>> OK, I am doing that at the end.
>>
>> And also while moving on...
>>
>> On another topic, here are some ideas to extend CREATE/ALTER ROLE to
>> support SCRAM password directly:
>> 1) protocol PASSWORD value, where protocol is { MD5 | PLAIN | SCRAM }, giving:
>> CREATE ROLE foorole SCRAM PASSWORD value;
>> 2) PASSWORD (protocol) value.
>> 3) Just add SCRAM PASSWORD
>> My mind is thinking about 1) as being the cleanest solution as this
>> does not touch the defaults, which may change a couple of releases
>> later. Other opinions?
>
> I can't really understand what you are saying here, but I'm going to
> be -1 on adding SCRAM as a parser keyword.  Let's pick a syntax like
> "PASSWORD SConst USING SConst" or "PASSWORD SConst ENCRYPTED WITH
> SConst".

No, I do not mean to make SCRAM or MD5 keywords. While hacking that, I
got at some point in the mood of using "PASSWORD Sconst Sconst" but
that's ugly. Sticking a keyword in between makes more sense, and USING
is a good idea. I haven't thought of this one.

By the way, the core patch does not have any grammar extension. The
grammar extension will be on top of it and the core patch can just
activate scram passwords using password_encryption. That's user
unfriendly, but as the patch is large I try to cut it in as many
pieces as necessary.
-- 
Michael



Re: Password identifiers, protocol aging and SCRAM protocol

From
Michael Paquier
Date:
On Thu, Jul 21, 2016 at 5:25 AM, David Fetter <david@fetter.org> wrote:
> On Wed, Jul 20, 2016 at 02:12:57PM -0400, Alvaro Herrera wrote:
>> Michael Paquier wrote:
>> > On Wed, Jul 6, 2016 at 4:18 PM, Michael Paquier
>> > <michael.paquier@gmail.com> wrote:
>> > > OK, after hacking that for a bit I have finished with option 2 and the
>> > > set of PG-like set of routines, the use of USE_SSL in the file
>> > > containing all the SHA functions of OpenBSD has proved to be really
>> > > ugly, but with a split things are really clear to the eye. The stuff I
>> > > got builds on OSX, Linux and MSVC. pgcrypto cannot link directly to
>> > > libpgcommon.a, so I am making it compile directly with the source
>> > > files, as it is doing on HEAD.
>> >
>> > Btw, attached is the patch I did for this part if there is any interest in it.
>>
>> After quickly eyeballing your patch, I agree with the decision of going
>> with (2), even if my gut initially told me that (1) would be better
>> because it'd require less makefile trickery.

Yeah, I thought the same thing as well when putting my hands in the
dirt... But the in the end (2) is really less ugly.

>> I'm surprised that you say pgcrypto cannot link libpgcommon directly.
>> Is there some insurmountable problem there?  I notice your MSVC patch
>> uses libpgcommon while the Makefile symlinks the files.

I am running into some weird things when linking both on OSX... But I
am not done with it completely yet. I'll adjust that a bit more when
producing the set of patches that will be published. So let's see.

> People have, in the past, expressed concerns about linking in
> pgcrypto.  Apparently, in some countries, it's a legal problem.

Do you have any references? I don't see that as a problem.
-- 
Michael



Re: Password identifiers, protocol aging and SCRAM protocol

From
Robert Haas
Date:
On Wed, Jul 20, 2016 at 7:42 PM, Michael Paquier
<michael.paquier@gmail.com> wrote:
>> People have, in the past, expressed concerns about linking in
>> pgcrypto.  Apparently, in some countries, it's a legal problem.
>
> Do you have any references? I don't see that as a problem.

I don't have a link to previous discussion handy, but I definitely
recall that it's been discussed.  I don't think that would mean that
libpgcrypto couldn't depend on libpgcommon, but the reverse direction
would make libpgcrypto essentially mandatory which I don't think is a
direction we want to go for both technical and legal reasons.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Password identifiers, protocol aging and SCRAM protocol

From
David Steele
Date:
On 7/21/16 12:19 PM, Robert Haas wrote:
> On Wed, Jul 20, 2016 at 7:42 PM, Michael Paquier
> <michael.paquier@gmail.com> wrote:
>>> People have, in the past, expressed concerns about linking in
>>> pgcrypto.  Apparently, in some countries, it's a legal problem.
>>
>> Do you have any references? I don't see that as a problem.
> 
> I don't have a link to previous discussion handy, but I definitely
> recall that it's been discussed.  I don't think that would mean that
> libpgcrypto couldn't depend on libpgcommon, but the reverse direction
> would make libpgcrypto essentially mandatory which I don't think is a
> direction we want to go for both technical and legal reasons.

I searched a few different ways and finally came up with this post from Tom:

https://www.postgresql.org/message-id/11392.1389991321@sss.pgh.pa.us

It's the only thing I could find, but thought it might jog something
loose for somebody else.

I know that export controls have been an issue for crypto in the past
but have no idea what the current state of that is.

-- 
-David
david@pgmasters.net



Re: Password identifiers, protocol aging and SCRAM protocol

From
Tom Lane
Date:
David Steele <david@pgmasters.net> writes:
> On 7/21/16 12:19 PM, Robert Haas wrote:
>> On Wed, Jul 20, 2016 at 7:42 PM, Michael Paquier
>> <michael.paquier@gmail.com> wrote:
>>>> People have, in the past, expressed concerns about linking in
>>>> pgcrypto.  Apparently, in some countries, it's a legal problem.

>>> Do you have any references? I don't see that as a problem.

>> I don't have a link to previous discussion handy, but I definitely
>> recall that it's been discussed.  I don't think that would mean that
>> libpgcrypto couldn't depend on libpgcommon, but the reverse direction
>> would make libpgcrypto essentially mandatory which I don't think is a
>> direction we want to go for both technical and legal reasons.

> I searched a few different ways and finally came up with this post from Tom:
> https://www.postgresql.org/message-id/11392.1389991321@sss.pgh.pa.us
> It's the only thing I could find, but thought it might jog something
> loose for somebody else.

Way back when, like fifteen years ago, there absolutely were US export
control restrictions on software containing crypto.  I believe the US has
figured out that that was silly, but I'm not sure everyplace else has.
(And if you've been reading the news you will notice that legal
restrictions on crypto are back in vogue, so it would not be wise to
assume that the question is dead and buried.)  So our project policy
since at least the turn of the century has been that any crypto facility
has to be in a separable extension, where it would be fairly easy for
a packager to delete it if they need to ship a crypto-free version.

Note that "crypto" for this purpose generally means reversible encryption;
I've never heard that one-way hashes are illegal anywhere.  So password
hashing such as md5 is fine in core, and a stronger hash would be too.
But pulling in pgcrypto lock, stock, and barrel is not OK.
        regards, tom lane



Re: Password identifiers, protocol aging and SCRAM protocol

From
Michael Paquier
Date:
On Fri, Jul 22, 2016 at 2:31 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Way back when, like fifteen years ago, there absolutely were US export
> control restrictions on software containing crypto.  I believe the US has
> figured out that that was silly, but I'm not sure everyplace else has.

England is these days legally running a battle against data
encryption. I have not heard how this is evolving these days.

> (And if you've been reading the news you will notice that legal
> restrictions on crypto are back in vogue, so it would not be wise to
> assume that the question is dead and buried.)  So our project policy
> since at least the turn of the century has been that any crypto facility
> has to be in a separable extension, where it would be fairly easy for
> a packager to delete it if they need to ship a crypto-free version.
> Note that "crypto" for this purpose generally means reversible encryption;
> I've never heard that one-way hashes are illegal anywhere.  So password
> hashing such as md5 is fine in core, and a stronger hash would be too.
> But pulling in pgcrypto lock, stock, and barrel is not OK.

So it would be an issue if pgcrypto.so links directly to libpqcommon?
Because that's not what I am doing now, perhaps fortunately. I moved
the sha functions to src/common. But actually but thinking more about
that, I don't need to do so because the routines of SCRAM shared
between the frontend and the backend just need to be part of libpq so
they could just be part of backend/libpq like md5.

Tom, if I get it correctly, it would not be an issue if the SHA
functions are directly part of the compiled backend like md5, right?
Because I would like to just change my set of patches to have the SHA
and the encoding functions in src/backend/libpq instead of src/common,
and then have pgcrypto be compiled with a link to those files. That's
a cleaner design btw, more in line with what is done for md5..
-- 
Michael



Re: Password identifiers, protocol aging and SCRAM protocol

From
Tom Lane
Date:
Michael Paquier <michael.paquier@gmail.com> writes:
> On Fri, Jul 22, 2016 at 2:31 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Note that "crypto" for this purpose generally means reversible encryption;
>> I've never heard that one-way hashes are illegal anywhere.  So password
>> hashing such as md5 is fine in core, and a stronger hash would be too.
>> But pulling in pgcrypto lock, stock, and barrel is not OK.

> So it would be an issue if pgcrypto.so links directly to libpqcommon?

No, I don't see why that'd be an issue.  What we can't do is have
libpgcommon depending on pgcrypto.so, or containing anything more than
one-way-hash functionality itself.

> Because I would like to just change my set of patches to have the SHA
> and the encoding functions in src/backend/libpq instead of src/common,
> and then have pgcrypto be compiled with a link to those files. That's
> a cleaner design btw, more in line with what is done for md5..

I'm confused.  We need that code in both libpq and backend, no?
src/common is the place for stuff of that description.
        regards, tom lane



Re: Password identifiers, protocol aging and SCRAM protocol

From
Michael Paquier
Date:
On Fri, Jul 22, 2016 at 8:48 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Michael Paquier <michael.paquier@gmail.com> writes:
>> Because I would like to just change my set of patches to have the SHA
>> and the encoding functions in src/backend/libpq instead of src/common,
>> and then have pgcrypto be compiled with a link to those files. That's
>> a cleaner design btw, more in line with what is done for md5..
>
> I'm confused.  We need that code in both libpq and backend, no?
> src/common is the place for stuff of that description.

Not necessarily. src/interfaces/libpq/Makefile uses a set of files
like md5.c which is located in the backend code and directly compiles
libpq.so with them, so one possibility would be to do the same for
sha.c: locate the file in src/backend/libpq/ and then fetch the file
directly when compiling libpq's shared library.

One thing about my current set of patches is that I have begun adding
files from src/common/ to libpq's list of files. As that would be new
I am wondering if I should avoid doing so. Here is what I mean:
--- a/src/interfaces/libpq/Makefile
+++ b/src/interfaces/libpq/Makefile
@@ -43,6 +43,14 @@ OBJS += $(filter crypt.o getaddrinfo.o getpeereid.o
inet_aton.o open.o system.oOBJS += ip.o md5.o# utils/mbOBJS += encnames.o wchar.o
+# common/
+OBJS += encode.o scram-common.o
+
-- 
Michael



Re: Password identifiers, protocol aging and SCRAM protocol

From
Tom Lane
Date:
Michael Paquier <michael.paquier@gmail.com> writes:
> On Fri, Jul 22, 2016 at 8:48 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> I'm confused.  We need that code in both libpq and backend, no?
>> src/common is the place for stuff of that description.

> Not necessarily. src/interfaces/libpq/Makefile uses a set of files
> like md5.c which is located in the backend code and directly compiles
> libpq.so with them, so one possibility would be to do the same for
> sha.c: locate the file in src/backend/libpq/ and then fetch the file
> directly when compiling libpq's shared library.

Meh.  That seems like a hack left over from before we had src/common.

Having said that, src/interfaces/libpq/ does have some special
requirements, because it needs the code compiled with -fpic (on most
hardware), which means it can't just use the client-side libpgcommon.a
builds.  So maybe it's not worth improving this.

> One thing about my current set of patches is that I have begun adding
> files from src/common/ to libpq's list of files. As that would be new
> I am wondering if I should avoid doing so.

Well, it could link source files from there just as easily as from the
backend.  Not object files, though.
        regards, tom lane



Re: Password identifiers, protocol aging and SCRAM protocol

From
Michael Paquier
Date:
On Fri, Jul 22, 2016 at 9:02 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Michael Paquier <michael.paquier@gmail.com> writes:
>> One thing about my current set of patches is that I have begun adding
>> files from src/common/ to libpq's list of files. As that would be new
>> I am wondering if I should avoid doing so.
>
> Well, it could link source files from there just as easily as from the
> backend.  Not object files, though.

OK. I'll just keep things the current way then :)
-- 
Michael



Re: Password identifiers, protocol aging and SCRAM protocol

From
Craig Ringer
Date:
On 22 July 2016 at 01:31, Tom Lane <tgl@sss.pgh.pa.us> wrote:
David Steele <david@pgmasters.net> writes:
> On 7/21/16 12:19 PM, Robert Haas wrote:
>> On Wed, Jul 20, 2016 at 7:42 PM, Michael Paquier
>> <michael.paquier@gmail.com> wrote:
>>>> People have, in the past, expressed concerns about linking in
>>>> pgcrypto.  Apparently, in some countries, it's a legal problem.

>>> Do you have any references? I don't see that as a problem.

>> I don't have a link to previous discussion handy, but I definitely
>> recall that it's been discussed.  I don't think that would mean that
>> libpgcrypto couldn't depend on libpgcommon, but the reverse direction
>> would make libpgcrypto essentially mandatory which I don't think is a
>> direction we want to go for both technical and legal reasons.

> I searched a few different ways and finally came up with this post from Tom:
> https://www.postgresql.org/message-id/11392.1389991321@sss.pgh.pa.us
> It's the only thing I could find, but thought it might jog something
> loose for somebody else.

Way back when, like fifteen years ago, there absolutely were US export
control restrictions on software containing crypto.  I believe the US has
figured out that that was silly, but I'm not sure everyplace else has.

Australia has recently enacted laws that are reminiscent of the US's defunct crypto export control laws, but they add penalties for *teaching* encryption too. Yup, you can be charged for talking about it. Of course they'll only actually USE those new powers to Stop The Terrorist Threat, they promise...


Unless recently amended, they even failed to exclude academic institutions. I haven't been following it closely because, frankly, it's too ridiculous to pay much attention to, and I don't work directly with crypto anyway. But it's far from the only such colossally ignorant and idiotic law floating around.

Despite the technical frustrations involved, we should keep crypto implementations in a separate library. I agree with Tom that one-way hashes are not a practical concern, even if the laws are probably written too poorly to draw a distinction.

--
 Craig Ringer                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Re: Password identifiers, protocol aging and SCRAM protocol

From
Michael Paquier
Date:
On Fri, Jul 22, 2016 at 9:06 AM, Michael Paquier
<michael.paquier@gmail.com> wrote:
> On Fri, Jul 22, 2016 at 9:02 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Michael Paquier <michael.paquier@gmail.com> writes:
>>> One thing about my current set of patches is that I have begun adding
>>> files from src/common/ to libpq's list of files. As that would be new
>>> I am wondering if I should avoid doing so.
>>
>> Well, it could link source files from there just as easily as from the
>> backend.  Not object files, though.
>
> OK. I'll just keep things the current way then :)

Note: I have put more energy into that and I think that I will be able
to publish a new patch set pretty soon, like at the beginning of next
week.
-- 
Michael



Re: Password identifiers, protocol aging and SCRAM protocol

From
Michael Paquier
Date:
On Fri, Jul 22, 2016 at 3:43 PM, Michael Paquier
<michael.paquier@gmail.com> wrote:
> On Fri, Jul 22, 2016 at 9:06 AM, Michael Paquier
> <michael.paquier@gmail.com> wrote:
>> On Fri, Jul 22, 2016 at 9:02 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> Michael Paquier <michael.paquier@gmail.com> writes:
>>>> One thing about my current set of patches is that I have begun adding
>>>> files from src/common/ to libpq's list of files. As that would be new
>>>> I am wondering if I should avoid doing so.
>>>
>>> Well, it could link source files from there just as easily as from the
>>> backend.  Not object files, though.
>>
>> OK. I'll just keep things the current way then :)
>
> Note: I have put more energy into that and I think that I will be able
> to publish a new patch set pretty soon, like at the beginning of next
> week.

Ok, here is the real deal. As discussed at PGcon, I have shaved off
from the set of patches the following things:
- No separate catalog pg_auth_verifier
- No additional column in pg_authid to determine the password type.
All the logic used check if the password string has a wanted format.
We do that for MD5 now, this set does it for SCRAM.
- Removal of the pg_upgrade stuff.
- Removal of password_protocols, so we don't care anymore about protocol aging.
In short, the SCRAM verifiers get stored in rolpassword.

And here is what this set of patches does:
- Implementation of SCRAM-SHA-256, and not SHA1. I have moved to the
one that makes the most sense considering the current situation based
on RFC 5802 and 7677.
- No channel binding support. I guess that this could be added later on.
- password_encryption is now an enum, and gains three values: md5,
plain and scram. true => md5, false => plain for backward
compatibility
- Grammar of CREATE/ALTER ROLE is extended with PASSWORD val USING
protocol, that's a separate patch applying on top of the core patch
for SASL.

I have noticed as well a couple of bugs in the previous set(s) of patches:
- valid_until was not checked for SCRAM
- When using ENCRYPTED or UNENCRYPTED, already encrypted password
should be used as-is. The same is applied to PASSWORD USING protocol
to ease dump and reload. That's actually what is used for MD5.

And here is a detail of the patches:
- 0001, refactoring of SHA functions into src/common.
- 0002, refactoring for sendAuthRequest
- 0003, Refactoring for RandomSalt to accomodate with the salt used by
scram (length of 10 bytes, md5 is 4).
- 0004, move encoding routines to src/common/
- 0005, make password_encryption an enum
- 0006, refactor some code in CREATE/ALTER role code paths related the
use of password_encryption
- 0007, refactor some code to have a single routine to fetch password
and valid_until from pg_authid
- 0008, The core implementation of SCRAM-SHA-256, with the SASL
communication protocol. if you want to use SCRAM with that, things go
with password_encryption = 'scram'.
- 0009, addition of PASSWORD val USING protocol
- 0010. regression tests for passwords. Not sure how useful they would
be. But they helped me a bit.

I am adding an entry in the next CF. Comments are welcome.
--
Michael

Attachment

Re: Password identifiers, protocol aging and SCRAM protocol

From
Heikki Linnakangas
Date:
On 07/22/2016 03:02 AM, Tom Lane wrote:
> Michael Paquier <michael.paquier@gmail.com> writes:
>> On Fri, Jul 22, 2016 at 8:48 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> I'm confused.  We need that code in both libpq and backend, no?
>>> src/common is the place for stuff of that description.
>
>> Not necessarily. src/interfaces/libpq/Makefile uses a set of files
>> like md5.c which is located in the backend code and directly compiles
>> libpq.so with them, so one possibility would be to do the same for
>> sha.c: locate the file in src/backend/libpq/ and then fetch the file
>> directly when compiling libpq's shared library.
>
> Meh.  That seems like a hack left over from before we had src/common.
>
> Having said that, src/interfaces/libpq/ does have some special
> requirements, because it needs the code compiled with -fpic (on most
> hardware), which means it can't just use the client-side libpgcommon.a
> builds.  So maybe it's not worth improving this.

src/common/Makefile says:

> # This makefile generates two outputs:
> #
> #    libpgcommon.a - contains object files with FRONTEND defined,
> #        for use by client application and libraries
> #
> #    libpgcommon_srv.a - contains object files without FRONTEND defined,
> #        for use only by the backend binaries

It claims that libpcommon.a can be used by libraries, but without -fPIC, 
that's a lie.

>> One thing about my current set of patches is that I have begun adding
>> files from src/common/ to libpq's list of files. As that would be new
>> I am wondering if I should avoid doing so.
>
> Well, it could link source files from there just as easily as from the
> backend.  Not object files, though.

I think that's the way to go (and that's what Michael's latest patch 
did). But let's update the comment in the Makefile, explaining that you 
can also copy or symlink source files directly from src/common as 
needed, for instance for shared libraries.

Let's take the opportunity and also move src/backend/libpq/ip.c and 
md5.c into src/common. It would be weird to have sha.c in src/common, 
but md5.c in src/backend/libpq. Looking at ip.c, it could be split into 
two: some of the functions in ip.c are clearly not needed in the client, 
like enumerating all interfaces.

- Heikki




Re: Password identifiers, protocol aging and SCRAM protocol

From
Michael Paquier
Date:
On Thu, Aug 18, 2016 at 9:28 PM, Heikki Linnakangas <hlinnaka@iki.fi> wrote:
>> # This makefile generates two outputs:
>> #
>> #       libpgcommon.a - contains object files with FRONTEND defined,
>> #               for use by client application and libraries
>> #
>> #       libpgcommon_srv.a - contains object files without FRONTEND
>> defined,
>> #               for use only by the backend binaries
>
>
> It claims that libpcommon.a can be used by libraries, but without -fPIC,
> that's a lie.

Yes.

>>> One thing about my current set of patches is that I have begun adding
>>> files from src/common/ to libpq's list of files. As that would be new
>>> I am wondering if I should avoid doing so.
>>
>>
>> Well, it could link source files from there just as easily as from the
>> backend.  Not object files, though.
>
>
> I think that's the way to go (and that's what Michael's latest patch did).
> But let's update the comment in the Makefile, explaining that you can also
> copy or symlink source files directly from src/common as needed, for
> instance for shared libraries.

Updating that is a good idea.

> Let's take the opportunity and also move src/backend/libpq/ip.c and md5.c
> into src/common. It would be weird to have sha.c in src/common, but md5.c in
> src/backend/libpq. Looking at ip.c, it could be split into two: some of the
> functions in ip.c are clearly not needed in the client, like enumerating all
> interfaces.

It would be definitely better to do all that before even moving sha.c.
For the current ip.c, I don't have a better idea than putting in
src/common/ip.c the set of routines used by both the frontend and
backend, and have fe_ip.c the new file that has the frontend-only
things. Need a patch?
-- 
Michael



Re: Password identifiers, protocol aging and SCRAM protocol

From
Heikki Linnakangas
Date:
On 08/18/2016 03:45 PM, Michael Paquier wrote:
> On Thu, Aug 18, 2016 at 9:28 PM, Heikki Linnakangas <hlinnaka@iki.fi> wrote:
>> Let's take the opportunity and also move src/backend/libpq/ip.c and md5.c
>> into src/common. It would be weird to have sha.c in src/common, but md5.c in
>> src/backend/libpq. Looking at ip.c, it could be split into two: some of the
>> functions in ip.c are clearly not needed in the client, like enumerating all
>> interfaces.
>
> It would be definitely better to do all that before even moving sha.c.

Agreed.

> For the current ip.c, I don't have a better idea than putting in
> src/common/ip.c the set of routines used by both the frontend and
> backend, and have fe_ip.c the new file that has the frontend-only
> things. Need a patch?

Yes, please. I don't think there's anything there that's needed by only 
the frontend, but some of the functions are needed by only the backend. 
So I think we'll end up with src/common/ip.c, and 
src/backend/libpq/be-ip.c. (Not sure about those names, pick something 
that makes sense, given what's left in the files.)

- Heikki




Re: Password identifiers, protocol aging and SCRAM protocol

From
Michael Paquier
Date:
On Fri, Aug 19, 2016 at 1:51 AM, Heikki Linnakangas <hlinnaka@iki.fi> wrote:
> On 08/18/2016 03:45 PM, Michael Paquier wrote:
>>
>> On Thu, Aug 18, 2016 at 9:28 PM, Heikki Linnakangas <hlinnaka@iki.fi>
>> wrote:
>> For the current ip.c, I don't have a better idea than putting in
>> src/common/ip.c the set of routines used by both the frontend and
>> backend, and have fe_ip.c the new file that has the frontend-only
>> things. Need a patch?
>
>
> Yes, please. I don't think there's anything there that's needed by only the
> frontend, but some of the functions are needed by only the backend. So I
> think we'll end up with src/common/ip.c, and src/backend/libpq/be-ip.c. (Not
> sure about those names, pick something that makes sense, given what's left
> in the files.)

OK, so let's do that first correctly. Attached are two patches:
- 0001 moves md5 to src/common
- 0002 that does the same for ip.c.
By the way, it seems to me that having be-ip.c is not that much worth
it. I am noticing that only pg_range_sockaddr could be marked as
backend-only. pg_foreach_ifaddr is being used as well by
tools/ifaddrs/, and this one calls as well pg_sockaddr_cidr_mask. Or
is there still some utility in having src/tools/ifaddrs? If not we
could move pg_sockaddr_cidr_mask and pg_foreach_ifaddr to be
backend-only. With pg_range_sockaddr that would make half the routines
to be marked as backend-only.

I have not rebased the whole series yet of SCRAM... I'll do that after
we agree on those two patches with the two commits you have already
done cleaned up of course (thanks btw for those ones!).
--
Michael

Attachment

Re: Password identifiers, protocol aging and SCRAM protocol

From
Heikki Linnakangas
Date:
On 08/19/2016 09:46 AM, Michael Paquier wrote:
> On Fri, Aug 19, 2016 at 1:51 AM, Heikki Linnakangas <hlinnaka@iki.fi> wrote:
>> On 08/18/2016 03:45 PM, Michael Paquier wrote:
>>>
>>> On Thu, Aug 18, 2016 at 9:28 PM, Heikki Linnakangas <hlinnaka@iki.fi>
>>> wrote:
>>> For the current ip.c, I don't have a better idea than putting in
>>> src/common/ip.c the set of routines used by both the frontend and
>>> backend, and have fe_ip.c the new file that has the frontend-only
>>> things. Need a patch?
>>
>> Yes, please. I don't think there's anything there that's needed by only the
>> frontend, but some of the functions are needed by only the backend. So I
>> think we'll end up with src/common/ip.c, and src/backend/libpq/be-ip.c. (Not
>> sure about those names, pick something that makes sense, given what's left
>> in the files.)
>
> OK, so let's do that first correctly. Attached are two patches:
> - 0001 moves md5 to src/common
> - 0002 that does the same for ip.c.
> By the way, it seems to me that having be-ip.c is not that much worth
> it. I am noticing that only pg_range_sockaddr could be marked as
> backend-only. pg_foreach_ifaddr is being used as well by
> tools/ifaddrs/, and this one calls as well pg_sockaddr_cidr_mask. Or
> is there still some utility in having src/tools/ifaddrs? If not we
> could move pg_sockaddr_cidr_mask and pg_foreach_ifaddr to be
> backend-only. With pg_range_sockaddr that would make half the routines
> to be marked as backend-only.

I decided to split ip.c anyway. I'd like to keep the files in 
src/common/ip.c as small as possible, so I think it makes sense to be 
quite surgical when moving things there. I kept the pg_foreach_ifaddr() 
function in src/backend/libpq/ifaddr.c (I renamed the file to avoid 
confusion with the ip.c that got moved), even though it means that 
test_ifaddr will have to continue to copy the file directly from 
src/backend/libpq. I'm OK with that, because test_ifaddrs is just a 
little test program that mimics the backend's behaviour of enumerating 
interfaces. I don't consider it to be a "real" frontend application.

Pushed, after splitting. Thanks! Now let's move on to the more 
substantial patches.

- Heikki




Re: Password identifiers, protocol aging and SCRAM protocol

From
Michael Paquier
Date:
On Fri, Sep 2, 2016 at 7:57 PM, Heikki Linnakangas <hlinnaka@iki.fi> wrote:
> I decided to split ip.c anyway. I'd like to keep the files in
> src/common/ip.c as small as possible, so I think it makes sense to be quite
> surgical when moving things there. I kept the pg_foreach_ifaddr() function
> in src/backend/libpq/ifaddr.c (I renamed the file to avoid confusion with
> the ip.c that got moved), even though it means that test_ifaddr will have to
> continue to copy the file directly from src/backend/libpq. I'm OK with that,
> because test_ifaddrs is just a little test program that mimics the backend's
> behaviour of enumerating interfaces. I don't consider it to be a "real"
> frontend application.
>
> Pushed, after splitting. Thanks! Now let's move on to the more substantial
> patches.

Before I send a new series of patches... There is one thing that I am
still troubled with: the compilation of pgcrypto. First from
contrib/pgcrypto/Makefile I am noticing the following issue with this
block:
CF_SRCS = $(if $(subst no,,$(with_openssl)), $(OSSL_SRCS), $(INT_SRCS))
CF_TESTS = $(if $(subst no,,$(with_openssl)), $(OSSL_TESTS), $(INT_TESTS))
CF_PGP_TESTS = $(if $(subst no,,$(with_zlib)), $(ZLIB_TST), $(ZLIB_OFF_TST))
How is that correct if src/Makefile.global is not loaded first?
Variables like with_openssl are still not loaded at that point.

Then, as per patch 0001 there are two files holding the SHA routines:
sha.c with the interface taken from OpenBSD, and sha_openssl.c that
uses the interface of OpenSSL. And when compiling pgcrypto, the choice
of file is made depending on the value of $(with_openssl).

As far as I know, the list of OBJS needs to be completely defined
before loading contrib-global.mk, but I fail to see how we can do that
with USE_PGXS=1... Or would it be fine to error if pgcrypto is
compiled with USE_PGXS?
-- 
Michael



Re: Password identifiers, protocol aging and SCRAM protocol

From
Tom Lane
Date:
Michael Paquier <michael.paquier@gmail.com> writes:
> Before I send a new series of patches... There is one thing that I am
> still troubled with: the compilation of pgcrypto. First from
> contrib/pgcrypto/Makefile I am noticing the following issue with this
> block:
> CF_SRCS = $(if $(subst no,,$(with_openssl)), $(OSSL_SRCS), $(INT_SRCS))
> CF_TESTS = $(if $(subst no,,$(with_openssl)), $(OSSL_TESTS), $(INT_TESTS))
> CF_PGP_TESTS = $(if $(subst no,,$(with_zlib)), $(ZLIB_TST), $(ZLIB_OFF_TST))
> How is that correct if src/Makefile.global is not loaded first?
> Variables like with_openssl are still not loaded at that point.

Um, you do know that Make treats "=" definitions of variables as,
essentially, macro definitions?  The fact that with_openssl isn't
set yet doesn't necessarily mean these definitions are wrong.
Is it actually not working for you, or are you just not understanding
why it works?
        regards, tom lane



Re: Password identifiers, protocol aging and SCRAM protocol

From
Michael Paquier
Date:
On Fri, Sep 2, 2016 at 10:59 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Michael Paquier <michael.paquier@gmail.com> writes:
>> Before I send a new series of patches... There is one thing that I am
>> still troubled with: the compilation of pgcrypto. First from
>> contrib/pgcrypto/Makefile I am noticing the following issue with this
>> block:
>> CF_SRCS = $(if $(subst no,,$(with_openssl)), $(OSSL_SRCS), $(INT_SRCS))
>> CF_TESTS = $(if $(subst no,,$(with_openssl)), $(OSSL_TESTS), $(INT_TESTS))
>> CF_PGP_TESTS = $(if $(subst no,,$(with_zlib)), $(ZLIB_TST), $(ZLIB_OFF_TST))
>> How is that correct if src/Makefile.global is not loaded first?
>> Variables like with_openssl are still not loaded at that point.
>
> Um, you do know that Make treats "=" definitions of variables as,
> essentially, macro definitions?  The fact that with_openssl isn't
> set yet doesn't necessarily mean these definitions are wrong.
> Is it actually not working for you, or are you just not understanding
> why it works?

Oops right. I was trying to use an ifeq on $with_openssl, and that did
not work but just using that would go correctly...
-- 
Michael



Re: Password identifiers, protocol aging and SCRAM protocol

From
Michael Paquier
Date:
On Fri, Sep 2, 2016 at 10:23 PM, Michael Paquier
<michael.paquier@gmail.com> wrote:
> On Fri, Sep 2, 2016 at 7:57 PM, Heikki Linnakangas <hlinnaka@iki.fi> wrote:
>> I decided to split ip.c anyway. I'd like to keep the files in
>> src/common/ip.c as small as possible, so I think it makes sense to be quite
>> surgical when moving things there. I kept the pg_foreach_ifaddr() function
>> in src/backend/libpq/ifaddr.c (I renamed the file to avoid confusion with
>> the ip.c that got moved), even though it means that test_ifaddr will have to
>> continue to copy the file directly from src/backend/libpq. I'm OK with that,
>> because test_ifaddrs is just a little test program that mimics the backend's
>> behaviour of enumerating interfaces. I don't consider it to be a "real"
>> frontend application.
>>
>> Pushed, after splitting. Thanks! Now let's move on to the more substantial
>> patches.

Thanks for the push.

> Before I send a new series of patches... There is one thing that I am
> still troubled with: the compilation of pgcrypto. First from
> contrib/pgcrypto/Makefile I am noticing the following issue with this
> block:
> CF_SRCS = $(if $(subst no,,$(with_openssl)), $(OSSL_SRCS), $(INT_SRCS))
> CF_TESTS = $(if $(subst no,,$(with_openssl)), $(OSSL_TESTS), $(INT_TESTS))
> CF_PGP_TESTS = $(if $(subst no,,$(with_zlib)), $(ZLIB_TST), $(ZLIB_OFF_TST))
> How is that correct if src/Makefile.global is not loaded first?
> Variables like with_openssl are still not loaded at that point.
>
> Then, as per patch 0001 there are two files holding the SHA routines:
> sha.c with the interface taken from OpenBSD, and sha_openssl.c that
> uses the interface of OpenSSL. And when compiling pgcrypto, the choice
> of file is made depending on the value of $(with_openssl).

So I have solved my identity crisis here by just using INT_SRCS and
OSSL_SRCS to list the correct files holding the SHA files. Thanks Tom
for the hint. I need to study more my Makefile-fu.

Attached is a new series:
- 0001, refactoring of SHA functions into src/common.
- 0002, move encoding routines to src/common/
- 0003, make password_encryption an enum
- 0004, refactor some code in CREATE/ALTER role code paths related the
use of password_encryption
- 0005, refactor some code to have a single routine to fetch password
and valid_until from pg_authid
- 0006, The core implementation of SCRAM-SHA-256, with the SASL
communication protocol. if you want to use SCRAM with that, things go
with password_encryption = 'scram'. I have spotted here a bug with the
MSVC build on the way.
- 0007, addition of PASSWORD val USING protocol
- 0008. regression tests for passwords. Those do not trigger the
internal sha routines, which lead to inconsistent results.
--
Michael

Attachment

Re: Password identifiers, protocol aging and SCRAM protocol

From
David Steele
Date:
On 9/3/16 8:36 AM, Michael Paquier wrote:
>
> Attached is a new series:

* [PATCH 1/8] Refactor SHA functions and move them to src/common/

I'd like to see more code comments in sha.c (though I realize this was
copied directly from pgcrypto.)

I tested by building with and without --with-openssl and running make
check for the project as a whole and the pgcrypto extension.

I notice that the copyright from pgcrypto/sha1.c was carried over but
not the copyright from pgcrypto/sha2.c.  I'm no expert on how this
works, but I believe the copyright from sha2.c must be copied over.

Also, are there any plans to expose these functions directly to the user
without loading pgcrypto?  Now that the functionality is in core it
seems that would be useful.  In addition, it would make this patch stand
on its own rather than just being a building block

* [PATCH 2/8] Move encoding routines to src/common/

I wonder if it is confusing to have two of encode.h/encode.c.  Perhaps
they should be renamed to make them distinct?

* [PATCH 3/8] Switch password_encryption to a enum

Does not apply on HEAD (98c2d3332):

error: patch failed: src/backend/commands/user.c:139
error: src/backend/commands/user.c: patch does not apply
error: patch failed: src/include/commands/user.h:15
error: src/include/commands/user.h: patch does not apply

For here on I used 39b691f251 for review and testing.

I seems you are keeping on/off for backwards compatibility, shouldn't
the default now be "md5"?

-#password_encryption = on
+#password_encryption = on        # on, off, md5 or plain

* [PATCH 4/8] Refactor decision-making of password encryption into a
single routine

+++ b/src/backend/commands/user.c
+        new_record[Anum_pg_authid_rolpassword - 1] =
+            CStringGetTextDatum(encrypted_passwd);

pfree(encrypted_passwd) here or let it get freed with the context?

* [PATCH 5/8] Create generic routine to fetch password and valid until
values for a role

Couldn't md5_crypt_verify() be made more general and take the hash type?For instance, password_crypt_verify() with the
lastparam as the new
 
password type enum.

* [PATCH 6/8] Support for SCRAM-SHA-256 authentication

+++ b/contrib/passwordcheck/passwordcheck.c
+        case PASSWORD_TYPE_SCRAM:
+            /* unfortunately not much can be done here */
+            break;

Why can't we at least do the same check as md5 to make sure the username
was not used as the password?

+++ b/src/backend/libpq/auth.c
+     * without relying on the length word, but we hardly care about protocol
+     * version or older anymore.)

Do you mean protocol version 2 or older?

+++ b/src/backend/libpq/crypt.c        return STATUS_ERROR;    /* empty password */
+

Looks like a stray LF.

+++ b/src/backend/parser/gram.y
+    SAVEPOINT SCHEMA SCRAM SCROLL SEARCH SECOND_P SECURITY SELECT SEQUENCE

Doesn't this belong in patch 7?  Even in patch 7 it doesn't appear that
SCRAM is a keyword since the protocol specified after USING is quoted.

I tested this patch using both md5 and scram and was able to get both of
them to working separately.

However, it doesn't look like they can be used in conjunction since the
pg_hba.conf entry must specify either m5 or scram (though the database
can easily contain a mixture).  This would probably make a migration
very unpleasant.

Is there any chance of a mixed mode that will allow new passwords to be
set as scram while still honoring the old md5 passwords? Or does that
cause too many complications with the protocol?

* [PATCH 7/8] Add clause PASSWORD val USING protocol to CREATE/ALTER ROLE

+++ b/doc/src/sgml/ref/create_role.sgml
+        Sets the role's password using the wanted protocol.

How about "Sets the role's password using the requested procotol."

+        an unencrypted password.   If the presented password string is
already
+        in MD5-encrypted or SCRAM-encrypted format, then it is stored
encrypted
+        as-is.

How about, "If the password string is..."

* [PATCH 8/8] Add regression tests for passwords

OK.

On the whole I find this patch set easier to digest than what was
submitted for 9.6.  It is more targeted but still provides very valuable
functionality.

I'm a bit concerned that a mixture of md5/scram could cause confusion
and think this may warrant discussion somewhere in the documentation
since the idea is for users to migrate from md5 to scram.

-- 
-David
david@pgmasters.net



Re: Password identifiers, protocol aging and SCRAM protocol

From
Michael Paquier
Date:
On Mon, Sep 26, 2016 at 2:15 AM, David Steele <david@pgmasters.net> wrote:
> On 9/3/16 8:36 AM, Michael Paquier wrote:
>>
>> Attached is a new series:

Thanks for the review and the comments!

> * [PATCH 1/8] Refactor SHA functions and move them to src/common/
>
> I'd like to see more code comments in sha.c (though I realize this was
> copied directly from pgcrypto.)

OK... I have added some comments for the user-facing routines, as well
as the private routines that are doing step-by-step random
calculations.

> I notice that the copyright from pgcrypto/sha1.c was carried over but
> not the copyright from pgcrypto/sha2.c.  I'm no expert on how this
> works, but I believe the copyright from sha2.c must be copied over.

Right, those copyright bits are missing:
- * AUTHOR: Aaron D. Gifford <me@aarongifford.com>
[...]
- * Copyright (c) 2000-2001, Aaron D. Gifford
The license block being the same, it seems to me that there is no need
to copy it over. The copyright should be enough.

> Also, are there any plans to expose these functions directly to the user
> without loading pgcrypto?  Now that the functionality is in core it
> seems that would be useful.  In addition, it would make this patch stand
> on its own rather than just being a building block.

There have been discussions about avoiding enabling those functions by
default in the distribution. We'd rather not do that...

> * [PATCH 2/8] Move encoding routines to src/common/
>
> I wonder if it is confusing to have two of encode.h/encode.c.  Perhaps
> they should be renamed to make them distinct?

Yes it may be a good idea to rename that, like encode_utils.[c|h] for
the new files.

> * [PATCH 3/8] Switch password_encryption to a enum
>
> Does not apply on HEAD (98c2d3332):

Interesting, it works for me on da6c4f6.

> For here on I used 39b691f251 for review and testing.
> I seems you are keeping on/off for backwards compatibility, shouldn't
> the default now be "md5"?
>
> -#password_encryption = on
> +#password_encryption = on              # on, off, md5 or plain

That sounds like a good idea, so switched this way.

> * [PATCH 4/8] Refactor decision-making of password encryption into a
> single routine
>
> +++ b/src/backend/commands/user.c
> +               new_record[Anum_pg_authid_rolpassword - 1] =
> +                       CStringGetTextDatum(encrypted_passwd);
>
> pfree(encrypted_passwd) here or let it get freed with the context?

Calling encrypt_password did not ensure that the password needs to be
free'd.. So I guess that at the moment I coded that I just relied on
the context. But well reading now let's do this cleanly and have
encrypt_password return a palloc'ed string. That's more consistent.

> * [PATCH 5/8] Create generic routine to fetch password and valid until
> values for a role
>
> Couldn't md5_crypt_verify() be made more general and take the hash type?
>  For instance, password_crypt_verify() with the last param as the new
> password type enum.

This would mean incorporating the whole SASL message exchange into
this routine because the password string is part of the scram
initialization context, and it seems to me that it is better to just
do once a lookup at the entry in pg_authid. So we'd finish with a more
confusing code I am afraid. At least that's the conclusion I came up
with when doing that.. md5_crypt_verify does only the work on a
received password.

> * [PATCH 6/8] Support for SCRAM-SHA-256 authentication
>
> +++ b/contrib/passwordcheck/passwordcheck.c
> +               case PASSWORD_TYPE_SCRAM:
> +                       /* unfortunately not much can be done here */
> +                       break;
>
> Why can't we at least do the same check as md5 to make sure the username
> was not used as the password?

You are right. We could at least check that, so changed the way you suggest.

> +++ b/src/backend/libpq/auth.c
> +        * without relying on the length word, but we hardly care about protocol
> +        * version or older anymore.)
>
> Do you mean protocol version 2 or older?
>
> +++ b/src/backend/libpq/crypt.c
>                 return STATUS_ERROR;    /* empty password */
> +
>
> Looks like a stray LF.

Fixed.

> +++ b/src/backend/parser/gram.y
> +       SAVEPOINT SCHEMA SCRAM SCROLL SEARCH SECOND_P SECURITY SELECT SEQUENCE
>
> Doesn't this belong in patch 7?  Even in patch 7 it doesn't appear that
> SCRAM is a keyword since the protocol specified after USING is quoted.

This is some garbage from a past version. Fixed.

> However, it doesn't look like they can be used in conjunction since the
> pg_hba.conf entry must specify either m5 or scram (though the database
> can easily contain a mixture).  This would probably make a migration
> very unpleasant.

Yep, it uses a given auth-method once user and database match. This is
partially related to the problem to support multiple password
verifiers per users, which was submitted last CF but got rejected
because of a lack of interest, and removed to simplify this patch. You
need as well to think about other things like password and protocol
aging. But well, it is a problem that we don't have to tackle with
this patch...

> Is there any chance of a mixed mode that will allow new passwords to be
> set as scram while still honoring the old md5 passwords? Or does that
> cause too many complications with the protocol?

Hm. That looks complicated to me. This sounds to me like a retry logic
if for multiple authentication methods, and a different feature. What
you'd be looking for here is a connection parameter to specify a list
of protocols and try them all, no?

And that:
+    * multiple messags sent in both directions. First message is always from

> * [PATCH 7/8] Add clause PASSWORD val USING protocol to CREATE/ALTER ROLE
>
> +++ b/doc/src/sgml/ref/create_role.sgml
> +        Sets the role's password using the wanted protocol.
>
> How about "Sets the role's password using the requested procotol."

Done.

> +        an unencrypted password.   If the presented password string is
> already
> +        in MD5-encrypted or SCRAM-encrypted format, then it is stored
> encrypted
> +        as-is.
>
> How about, "If the password string is..."

OK.

> On the whole I find this patch set easier to digest than what was
> submitted for 9.6.  It is more targeted but still provides very valuable
> functionality.

Thanks.

> I'm a bit concerned that a mixture of md5/scram could cause confusion
> and think this may warrant discussion somewhere in the documentation
> since the idea is for users to migrate from md5 to scram.

We could finish with a red warning in the docs to say that users are
recommended to use SCRAM instead of MD5. Just an idea, perhaps that's
not mandatory for the first shot though.
--
Michael

Attachment

Re: Password identifiers, protocol aging and SCRAM protocol

From
Heikki Linnakangas
Date:
On 09/26/2016 09:02 AM, Michael Paquier wrote:
> On Mon, Sep 26, 2016 at 2:15 AM, David Steele <david@pgmasters.net> wrote:
>> However, it doesn't look like they can be used in conjunction since the
>> pg_hba.conf entry must specify either m5 or scram (though the database
>> can easily contain a mixture).  This would probably make a migration
>> very unpleasant.
>
> Yep, it uses a given auth-method once user and database match. This is
> partially related to the problem to support multiple password
> verifiers per users, which was submitted last CF but got rejected
> because of a lack of interest, and removed to simplify this patch. You
> need as well to think about other things like password and protocol
> aging. But well, it is a problem that we don't have to tackle with
> this patch...
>
>> Is there any chance of a mixed mode that will allow new passwords to be
>> set as scram while still honoring the old md5 passwords? Or does that
>> cause too many complications with the protocol?
>
> Hm. That looks complicated to me. This sounds to me like a retry logic
> if for multiple authentication methods, and a different feature. What
> you'd be looking for here is a connection parameter to specify a list
> of protocols and try them all, no?

It would be possible to have a "md5-or-scram" authentication method in 
pg_hba.conf, such that the server would look up the pg_authid row of the 
user when it receives startup message, and send an MD5 or SCRAM 
challenge depending on which one the user's password is encrypted with. 
It has one drawback though: it allows an unauthenticated user to probe 
if there is a role with a given name in the system, because if a user 
doesn't exist, we'd have to still send an MD5 or SCRAM challenge, or a 
"user does not exist" error without a challenge. If we send a SCRAM 
challenge for a non-existent user, and the attacker knows that most 
users still have a MD5 password, that reveals that the username doesn't 
most likely doesn't exist.

Hmm. The server could send a SCRAM challenge first, and if the client 
gives an incorrect response, or the username doesn't exist, or the 
user's password is actually MD5-encrypted, the server could then send an 
MD5 challenge. It would add one round-trip to the authentication of MD5 
passwords, but that seems acceptable.

We can do this as a follow-up patch though. Let's try to keep this patch 
series small.

- Heikki




Re: Password identifiers, protocol aging and SCRAM protocol

From
David Steele
Date:
On 9/26/16 4:54 AM, Heikki Linnakangas wrote:
> On 09/26/2016 09:02 AM, Michael Paquier wrote:
>> On Mon, Sep 26, 2016 at 2:15 AM, David Steele <david@pgmasters.net>
>> wrote:
>>> However, it doesn't look like they can be used in conjunction since the
>>> pg_hba.conf entry must specify either m5 or scram (though the database
>>> can easily contain a mixture).  This would probably make a migration
>>> very unpleasant.
>>
>> Yep, it uses a given auth-method once user and database match. This is
>> partially related to the problem to support multiple password
>> verifiers per users, which was submitted last CF but got rejected
>> because of a lack of interest, and removed to simplify this patch. You
>> need as well to think about other things like password and protocol
>> aging. But well, it is a problem that we don't have to tackle with
>> this patch...
>>
>>> Is there any chance of a mixed mode that will allow new passwords to be
>>> set as scram while still honoring the old md5 passwords? Or does that
>>> cause too many complications with the protocol?
>>
>> Hm. That looks complicated to me. This sounds to me like a retry logic
>> if for multiple authentication methods, and a different feature. What
>> you'd be looking for here is a connection parameter to specify a list
>> of protocols and try them all, no?
> 
> It would be possible to have a "md5-or-scram" authentication method in
> pg_hba.conf, such that the server would look up the pg_authid row of the
> user when it receives startup message, and send an MD5 or SCRAM
> challenge depending on which one the user's password is encrypted with.
> It has one drawback though: it allows an unauthenticated user to probe
> if there is a role with a given name in the system, because if a user
> doesn't exist, we'd have to still send an MD5 or SCRAM challenge, or a
> "user does not exist" error without a challenge. If we send a SCRAM
> challenge for a non-existent user, and the attacker knows that most
> users still have a MD5 password, that reveals that the username doesn't
> most likely doesn't exist.
> 
> Hmm. The server could send a SCRAM challenge first, and if the client
> gives an incorrect response, or the username doesn't exist, or the
> user's password is actually MD5-encrypted, the server could then send an
> MD5 challenge. It would add one round-trip to the authentication of MD5
> passwords, but that seems acceptable.
> 
> We can do this as a follow-up patch though. Let's try to keep this patch
> series small.

Fair enough.  I'm not even 100% sure we should do it, but wanted to
raise it as a possible issue.

-- 
-David
david@pgmasters.net



Re: Password identifiers, protocol aging and SCRAM protocol

From
Michael Paquier
Date:
On Mon, Sep 26, 2016 at 9:22 PM, David Steele <david@pgmasters.net> wrote:
> On 9/26/16 4:54 AM, Heikki Linnakangas wrote:
>> Hmm. The server could send a SCRAM challenge first, and if the client
>> gives an incorrect response, or the username doesn't exist, or the
>> user's password is actually MD5-encrypted, the server could then send an
>> MD5 challenge. It would add one round-trip to the authentication of MD5
>> passwords, but that seems acceptable.

I don't think that this applies just to md5 or scram. Could we for
example use a connection parameter, like expected_auth_methods to do
that? We include that in the startup packet if the caller has defined
it, then the backend checks for matching entries in pg_hba.conf using
the username, database and the expected auth method if specified.
-- 
Michael



Re: Password identifiers, protocol aging and SCRAM protocol

From
Heikki Linnakangas
Date:
On 09/26/2016 09:02 AM, Michael Paquier wrote:
> On Mon, Sep 26, 2016 at 2:15 AM, David Steele <david@pgmasters.net> wrote:
>> On 9/3/16 8:36 AM, Michael Paquier wrote:
>>>
>>> Attached is a new series:
>
> Thanks for the review and the comments!

I read-through this again, and did a bunch of little fixes:

* Added error-handling for OOM and other errors in liybpq
* In libpq, added check that the server sent back the same client-nonce
* Turned ERRORs into COMMERRORs and removed DEBUG4 lines (they could
reveal useful information to an attacker)
* Improved comments

Some things that need to be resolved (I also added FIXME comments for
some of this):

* A source of random values. This currently uses PostmasterRandom()
similarly to how the MD5 salt is generated, in the server, but plain old
random() in the client. If built with OpenSSL, we should probably use
RAND_bytes(). But what if the client is built without OpenSSL? I believe
the protocol doesn't require cryptographically strong randomness for the
nonces, i.e. it's OK if they're predictable, but they should be
different for each session.

* Nonce and salt lengths. The patch currently uses 10 bytes for both,
but I think I just pulled number that out of thin air. The spec doesn't
say anything about nonce and salt lengths AFAICS. What do other
implementations use? Is 10 bytes enough?

* The spec defines a final "server-error" message that the server sends
on authentication failure, or e.g. if a required extension is not
supported. The patch just uses FATAL for those. Should we try to send a
server-error message instead, or before, the elog(FATAL) ?

I'll continue hacking this later, but need a little break for now.

>> I'm a bit concerned that a mixture of md5/scram could cause confusion
>> and think this may warrant discussion somewhere in the documentation
>> since the idea is for users to migrate from md5 to scram.
>
> We could finish with a red warning in the docs to say that users are
> recommended to use SCRAM instead of MD5. Just an idea, perhaps that's
> not mandatory for the first shot though.

Some sort of Migration Guide would certainly be in order. There isn't
any easy migration path with this patch series alone, so perhaps that
should be part of the follow-up patches that add the "MD5 or SCRAM"
authentication method to pg_hba.conf, or support for having both
verifiers for the same user in pg_authid.

- Heikki


Attachment

Re: Password identifiers, protocol aging and SCRAM protocol

From
Michael Paquier
Date:
On Tue, Sep 27, 2016 at 9:01 PM, Heikki Linnakangas <hlinnaka@iki.fi> wrote:
> * Added error-handling for OOM and other errors in liybpq
> * In libpq, added check that the server sent back the same client-nonce
> * Turned ERRORs into COMMERRORs and removed DEBUG4 lines (they could reveal
> useful information to an attacker)
> * Improved comments

Thanks!

> * A source of random values. This currently uses PostmasterRandom()
> similarly to how the MD5 salt is generated, in the server, but plain old
> random() in the client. If built with OpenSSL, we should probably use
> RAND_bytes(). But what if the client is built without OpenSSL? I believe the
> protocol doesn't require cryptographically strong randomness for the nonces,
> i.e. it's OK if they're predictable, but they should be different for each
> session.

And what if we just replace PostmasterRandom()? pgcrypto is a useful
source of inspiration here. If the server is built with OpenSSL we use
RAND_bytes all the time. If not, let's use /dev/urandom. If urandom is
not there, we fallback to /dev/random. For WIN32, there is
CryptGenRandom(). This could just be done as an independent patch with
a routine in src/common/ for example to allow both frontend and
backend to use it. Do you think that this is a requirement for this
patch? I think not really for the first shot.

> * Nonce and salt lengths. The patch currently uses 10 bytes for both, but I
> think I just pulled number that out of thin air. The spec doesn't say
> anything about nonce and salt lengths AFAICS. What do other implementations
> use? Is 10 bytes enough?

Good question, but that seems rather short to me now that you mention
it. Mongo has implemented already SCRAM-SHA-1 and they are using 3
uint64 so that's 24 bytes (sasl_scramsha1_client_conversation.cpp for
example). For the salt I am seeing a reference to a string "salt"
only, which is too short.

> * The spec defines a final "server-error" message that the server sends on
> authentication failure, or e.g. if a required extension is not supported.
> The patch just uses FATAL for those. Should we try to send a server-error
> message instead, or before, the elog(FATAL) ?

It seems to me that sending back the error while the context is still
alive, aka before the FATAL would be the way to go. That could be
nicely done with an error callback while the exchange is happening. I
missed that while going through the spec.
-- 
Michael



Re: Password identifiers, protocol aging and SCRAM protocol

From
Heikki Linnakangas
Date:
On 09/27/2016 04:19 PM, Michael Paquier wrote:
> On Tue, Sep 27, 2016 at 9:01 PM, Heikki Linnakangas <hlinnaka@iki.fi> wrote:
>> * A source of random values. This currently uses PostmasterRandom()
>> similarly to how the MD5 salt is generated, in the server, but plain old
>> random() in the client. If built with OpenSSL, we should probably use
>> RAND_bytes(). But what if the client is built without OpenSSL? I believe the
>> protocol doesn't require cryptographically strong randomness for the nonces,
>> i.e. it's OK if they're predictable, but they should be different for each
>> session.
>
> And what if we just replace PostmasterRandom()? pgcrypto is a useful
> source of inspiration here. If the server is built with OpenSSL we use
> RAND_bytes all the time. If not, let's use /dev/urandom. If urandom is
> not there, we fallback to /dev/random. For WIN32, there is
> CryptGenRandom(). This could just be done as an independent patch with
> a routine in src/common/ for example to allow both frontend and
> backend to use it.

Yeah, if built with OpenSSL, we probably should just always use 
RAND_bytes(). Without OpenSSL, we have to think a bit harder.

The server-side code in the patch is probably good enough. After all, we 
use the same mechanism for the MD5 salt today.

The libpq-side is not. Just calling random() won't do. We haven't needed 
for random numbers in libpq before, but now we do. Is the pgcrypto 
solution portable enough that we can use it in libpq?

> Do you think that this is a requirement for this
> patch? I think not really for the first shot.

We need something for libpq. We can't just call random(), as that's not 
random unless you also do srandom(), and we don't want to do that 
because the application might have a different idea of what the seed 
should be.

- Heikki




Re: Password identifiers, protocol aging and SCRAM protocol

From
Michael Paquier
Date:
On Tue, Sep 27, 2016 at 10:42 PM, Heikki Linnakangas <hlinnaka@iki.fi> wrote:
> The libpq-side is not. Just calling random() won't do. We haven't needed for
> random numbers in libpq before, but now we do. Is the pgcrypto solution
> portable enough that we can use it in libpq?

Do you think that urandom would be enough then? The last time I took a
look at that, I saw urandom on all modern platforms even those ones:
OpenBSD, NetBSD, Solaris, SunOS. For Windows the CryptGen stuff would
be nice enough I guess..
-- 
Michael



Re: Password identifiers, protocol aging and SCRAM protocol

From
David Steele
Date:
On 9/26/16 2:02 AM, Michael Paquier wrote:

> On Mon, Sep 26, 2016 at 2:15 AM, David Steele <david@pgmasters.net> wrote:
>
> Thanks for the review and the comments!
> 
>> I notice that the copyright from pgcrypto/sha1.c was carried over but
>> not the copyright from pgcrypto/sha2.c.  I'm no expert on how this
>> works, but I believe the copyright from sha2.c must be copied over.
> 
> Right, those copyright bits are missing:
> - * AUTHOR: Aaron D. Gifford <me@aarongifford.com>
> [...]
> - * Copyright (c) 2000-2001, Aaron D. Gifford
> The license block being the same, it seems to me that there is no need
> to copy it over. The copyright should be enough.

Looks fine to me.

>> Also, are there any plans to expose these functions directly to the user
>> without loading pgcrypto?  Now that the functionality is in core it
>> seems that would be useful.  In addition, it would make this patch stand
>> on its own rather than just being a building block.
> 
> There have been discussions about avoiding enabling those functions by
> default in the distribution. We'd rather not do that...

OK.

>> * [PATCH 2/8] Move encoding routines to src/common/
>>
>> I wonder if it is confusing to have two of encode.h/encode.c.  Perhaps
>> they should be renamed to make them distinct?
> 
> Yes it may be a good idea to rename that, like encode_utils.[c|h] for
> the new files.

I like that better.

>> Couldn't md5_crypt_verify() be made more general and take the hash type?
>>  For instance, password_crypt_verify() with the last param as the new
>> password type enum.
> 
> This would mean incorporating the whole SASL message exchange into
> this routine because the password string is part of the scram
> initialization context, and it seems to me that it is better to just
> do once a lookup at the entry in pg_authid. So we'd finish with a more
> confusing code I am afraid. At least that's the conclusion I came up
> with when doing that.. md5_crypt_verify does only the work on a
> received password.

Ah, yes, I see now.  I missed that when I reviewed patch 6.

-- 
-David
david@pgmasters.net



Re: Password identifiers, protocol aging and SCRAM protocol

From
Heikki Linnakangas
Date:
On 09/26/2016 09:02 AM, Michael Paquier wrote:
> On Mon, Sep 26, 2016 at 2:15 AM, David Steele <david@pgmasters.net> wrote:
>> * [PATCH 3/8] Switch password_encryption to a enum
>>
>> Does not apply on HEAD (98c2d3332):
>
> Interesting, it works for me on da6c4f6.
>
>> For here on I used 39b691f251 for review and testing.
>> I seems you are keeping on/off for backwards compatibility, shouldn't
>> the default now be "md5"?
>>
>> -#password_encryption = on
>> +#password_encryption = on              # on, off, md5 or plain
>
> That sounds like a good idea, so switched this way.

Committed this patch in the series, to turn password_encryption GUC into 
an enum.

There was one bug in the patch: if a plaintext password was given with 
CREATE/ALTER USER foo PASSWORD 'bar', but password_encryption was 'md5', 
it would incorrectly pass PASSWORD_TYPE_MD5 to the check-password hook. 
That would limit the amount of checking that the hook can do. Fixed 
that. Also edited the docs and comments a little bit, hopefully for the 
better.

Once we get the main SCRAM patch in, we may want to remove the "on" 
alias altogether. We don't promise backwards-compatibility of config 
files or GUC values, and not many people set password_encryption=on 
explicitly anyway, since it's the default. But I kept it now, as there's 
no ambiguity on what "on" means, yet.

- Heikki




Re: Password identifiers, protocol aging and SCRAM protocol

From
Heikki Linnakangas
Date:
On 09/26/2016 09:02 AM, Michael Paquier wrote:
>> * [PATCH 2/8] Move encoding routines to src/common/
>> >
>> > I wonder if it is confusing to have two of encode.h/encode.c.  Perhaps
>> > they should be renamed to make them distinct?
> Yes it may be a good idea to rename that, like encode_utils.[c|h] for
> the new files.

Looking at these encoding functions, the SCRAM protocol actually uses 
base64 for everything. The hex encoding is only used in the server, to 
encode the StoredKey and ServerKey in pg_authid. So we don't need that 
in the client. It would actually make sense to use base64 for the fields 
in pg_authid, too. Takes less space, and seems more natural for SCRAM 
anyway.

libpq actually has its own implementation of hex encoding and decoding 
already, in fe-exec.c. So if we wanted to use hex-encoding for 
something, we could use that, or if we moved the routines from 
src/backend/utils/encode.c, then we should try to reuse them for the 
purposes of fe-exec.c, too. And libpq already has an implementation of 
the 'escape' encoding, too, in fe-exec.c.  But as I said above, I don't 
think we need to touch any of that.

In summary, I think we only need to move the base64 routines to 
src/common. I'd prefer to be quite surgical in what we put in 
src/common, and avoid moving stuff that's not strictly required by both 
the server and the client.

- Heikki




Re: Password identifiers, protocol aging and SCRAM protocol

From
Heikki Linnakangas
Date:
On 09/28/2016 12:53 PM, Heikki Linnakangas wrote:
> On 09/26/2016 09:02 AM, Michael Paquier wrote:
>>> * [PATCH 2/8] Move encoding routines to src/common/
>>>>
>>>> I wonder if it is confusing to have two of encode.h/encode.c.  Perhaps
>>>> they should be renamed to make them distinct?
>> Yes it may be a good idea to rename that, like encode_utils.[c|h] for
>> the new files.
>
> Looking at these encoding functions, the SCRAM protocol actually uses
> base64 for everything.

Oh, one more thing. The SCRAM spec says:

> The use of base64 in SCRAM is restricted to the canonical form with
> no whitespace.

Our b64_encode routine does use whitespace, so we can't use it as is for 
SCRAM. As the patch stands, we might never output anything long enough 
to create linefeeds, but let's be tidy. The base64 implementation is 
about 100 lines of code, so perhaps we should just leave 
src/backend/utils/encode.c alone, and make a new copy of the base64 
routines in src/common.

- Heikki




Re: Password identifiers, protocol aging and SCRAM protocol

From
Heikki Linnakangas
Date:
On 09/28/2016 12:53 PM, Heikki Linnakangas wrote:
> On 09/26/2016 09:02 AM, Michael Paquier wrote:
>>> * [PATCH 2/8] Move encoding routines to src/common/
>>>>
>>>> I wonder if it is confusing to have two of encode.h/encode.c.  Perhaps
>>>> they should be renamed to make them distinct?
>> Yes it may be a good idea to rename that, like encode_utils.[c|h] for
>> the new files.
>
> Looking at these encoding functions, the SCRAM protocol actually uses
> base64 for everything.

Oh, one more thing. The SCRAM spec says:

> The use of base64 in SCRAM is restricted to the canonical form with
> no whitespace.

Our b64_encode routine does use whitespace, so we can't use it as is for 
SCRAM. As the patch stands, we might never output anything long enough 
to create linefeeds, but let's be tidy. The base64 implementation is 
about 100 lines of code, so perhaps we should just leave 
src/backend/utils/encode.c alone, and make a new copy of the base64 
routines in src/common.

- Heikki




Re: Password identifiers, protocol aging and SCRAM protocol

From
Michael Paquier
Date:
On Wed, Sep 28, 2016 at 7:03 PM, Heikki Linnakangas <hlinnaka@iki.fi> wrote:
> On 09/28/2016 12:53 PM, Heikki Linnakangas wrote:
>>
>> On 09/26/2016 09:02 AM, Michael Paquier wrote:
>>>>
>>>> * [PATCH 2/8] Move encoding routines to src/common/
>>>>>
>>>>>
>>>>> I wonder if it is confusing to have two of encode.h/encode.c.  Perhaps
>>>>> they should be renamed to make them distinct?
>>>
>>> Yes it may be a good idea to rename that, like encode_utils.[c|h] for
>>> the new files.
>>
>>
>> Looking at these encoding functions, the SCRAM protocol actually uses
>> base64 for everything.

OK, I thought that moving everything made more sense for consistency
but let's keep src/common/ as small as possible.

> Oh, one more thing. The SCRAM spec says:
>
>> The use of base64 in SCRAM is restricted to the canonical form with
>> no whitespace.
>
> Our b64_encode routine does use whitespace, so we can't use it as is for
> SCRAM. As the patch stands, we might never output anything long enough to
> create linefeeds, but let's be tidy. The base64 implementation is about 100
> lines of code, so perhaps we should just leave src/backend/utils/encode.c
> alone, and make a new copy of the base64 routines in src/common.

OK, I'll refresh that tomorrow with the rest. Thanks for the commit to
extend password_encryption.
-- 
Michael



Re: Password identifiers, protocol aging and SCRAM protocol

From
David Steele
Date:
On 9/28/16 5:25 AM, Heikki Linnakangas wrote:
> 
> Once we get the main SCRAM patch in, we may want to remove the "on"
> alias altogether. We don't promise backwards-compatibility of config
> files or GUC values, and not many people set password_encryption=on
> explicitly anyway, since it's the default.

+1.

-- 
-David
david@pgmasters.net



Re: Password identifiers, protocol aging and SCRAM protocol

From
Stephen Frost
Date:
Heikki, Michael, Magnus,

* Michael Paquier (michael.paquier@gmail.com) wrote:
> On Tue, Sep 27, 2016 at 10:42 PM, Heikki Linnakangas <hlinnaka@iki.fi> wrote:
> > The libpq-side is not. Just calling random() won't do. We haven't needed for
> > random numbers in libpq before, but now we do. Is the pgcrypto solution
> > portable enough that we can use it in libpq?
>
> Do you think that urandom would be enough then? The last time I took a
> look at that, I saw urandom on all modern platforms even those ones:
> OpenBSD, NetBSD, Solaris, SunOS. For Windows the CryptGen stuff would
> be nice enough I guess..

Magnus had been working on a patch that, as I recall, he thought was
portable and I believe could be used on both sides.

Magnus, would what you were working on be helpful here...?

Thanks!

Stephen

Re: Password identifiers, protocol aging and SCRAM protocol

From
Michael Paquier
Date:
On Wed, Sep 28, 2016 at 8:55 PM, Michael Paquier
<michael.paquier@gmail.com> wrote:
>> Our b64_encode routine does use whitespace, so we can't use it as is for
>> SCRAM. As the patch stands, we might never output anything long enough to
>> create linefeeds, but let's be tidy. The base64 implementation is about 100
>> lines of code, so perhaps we should just leave src/backend/utils/encode.c
>> alone, and make a new copy of the base64 routines in src/common.
>
> OK, I'll refresh that tomorrow with the rest. Thanks for the commit to
> extend password_encryption.

OK, so after more chatting with Heikki, here is a list of TODO items
and a summary of the state of things:
- base64 encoding routines should drop whitespace (' ', \r, \t), and
it would be better to just copy those from the backend's encode.c to
src/common/. No need to move escape and binary things, nor touch
backend's base64 routines.
- No need to move sha1.c to src/common/. Better to just get sha2.c
into src/common/ as we aim at SCRAM-SHA-256.
- random() called in the client is no good. We need something better here.
- The error handling needs to be reworked and should follow the
protocol presented by RFC5802, by sending back e= messages. This needs
a bit of work, not much I think though as the infra is in place in the
core patch.
- Let's discard the md5-or-scram optional thing in pg_hba.conf. This
complicates the error handling protocol.

I am marking this patch as returned with feedback for current CF and
will post a new set soon, moving it to the next CF once I have the new
set of patches ready for posting.
-- 
Michael



Re: Password identifiers, protocol aging and SCRAM protocol

From
Michael Paquier
Date:
On Thu, Sep 29, 2016 at 12:48 PM, Michael Paquier
<michael.paquier@gmail.com> wrote:
> On Wed, Sep 28, 2016 at 8:55 PM, Michael Paquier
> <michael.paquier@gmail.com> wrote:
>>> Our b64_encode routine does use whitespace, so we can't use it as is for
>>> SCRAM. As the patch stands, we might never output anything long enough to
>>> create linefeeds, but let's be tidy. The base64 implementation is about 100
>>> lines of code, so perhaps we should just leave src/backend/utils/encode.c
>>> alone, and make a new copy of the base64 routines in src/common.
>>
>> OK, I'll refresh that tomorrow with the rest. Thanks for the commit to
>> extend password_encryption.
>
> OK, so after more chatting with Heikki, here is a list of TODO items
> and a summary of the state of things:
> - base64 encoding routines should drop whitespace (' ', \r, \t), and
> it would be better to just copy those from the backend's encode.c to
> src/common/. No need to move escape and binary things, nor touch
> backend's base64 routines.
> - No need to move sha1.c to src/common/. Better to just get sha2.c
> into src/common/ as we aim at SCRAM-SHA-256.
> - random() called in the client is no good. We need something better here.
> - The error handling needs to be reworked and should follow the
> protocol presented by RFC5802, by sending back e= messages. This needs
> a bit of work, not much I think though as the infra is in place in the
> core patch.
> - Let's discard the md5-or-scram optional thing in pg_hba.conf. This
> complicates the error handling protocol.
>
> I am marking this patch as returned with feedback for current CF and
> will post a new set soon, moving it to the next CF once I have the new
> set of patches ready for posting.

And so we are back on that, with a new set:
- 0001, introducing pg_strong_random() in src/port/ to have the
backend portion of SCRAM use it instead of random(). This patch is
from Magnus who has kindly sent is to me, so the authorship goes to
him. This patch replaces at the same time PostmasterRandom() with it,
this way once SCRAM gets integrated both the frontend and the backend
finish using the same facility. I think that's good for consistency.
Compared to the version Magnus has sent me, I have changed two things:
-- Reading from /dev/urandom and /dev/random is not influenced by
EINTR. read() handling is also made better in case of partial reads
from a given source.
-- Win32 Crypto routines use MS_DEF_PROV instead of NULL. I think
that's a better idea to not let the user the choice of the encryption
source here.
- 0002, moving all the SHA2 functions to src/common/. As mentioned
upthread, this keeps the amount of code moved to src/common/ to a
minimum. I have been careful to get the header files and copyright
mentions into a correct shape at the same time. I have moved a couple
of code blocks in a shape that make a bit more sense, not sure how you
feel about that, Heikki.
- 0003, creating a set of base64 routines without whitespace handling.
That's more or less a copy of what is in encode.c, simplified for
SCRAM. At the same time I have prefixed the routines with pg_ to make
a difference with what is in encode.c.
- 0004 does some refactoring regarding encrypted passwords in user.c
- 0005 creates a generic routine to fetch password and valid until
values for a role
- 0006 adds support for SCRAM-SHA-256. I have not yet addressed the
concerns regarding the handling of e= messages yet. I have fixed the
nonce generation with random() though.
- 0007 adds the extension for CREATE ROLE .. PASSWORD foo USING protocol
- 0008 is a basic set of regression tests to test passwords.

To be honest, I have now put some love into 0001~0004, but less in the
rest. The first refactoring patches are going to be subject to enough
comments I guess :) I'll put more love into 0005~ in the next couple
of days though while reworking the message interface.

Thanks,
--
Michael

Attachment

Re: Password identifiers, protocol aging and SCRAM protocol

From
Heikki Linnakangas
Date:
On 10/12/2016 11:11 AM, Michael Paquier wrote:
> And so we are back on that, with a new set:

Great! I'm looking at this first one for now:

> - 0001, introducing pg_strong_random() in src/port/ to have the
> backend portion of SCRAM use it instead of random(). This patch is
> from Magnus who has kindly sent is to me, so the authorship goes to
> him. This patch replaces at the same time PostmasterRandom() with it,
> this way once SCRAM gets integrated both the frontend and the backend
> finish using the same facility. I think that's good for consistency.
> Compared to the version Magnus has sent me, I have changed two things:
> -- Reading from /dev/urandom and /dev/random is not influenced by
> EINTR. read() handling is also made better in case of partial reads
> from a given source.
> -- Win32 Crypto routines use MS_DEF_PROV instead of NULL. I think
> that's a better idea to not let the user the choice of the encryption
> source here.

I spent some time whacking that around:

* Renamed the file to src/port/pg_strong_random.c "pgsrandom" makes me 
think of srandom(), which this isn't.

* Changed pg_strong_random() to return false on error, and let the 
callers handle errors. That's more error-prone than throwing an error in 
the function itself, as it's an easy mistake to forget to check for the 
return value, but we can't just "exit(1)" if called in the frontend. If 
it gets called from libpq during authentication, as it will with SCRAM, 
we want to close the connection and report an error, not exit the whole 
user application. Likewise, in postmaster, if we fail to generate a 
query cancel key when forking a backend, we don't want to FATAL and shut 
down the whole postmaster.

* There used to be this:

>         /*
> -        * Precompute password salt values to use for this connection. It's
> -        * slightly annoying to do this long in advance of knowing whether we'll
> -        * need 'em or not, but we must do the random() calls before we fork, not
> -        * after.  Else the postmaster's random sequence won't get advanced, and
> -        * all backends would end up using the same salt...
> -        */
> -       RandomSalt(port->md5Salt, sizeof(port->md5Salt));

But that whole business of advancing postmaster's random sequence is 
moot now. So I moved the generation of md5 salt from postmaster to where 
MD5 authentication is performed.

* This comment in postmaster.c was wrong:

> @@ -581,7 +571,7 @@ PostmasterMain(int argc, char *argv[])
>       * Note: the seed is pretty predictable from externally-visible facts such
>       * as postmaster start time, so avoid using random() for security-critical
>       * random values during postmaster startup.  At the time of first
> -     * connection, PostmasterRandom will select a hopefully-more-random seed.
> +     * connection, pg_strong_random will select a hopefully-more-random seed.
>       */
>      srandom((unsigned int) (MyProcPid ^ MyStartTime));

We don't use pg_strong_random() for that, the same PID+timestamp method 
is still used as before. Adjusted the comment to reflect reality.

* Added "#include <Wincrypt.h>", for the CryptAcquireContext and 
CryptGenRandom functions? It compiled OK without that, so I guess it got 
pulled in via some other header file, but seems more clear and 
future-proof to #include it directly.

* random comment kibitzing (no pun intended).

This is pretty much ready for commit now, IMO, but please do review one 
more time. And I do have some small questions still:

* We now open and close /dev/(u)random on every pg_strong_random() call. 
Should we be worried about performance of that?

* Now that we don't call random() in postmaster anymore, is there any 
point in calling srandom() there (i.e. where the above incorrect comment 
was)? Should we remove it? random() might be used by pre-loaded 
extensions, though. (Hopefully not for cryptographic purposes.)

* Should we backport this? Sorry if we discussed that already, but I 
don't remember.

- Heikki




Re: Password identifiers, protocol aging and SCRAM protocol

From
Heikki Linnakangas
Date:
On 10/14/2016 03:08 PM, Heikki Linnakangas wrote:
> I spent some time whacking that around:

Sigh, forgot attachment. Here you go.

- Heikki


Attachment

Re: Password identifiers, protocol aging and SCRAM protocol

From
Michael Paquier
Date:
On Fri, Oct 14, 2016 at 9:08 PM, Heikki Linnakangas <hlinnaka@iki.fi> wrote:
> On 10/12/2016 11:11 AM, Michael Paquier wrote:
> * Changed pg_strong_random() to return false on error, and let the callers
> handle errors. That's more error-prone than throwing an error in the
> function itself, as it's an easy mistake to forget to check for the return
> value, but we can't just "exit(1)" if called in the frontend. If it gets
> called from libpq during authentication, as it will with SCRAM, we want to
> close the connection and report an error, not exit the whole user
> application. Likewise, in postmaster, if we fail to generate a query cancel
> key when forking a backend, we don't want to FATAL and shut down the whole
> postmaster.

Okay for this one. Indeed that's a cleaner interface.

> This is pretty much ready for commit now, IMO, but please do review one more
> time.

OK, I had an extra lookup and the patch looks in pretty good shape
seen from here.

-   MyCancelKey = PostmasterRandom();
+   if (!pg_strong_random(&MyCancelKey, sizeof(MyCancelKey)))
+   {
+       rw->rw_crashed_at = GetCurrentTimestamp();
+       return false;
+   }
It would be nice to LOG an entry here for bgworkers.

+               /*
+                * fork failed, fall through to report -- actual error
message was
+                * logged by StartAutoVacWorker
+                */
Since you created a new block, the first line gets longer than 80 characters.

> * We now open and close /dev/(u)random on every pg_strong_random() call.
> Should we be worried about performance of that?

Actually I have hacked up a small program that can be used to compare
using /dev/urandom with random() calls (this emulates RandomSalt), and
opening/closing /dev/urandom causes a performance hit, but the
difference becomes noticeable with loop calls higher than 10k on my
Linux laptop. I recall that /dev/urandom is quite slow on Linux
compared to other platforms still... So for a single call per
connection attempt we won't actually notice it much. I am just
attaching that if you want to play with it, and you can use it as
follows:
./calc [dev|random] nbytes loops
That's really a quick hack but it does the job if you worry about the
performance.

> * Now that we don't call random() in postmaster anymore, is there any point
> in calling srandom() there (i.e. where the above incorrect comment was)?
> Should we remove it? random() might be used by pre-loaded extensions,
> though. (Hopefully not for cryptographic purposes.)

That's the business of the maintainers such modules, so my heart is
telling me to rip it off, but my mind tells me that there is no point
in making them unhappy either if they rely on it. I'd trust my mind on
this one, other opinions are welcome.

> * Should we backport this? Sorry if we discussed that already, but I don't
> remember.

I think that we discussed quickly the point at last PGCon during the
SCRAM-committee-unofficial meeting, and that we talked about doing
that only for HEAD.
--
Michael

Attachment

Re: Password identifiers, protocol aging and SCRAM protocol

From
Heikki Linnakangas
Date:
On 10/15/2016 04:26 PM, Michael Paquier wrote:
>> * Now that we don't call random() in postmaster anymore, is there any point
>> in calling srandom() there (i.e. where the above incorrect comment was)?
>> Should we remove it? random() might be used by pre-loaded extensions,
>> though. (Hopefully not for cryptographic purposes.)
>
> That's the business of the maintainers such modules, so my heart is
> telling me to rip it off, but my mind tells me that there is no point
> in making them unhappy either if they rely on it. I'd trust my mind on
> this one, other opinions are welcome.

I kept it for now. Doesn't do any harm either, even if it's unnecessary.

>> * Should we backport this? Sorry if we discussed that already, but I don't
>> remember.
>
> I think that we discussed quickly the point at last PGCon during the
> SCRAM-committee-unofficial meeting, and that we talked about doing
> that only for HEAD.

Ok, committed to HEAD.

Thanks!

- Heikki




Re: Password identifiers, protocol aging and SCRAM protocol

From
Michael Paquier
Date:
On Mon, Oct 17, 2016 at 5:55 PM, Heikki Linnakangas <hlinnaka@iki.fi> wrote:
> On 10/15/2016 04:26 PM, Michael Paquier wrote:
>>>
>>> * Now that we don't call random() in postmaster anymore, is there any
>>> point
>>> in calling srandom() there (i.e. where the above incorrect comment was)?
>>> Should we remove it? random() might be used by pre-loaded extensions,
>>> though. (Hopefully not for cryptographic purposes.)
>>
>>
>> That's the business of the maintainers such modules, so my heart is
>> telling me to rip it off, but my mind tells me that there is no point
>> in making them unhappy either if they rely on it. I'd trust my mind on
>> this one, other opinions are welcome.
>
>
> I kept it for now. Doesn't do any harm either, even if it's unnecessary.
>
>>> * Should we backport this? Sorry if we discussed that already, but I
>>> don't
>>> remember.
>>
>>
>> I think that we discussed quickly the point at last PGCon during the
>> SCRAM-committee-unofficial meeting, and that we talked about doing
>> that only for HEAD.
>
>
> Ok, committed to HEAD.

You removed the part of pgcrypto in charge of randomness, nice move. I
was wondering about how to do with the perfc and the unix_std at some
point, and ripping them off as you did is fine for me.
-- 
Michael



Re: Password identifiers, protocol aging and SCRAM protocol

From
Heikki Linnakangas
Date:
On 10/17/2016 12:18 PM, Michael Paquier wrote:
> You removed the part of pgcrypto in charge of randomness, nice move. I
> was wondering about how to do with the perfc and the unix_std at some
> point, and ripping them off as you did is fine for me.

Yeah. I didn't understand the need for the perfc stuff. Are there 
Windows systems that don't have the Crypto APIs? I doubt it, but the 
buildfarm will tell us in a moment if there are.

And if we don't have a good source of randomness like /dev/random, I 
think it's better to fail, than try to collect entropy ourselves (which 
is what unix_std did). If there's a platform where that doesn't work, 
someone will hopefully send us a patch, rather than silently fall back 
to an iffy implementation.

- Heikki




Re: Password identifiers, protocol aging and SCRAM protocol

From
Heikki Linnakangas
Date:
On 10/17/2016 12:27 PM, Heikki Linnakangas wrote:
> On 10/17/2016 12:18 PM, Michael Paquier wrote:
>> You removed the part of pgcrypto in charge of randomness, nice move. I
>> was wondering about how to do with the perfc and the unix_std at some
>> point, and ripping them off as you did is fine for me.
>
> Yeah. I didn't understand the need for the perfc stuff. Are there
> Windows systems that don't have the Crypto APIs? I doubt it, but the
> buildfarm will tell us in a moment if there are.
>
> And if we don't have a good source of randomness like /dev/random, I
> think it's better to fail, than try to collect entropy ourselves (which
> is what unix_std did). If there's a platform where that doesn't work,
> someone will hopefully send us a patch, rather than silently fall back
> to an iffy implementation.

Looks like Tom's old HP-UX box, pademelon, is not happy about this. Does 
(that version of) HP-UX not have /dev/urandom?

I think we're going to need a bit more logging if no randomness source 
is available. What we have now is just "could not generate random query 
cancel key", which isn't very informative. Perhaps we should also call 
pg_strong_random() once at postmaster startup, to check that it works, 
instead of starting up but not accepting any connections.

- Heikki




Re: Password identifiers, protocol aging and SCRAM protocol

From
Michael Paquier
Date:
On Mon, Oct 17, 2016 at 6:18 PM, Michael Paquier
<michael.paquier@gmail.com> wrote:
> On Mon, Oct 17, 2016 at 5:55 PM, Heikki Linnakangas <hlinnaka@iki.fi> wrote:
>> Ok, committed to HEAD.

Attached is a rebased patch set for SCRAM, with the following things:
- 0001, moving all the SHA2 functions to src/common/ and introducing a
PG-like interface. No actual changes here.
- 0002, creating a set of base64 routines without whitespace handling.
Previous version sent had a bug: I missed the point that the backend
version of base64 was adding a newline every 76 characters. So this is
removed to make the encoding not using any whitespace. Also the
routines are reworked so as they return -1 in the event of an error
instead of generating an elog by themselves. That will be useful for
SCRAM that needs to do its own error handling with the e= messages
from the server. I think that's cleaner this way. Encoding does not
have any error code paths, but decoding has, so one possible
improvement would be to add in arguments a string to store an error
message to make things easier for callers to debug.
- 0003 does some refactoring regarding encrypted passwords in user.c.
I am pretty happy with this one as well.
- 0004 adds the extension for CREATE ROLE .. PASSWORD foo USING
protocol. I found a bug in this one when using CREATE|ALTER ROLE ..
PASSWORD missing to update the given password correctly using
password_encryption. This one I am happy with it. Even if it depends
on 0005 in this patch set it is possible to make it independent of it
to introduce the grammar just for 'plain' and 'md5' first. In previous
sets it was located after SCRAM, but it looks cleaner to get that
first. I don't think I am going to change that much more now.
- 0005 adds support for SCRAM-SHA-256. There is still some work to do
here, particularly the error handling that requires to be extended
with the e= messages sent back to the client before moving to a
PG-like error code path. Those need to be set in the context of the
SASL message exchange. I noticed as well that this is missing a hell
lot of error checks when building the exchange messages, and when
doing encoding and decoding of base64 strings. I'll address that in
the next couple of days.
- 0006 is the basic set of regression tests for passwords. Nothing new
here, they are useful as basic tests when checking the patch. I don't
think that they are worth having committed at the end.
--
Michael

Attachment

Re: Password identifiers, protocol aging and SCRAM protocol

From
Michael Paquier
Date:
On Tue, Oct 18, 2016 at 4:35 PM, Michael Paquier
<michael.paquier@gmail.com> wrote:
> On Mon, Oct 17, 2016 at 6:18 PM, Michael Paquier
> <michael.paquier@gmail.com> wrote:
>> On Mon, Oct 17, 2016 at 5:55 PM, Heikki Linnakangas <hlinnaka@iki.fi> wrote:
>>> Ok, committed to HEAD.
>
> Attached is a rebased patch set for SCRAM, with the following things:
> [...]

And as the PostmasterRandom() patch has been reverted, here is once
again a new set:
- 0001, moving all the SHA2 functions to src/common/ and introducing a
PG-like interface. No actual changes here.
- 0002, replacing PostmasterRandom by pg_strong_random(), with a fix
for the cancel key problem.
- 0003, adding for pg_strong_random() a fallback for any nix platform
not having /dev/random. This should be grouped with 0002, but I split
it for clarity.
- 0004, Add encoding routines for base64 without whitespace in
src/common/. I improved the error handling here by making them return
-1 in case of error and let the caller handle the error.
- 0005, Refactor decision-making of password encryption into a single routine.
- 0006, Add clause PASSWORD val USING protocol to CREATE/ALTER ROLE.
- 0007, the SCRAM implementation. I have reworked the error handling
on both the frontend and the backend. In the frontend, there were many
code paths that did not bother much about many sanity checks like
OOMs, so I addressed that as a whole thing. For the backend, in the
event of an error, the backend sends back to the client a e= message
with an error string corresponding to what happened per RFC5802.
Sanity checks of the user data on the server (get the SCRAM verifier,
its validuntil, empty password and the user name itself), are made
part of the message exchange as in case of errors we need to return
errors like e=unknown-user, e=other-errors and stuff similar to that.
This makes the code in auth.c slightly cleaner btw.
- 0008 is a set of regression tests.

The PostmasterRandom() patch sent in this set contains the fix for
cancel keys that were previously broken. I have also implemented a
fallback method in 0003 inspired by pgcrypto's try_unix_std. It simply
uses gettimeofday() (should be put in the upper loop actually now that
I think about it!), getpid() and random() to generate some randomness,
and then processes the whole through a SHA-256 hash, generating chunks
of random data worth of SHA256_DIGEST_LENGTH bytes. I have not added a
./configure switch for it, but there were voices in favor of that. And
this is not available on Windows (no need to care anyway as there are
crypto APIs). A requirement of this patch is to have the SHA-256
routines in src/common/ first, and this will allow any platform
without /dev/random to generate random numbers like pademelon.

The fallback method for the pg_strong_random() is clearly not ready
for commit, one reason is that libpgport should stand at a level lower
than libpgcommon as far as I understand. But this patch makes
pg_strong_random() in src/port depend on the SHA2 routines in
src/common so it would make more sense if pg_strong_random() is moved
as well to src/common instead of src/port. Honestly I think that we'd
get away better with something like that than trying for example to
reimplement a dependency with PRNG knowing that OpenSSL does it
already, and perhaps better than we could do it.

Thoughts welcome. A lot of bits are independent of that part in the
patch set anyway.
--
Michael

Attachment

Re: Password identifiers, protocol aging and SCRAM protocol

From
Peter Eisentraut
Date:
The organization of these patches makes sense to me.

On 10/20/16 1:14 AM, Michael Paquier wrote:
> - 0001, moving all the SHA2 functions to src/common/ and introducing a
> PG-like interface. No actual changes here.

That's probably alright, although the patch contains a lot more changes
than I would imagine for a simple file move.  I'll still have to review
that in detail.

> - 0002, replacing PostmasterRandom by pg_strong_random(), with a fix
> for the cancel key problem.
> - 0003, adding for pg_strong_random() a fallback for any nix platform
> not having /dev/random. This should be grouped with 0002, but I split
> it for clarity.

Also makes sense, but will need more detailed review.  I did not follow
the previous PostmasterRandom issues closely.

> - 0004, Add encoding routines for base64 without whitespace in
> src/common/. I improved the error handling here by making them return
> -1 in case of error and let the caller handle the error.

I don't think we want to have two different copies of base64 routines.
Surely we can make the existing routines do what we want with a
parameter or two about whitespace and line length.

> - 0005, Refactor decision-making of password encryption into a single routine.

It makes sense to factor this out.  We probably don't need the pstrdup
if we just keep the string as is.  (You could make an argument for it if
the input values were const char *.)  We probably also don't need the
pfree.  The Assert(0) can probably be done better.  We usually use
elog() in such cases.

> - 0006, Add clause PASSWORD val USING protocol to CREATE/ALTER ROLE.

"protocol" is a weird choice here.  Maybe something like "method" is
better.  The way the USING clause is placed can be confusing.  It's not
clear that it belongs to PASSWORD.  If someone wants to augment another
clause in CREATE ROLE with a secondary argument, then it could get
really confusing.  I'd suggest something to group things together, like
PASSWORD (val USING method).  The method could be an identifier instead
of a string.

Please add an example to the documentation and explain better how this
interacts with the existing ENCRYPTED PASSWORD clause.

> - 0007, the SCRAM implementation.

No documentation about pg_hba.conf changes, so I don't know how to use
this. ;-)

This implements SASL and SCRAM and SHA256.  We need to be clear about
which term we advertise to users.  An explanation in the missing
documentation would probably be a good start.

I would also like to see a test suite that covers the authentication
specifically.

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Password identifiers, protocol aging and SCRAM protocol

From
Michael Paquier
Date:
On Sat, Nov 5, 2016 at 12:58 AM, Peter Eisentraut
<peter.eisentraut@2ndquadrant.com> wrote:
> The organization of these patches makes sense to me.
>
> On 10/20/16 1:14 AM, Michael Paquier wrote:
>> - 0001, moving all the SHA2 functions to src/common/ and introducing a
>> PG-like interface. No actual changes here.
>
> That's probably alright, although the patch contains a lot more changes
> than I would imagine for a simple file move.  I'll still have to review
> that in detail.

The main point is to know if people are happy of having an interface
of the type pg_sha256_[init|update|finish] to tackle the fact that
core code contains a set of routines that map with some of the OpenSSL
APIs...

>> - 0002, replacing PostmasterRandom by pg_strong_random(), with a fix
>> for the cancel key problem.
>> - 0003, adding for pg_strong_random() a fallback for any nix platform
>> not having /dev/random. This should be grouped with 0002, but I split
>> it for clarity.
>
> Also makes sense, but will need more detailed review.  I did not follow
> the previous PostmasterRandom issues closely.

pademelon does not have /dev/random and /dev/urandom, so the issue is
related to having a fallback method... But Heikki feels that having a
method producing potentially weak keys should not be in
pg_strong_random(). I'd suggest to control that with a ./configure
switch and call it a day. Platforms without any of the four randomness
methods pg_strong_random includes play a dangerous game but...

>> - 0004, Add encoding routines for base64 without whitespace in
>> src/common/. I improved the error handling here by making them return
>> -1 in case of error and let the caller handle the error.
>
> I don't think we want to have two different copies of base64 routines.
> Surely we can make the existing routines do what we want with a
> parameter or two about whitespace and line length.

We could. Though after hacking on that I find cleaner copying the code
from encoding.c after removing the whitespace handling, as Heikki has
suggested.

>> - 0005, Refactor decision-making of password encryption into a single routine.
>
> It makes sense to factor this out.  We probably don't need the pstrdup
> if we just keep the string as is.  (You could make an argument for it if
> the input values were const char *.)  We probably also don't need the
> pfree.  The Assert(0) can probably be done better.  We usually use
> elog() in such cases.

Hm, OK. Agreed with that.

>> - 0006, Add clause PASSWORD val USING protocol to CREATE/ALTER ROLE.
>
> "protocol" is a weird choice here.  Maybe something like "method" is
> better.  The way the USING clause is placed can be confusing.  It's not
> clear that it belongs to PASSWORD.  If someone wants to augment another
> clause in CREATE ROLE with a secondary argument, then it could get
> really confusing.  I'd suggest something to group things together, like
> PASSWORD (val USING method).  The method could be an identifier instead
> of a string.

Why not.

> Please add an example to the documentation and explain better how this
> interacts with the existing ENCRYPTED PASSWORD clause.

Sure.

>> - 0007, the SCRAM implementation.
>
> No documentation about pg_hba.conf changes, so I don't know how to use
> this. ;-)

Oops. I have focused on the code a lot during last rewrite of the
patch and forgot that. I'll think about something.

> This implements SASL and SCRAM and SHA256.  We need to be clear about
> which term we advertise to users.  An explanation in the missing
> documentation would probably be a good start.

pg_hba.conf uses "scram" as keyword, but scram refers to a family of
authentication methods. There is as well SCRAM-SHA-1, SCRAM-SHA-256
(what this patch does). Hence wouldn't it make sense to use
scram_sha256 in pg_hba.conf instead? If for example in the future
there is a SHA-512 version of SCRAM we could switch easily to that and
define scram_sha512.

There is also the channel binding to think about... So we could have a
list of keywords perhaps associated with SASL? Imagine for example:
sasl    $algo,$channel_binding
Giving potentially:
sasl    scram_sha256
sasl    scram_sha256,channel
sasl    scram_sha512
sasl    scram_sha512,channel
In the case of the patch of this thread just the first entry would
make sense, once channel binding support is added a second
keyword/option could be added. And there are of course other methods
that could replace SCRAM..

> I would also like to see a test suite that covers the authentication
> specifically.

What you have in mind is a TAP test with a couple of roles and
pg_hba.conf getting rewritten then reloaded? Adding it in
src/test/recovery/ is the first place that comes in mind but that's
not really something related to recovery... Any ideas?
-- 
Michael



Re: Password identifiers, protocol aging and SCRAM protocol

From
Victor Wagner
Date:
On Tue, 18 Oct 2016 16:35:27 +0900
Michael Paquier <michael.paquier@gmail.com> wrote:
Hi
> Attached is a rebased patch set for SCRAM, with the following things:
> - 0001, moving all the SHA2 functions to src/common/ and introducing a
> PG-like interface. No actual changes here.

It seems, that client nonce generation in this patch is not
RFC-compliant.

RFC 5802 states that SCRAM nonce should be

a sequence of random printable ASCII     characters excluding ','

while this patch uses sequence of random bytes from pg_strong_random
function with zero byte appended.

It could cause following problems

1. If zero byte happens inside random sequence, nonce would be shorter
than expected, or even empty.

2. If one of bytes happens to be ASCII Code of comma, than server
to the client-first message, which includes copy of client nonce,
appended by server nonce,
as one of unquoted comman-separated field, would be parsed incorrectly.


Regards, Victor
-- 
    




Re: Password identifiers, protocol aging and SCRAM protocol

From
Michael Paquier
Date:
On Wed, Nov 9, 2016 at 3:13 PM, Victor Wagner <vitus@wagner.pp.ru> wrote:
> On Tue, 18 Oct 2016 16:35:27 +0900
> Michael Paquier <michael.paquier@gmail.com> wrote:
>
>  Hi
>> Attached is a rebased patch set for SCRAM, with the following things:
>> - 0001, moving all the SHA2 functions to src/common/ and introducing a
>> PG-like interface. No actual changes here.
>
> It seems, that client nonce generation in this patch is not
> RFC-compliant.
>
> RFC 5802 states that SCRAM nonce should be
>
> a sequence of random printable ASCII
>       characters excluding ','
>
> while this patch uses sequence of random bytes from pg_strong_random
> function with zero byte appended.

(This is about patch 0007, not 0001)
Thanks, you are right. That's not good as-is. So this basically means
that the characters here should be from 32 to 127 included.
generate_nonce needs just to be made smarter in the way it selects the
character bytes.
-- 
Michael



Re: Password identifiers, protocol aging and SCRAM protocol

From
Victor Wagner
Date:
On Wed, 9 Nov 2016 15:23:11 +0900
Michael Paquier <michael.paquier@gmail.com> wrote:


> 
> (This is about patch 0007, not 0001)
> Thanks, you are right. That's not good as-is. So this basically means
> that the characters here should be from 32 to 127 included.

Really, most important is to exclude comma from the list of allowed
characters. And this prevents us from using a range.

I'd do something like:

char prinables="0123456789ABCDE...xyz!@#*&+";
unsigned int r;

for (i=0;i<SCRAM_NONCE_SIZE;i++) {    pg_strong_random(&r,sizeof(unsigned int))
nonce[i]=printables[r%(sizeof(prinables)-1)]   /* -1 is here to exclude terminating zero byte*/
 
}   

> generate_nonce needs just to be made smarter in the way it selects the
> character bytes.




Re: Password identifiers, protocol aging and SCRAM protocol

From
Michael Paquier
Date:
On Sat, Nov 5, 2016 at 9:36 PM, Michael Paquier
<michael.paquier@gmail.com> wrote:
> On Sat, Nov 5, 2016 at 12:58 AM, Peter Eisentraut
> <peter.eisentraut@2ndquadrant.com> wrote:
>> The organization of these patches makes sense to me.
>>
>> On 10/20/16 1:14 AM, Michael Paquier wrote:
>>> - 0001, moving all the SHA2 functions to src/common/ and introducing a
>>> PG-like interface. No actual changes here.
>>
>> That's probably alright, although the patch contains a lot more changes
>> than I would imagine for a simple file move.  I'll still have to review
>> that in detail.
>
> The main point is to know if people are happy of having an interface
> of the type pg_sha256_[init|update|finish] to tackle the fact that
> core code contains a set of routines that map with some of the OpenSSL
> APIs...

Or in short that:
+extern void pg_sha256_init(pg_sha256_ctx *ctx);
+extern void pg_sha256_update(pg_sha256_ctx *ctx,
+                       const uint8 *input0, size_t len);
+extern void pg_sha256_final(pg_sha256_ctx *ctx, uint8 *dest);

>>> - 0005, Refactor decision-making of password encryption into a single routine.
>>
>> It makes sense to factor this out.  We probably don't need the pstrdup
>> if we just keep the string as is.  (You could make an argument for it if
>> the input values were const char *.)  We probably also don't need the
>> pfree.  The Assert(0) can probably be done better.  We usually use
>> elog() in such cases.
>
> Hm, OK. Agreed with that.

I have replaced the Assert(0) with an elog(ERROR). OK for the
additional palloc and pfree calls. I just made that for consistency in
the routine for all the password types, but changed your way.

>>> - 0006, Add clause PASSWORD val USING protocol to CREATE/ALTER ROLE.
>>
>> "protocol" is a weird choice here.  Maybe something like "method" is
>> better.  The way the USING clause is placed can be confusing.  It's not
>> clear that it belongs to PASSWORD.  If someone wants to augment another
>> clause in CREATE ROLE with a secondary argument, then it could get
>> really confusing.  I'd suggest something to group things together, like
>> PASSWORD (val USING method).  The method could be an identifier instead
>> of a string.
>
> Why not.

Done.

>> Please add an example to the documentation and explain better how this
>> interacts with the existing ENCRYPTED PASSWORD clause.
>
> Sure.

Done.

>>> - 0007, the SCRAM implementation.
>>
>> No documentation about pg_hba.conf changes, so I don't know how to use
>> this. ;-)
>
> Oops. I have focused on the code a lot during last rewrite of the
> patch and forgot that. I'll think about something.
>
>> This implements SASL and SCRAM and SHA256.  We need to be clear about
>> which term we advertise to users.  An explanation in the missing
>> documentation would probably be a good start.
>
> pg_hba.conf uses "scram" as keyword, but scram refers to a family of
> authentication methods. There is as well SCRAM-SHA-1, SCRAM-SHA-256
> (what this patch does). Hence wouldn't it make sense to use
> scram_sha256 in pg_hba.conf instead? If for example in the future
> there is a SHA-512 version of SCRAM we could switch easily to that and
> define scram_sha512.

OK, I have added more docs regarding the use of scram in pg_hba.conf,
particularly in client-auth.sgml to describe what scram is better than
md5 in terms of protection, and also completed the data of pg_hba.conf
about the new keyword used in it.

>> I would also like to see a test suite that covers the authentication
>> specifically.
>
> What you have in mind is a TAP test with a couple of roles and
> pg_hba.conf getting rewritten then reloaded? Adding it in
> src/test/recovery/ is the first place that comes in mind but that's
> not really something related to recovery... Any ideas?

OK, hearing no complaints I have done exactly that and added a test in
src/test/recovery/ with patch 0009. This place may not be the best fit
though, but it looks like an overkill to add a new module in
src/test/modules just for that and that's a pretty compact test.

On Wed, Nov 9, 2016 at 3:13 PM, Victor Wagner <vitus@wagner.pp.ru> wrote:
> On Tue, 18 Oct 2016 16:35:27 +0900
> Michael Paquier <michael.paquier@gmail.com> wrote:
>> Attached is a rebased patch set for SCRAM, with the following things:
>> - 0001, moving all the SHA2 functions to src/common/ and introducing a
>> PG-like interface. No actual changes here.
>
> It seems, that client nonce generation in this patch is not
> RFC-compliant.
>
> RFC 5802 states that SCRAM nonce should be
>
> a sequence of random printable ASCII
>       characters excluding ','
>
> while this patch uses sequence of random bytes from pg_strong_random
> function with zero byte appended.

Right, I have fixed that in 0007 with a solution less exotic than what
you suggested upthread by scanning the ASCII characters between '!'
and '~', ignoring comma if selected.
--
Michael

Attachment

Re: Password identifiers, protocol aging and SCRAM protocol

From
Robert Haas
Date:
On Fri, Nov 4, 2016 at 11:58 AM, Peter Eisentraut
<peter.eisentraut@2ndquadrant.com> wrote:
> The organization of these patches makes sense to me.
>
> On 10/20/16 1:14 AM, Michael Paquier wrote:
>> - 0001, moving all the SHA2 functions to src/common/ and introducing a
>> PG-like interface. No actual changes here.
>
> That's probably alright, although the patch contains a lot more changes
> than I would imagine for a simple file move.  I'll still have to review
> that in detail.

Even with git diff -M, reviewing 0001 is very difficult.  It does
things that are considerably in excess of what is needed to move these
files from point A to point B, such as:

- Renaming static functions to have a "pg" prefix.
- Changing the order of the functions in the file.
- Renaming an argument called "context" to "cxt".

I think that is a bad plan.  I think we should insist that 0001
content itself with a minimal move of the files changing no more than
is absolutely necessary.  If refactoring is needed, those changes can
be submitted separately, which will be much easier to review.  My
preliminary judgement is that most of this change is pointless and
should be reverted.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Password identifiers, protocol aging and SCRAM protocol

From
Michael Paquier
Date:
On Tue, Nov 15, 2016 at 10:40 AM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Fri, Nov 4, 2016 at 11:58 AM, Peter Eisentraut
> <peter.eisentraut@2ndquadrant.com> wrote:
>> The organization of these patches makes sense to me.
>>
>> On 10/20/16 1:14 AM, Michael Paquier wrote:
>>> - 0001, moving all the SHA2 functions to src/common/ and introducing a
>>> PG-like interface. No actual changes here.
>>
>> That's probably alright, although the patch contains a lot more changes
>> than I would imagine for a simple file move.  I'll still have to review
>> that in detail.
>
> Even with git diff -M, reviewing 0001 is very difficult.  It does
> things that are considerably in excess of what is needed to move these
> files from point A to point B, such as:
>
> - Renaming static functions to have a "pg" prefix.
> - Changing the order of the functions in the file.
> - Renaming an argument called "context" to "cxt".
>
> I think that is a bad plan.  I think we should insist that 0001
> content itself with a minimal move of the files changing no more than
> is absolutely necessary.  If refactoring is needed, those changes can
> be submitted separately, which will be much easier to review.  My
> preliminary judgement is that most of this change is pointless and
> should be reverted.

How do you plug in that with OpenSSL? Are you suggesting to use a set
of undef definitions in the new header in the same way as pgcrypto is
doing, which is rather ugly? Because that's what the deal is about in
this patch.
-- 
Michael



Re: Password identifiers, protocol aging and SCRAM protocol

From
Robert Haas
Date:
On Tue, Nov 15, 2016 at 2:24 PM, Michael Paquier
<michael.paquier@gmail.com> wrote:
> How do you plug in that with OpenSSL? Are you suggesting to use a set
> of undef definitions in the new header in the same way as pgcrypto is
> doing, which is rather ugly? Because that's what the deal is about in
> this patch.

Perhaps that justifies renaming them -- although I would think the
fact that they are static would prevent conflicts -- but why reorder
them and change variable names?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Password identifiers, protocol aging and SCRAM protocol

From
Michael Paquier
Date:
On Tue, Nov 15, 2016 at 12:40 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Tue, Nov 15, 2016 at 2:24 PM, Michael Paquier
> <michael.paquier@gmail.com> wrote:
>> How do you plug in that with OpenSSL? Are you suggesting to use a set
>> of undef definitions in the new header in the same way as pgcrypto is
>> doing, which is rather ugly? Because that's what the deal is about in
>> this patch.
>
> Perhaps that justifies renaming them -- although I would think the
> fact that they are static would prevent conflicts -- but why reorder
> them and change variable names?

Yeah... Perhaps I should not have done that, which was just for
consistency's sake, and even if the new reordering makes more sense
actually...
-- 
Michael



Re: Password identifiers, protocol aging and SCRAM protocol

From
Robert Haas
Date:
On Tue, Nov 15, 2016 at 5:12 PM, Michael Paquier
<michael.paquier@gmail.com> wrote:
> On Tue, Nov 15, 2016 at 12:40 PM, Robert Haas <robertmhaas@gmail.com> wrote:
>> On Tue, Nov 15, 2016 at 2:24 PM, Michael Paquier
>> <michael.paquier@gmail.com> wrote:
>>> How do you plug in that with OpenSSL? Are you suggesting to use a set
>>> of undef definitions in the new header in the same way as pgcrypto is
>>> doing, which is rather ugly? Because that's what the deal is about in
>>> this patch.
>>
>> Perhaps that justifies renaming them -- although I would think the
>> fact that they are static would prevent conflicts -- but why reorder
>> them and change variable names?
>
> Yeah... Perhaps I should not have done that, which was just for
> consistency's sake, and even if the new reordering makes more sense
> actually...

Yeah, I don't see a point to that.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Password identifiers, protocol aging and SCRAM protocol

From
Michael Paquier
Date:
On Wed, Nov 16, 2016 at 4:46 AM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Tue, Nov 15, 2016 at 5:12 PM, Michael Paquier
> <michael.paquier@gmail.com> wrote:
>> On Tue, Nov 15, 2016 at 12:40 PM, Robert Haas <robertmhaas@gmail.com> wrote:
>>> On Tue, Nov 15, 2016 at 2:24 PM, Michael Paquier
>>> <michael.paquier@gmail.com> wrote:
>>>> How do you plug in that with OpenSSL? Are you suggesting to use a set
>>>> of undef definitions in the new header in the same way as pgcrypto is
>>>> doing, which is rather ugly? Because that's what the deal is about in
>>>> this patch.
>>>
>>> Perhaps that justifies renaming them -- although I would think the
>>> fact that they are static would prevent conflicts -- but why reorder
>>> them and change variable names?
>>
>> Yeah... Perhaps I should not have done that, which was just for
>> consistency's sake, and even if the new reordering makes more sense
>> actually...
>
> Yeah, I don't see a point to that.

OK, by doing so here is what I have. The patch generated by
format-patch, as well as diffs generated by git diff -M are reduced
and the patch gets half in size. They could be reduced more by adding
at the top of sha2.c a couple of defined to map the old SHAXXX_YYY
variables with their PG_ equivalents, but that does not seem worth it
to me, and diffs are listed line by line.
--
Michael

Attachment

Re: Password identifiers, protocol aging and SCRAM protocol

From
Robert Haas
Date:
On Wed, Nov 16, 2016 at 1:53 PM, Michael Paquier
<michael.paquier@gmail.com> wrote:
>> Yeah, I don't see a point to that.
>
> OK, by doing so here is what I have. The patch generated by
> format-patch, as well as diffs generated by git diff -M are reduced
> and the patch gets half in size. They could be reduced more by adding
> at the top of sha2.c a couple of defined to map the old SHAXXX_YYY
> variables with their PG_ equivalents, but that does not seem worth it
> to me, and diffs are listed line by line.

All right, this version is much easier to review.  I am a bit puzzled,
though.  It looks like src/common will include sha2.o if built without
OpenSSL and sha2_openssl.o if built with OpenSSL.  So far, so good.
One would think, then, that pgcrypto would not need to worry about
these functions any more because libpgcommon_srv.a is linked into the
server, so any references to those symbols would presumably just work.
However, that's not what you did.  On Windows, you added a dependency
on libpgcommon which I think is unnecessary because that stuff is
already linked into the server.  On non-Windows systems, however, you
have instead taught pgcrypto to copy the source file it needs from
src/common and recompile it.  I don't understand why you need to do
any of that, or why it should be different on Windows vs. non-Windows.
So I think that the changes for the pgcrypto Makefile could just look
like this:

diff --git a/contrib/pgcrypto/Makefile b/contrib/pgcrypto/Makefile
index 805db76..ddb0183 100644
--- a/contrib/pgcrypto/Makefile
+++ b/contrib/pgcrypto/Makefile
@@ -1,6 +1,6 @@# contrib/pgcrypto/Makefile

-INT_SRCS = md5.c sha1.c sha2.c internal.c internal-sha2.c blf.c rijndael.c \
+INT_SRCS = md5.c sha1.c internal.c internal-sha2.c blf.c rijndael.c \        fortuna.c random.c pgp-mpi-internal.c
imath.cINT_TESTS= sha2
 

And for Mkvcbuild.pm I think you could just do this:

diff --git a/src/tools/msvc/Mkvcbuild.pm b/src/tools/msvc/Mkvcbuild.pm
index de764dd..1993764 100644
--- a/src/tools/msvc/Mkvcbuild.pm
+++ b/src/tools/msvc/Mkvcbuild.pm
@@ -114,6 +114,15 @@ sub mkvcbuild      md5.c pg_lzcompress.c pgfnames.c psprintf.c relpath.c rmtree.c      string.c
username.cwait_error.c);
 

+    if ($solution->{options}->{openssl})
+    {
+        push(@pgcommonallfiles, 'sha2_openssl.c');
+    }
+    else
+    {
+        push(@pgcommonallfiles, 'sha2.c');
+    }
+    our @pgcommonfrontendfiles = (        @pgcommonallfiles, qw(fe_memutils.c file_utils.c
restricted_token.c));
@@ -422,7 +431,7 @@ sub mkvcbuild    {        $pgcrypto->AddFiles(            'contrib/pgcrypto',   'md5.c',
-            'sha1.c',             'sha2.c',
+            'sha1.c',            'internal.c',         'internal-sha2.c',            'blf.c',
'rijndael.c',           'fortuna.c',          'random.c',
 

Is there some reason that won't work?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Password identifiers, protocol aging and SCRAM protocol

From
Michael Paquier
Date:
On Wed, Nov 16, 2016 at 11:24 AM, Robert Haas <robertmhaas@gmail.com> wrote:
> diff --git a/contrib/pgcrypto/Makefile b/contrib/pgcrypto/Makefile
> index 805db76..ddb0183 100644
> --- a/contrib/pgcrypto/Makefile
> +++ b/contrib/pgcrypto/Makefile
> @@ -1,6 +1,6 @@
>  # contrib/pgcrypto/Makefile
>
> -INT_SRCS = md5.c sha1.c sha2.c internal.c internal-sha2.c blf.c rijndael.c \
> +INT_SRCS = md5.c sha1.c internal.c internal-sha2.c blf.c rijndael.c \
>          fortuna.c random.c pgp-mpi-internal.c imath.c
>  INT_TESTS = sha2

I would like to do so. And while Linux is happy with that, macOS is
not, this results in linking resolution errors when compiling the
library.

> And for Mkvcbuild.pm I think you could just do this:
>
> diff --git a/src/tools/msvc/Mkvcbuild.pm b/src/tools/msvc/Mkvcbuild.pm
> index de764dd..1993764 100644
> --- a/src/tools/msvc/Mkvcbuild.pm
> +++ b/src/tools/msvc/Mkvcbuild.pm
> @@ -114,6 +114,15 @@ sub mkvcbuild
>        md5.c pg_lzcompress.c pgfnames.c psprintf.c relpath.c rmtree.c
>        string.c username.c wait_error.c);
>
> +    if ($solution->{options}->{openssl})
> +    {
> +        push(@pgcommonallfiles, 'sha2_openssl.c');
> +    }
> +    else
> +    {
> +        push(@pgcommonallfiles, 'sha2.c');
> +    }
> +
>      our @pgcommonfrontendfiles = (
>          @pgcommonallfiles, qw(fe_memutils.c file_utils.c
>            restricted_token.c));
> @@ -422,7 +431,7 @@ sub mkvcbuild
>      {
>          $pgcrypto->AddFiles(
>              'contrib/pgcrypto',   'md5.c',
> -            'sha1.c',             'sha2.c',
> +            'sha1.c',
>              'internal.c',         'internal-sha2.c',
>              'blf.c',              'rijndael.c',
>              'fortuna.c',          'random.c',
>
> Is there some reason that won't work?

Yes we could do that for consistency with the other nix platforms. But
is that really necessary as libpgcommon already has those objects?
-- 
Michael



Re: Password identifiers, protocol aging and SCRAM protocol

From
Robert Haas
Date:
On Wed, Nov 16, 2016 at 6:56 PM, Michael Paquier
<michael.paquier@gmail.com> wrote:
> On Wed, Nov 16, 2016 at 11:24 AM, Robert Haas <robertmhaas@gmail.com> wrote:
>> diff --git a/contrib/pgcrypto/Makefile b/contrib/pgcrypto/Makefile
>> index 805db76..ddb0183 100644
>> --- a/contrib/pgcrypto/Makefile
>> +++ b/contrib/pgcrypto/Makefile
>> @@ -1,6 +1,6 @@
>>  # contrib/pgcrypto/Makefile
>>
>> -INT_SRCS = md5.c sha1.c sha2.c internal.c internal-sha2.c blf.c rijndael.c \
>> +INT_SRCS = md5.c sha1.c internal.c internal-sha2.c blf.c rijndael.c \
>>          fortuna.c random.c pgp-mpi-internal.c imath.c
>>  INT_TESTS = sha2
>
> I would like to do so. And while Linux is happy with that, macOS is
> not, this results in linking resolution errors when compiling the
> library.

Well, I'm running macOS and it worked for me.  TBH, I don't even quite
understand how it could NOT work.  What makes the symbols provided by
libpgcommon any different from any other symbols that are part of the
binary?  How could one set work and the other set fail?  I can
understand how there might be some problem if the backend were
dynamically linked libpgcommon, but it's not.  It's doing this:

gcc -Wall -Wmissing-prototypes -Wpointer-arith
-Wdeclaration-after-statement -Wendif-labels
-Wmissing-format-attribute -Wformat-security -fno-strict-aliasing
-fwrapv -g -O2 -Wall -Werror -L../../src/port -L../../src/common
-Wl,-dead_strip_dylibs  -Wall -Werror   access/brin/brin.o [many more
.o files omitted for brevity] utils/fmgrtab.o
../../src/timezone/localtime.o ../../src/timezone/strftime.o
../../src/timezone/pgtz.o ../../src/port/libpgport_srv.a
../../src/common/libpgcommon_srv.a -lm -o postgres

As I understand it, listing the .a file on the linker command line
like that is exactly equivalent to listing out each individual .o file
that is part of that static library.  There shouldn't be any
difference in how a symbol that's provided by one of the .o files
looks vs. how a symbol that's provided by one of the .a files looks.
Let's test it.

[rhaas pgsql]$ nm src/backend/postgres | grep -E 'GetUserIdAndContext|psprintf'
00000001003d71d0 T _GetUserIdAndContext
000000010040f160 T _psprintf

So... how would the dynamic loader know that it was supposed to find
the first one and fail to find the second one?  More to the point,
it's clear that it DOES find the second one on every platform in the
buildfarm, because adminpack, dblink, pageinspect, and pgstattuple all
use psprintf without the push-ups you are proposing to undertake here.
pg_md5_encrypt is used by passwordcheck, and forkname_to_number is
used by pageinspect and pg_prewarm.  It all just works.  No special
magic required.

> Yes we could do that for consistency with the other nix platforms. But
> is that really necessary as libpgcommon already has those objects?

The point is that *postgres* already has those objects.  You don't
need to include them twice.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Password identifiers, protocol aging and SCRAM protocol

From
Andres Freund
Date:
Hi,

On 2016-11-16 19:29:41 -0500, Robert Haas wrote:
> On Wed, Nov 16, 2016 at 6:56 PM, Michael Paquier
> <michael.paquier@gmail.com> wrote:
> > On Wed, Nov 16, 2016 at 11:24 AM, Robert Haas <robertmhaas@gmail.com> wrote:
> >> diff --git a/contrib/pgcrypto/Makefile b/contrib/pgcrypto/Makefile
> >> index 805db76..ddb0183 100644
> >> --- a/contrib/pgcrypto/Makefile
> >> +++ b/contrib/pgcrypto/Makefile
> >> @@ -1,6 +1,6 @@
> >>  # contrib/pgcrypto/Makefile
> >>
> >> -INT_SRCS = md5.c sha1.c sha2.c internal.c internal-sha2.c blf.c rijndael.c \
> >> +INT_SRCS = md5.c sha1.c internal.c internal-sha2.c blf.c rijndael.c \
> >>          fortuna.c random.c pgp-mpi-internal.c imath.c
> >>  INT_TESTS = sha2
> >
> > I would like to do so. And while Linux is happy with that, macOS is
> > not, this results in linking resolution errors when compiling the
> > library.
>
> Well, I'm running macOS and it worked for me.  TBH, I don't even quite
> understand how it could NOT work.  What makes the symbols provided by
> libpgcommon any different from any other symbols that are part of the
> binary?  How could one set work and the other set fail?  I can
> understand how there might be some problem if the backend were
> dynamically linked libpgcommon, but it's not.  It's doing this:

With -Wl,--as-neeeded the linker will dismiss unused symbols found in a
static library. Maybe that's the difference?

Andres



Re: Password identifiers, protocol aging and SCRAM protocol

From
Robert Haas
Date:
On Wed, Nov 16, 2016 at 7:36 PM, Andres Freund <andres@anarazel.de> wrote:
> With -Wl,--as-neeeded the linker will dismiss unused symbols found in a
> static library. Maybe that's the difference?

The man page --as-needed says that --as-needed modifies the behavior
of dynamic libraries, not static ones.  If there is any such effect,
it is undocumented.  Here is the text:

LD> This option affects ELF DT_NEEDED tags for dynamic libraries mentioned
LD> on the command line after the --as-needed option. Normally the linker will
LD> add a DT_NEEDED tag for each dynamic library mentioned on the
LD> command line, regardless of whether the library is actually needed or not.
LD> --as-needed causes a DT_NEEDED tag to only be emitted for a library
LD> that at that point in the link satisfies a non-weak undefined
symbol reference
LD> from a regular object file or, if the library is not found in the DT_NEEDED
LD> lists of other needed libraries, a non-weak undefined symbol reference
LD> from another needed dynamic library. Object files or libraries appearing
LD> on the command line after the library in question do not affect whether the
LD> library is seen as needed. This is similar to the rules for
extraction of object
LD> files from archives. --no-as-needed restores the default behaviour.

Some experimentation on my Mac reveals that my previous statement
about how this works was incorrect.  See attached patch for what I
tried.  What I find is:

1. If I create an additional source file in src/common containing a
completely unused symbol (wunk) it appears in the nm output for
libpgcommon_srv.a but not in the nm output for the postgres binary.

2. If I add an additional function to an existing source file in
src/common containing a completely unused symbol (quux) it appears in
the nm output for both libpgcommon_srv.a and also in the nm output for
the postgres binary.

3. If I create an additional source file in src/backend containing a
completely unused symbol (blarfle) it appears in the nm output for the
postgres binary.

So, it seems that the linker is willing to drop archive members if the
entire .o file is used, but not individual symbols.  That explains why
Michael thinks we need to do something special here, because with his
0001 patch, nothing in the new sha2(_openssl).c file would immediately
be used in the backend.  And indeed I see now that my earlier testing
was done incorrectly, and pgcrypto does in fact fail to build under my
proposal.  Oops.

But I think that's a temporary thing.  As soon as the backend is using
the sha2 routines for anything (which is the point, right?) the build
changes become unnecessary.  For example, if I apply this patch:

--- a/src/backend/lib/binaryheap.c
+++ b/src/backend/lib/binaryheap.c
@@ -305,3 +305,7 @@ sift_down(binaryheap *heap, int node_off)
                node_off = swap_off;
        }
 }
+
+#include "common/sha2.h"
+extern void ugh(void);
+void ugh(void) { pg_sha224_init(NULL); }

...then the backend ends up sucking in everything in sha2.c and the
pgcrypto build works again.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Attachment

Re: Password identifiers, protocol aging and SCRAM protocol

From
Michael Paquier
Date:
On Wed, Nov 16, 2016 at 6:51 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> So, it seems that the linker is willing to drop archive members if the
> entire .o file is used, but not individual symbols.  That explains why
> Michael thinks we need to do something special here, because with his
> 0001 patch, nothing in the new sha2(_openssl).c file would immediately
> be used in the backend.  And indeed I see now that my earlier testing
> was done incorrectly, and pgcrypto does in fact fail to build under my
> proposal.  Oops.

Ah, thanks! I did not notice that before in configure.in:
if test "$PORTNAME" = "darwin"; then PGAC_PROG_CC_LDFLAGS_OPT([-Wl,-dead_strip_dylibs], $link_test_func)
elif test "$PORTNAME" = "openbsd"; then PGAC_PROG_CC_LDFLAGS_OPT([-Wl,-Bdynamic], $link_test_func)
else PGAC_PROG_CC_LDFLAGS_OPT([-Wl,--as-needed], $link_test_func)
fi

In the current set of patches, the sha2 functions would not get used
until the main patch for SCRAM gets committed so that's a couple of
steps and many months ahead.. And --as-needed/--no-as-needed are not
supported in macos. So I would believe that the best route is just to
use this patch with the way it does things, and once SCRAM gets in we
could switch the build into more appropriate linking. At least that's
far less ugly than having fake objects in the backend code. Of course
a comment in pgcrypo's Makefile would be appropriate.
-- 
Michael



Re: Password identifiers, protocol aging and SCRAM protocol

From
Michael Paquier
Date:
On Wed, Nov 16, 2016 at 8:04 PM, Michael Paquier
<michael.paquier@gmail.com> wrote:
> In the current set of patches, the sha2 functions would not get used
> until the main patch for SCRAM gets committed so that's a couple of
> steps and many months ahead.. And --as-needed/--no-as-needed are not
> supported in macos. So I would believe that the best route is just to
> use this patch with the way it does things, and once SCRAM gets in we
> could switch the build into more appropriate linking. At least that's
> far less ugly than having fake objects in the backend code. Of course
> a comment in pgcrypo's Makefile would be appropriate.

Or a comment with a "ifeq ($(PORTNAME), darwin)" containing the
additional objects to make clear that this is proper to only OSX.
Other ideas are welcome.
-- 
Michael



Re: Password identifiers, protocol aging and SCRAM protocol

From
Robert Haas
Date:
On Wed, Nov 16, 2016 at 11:28 PM, Michael Paquier
<michael.paquier@gmail.com> wrote:
> On Wed, Nov 16, 2016 at 8:04 PM, Michael Paquier
> <michael.paquier@gmail.com> wrote:
>> In the current set of patches, the sha2 functions would not get used
>> until the main patch for SCRAM gets committed so that's a couple of
>> steps and many months ahead.. And --as-needed/--no-as-needed are not
>> supported in macos. So I would believe that the best route is just to
>> use this patch with the way it does things, and once SCRAM gets in we
>> could switch the build into more appropriate linking. At least that's
>> far less ugly than having fake objects in the backend code. Of course
>> a comment in pgcrypo's Makefile would be appropriate.
>
> Or a comment with a "ifeq ($(PORTNAME), darwin)" containing the
> additional objects to make clear that this is proper to only OSX.
> Other ideas are welcome.

So, the problem isn't Darwin-specific.  I experimented with this on
Linux and found Linux does the same thing with libpgcommon_srv.a that
macOS does: a file in the archive that is totally unused is omitted
from the postgres binary.  In Linux, however, that doesn't prevent
pgcrypto from compiling anyway.  It does, however, prevent it from
working.  Instead of failing at compile time with a complaint about
missing symbols, it fails at load time.  I think that's because macOS
has -bundle-loader and we use it; without that, I think we'd get the
same behavior on macOS that we get on Windows.

The fundamental problem here is that the archive-member-dropping
behavior that we're getting here is not really what we want, and I
think that's going to happen on most or all architectures.  For GNU
ld, we could add -Wl,--whole-archive, and macOS has -all_load, but I
that this is just a nest of portability problems waiting to happen.  I
think there are two things we can do here that are far simpler:

1. Rejigger things so that we don't build libpgcommon_srv.a in the
first place, and instead add $(top_builddir)/src/common to
src/backend/Makefile's value of SUBDIRS.  With appropriate adjustments
to src/common/Makefile, this should allow us to include all of the
object files on the linker command line individually instead of
building an archive library that is then used only for the postgres
binary itself anyway.  Then, things wouldn't get dropped.

2. Just postpone committing this patch until we're ready to use the
new code in the backend someplace (or add a dummy reference to it
someplace).

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Password identifiers, protocol aging and SCRAM protocol

From
Michael Paquier
Date:
On Thu, Nov 17, 2016 at 8:12 AM, Robert Haas <robertmhaas@gmail.com> wrote:
> So, the problem isn't Darwin-specific.  I experimented with this on
> Linux and found Linux does the same thing with libpgcommon_srv.a that
> macOS does: a file in the archive that is totally unused is omitted
> from the postgres binary.  In Linux, however, that doesn't prevent
> pgcrypto from compiling anyway.  It does, however, prevent it from
> working.  Instead of failing at compile time with a complaint about
> missing symbols, it fails at load time.  I think that's because macOS
> has -bundle-loader and we use it; without that, I think we'd get the
> same behavior on macOS that we get on Windows.

Yes, right. I recall seeing the regression tests failing with pgcrypto
when doing that. Though I did not recall if this was specific to macos
or Linux when I looked again at this patch yesterday. When testing
again yesterday I was able to make the tests of pgcrypto to pass, but
perhaps my build was not in a clean state...

> 1. Rejigger things so that we don't build libpgcommon_srv.a in the
> first place, and instead add $(top_builddir)/src/common to
> src/backend/Makefile's value of SUBDIRS.  With appropriate adjustments
> to src/common/Makefile, this should allow us to include all of the
> object files on the linker command line individually instead of
> building an archive library that is then used only for the postgres
> binary itself anyway.  Then, things wouldn't get dropped.
>
> 2. Just postpone committing this patch until we're ready to use the
> new code in the backend someplace (or add a dummy reference to it
> someplace).

At the end this refactoring makes sense because it will be used in the
backend with the SCRAM engine, so we could just wait for 2 instead of
having some workarounds. This is dropping the ball for later and there
will be already a lot of work for the SCRAM core part, though I don't
think that the SHA2 refactoring will change much going forward.

Option 3 would be to do things the patch does it, aka just compiling
pgcrypto using the source files directly and put a comment to revert
that once the APIs are used in the backend. I can guess that you don't
like that.
-- 
Michael



Re: Password identifiers, protocol aging and SCRAM protocol

From
Michael Paquier
Date:
On Fri, Nov 18, 2016 at 2:51 AM, Michael Paquier
<michael.paquier@gmail.com> wrote:
> On Thu, Nov 17, 2016 at 8:12 AM, Robert Haas <robertmhaas@gmail.com> wrote:
>> So, the problem isn't Darwin-specific.  I experimented with this on
>> Linux and found Linux does the same thing with libpgcommon_srv.a that
>> macOS does: a file in the archive that is totally unused is omitted
>> from the postgres binary.  In Linux, however, that doesn't prevent
>> pgcrypto from compiling anyway.  It does, however, prevent it from
>> working.  Instead of failing at compile time with a complaint about
>> missing symbols, it fails at load time.  I think that's because macOS
>> has -bundle-loader and we use it; without that, I think we'd get the
>> same behavior on macOS that we get on Windows.
>
> Yes, right. I recall seeing the regression tests failing with pgcrypto
> when doing that. Though I did not recall if this was specific to macos
> or Linux when I looked again at this patch yesterday. When testing
> again yesterday I was able to make the tests of pgcrypto to pass, but
> perhaps my build was not in a clean state...
>
>> 1. Rejigger things so that we don't build libpgcommon_srv.a in the
>> first place, and instead add $(top_builddir)/src/common to
>> src/backend/Makefile's value of SUBDIRS.  With appropriate adjustments
>> to src/common/Makefile, this should allow us to include all of the
>> object files on the linker command line individually instead of
>> building an archive library that is then used only for the postgres
>> binary itself anyway.  Then, things wouldn't get dropped.
>>
>> 2. Just postpone committing this patch until we're ready to use the
>> new code in the backend someplace (or add a dummy reference to it
>> someplace).
>
> At the end this refactoring makes sense because it will be used in the
> backend with the SCRAM engine, so we could just wait for 2 instead of
> having some workarounds. This is dropping the ball for later and there
> will be already a lot of work for the SCRAM core part, though I don't
> think that the SHA2 refactoring will change much going forward.
>
> Option 3 would be to do things the patch does it, aka just compiling
> pgcrypto using the source files directly and put a comment to revert
> that once the APIs are used in the backend. I can guess that you don't
> like that.

Nothing more will likely happen in this CF, so I have moved it to
2017-01 with the same status of "Needs Review".
-- 
Michael



Re: Password identifiers, protocol aging and SCRAM protocol

From
Michael Paquier
Date:
On Tue, Nov 29, 2016 at 1:36 PM, Michael Paquier
<michael.paquier@gmail.com> wrote:
> Nothing more will likely happen in this CF, so I have moved it to
> 2017-01 with the same status of "Needs Review".

Attached is a new set of patches using the new routines
pg_backend_random() and pg_strong_random() to handle the randomness in
SCRAM:
- 0001 refactors the SHA2 routines. pgcrypto uses raw files from
src/common when compiling with this patch. That works on any platform,
and this is the simplified version of upthread.
- 0002 adds base64 routines to src/common.
- 0003 does some refactoring regarding the password encryption in
ALTER/CREATE USER queries.
- 0004 adds the clause PASSWORD (val USING method) in CREATE/ALTER USER.
- 0005 is the code patch for SCRAM. Note that this switches pgcrypto
to link to libpgcommon as SHA2 routines are used by the backend.
- 0006 adds some regression tests for passwords.
- 0007 adds some TAP tests for authentication.
This is added to the upcoming CF.

Thanks,
--
Michael

Attachment

Re: Password identifiers, protocol aging and SCRAM protocol

From
Heikki Linnakangas
Date:
On 12/07/2016 08:39 AM, Michael Paquier wrote:
> On Tue, Nov 29, 2016 at 1:36 PM, Michael Paquier
> <michael.paquier@gmail.com> wrote:
>> Nothing more will likely happen in this CF, so I have moved it to
>> 2017-01 with the same status of "Needs Review".
>
> Attached is a new set of patches using the new routines
> pg_backend_random() and pg_strong_random() to handle the randomness in
> SCRAM:
> - 0001 refactors the SHA2 routines. pgcrypto uses raw files from
> src/common when compiling with this patch. That works on any platform,
> and this is the simplified version of upthread.
> - 0002 adds base64 routines to src/common.
> - 0003 does some refactoring regarding the password encryption in
> ALTER/CREATE USER queries.
> - 0004 adds the clause PASSWORD (val USING method) in CREATE/ALTER USER.
> - 0005 is the code patch for SCRAM. Note that this switches pgcrypto
> to link to libpgcommon as SHA2 routines are used by the backend.
> - 0006 adds some regression tests for passwords.
> - 0007 adds some TAP tests for authentication.
> This is added to the upcoming CF.

I spent a little time reading through this once again. Steady progress,
did some small fixes:

* Rewrote the nonce generation. In the server-side, it first generated a
string of ascii-printable characters, then base64-encoded them, which is
superfluous. Also, avoid calling pg_strong_random() one byte at a time,
for performance reasons.

* Added a more sophisticated fallback implementation in libpq, for the
--disable-strong-random cases, similar to pg_backend_random().

* No need to disallow SCRAM with db_user_namespace. It doesn't include
the username in the salt like MD5 does.

Attached those here, as add-on patches to your latest patch set. I'll
continue reviewing, but a couple of things caught my eye that you may
want to jump on, in the meanwhile:

On error messages, the spec says:

> o  e: This attribute specifies an error that occurred during
>       authentication exchange.  It is sent by the server in its final
>       message and can help diagnose the reason for the authentication
>       exchange failure.  On failed authentication, the entire server-
>       final-message is OPTIONAL; specifically, a server implementation
>       MAY conclude the SASL exchange with a failure without sending the
>       server-final-message.  This results in an application-level error
>       response without an extra round-trip.  If the server-final-message
>       is sent on authentication failure, then the "e" attribute MUST be
>       included.

Note that it says that the server can send the error message with the e=
attribute, in the *final message*. It's not a valid response in the
earlier state, before sending server-first-message. I think we need to
change the INIT state handling in pg_be_scram_exchange() to not send e=
messages to the client. On an error at that state, it needs to just bail
out without a message. The spec allows that. We can always log the
detailed reason in the server log, anyway.

As Peter E pointed out earlier, the documentation is lacking, on how to
configure MD5 and/or SCRAM. If you put "scram" as the authentication
method in pg_hba.conf, what does it mean? If you have a line for both
"scram" and "md5" in pg_hba.conf, with the same database/user/hostname
combo, what does that mean? Answer: The first one takes effect, the
second one has no effect. Yet the example in the docs now has that,
which is nonsense :-). Hopefully we'll have some kind of a "both"
option, before the release, but in the meanwhile, we need describe how
this works now in the docs.

- Heikki


Attachment

Re: Password identifiers, protocol aging and SCRAM protocol

From
Michael Paquier
Date:
On Thu, Dec 8, 2016 at 5:54 AM, Heikki Linnakangas <hlinnaka@iki.fi> wrote:
> Attached those here, as add-on patches to your latest patch set.

Thanks for looking at it!

> I'll continue reviewing, but a couple of things caught my eye that you may want
> to jump on, in the meanwhile:
>
> On error messages, the spec says:
>
>> o  e: This attribute specifies an error that occurred during
>>       authentication exchange.  It is sent by the server in its final
>>       message and can help diagnose the reason for the authentication
>>       exchange failure.  On failed authentication, the entire server-
>>       final-message is OPTIONAL; specifically, a server implementation
>>       MAY conclude the SASL exchange with a failure without sending the
>>       server-final-message.  This results in an application-level error
>>       response without an extra round-trip.  If the server-final-message
>>       is sent on authentication failure, then the "e" attribute MUST be
>>       included.
>
>
> Note that it says that the server can send the error message with the e=
> attribute, in the *final message*. It's not a valid response in the earlier
> state, before sending server-first-message. I think we need to change the
> INIT state handling in pg_be_scram_exchange() to not send e= messages to the
> client. On an error at that state, it needs to just bail out without a
> message. The spec allows that. We can always log the detailed reason in the
> server log, anyway.

Hmmm. How do we handle the case where the user name does not match
then? The spec gives an error message e= specifically for this case.
If this is taken into account we need to perform sanity checks at
initialization phase I am afraid as the number of iterations and the
salt are part of the verifier. So you mean that just sending out a
normal ERROR message is fine at an earlier step (with *logdetails
filled for the backend)? I just want to be sure I understand what you
mean here.

> As Peter E pointed out earlier, the documentation is lacking, on how to
> configure MD5 and/or SCRAM. If you put "scram" as the authentication method
> in pg_hba.conf, what does it mean? If you have a line for both "scram" and
> "md5" in pg_hba.conf, with the same database/user/hostname combo, what does
> that mean? Answer: The first one takes effect, the second one has no effect.
> Yet the example in the docs now has that, which is nonsense :-). Hopefully
> we'll have some kind of a "both" option, before the release, but in the
> meanwhile, we need describe how this works now in the docs.

OK, it would be better to add a paragraph in client-auth.sgml
regarding the mapping of the two settings. For the example of file in
postgresql.conf, I would have really thought that adding directly a
line with "scram" listed was enough though. Perhaps a comment to say
that if md5 and scram are specified the first one wins where a user
and database name map?
-- 
Michael



Re: Password identifiers, protocol aging and SCRAM protocol

From
Heikki Linnakangas
Date:
On 12/08/2016 10:18 AM, Michael Paquier wrote:
> On Thu, Dec 8, 2016 at 5:54 AM, Heikki Linnakangas <hlinnaka@iki.fi> wrote:
>> Attached those here, as add-on patches to your latest patch set.
>
> Thanks for looking at it!
>
>> I'll continue reviewing, but a couple of things caught my eye that you may want
>> to jump on, in the meanwhile:
>>
>> On error messages, the spec says:
>>
>>> o  e: This attribute specifies an error that occurred during
>>>       authentication exchange.  It is sent by the server in its final
>>>       message and can help diagnose the reason for the authentication
>>>       exchange failure.  On failed authentication, the entire server-
>>>       final-message is OPTIONAL; specifically, a server implementation
>>>       MAY conclude the SASL exchange with a failure without sending the
>>>       server-final-message.  This results in an application-level error
>>>       response without an extra round-trip.  If the server-final-message
>>>       is sent on authentication failure, then the "e" attribute MUST be
>>>       included.
>>
>>
>> Note that it says that the server can send the error message with the e=
>> attribute, in the *final message*. It's not a valid response in the earlier
>> state, before sending server-first-message. I think we need to change the
>> INIT state handling in pg_be_scram_exchange() to not send e= messages to the
>> client. On an error at that state, it needs to just bail out without a
>> message. The spec allows that. We can always log the detailed reason in the
>> server log, anyway.
>
> Hmmm. How do we handle the case where the user name does not match
> then? The spec gives an error message e= specifically for this case.

Hmm, interesting. I wonder how/when they imagine that error message to 
be used. I suppose you could send a dummy server-first message, with a 
made-up salt and iteration count, if the user is not found, so that you 
can report that in the server-final message. But that seems 
unnecessarily complicated, compared to just sending the error 
immediately. I could imagine using a dummy server-first messaage to hide 
whether the user exists, but that argument doesn't hold water if you're 
going to report an "unknown-user" error, anyway.

Actually, we don't give away that information currently. If you try to 
log in with password or MD5 authentication, and the user doesn't exist, 
you get the same error as with an incorrect password. So, I think we do 
need to give the client a made-up salt and iteration count in that case, 
to hide the fact that the user doesn't exist. Furthermore, you can't 
just generate random salt and iteration count, because then you could 
simply try connecting twice, and see if you get the same salt and 
iteration count. We need to deterministically derive the salt from the 
username, so that you get the same salt/iteration count every time you 
try connecting with that username. But it needs indistinguishable from a 
random salt, to the client. Perhaps a SHA hash of the username and some 
per-cluster secret value, created by initdb. There must be research 
papers out there on how to do this..

To be really pedantic about that, we should also ward off timing 
attacks, by making sure that the dummy authentication is no 
faster/slower than a real one..

> If this is taken into account we need to perform sanity checks at
> initialization phase I am afraid as the number of iterations and the
> salt are part of the verifier. So you mean that just sending out a
> normal ERROR message is fine at an earlier step (with *logdetails
> filled for the backend)? I just want to be sure I understand what you
> mean here.

That's right, we can send a normal ERROR message. (But not for the 
"user-not-found" case, as discussed above.)

>> As Peter E pointed out earlier, the documentation is lacking, on how to
>> configure MD5 and/or SCRAM. If you put "scram" as the authentication method
>> in pg_hba.conf, what does it mean? If you have a line for both "scram" and
>> "md5" in pg_hba.conf, with the same database/user/hostname combo, what does
>> that mean? Answer: The first one takes effect, the second one has no effect.
>> Yet the example in the docs now has that, which is nonsense :-). Hopefully
>> we'll have some kind of a "both" option, before the release, but in the
>> meanwhile, we need describe how this works now in the docs.
>
> OK, it would be better to add a paragraph in client-auth.sgml
> regarding the mapping of the two settings. For the example of file in
> postgresql.conf, I would have really thought that adding directly a
> line with "scram" listed was enough though. Perhaps a comment to say
> that if md5 and scram are specified the first one wins where a user
> and database name map?

So, I think this makes no sense:

>  # Allow any user from host 192.168.12.10 to connect to database
> -# "postgres" if the user's password is correctly supplied.
> +# "postgres" if the user's password is correctly supplied and is
> +# using the correct password method.
>  #
>  # TYPE  DATABASE        USER            ADDRESS                 METHOD
>  host    postgres        all             192.168.12.10/32        md5
> +host    postgres        all             192.168.12.10/32        scram

But this is OK:

> +# Same as previous entry, except that the supplied password must be
> +# encrypted with SCRAM-SHA-256.
> +host    all             all             .example.com            scram
> +

Although, currently, the whole pg_hba.conf file in that example is a 
valid file that someone might have on a real server. With the above 
addition, it would not be. You would never have the two lines with the 
same host/database/user combination in pg_hba.conf.

Overall, I think something like this would make sense in the example:

# Allow any user from hosts in the example.com domain to connect to
# any database, if the user's password is correctly supplied.
#
# Most users use SCRAM authentication, but some users use older clients
# that don't support SCRAM authentication, and need to be able to log
# in using MD5 authentication. Such users are put in the @md5users
# group, everyone else must use SCRAM.
#
# TYPE  DATABASE        USER            ADDRESS                 METHOD
host    all             @md5users       .example.com            md5
host    all             all             .example.com            scram

- Heikki




Re: Password identifiers, protocol aging and SCRAM protocol

From
Michael Paquier
Date:
On Thu, Dec 8, 2016 at 5:55 PM, Heikki Linnakangas <hlinnaka@iki.fi> wrote:
> On 12/08/2016 10:18 AM, Michael Paquier wrote:
>> Hmmm. How do we handle the case where the user name does not match
>> then? The spec gives an error message e= specifically for this case.
>
> Hmm, interesting. I wonder how/when they imagine that error message to be
> used. I suppose you could send a dummy server-first message, with a made-up
> salt and iteration count, if the user is not found, so that you can report
> that in the server-final message. But that seems unnecessarily complicated,
> compared to just sending the error immediately. I could imagine using a
> dummy server-first messaage to hide whether the user exists, but that
> argument doesn't hold water if you're going to report an "unknown-user"
> error, anyway.

Using directly an error message would map with MD5 and plain, but
that's definitely a new protocol piece so I'd rather think that using
e= once the client has sent its first message in the exchange should
be answered with an appropriate SASL error...

> Actually, we don't give away that information currently. If you try to log
> in with password or MD5 authentication, and the user doesn't exist, you get
> the same error as with an incorrect password. So, I think we do need to give
> the client a made-up salt and iteration count in that case, to hide the fact
> that the user doesn't exist. Furthermore, you can't just generate random
> salt and iteration count, because then you could simply try connecting
> twice, and see if you get the same salt and iteration count. We need to
> deterministically derive the salt from the username, so that you get the
> same salt/iteration count every time you try connecting with that username.
> But it needs indistinguishable from a random salt, to the client. Perhaps a
> SHA hash of the username and some per-cluster secret value, created by
> initdb. There must be research papers out there on how to do this..

A simple idea would be to use the system ID when generating this fake
salt? That's generated by initdb, once per cluster. I am wondering if
it would be risky to use it for the salt. For the number of iterations
the default number could be used.

> To be really pedantic about that, we should also ward off timing attacks, by
> making sure that the dummy authentication is no faster/slower than a real
> one..

There is one catalog lookup when extracting the verifier from
pg_authid, I'd guess that if we generate a fake verifier things should
get pretty close.

>> If this is taken into account we need to perform sanity checks at
>> initialization phase I am afraid as the number of iterations and the
>> salt are part of the verifier. So you mean that just sending out a
>> normal ERROR message is fine at an earlier step (with *logdetails
>> filled for the backend)? I just want to be sure I understand what you
>> mean here.
>
> That's right, we can send a normal ERROR message. (But not for the
> "user-not-found" case, as discussed above.)

I'd think that the cases where the password is empty and the password
has passed valid duration should be returned with e=other-error. If
the caller sends a SCRAM request that would be impolite (?) to just
throw up an error once the exchange has begun.

> Although, currently, the whole pg_hba.conf file in that example is a valid
> file that someone might have on a real server. With the above addition, it
> would not be. You would never have the two lines with the same
> host/database/user combination in pg_hba.conf.

Okay.
-- 
Michael



Re: [HACKERS] Password identifiers, protocol aging and SCRAM protocol

From
Michael Paquier
Date:
On Thu, Dec 8, 2016 at 10:05 PM, Michael Paquier
<michael.paquier@gmail.com> wrote:
> On Thu, Dec 8, 2016 at 5:55 PM, Heikki Linnakangas <hlinnaka@iki.fi> wrote:
>> On 12/08/2016 10:18 AM, Michael Paquier wrote:
>>> Hmmm. How do we handle the case where the user name does not match
>>> then? The spec gives an error message e= specifically for this case.
>>
>> Hmm, interesting. I wonder how/when they imagine that error message to be
>> used. I suppose you could send a dummy server-first message, with a made-up
>> salt and iteration count, if the user is not found, so that you can report
>> that in the server-final message. But that seems unnecessarily complicated,
>> compared to just sending the error immediately. I could imagine using a
>> dummy server-first message to hide whether the user exists, but that
>> argument doesn't hold water if you're going to report an "unknown-user"
>> error, anyway.
>
> Using directly an error message would map with MD5 and plain, but
> that's definitely a new protocol piece so I'd rather think that using
> e= once the client has sent its first message in the exchange should
> be answered with an appropriate SASL error...
>
>> Actually, we don't give away that information currently. If you try to log
>> in with password or MD5 authentication, and the user doesn't exist, you get
>> the same error as with an incorrect password. So, I think we do need to give
>> the client a made-up salt and iteration count in that case, to hide the fact
>> that the user doesn't exist. Furthermore, you can't just generate random
>> salt and iteration count, because then you could simply try connecting
>> twice, and see if you get the same salt and iteration count. We need to
>> deterministically derive the salt from the username, so that you get the
>> same salt/iteration count every time you try connecting with that username.
>> But it needs indistinguishable from a random salt, to the client. Perhaps a
>> SHA hash of the username and some per-cluster secret value, created by
>> initdb. There must be research papers out there on how to do this..
>
> A simple idea would be to use the system ID when generating this fake
> salt? That's generated by initdb, once per cluster. I am wondering if
> it would be risky to use it for the salt. For the number of iterations
> the default number could be used.

I have been thinking more about this part quite a bit, and here is the
most simple thing that we could do while respecting the protocol.
That's more or less what I think you have in mind by re-reading
upthread, but it does not hurt to rewrite the whole flow to be clear:
1) Server gets the startup packet, maps pg_hba.conf and moves on to
the scram authentication code path.
2) Server sends back sendAuthRequest() to request user to provide a
password. This maps to the plain/md5 behavior as no errors would be
issued to user until he has provided a password.
3) Client sends back the password, and the first message with the user name.
4) Server receives it, and checks the data. If a failure happens at
this stage, just ERROR on PG-side without sending back a e= message.
This includes the username-mismatch, empty password and end of
password validity. So we would never use e=unknown-user. This sticks
with what you quoted upthread that the server may end the exchange
before sending the final message.
5) Server sends back the challenge, and client answers back with its
reply to it.

Then enters the final stage of the exchange, at which point the server
would issue its final message that would be e= in case of errors. If
something like an OOM happens, no message would be sent so failing on
an OOM ERROR on PG side would be fine as well.

6) Read final message from client and validate.
7) issue final message of server.

On failure at steps 6) or 7), an e= message is returned instead of the
final message. Does that look right?

One thing is: when do we look up at pg_authid? After receiving the
first message from client or before beginning the exchange? As the
first message from client has the user name, it would make sense to do
the lookup after receiving it, but from PG prospective it would just
make sense to use the data already present in the startup packet. The
current patch does the latter. What do you think?

By the way, I have pushed the extra patches you sent into this branch:
https://github.com/michaelpq/postgres/tree/scram
-- 
Michael



Re: [HACKERS] Password identifiers, protocol aging and SCRAM protocol

From
Heikki Linnakangas
Date:
On 12/09/2016 05:58 AM, Michael Paquier wrote:
> On Thu, Dec 8, 2016 at 10:05 PM, Michael Paquier
> <michael.paquier@gmail.com> wrote:
>> On Thu, Dec 8, 2016 at 5:55 PM, Heikki Linnakangas <hlinnaka@iki.fi> wrote:
>>> Actually, we don't give away that information currently. If you try to log
>>> in with password or MD5 authentication, and the user doesn't exist, you get
>>> the same error as with an incorrect password. So, I think we do need to give
>>> the client a made-up salt and iteration count in that case, to hide the fact
>>> that the user doesn't exist. Furthermore, you can't just generate random
>>> salt and iteration count, because then you could simply try connecting
>>> twice, and see if you get the same salt and iteration count. We need to
>>> deterministically derive the salt from the username, so that you get the
>>> same salt/iteration count every time you try connecting with that username.
>>> But it needs indistinguishable from a random salt, to the client. Perhaps a
>>> SHA hash of the username and some per-cluster secret value, created by
>>> initdb. There must be research papers out there on how to do this..
>>
>> A simple idea would be to use the system ID when generating this fake
>> salt? That's generated by initdb, once per cluster. I am wondering if
>> it would be risky to use it for the salt. For the number of iterations
>> the default number could be used.

I think I'd feel better with a completely separate randomly-generated 
value for this. System ID is not too difficult to guess, and there's no 
need to skimp on this. Yes, default number of iterations makes sense.

We cannot completely avoid leaking information through this, 
unfortunately. For example, if you have a user with a non-default number 
of iterations, and an attacker probes that, he'll know that the username 
was valid, because he got back a non-default number of iterations. But 
let's do our best.

> I have been thinking more about this part quite a bit, and here is the
> most simple thing that we could do while respecting the protocol.
> That's more or less what I think you have in mind by re-reading
> upthread, but it does not hurt to rewrite the whole flow to be clear:
> 1) Server gets the startup packet, maps pg_hba.conf and moves on to
> the scram authentication code path.
> 2) Server sends back sendAuthRequest() to request user to provide a
> password. This maps to the plain/md5 behavior as no errors would be
> issued to user until he has provided a password.
> 3) Client sends back the password, and the first message with the user name.
> 4) Server receives it, and checks the data. If a failure happens at
> this stage, just ERROR on PG-side without sending back a e= message.
> This includes the username-mismatch, empty password and end of
> password validity. So we would never use e=unknown-user. This sticks
> with what you quoted upthread that the server may end the exchange
> before sending the final message.

If we want to mimic the current behavior with MD5 authentication, I 
think we need to follow through with the challenge, and only fail in the 
last step, even if we know the password was empty or expired. MD5 
authentication doesn't currently give away that information to the user.

But it's OK to bail out early on OOM, or if the client sends an outright 
broken message. Those don't give away any information on the user account.

> 5) Server sends back the challenge, and client answers back with its
> reply to it.

> Then enters the final stage of the exchange, at which point the server
> would issue its final message that would be e= in case of errors. If
> something like an OOM happens, no message would be sent so failing on
> an OOM ERROR on PG side would be fine as well.

> 6) Read final message from client and validate.
> 7) issue final message of server.
>
> On failure at steps 6) or 7), an e= message is returned instead of the
> final message. Does that look right?

Yep.

> One thing is: when do we look up at pg_authid? After receiving the
> first message from client or before beginning the exchange? As the
> first message from client has the user name, it would make sense to do
> the lookup after receiving it, but from PG prospective it would just
> make sense to use the data already present in the startup packet. The
> current patch does the latter. What do you think?

Let's see what fits the program flow best. Probably best to do it before 
beginning the exchange. I'm hacking on this right now...

> By the way, I have pushed the extra patches you sent into this branch:
> https://github.com/michaelpq/postgres/tree/scram

Thanks! We had a quick chat with Michael, and agreed that we'd hack 
together on that github repository, to avoid stepping on each other's 
toes, and cut rebased patch sets from there to pgsql-hackers every now 
and then.

- Heikki




Re: [HACKERS] Password identifiers, protocol aging and SCRAM protocol

From
Heikki Linnakangas
Date:
Couple of things I should write down before I forget:

1. It's a bit cumbersome that the scram verifiers stored in 
pg_authid.rolpassword don't have any clear indication that they're scram 
verifiers. MD5 hashes are readily identifiable by the "md5" prefix. I 
think we should use a "scram-sha-256:" for scram verifiers.

Actually, I think it'd be awfully nice to also prefix plaintext 
passwords with "plain:", but I'm not sure it's worth breaking the 
compatibility, if there are tools out there that peek into rolpassword. 
Thoughts?

2. It's currently not possible to use the plaintext "password" 
authentication method, for a user that has a SCRAM verifier in 
rolpassword. That seems like an oversight. We can't do MD5 
authentication with a SCRAM verifier, but "password" we could.

- Heikki




Re: [HACKERS] Password identifiers, protocol aging and SCRAM protocol

From
Michael Paquier
Date:
On Fri, Dec 9, 2016 at 5:11 PM, Heikki Linnakangas <hlinnaka@iki.fi> wrote:
> Couple of things I should write down before I forget:
>
> 1. It's a bit cumbersome that the scram verifiers stored in
> pg_authid.rolpassword don't have any clear indication that they're scram
> verifiers. MD5 hashes are readily identifiable by the "md5" prefix. I think
> we should use a "scram-sha-256:" for scram verifiers.

scram-sha-256 would make the most sense to me.

> Actually, I think it'd be awfully nice to also prefix plaintext passwords
> with "plain:", but I'm not sure it's worth breaking the compatibility, if
> there are tools out there that peek into rolpassword. Thoughts?

pgbouncer is the only thing coming up in mind. It looks at pg_shadow
for password values. pg_dump'ing data from pre-10 instances will also
need to adapt. I see tricky the compatibility with the exiting CREATE
USER PASSWORD command though, so I am wondering if that's worth the
complication.

> 2. It's currently not possible to use the plaintext "password"
> authentication method, for a user that has a SCRAM verifier in rolpassword.
> That seems like an oversight. We can't do MD5 authentication with a SCRAM
> verifier, but "password" we could.

Yeah, that should be possible...
-- 
Michael



Re: [HACKERS] Password identifiers, protocol aging and SCRAM protocol

From
Heikki Linnakangas
Date:
On 12/09/2016 05:58 AM, Michael Paquier wrote:
>
> One thing is: when do we look up at pg_authid? After receiving the
> first message from client or before beginning the exchange? As the
> first message from client has the user name, it would make sense to do
> the lookup after receiving it, but from PG prospective it would just
> make sense to use the data already present in the startup packet. The
> current patch does the latter. What do you think?

While hacking on this, I came up with the attached refactoring, against 
current master. I think it makes the current code more readable, anyway, 
and it provides a get_role_password() function that SCRAM can use, to 
look up the stored password. (This is essentially the same refactoring 
that was included in the SCRAM patch set, that introduced the 
get_role_details() function.)

Barring objections, I'll go ahead and commit this first.

- Heikki


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Attachment

Re: [HACKERS] Password identifiers, protocol aging and SCRAM protocol

From
Michael Paquier
Date:
On Fri, Dec 09, 2016 at 11:51:45AM +0200, Heikki Linnakangas wrote:
> On 12/09/2016 05:58 AM, Michael Paquier wrote:
> >
> > One thing is: when do we look up at pg_authid? After receiving the
> > first message from client or before beginning the exchange? As the
> > first message from client has the user name, it would make sense to do
> > the lookup after receiving it, but from PG prospective it would just
> > make sense to use the data already present in the startup packet. The
> > current patch does the latter. What do you think?
>
> While hacking on this, I came up with the attached refactoring, against
> current master. I think it makes the current code more readable, anyway, and
> it provides a get_role_password() function that SCRAM can use, to look up
> the stored password. (This is essentially the same refactoring that was
> included in the SCRAM patch set, that introduced the get_role_details()
> function.)
>
> Barring objections, I'll go ahead and commit this first.

Here are some comments.

> @@ -720,12 +721,16 @@ CheckMD5Auth(Port *port, char **logdetail)
>      sendAuthRequest(port, AUTH_REQ_MD5, md5Salt, 4);
>
>      passwd = recv_password_packet(port);
> -
>      if (passwd == NULL)
>          return STATUS_EOF;        /* client wouldn't send password */

This looks like useless noise.

> -    shadow_pass = TextDatumGetCString(datum);
> +    *shadow_pass = TextDatumGetCString(datum);
>
>      datum = SysCacheGetAttr(AUTHNAME, roleTup,
>                              Anum_pg_authid_rolvaliduntil, &isnull);
> @@ -83,100 +83,146 @@ md5_crypt_verify(const char *role, char *client_pass,
>      {
>          *logdetail = psprintf(_("User \"%s\" has an empty password."),
>                                role);
> +        *shadow_pass = NULL;
>          return STATUS_ERROR;    /* empty password */
>      }

Here the password is allocated by text_to_cstring(), that's only 1 byte
but it should be free()'d.
--
Michael

Re: [HACKERS] Password identifiers, protocol aging and SCRAM protocol

From
Heikki Linnakangas
Date:
On 12/09/2016 01:10 PM, Michael Paquier wrote:
> On Fri, Dec 09, 2016 at 11:51:45AM +0200, Heikki Linnakangas wrote:
>> On 12/09/2016 05:58 AM, Michael Paquier wrote:
>>>
>>> One thing is: when do we look up at pg_authid? After receiving the
>>> first message from client or before beginning the exchange? As the
>>> first message from client has the user name, it would make sense to do
>>> the lookup after receiving it, but from PG prospective it would just
>>> make sense to use the data already present in the startup packet. The
>>> current patch does the latter. What do you think?
>>
>> While hacking on this, I came up with the attached refactoring, against
>> current master. I think it makes the current code more readable, anyway, and
>> it provides a get_role_password() function that SCRAM can use, to look up
>> the stored password. (This is essentially the same refactoring that was
>> included in the SCRAM patch set, that introduced the get_role_details()
>> function.)
>>
>> Barring objections, I'll go ahead and commit this first.

Ok, committed.

>> -    shadow_pass = TextDatumGetCString(datum);
>> +    *shadow_pass = TextDatumGetCString(datum);
>>
>>      datum = SysCacheGetAttr(AUTHNAME, roleTup,
>>                              Anum_pg_authid_rolvaliduntil, &isnull);
>> @@ -83,100 +83,146 @@ md5_crypt_verify(const char *role, char *client_pass,
>>      {
>>          *logdetail = psprintf(_("User \"%s\" has an empty password."),
>>                                role);
>> +        *shadow_pass = NULL;
>>          return STATUS_ERROR;    /* empty password */
>>      }
>
> Here the password is allocated by text_to_cstring(), that's only 1 byte
> but it should be free()'d.

Fixed. Thanks, good catch! It doesn't matter in practice as we'll 
disconnect shortly afterwards anyway, but given that the callers pfree() 
other things on error, let's be tidy.

- Heikki




Re: [HACKERS] Password identifiers, protocol aging and SCRAM protocol

From
Heikki Linnakangas
Date:
A few couple more things that caught my eye while hacking on this:

1. We don't use SASLPrep to scrub username's and passwords. That's by 
choice, for usernames, because historically in PostgreSQL usernames can 
be stored in any encoding, but SASLPrep assumes UTF-8. We dodge that by 
passing an empty username in the authentication exchange anyway, because 
we always use the username we got from the startup packet. But for 
passwords, I think we need to fix that. The spec is very clear on that:

> Note that implementations MUST either implement SASLprep or disallow
> use of non US-ASCII Unicode codepoints in "str".


2. I think we should check nonces, etc. more carefully, to not contain 
invalid characters. For example, in the server, we use the 
read_attr_value() function to read the client's nonce. Per the spec, the 
nonce should consist of ASCII printable characters, but we will accept 
anything except the comma. That's no trouble to the server, but let's be 
strict.


To summarize, here's the overall TODO list so far:

* Use SASLPrep for passwords.

* Check nonces, etc. to not contain invalid characters.

* Derive mock SCRAM verifier for non-existent users deterministically 
from username.

* Allow plain 'password' authentication for users with a SCRAM verifier 
in rolpassword.

* Throw an error if an "authorization identity" is given. ATM, we just 
ignore it, but seems better to reject the attempt than do something that 
might not be what the client expects.

* Add "scram-sha-256" prefix to SCRAM verifiers stored in 
pg_authid.rolpassword.

Anything else I'm missing?

I've created a wiki page, mostly to host that TODO list, while we hack 
this to completion: 
https://wiki.postgresql.org/wiki/SCRAM_authentication. Feel free to add 
stuff that comes to mind, and remove stuff as you push patches to the 
branch on github.

- Heikki




Re: [HACKERS] Password identifiers, protocol aging and SCRAM protocol

From
Craig Ringer
Date:
On 12 December 2016 at 22:39, Heikki Linnakangas <hlinnaka@iki.fi> wrote:

> * Throw an error if an "authorization identity" is given. ATM, we just
> ignore it, but seems better to reject the attempt than do something that
> might not be what the client expects.

Yeah. That might be an opportunity to make admins' and connection
poolers' lives much happier down the track, but first we'd need a way
of specifying a mapping for the other users a given user is permitted
to masquerade as (like we have for roles and role membership). We have
SET SESSION AUTHORIZATION already, which has all the same benefits and
security problems as allowing connect-time selection of authorization
identity without such a framework. And we have SET ROLE.

ERRORing is the right thing to do here, so we can safely use this
protocol functionality later if we want to allow user masquerading.

-- Craig Ringer                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services



Re: [HACKERS] Password identifiers, protocol aging and SCRAM protocol

From
Michael Paquier
Date:
On Mon, Dec 12, 2016 at 11:39 PM, Heikki Linnakangas <hlinnaka@iki.fi> wrote:
> A few couple more things that caught my eye while hacking on this:
>
> 1. We don't use SASLPrep to scrub username's and passwords. That's by
> choice, for usernames, because historically in PostgreSQL usernames can be
> stored in any encoding, but SASLPrep assumes UTF-8. We dodge that by passing
> an empty username in the authentication exchange anyway, because we always
> use the username we got from the startup packet. But for passwords, I think
> we need to fix that. The spec is very clear on that:
>
>> Note that implementations MUST either implement SASLprep or disallow
>> use of non US-ASCII Unicode codepoints in "str".
>
> 2. I think we should check nonces, etc. more carefully, to not contain
> invalid characters. For example, in the server, we use the read_attr_value()
> function to read the client's nonce. Per the spec, the nonce should consist
> of ASCII printable characters, but we will accept anything except the comma.
> That's no trouble to the server, but let's be strict.
>
> To summarize, here's the overall TODO list so far:
>
> * Use SASLPrep for passwords.
>
> * Check nonces, etc. to not contain invalid characters.
>
> * Derive mock SCRAM verifier for non-existent users deterministically from
> username.
>
> * Allow plain 'password' authentication for users with a SCRAM verifier in
> rolpassword.
>
> * Throw an error if an "authorization identity" is given. ATM, we just
> ignore it, but seems better to reject the attempt than do something that
> might not be what the client expects.
>
> * Add "scram-sha-256" prefix to SCRAM verifiers stored in
> pg_authid.rolpassword.
>
> Anything else I'm missing?
>
> I've created a wiki page, mostly to host that TODO list, while we hack this
> to completion: https://wiki.postgresql.org/wiki/SCRAM_authentication. Feel
> free to add stuff that comes to mind, and remove stuff as you push patches
> to the branch on github.

Based on the current code, I think you have the whole list. I'll try
to look once again at the code to see I have anything else in mind.
Improving the TAP regression tests is also an item, with SCRAM
authentication support when a plain password is stored.
-- 
Michael



Re: [HACKERS] Password identifiers, protocol aging and SCRAM protocol

From
Michael Paquier
Date:
On Tue, Dec 13, 2016 at 10:43 AM, Michael Paquier
<michael.paquier@gmail.com> wrote:
> On Mon, Dec 12, 2016 at 11:39 PM, Heikki Linnakangas <hlinnaka@iki.fi> wrote:
>> A few couple more things that caught my eye while hacking on this:

Looking at what we have now, in the branch...

>> * Use SASLPrep for passwords.

SASLPrep is defined here:
https://tools.ietf.org/html/rfc4013
And stringprep is here:
https://tools.ietf.org/html/rfc3454
So that's roughly applying a conversion from the mapping table, taking
into account prohibited, bi-directional, mapping characters, etc. The
spec says that the password should be in unicode. But we cannot be
sure of that, right? Those mapping tables should be likely a separated
thing.. (perl has Unicode::Stringprep::Mapping for example).

>> * Check nonces, etc. to not contain invalid characters.

Fixed this one.

>> * Derive mock SCRAM verifier for non-existent users deterministically from
>> username.

You have put in place the facility to allow that. The only thing that
comes in mind to generate something per-cluster is to have
BootStrapXLOG() generate an "authentication secret identifier" with a
uint64 and add that in the control file. Using pg_backend_random()
would be a good idea here.

>> * Allow plain 'password' authentication for users with a SCRAM verifier in
>> rolpassword.

Done.

>> * Throw an error if an "authorization identity" is given. ATM, we just
>> ignore it, but seems better to reject the attempt than do something that
>> might not be what the client expects.

Done.

>> * Add "scram-sha-256" prefix to SCRAM verifiers stored in
>> pg_authid.rolpassword.

You did it.
-- 
Michael



On 12/09/2016 10:19 AM, Michael Paquier wrote:
> On Fri, Dec 9, 2016 at 5:11 PM, Heikki Linnakangas <hlinnaka@iki.fi> wrote:
>> Couple of things I should write down before I forget:
>>
>> 1. It's a bit cumbersome that the scram verifiers stored in
>> pg_authid.rolpassword don't have any clear indication that they're scram
>> verifiers. MD5 hashes are readily identifiable by the "md5" prefix. I think
>> we should use a "scram-sha-256:" for scram verifiers.
>
> scram-sha-256 would make the most sense to me.
>
>> Actually, I think it'd be awfully nice to also prefix plaintext passwords
>> with "plain:", but I'm not sure it's worth breaking the compatibility, if
>> there are tools out there that peek into rolpassword. Thoughts?
>
> pgbouncer is the only thing coming up in mind. It looks at pg_shadow
> for password values. pg_dump'ing data from pre-10 instances will also
> need to adapt. I see tricky the compatibility with the exiting CREATE
> USER PASSWORD command though, so I am wondering if that's worth the
> complication.
>
>> 2. It's currently not possible to use the plaintext "password"
>> authentication method, for a user that has a SCRAM verifier in rolpassword.
>> That seems like an oversight. We can't do MD5 authentication with a SCRAM
>> verifier, but "password" we could.
>
> Yeah, that should be possible...

The tip of the work branch can now do SCRAM authentication, when a user 
has a plaintext password in pg_authid.rolpassword. The reverse doesn't 
work, however: you cannot do plain "password" authentication, when the 
user has a SCRAM verifier in pg_authid.rolpassword. It gets worse: plain 
"password" authentication doesn't check if the string stored in 
pg_authid.rolpassword is a SCRAM authenticator, and treats it as a 
plaintext password, so you can do this:


PGPASSWORD="scram-sha-256:mDBuqO1mEekieg==:4096:17dc259499c1a184c26ee5b19715173d9354195f510b4d3af8be585acb39ae33:d3d713149c6becbbe56bae259aafe4e95b79ab7e3b50f2fbd850ea7d7b7c114f"
psql postgres  -h localhost -U scram_user
 

I think we're going to have a more bugs like this, if we don't start to 
explicitly label plaintext passwords as such.

So, let's add "plain:" prefix to plaintext passwords, in 
pg_authid.rolpassword. With that, these would be valid values in 
pg_authid.rolpassword:

plain:foo
md55a962ce7a24371a10e85627a484cac28

scram-sha-256:mDBuqO1mEekieg==:4096:17dc259499c1a184c26ee5b19715173d9354195f510b4d3af8be585acb39ae33:d3d713149c6becbbe56bae259aafe4e95b79ab7e3b50f2fbd850ea7d7b7c114f

But anything that doesn't begin with "plain:", "md5", or 
"scram-sha-256:" would be invalid. You shouldn't have invalid values in 
the column, but if you do, all the authentication mechanisms would 
reject it.

It would be nice to also change the format of MD5 passwords to have a 
colon, as in "md5:<hash>", but that's probably not worth breaking 
compatibility for. Almost no-one stores passwords in plaintext, so 
changing the format of that wouldn't affect many people, but there might 
well be tools out there that peek into MD5 hashes.

- Heikki




On Wed, Dec 14, 2016 at 5:51 PM, Heikki Linnakangas <hlinnaka@iki.fi> wrote:
> The tip of the work branch can now do SCRAM authentication, when a user has
> a plaintext password in pg_authid.rolpassword. The reverse doesn't work,
> however: you cannot do plain "password" authentication, when the user has a
> SCRAM verifier in pg_authid.rolpassword. It gets worse: plain "password"
> authentication doesn't check if the string stored in pg_authid.rolpassword
> is a SCRAM authenticator, and treats it as a plaintext password, so you can
> do this:
>
>
PGPASSWORD="scram-sha-256:mDBuqO1mEekieg==:4096:17dc259499c1a184c26ee5b19715173d9354195f510b4d3af8be585acb39ae33:d3d713149c6becbbe56bae259aafe4e95b79ab7e3b50f2fbd850ea7d7b7c114f"
> psql postgres  -h localhost -U scram_user

This one's fun.

> I think we're going to have a more bugs like this, if we don't start to
> explicitly label plaintext passwords as such.
>
> So, let's add "plain:" prefix to plaintext passwords, in
> pg_authid.rolpassword. With that, these would be valid values in
> pg_authid.rolpassword:
>
> [...]
>
> But anything that doesn't begin with "plain:", "md5", or "scram-sha-256:"
> would be invalid. You shouldn't have invalid values in the column, but if
> you do, all the authentication mechanisms would reject it.

I would be tempted to suggest adding the verifier type as a new column
of pg_authid, but as CREATE USER PASSWORD accepts strings with md5
prefix as-is for ages using the "plain:" prefix is definitely a better
plan. My opinion on the matter has changed compared to a couple of
months back.

> It would be nice to also change the format of MD5 passwords to have a colon,
> as in "md5:<hash>", but that's probably not worth breaking compatibility
> for. Almost no-one stores passwords in plaintext, so changing the format of
> that wouldn't affect many people, but there might well be tools out there
> that peek into MD5 hashes.

Yes, let's not take this road.

This work is definitely something that should be done before anything
else. Need a patch or are you on it?
-- 
Michael





On Wed, Dec 14, 2016 at 9:51 AM, Heikki Linnakangas <hlinnaka@iki.fi> wrote:
On 12/09/2016 10:19 AM, Michael Paquier wrote:
On Fri, Dec 9, 2016 at 5:11 PM, Heikki Linnakangas <hlinnaka@iki.fi> wrote:
Couple of things I should write down before I forget:

1. It's a bit cumbersome that the scram verifiers stored in
pg_authid.rolpassword don't have any clear indication that they're scram
verifiers. MD5 hashes are readily identifiable by the "md5" prefix. I think
we should use a "scram-sha-256:" for scram verifiers.

scram-sha-256 would make the most sense to me.

Actually, I think it'd be awfully nice to also prefix plaintext passwords
with "plain:", but I'm not sure it's worth breaking the compatibility, if
there are tools out there that peek into rolpassword. Thoughts?

pgbouncer is the only thing coming up in mind. It looks at pg_shadow
for password values. pg_dump'ing data from pre-10 instances will also
need to adapt. I see tricky the compatibility with the exiting CREATE
USER PASSWORD command though, so I am wondering if that's worth the
complication.

2. It's currently not possible to use the plaintext "password"
authentication method, for a user that has a SCRAM verifier in rolpassword.
That seems like an oversight. We can't do MD5 authentication with a SCRAM
verifier, but "password" we could.

Yeah, that should be possible...

The tip of the work branch can now do SCRAM authentication, when a user has a plaintext password in pg_authid.rolpassword. The reverse doesn't work, however: you cannot do plain "password" authentication, when the user has a SCRAM verifier in pg_authid.rolpassword. It gets worse: plain "password" authentication doesn't check if the string stored in pg_authid.rolpassword is a SCRAM authenticator, and treats it as a plaintext password, so you can do this:

PGPASSWORD="scram-sha-256:mDBuqO1mEekieg==:4096:17dc259499c1a184c26ee5b19715173d9354195f510b4d3af8be585acb39ae33:d3d713149c6becbbe56bae259aafe4e95b79ab7e3b50f2fbd850ea7d7b7c114f"  psql postgres  -h localhost -U scram_user

I think we're going to have a more bugs like this, if we don't start to explicitly label plaintext passwords as such.

So, let's add "plain:" prefix to plaintext passwords, in pg_authid.rolpassword. With that, these would be valid values in pg_authid.rolpassword:

plain:foo
md55a962ce7a24371a10e85627a484cac28
scram-sha-256:mDBuqO1mEekieg==:4096:17dc259499c1a184c26ee5b19715173d9354195f510b4d3af8be585acb39ae33:d3d713149c6becbbe56bae259aafe4e95b79ab7e3b50f2fbd850ea7d7b7c114f

I would so like to just drop support for plain passwords completely :) But there's a backwards compatibility issue to think about of course.

But -- is there any actual usecase for them anymore?

If not, another option could be to just specifically check that it's *not* "md5<something>" or "scram-<something>:<something>". That would invalidate plaintext passwords that have those texts in them of course, but what's the likelyhood of that in reality?

Though I guess that might at least in theory be more bug-prone, so going with a "plain:" prefix seems like a good idea as well.



But anything that doesn't begin with "plain:", "md5", or "scram-sha-256:" would be invalid. You shouldn't have invalid values in the column, but if you do, all the authentication mechanisms would reject it.

It would be nice to also change the format of MD5 passwords to have a colon, as in "md5:<hash>", but that's probably not worth breaking compatibility for. Almost no-one stores passwords in plaintext, so changing the format of that wouldn't affect many people, but there might well be tools out there that peek into MD5 hashes.

There are definitely tools that do that, so +1 on leaving that alone.

--
On 12/14/2016 12:15 PM, Michael Paquier wrote:
> This work is definitely something that should be done before anything
> else. Need a patch or are you on it?

I'm on it..

- Heikki




On 12/14/2016 12:27 PM, Magnus Hagander wrote:
> I would so like to just drop support for plain passwords completely :) But
> there's a backwards compatibility issue to think about of course.
>
> But -- is there any actual usecase for them anymore?

Hmm. At the moment, I don't think there is.

But, a password stored in plaintext works with either MD5 or SCRAM, or 
any future authentication mechanism. So as soon as we have SCRAM 
authentication, it becomes somewhat useful again.

In a nutshell:

auth / stored    MD5    SCRAM    plaintext
-----------------------------------------
password    Y    Y    Y
md5        Y    N    Y
scram        N    Y    Y

If a password is stored in plaintext, it can be used with any 
authentication mechanism. And the plaintext 'password' authentication 
mechanism works with any kind of a stored password. But an MD5 hash 
cannot be used with SCRAM authentication, or vice versa.


I just noticed that the manual for CREATE ROLE says:

> Note that older clients might lack support for the MD5 authentication
> mechanism that is needed to work with passwords that are stored
> encrypted.

That's is incorrect. The alternative to MD5 authentication is plain 
'password' authentication, and that works just fine with MD5-hashed 
passwords. I think that sentence is a leftover from when we still 
supported "crypt" authentication (so I actually get to blame you for 
that ;-), commit 53a5026b). Back then, it was true that if an MD5 hash 
was stored in pg_authid, you couldn't do "crypt" authentication. That 
might have left old clients out in the cold.

Now that we're getting SCRAM authentication, we'll need a similar notice 
there again, for the incompatibility of a SCRAM verifier with MDD5 
authentication and vice versa.


> If not, another option could be to just specifically check that it's *not*
> "md5<something>" or "scram-<something>:<something>". That would invalidate
> plaintext passwords that have those texts in them of course, but what's the
> likelyhood of that in reality?

Hmm, we have dismissed that risk for the MD5 hashes (and we also have a 
length check for them), but as we get new hash formats, the risk 
increases. Someone might well want to use "plain:of:jars" as password. 
Perhaps we should use a more complicated pattern.

I googled around for how others store SCRAM and other password hashes. 
Many other systems seem to have similar naming schemes. The closest 
thing to a standard I could find was:

https://github.com/P-H-C/phc-string-format/blob/master/phc-sf-spec.md

Perhaps we should also use something like "$plain$<password>" or 
"$scram-sha-256$<iterations>$<salt>$<key>$"?

There's also https://tools.ietf.org/html/rfc5803, which specifies how to 
store SCRAM verifiers in LDAP. I don't understand enough of LDAP to 
understand what those actually look like, though, and there were no 
examples in the RFC.

I wonder if we should also worry about storing multiple verifiers in 
rolpassword? We don't support that now, but we might in the future. It 
might come handy, if you could easily store multiple hashes in a single 
string, separated by commas for example.

- Heikki



On 12/14/16 5:15 AM, Michael Paquier wrote:
> I would be tempted to suggest adding the verifier type as a new column
> of pg_authid

Yes please.

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



* Peter Eisentraut (peter.eisentraut@2ndquadrant.com) wrote:
> On 12/14/16 5:15 AM, Michael Paquier wrote:
> > I would be tempted to suggest adding the verifier type as a new column
> > of pg_authid
>
> Yes please.

This discussion seems to continue to come up and I don't entirely
understand why we keep trying to shove more things into pg_authid, or
worse, into rolpassword.

We should have an independent table for the verifiers, which has a
different column for the verifier type, and either starts off supporting
multiple verifiers per role or at least gives us the ability to add that
easily later.  We should also move rolvaliduntil to that new table.

No, I am specifically *not* concerned with "backwards compatibility" of
that table- we continually add to it and change it and applications
which are so closely tied to PG that they look at pg_authid need to be
updated with nearly every release anyway.  What we *do* need to make
sure we get correct is what pg_dump/pg_upgrade do, but that's entirely
within our control to manage and shouldn't be that much of an issue to
implement.

Thanks!

Stephen

On Wed, Dec 14, 2016 at 11:27:15AM +0100, Magnus Hagander wrote:
> I would so like to just drop support for plain passwords completely :) But
> there's a backwards compatibility issue to think about of course.
> 
> But -- is there any actual usecase for them anymore?

I thought we recommended 'password' for SSL connections because if you
use MD5 passwords the password text layout is known and that simplifies
cryptanalysis.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+                      Ancient Roman grave inscription +




On 14 December 2016 20:12:05 EET, Bruce Momjian <bruce@momjian.us> wrote:
>On Wed, Dec 14, 2016 at 11:27:15AM +0100, Magnus Hagander wrote:
>> I would so like to just drop support for plain passwords completely
>:) But
>> there's a backwards compatibility issue to think about of course.
>> 
>> But -- is there any actual usecase for them anymore?
>
>I thought we recommended 'password' for SSL connections because if you
>use MD5 passwords the password text layout is known and that simplifies
>cryptanalysis.

No, that makes no sense. And whether you use 'password' or 'md5' authentication is a different question than whether
youstore passwords in plaintext or as md5 hashes. Magnus was asking whether it ever makes sense to *store* passwords in
plaintext.

Since you brought it up, there is a legitimate argument to be made that 'password' authentication is more secure than
'md5',when SSL is used. Namely, if an attacker can acquire contents of pg_authid e.g. by stealing a backup tape, with
'md5'authentication he can log in as any user, using just the stolen hashes. But with 'password', he needs to reverse
thehash first. It's not a great difference, but it's something.
 
- Heikki



* Heikki Linnakangas (hlinnaka@iki.fi) wrote:
> On 14 December 2016 20:12:05 EET, Bruce Momjian <bruce@momjian.us> wrote:
> >On Wed, Dec 14, 2016 at 11:27:15AM +0100, Magnus Hagander wrote:
> >> I would so like to just drop support for plain passwords completely
> >:) But
> >> there's a backwards compatibility issue to think about of course.
> >>
> >> But -- is there any actual usecase for them anymore?
> >
> >I thought we recommended 'password' for SSL connections because if you
> >use MD5 passwords the password text layout is known and that simplifies
> >cryptanalysis.
>
> No, that makes no sense. And whether you use 'password' or 'md5' authentication is a different question than whether
youstore passwords in plaintext or as md5 hashes. Magnus was asking whether it ever makes sense to *store* passwords in
plaintext.

Right.

> Since you brought it up, there is a legitimate argument to be made that 'password' authentication is more secure than
'md5',when SSL is used. Namely, if an attacker can acquire contents of pg_authid e.g. by stealing a backup tape, with
'md5'authentication he can log in as any user, using just the stolen hashes. But with 'password', he needs to reverse
thehash first. It's not a great difference, but it's something. 

Tunnelled passwords which are stored as hashes is also well understood
and comparable to SSH with passwords in /etc/passwd.

Storing plaintext passwords has been bad form for just about forever and
I wouldn't be sad to see our support of it go.  At the least, as was
discussed somewhere, but I'm not sure where it ended up, we should give
administrators the ability to control what ways a password can be
stored.  In particular, once a user has migrated all of their users to
SCRAM, they should be able to say "don't let new passwords be in any
format other than SCRAM-SHA-256".

Thanks!

Stephen

On 12/14/2016 11:41 AM, Stephen Frost wrote:
> * Heikki Linnakangas (hlinnaka@iki.fi) wrote:
>> On 14 December 2016 20:12:05 EET, Bruce Momjian <bruce@momjian.us> wrote:
>>> On Wed, Dec 14, 2016 at 11:27:15AM +0100, Magnus Hagander wrote:

> Storing plaintext passwords has been bad form for just about forever and
> I wouldn't be sad to see our support of it go.  At the least, as was
> discussed somewhere, but I'm not sure where it ended up, we should give
> administrators the ability to control what ways a password can be
> stored.  In particular, once a user has migrated all of their users to
> SCRAM, they should be able to say "don't let new passwords be in any
> format other than SCRAM-SHA-256".

It isn't as bad as it used to be. I remember with PASSWORD was the 
default. I agree that we should be able to set a policy that says, "we 
only allow X for password storage".

JD


>
> Thanks!
>
> Stephen
>


-- 
Command Prompt, Inc.                  http://the.postgres.company/                        +1-503-667-4564
PostgreSQL Centered full stack support, consulting and development.
Everyone appreciates your honesty, until you are honest with them.



On Wed, Dec 14, 2016 at 8:33 PM, Heikki Linnakangas <hlinnaka@iki.fi> wrote:
> But, a password stored in plaintext works with either MD5 or SCRAM, or any
> future authentication mechanism. So as soon as we have SCRAM authentication,
> it becomes somewhat useful again.
>
> In a nutshell:
>
> auth / stored   MD5     SCRAM   plaintext
> -----------------------------------------
> password        Y       Y       Y
> md5             Y       N       Y
> scram           N       Y       Y
>
> If a password is stored in plaintext, it can be used with any authentication
> mechanism. And the plaintext 'password' authentication mechanism works with
> any kind of a stored password. But an MD5 hash cannot be used with SCRAM
> authentication, or vice versa.

So.. I have been thinking about this portion of the thread. And what I
find the most scary is not the fact that we use plain passwords for
SCRAM authentication, it is the fact that we would need to do a
catalog lookup earlier in the connection workflow to decide what is
the connection protocol to use depending on the username provided in
the startup packet if the pg_hba.conf entry matching the user and
database names uses "password".

And, honestly, why do we actually need to have a support table that
spread? SCRAM is designed to be secure, so it seems to me that it
would on the contrary a bad idea to encourage the use of plain
passwords if we actually think that they should never be used (they
are actually useful for located, development instances, not production
ones). So what I would suggest would be to have a support table like
that:
auth / stored   MD5     SCRAM   plaintext
-----------------------------------------
password        Y       Y       N
md5             Y       N       Y
scram           N       N       Y

So here is an idea for things to do now:
1) do not change the format of the existing passwords
2) do not change pg_authid
3) block access to instances if "password" or "md5" are used in
pg_hba.conf if the user have a SCRAM verifier.
4) block access if "scram" is used and if user has a plain or md5 verifier.
5) Allow access if "scram" is used and if user has a SCRAM verifier.
We had a similar discussion regarding verifier/password formats last
year but that did not end well. It would be sad to fall back again
into this discussion and get no result. If somebody wants to support
access to SCRAM with plain password entries, why not. But that would
gain a -1 from me regarding the earlier lookup of pg_authid needed to
do the decision making on the protocol to use. And I think that we
want SCRAM to be designed to be a maximum stable and secure.
-- 
Michael



Re: [HACKERS] Password identifiers, protocol aging and SCRAM protocol

From
Michael Paquier
Date:
On Tue, Dec 13, 2016 at 2:44 PM, Michael Paquier
<michael.paquier@gmail.com> wrote:
> SASLPrep is defined here:
> https://tools.ietf.org/html/rfc4013
> And stringprep is here:
> https://tools.ietf.org/html/rfc3454
> So that's roughly applying a conversion from the mapping table, taking
> into account prohibited, bi-directional, mapping characters, etc. The
> spec says that the password should be in unicode. But we cannot be
> sure of that, right? Those mapping tables should be likely a separated
> thing.. (perl has Unicode::Stringprep::Mapping for example).

OK. I have look at that and I have bumped into libidn, that offers a
couple of APIs that could be used directly for this purpose.
Particularly, what has caught my eyes is stringprep_profile():
https://www.gnu.org/software/libidn/manual/html_node/Stringprep-Functions.html
res = stringprep_profile (input, output, "SASLprep", STRINGPREP_NO_UNASSIGNED);

libidn can be installed on Windows, and I have found packages for
cygwin, mingw, linux, freebsd and macos via brew. In the case where
libidn is not installed, I think that the safest path would be to
check if the input string has any high bits set (0x80) and bail out
because that would mean that it is a UTF-8 string that we cannot
change. Any thoughts about using libidn?

Also, after discussion with Heikki, here are the things that we need to do:
1) In libpq, we need to check if the string is valid utf-8. If that's
valid utf-8, apply SASLprep. if not, copy the string as-is. We could
error as well in this case... Perhaps a WARNING could be more adapted,
that's the most tricky case, and if the client does not use utf-8 that
may lead to unexpected behavior.
2) In server, when the password verifier is created. If
client_encoding is utf-8, but not server_encoding, convert the
password to utf-8 and build the verifier after applying SASLprep.

In the case where the binaries are *not* built with libidn, I think
that we had better reject valid UTF-8 string directly and just allow
ASCII? SASLprep is a no-op on ASCII characters.

Thoughts about this approach?
-- 
Michael



On 12/14/2016 04:57 PM, Stephen Frost wrote:
> * Peter Eisentraut (peter.eisentraut@2ndquadrant.com) wrote:
>> On 12/14/16 5:15 AM, Michael Paquier wrote:
>>> I would be tempted to suggest adding the verifier type as a new column
>>> of pg_authid
>>
>> Yes please.
>
> This discussion seems to continue to come up and I don't entirely
> understand why we keep trying to shove more things into pg_authid, or
> worse, into rolpassword.

I understand the relational beauty of having a separate column for the 
verifier type, but I don't think it would be practical. For starters, 
we'd still like to have a self-identifying string format like 
"scram-sha-256:<stuff>", so that you can conveniently pass the verifier 
as a string to CREATE USER. I think it'll be much better to stick to one 
format, than try to split the verifier into type and the string, when it 
enters the catalog table.

> We should have an independent table for the verifiers, which has a
> different column for the verifier type, and either starts off supporting
> multiple verifiers per role or at least gives us the ability to add that
> easily later.  We should also move rolvaliduntil to that new table.

I agree we'll probably need a new table for verifiers. Or turn 
rolpassword into an array or something. We discussed that before, 
however, and it didn't really go anywhere, so right now I'd like to get 
SCRAM in with minimal changes to the rest of the system. There is a lot 
of room for improvement once it's in.

- Heikki




On 12/15/2016 03:00 AM, Michael Paquier wrote:
> On Wed, Dec 14, 2016 at 8:33 PM, Heikki Linnakangas <hlinnaka@iki.fi> wrote:
>> But, a password stored in plaintext works with either MD5 or SCRAM, or any
>> future authentication mechanism. So as soon as we have SCRAM authentication,
>> it becomes somewhat useful again.
>>
>> In a nutshell:
>>
>> auth / stored   MD5     SCRAM   plaintext
>> -----------------------------------------
>> password        Y       Y       Y
>> md5             Y       N       Y
>> scram           N       Y       Y
>>
>> If a password is stored in plaintext, it can be used with any authentication
>> mechanism. And the plaintext 'password' authentication mechanism works with
>> any kind of a stored password. But an MD5 hash cannot be used with SCRAM
>> authentication, or vice versa.
>
> So.. I have been thinking about this portion of the thread. And what I
> find the most scary is not the fact that we use plain passwords for
> SCRAM authentication, it is the fact that we would need to do a
> catalog lookup earlier in the connection workflow to decide what is
> the connection protocol to use depending on the username provided in
> the startup packet if the pg_hba.conf entry matching the user and
> database names uses "password".

I don't see why we would need to do a catalog lookup any earlier. With 
"password" authentication, the server can simply request the client to 
send its password. When it receives it, it performs the catalog lookup 
to get pg_authid.rolpassword. If it's in plaintext, just compare it, if 
it's an MD5 hash, hash the client's password and compare, and if it's a 
SCRAM verifier, build a verifier with the same salt and iteration count 
and compare.

> And, honestly, why do we actually need to have a support table that
> spread? SCRAM is designed to be secure, so it seems to me that it
> would on the contrary a bad idea to encourage the use of plain
> passwords if we actually think that they should never be used (they
> are actually useful for located, development instances, not production
> ones).

I agree we should not encourage bad password practices. But as long as 
we support passwords to be stored in plaintext at all, it makes no sense 
to not allow them to be used with SCRAM. The fact that you can use a 
password stored in plaintext with both MD5 and SCRAM is literally the 
only reason you would store a password in plaintext, so if we don't want 
to allow that, we should disallow storing passwords in plaintext altogether.

> So what I would suggest would be to have a support table like
> that:
> auth / stored   MD5     SCRAM   plaintext
> -----------------------------------------
> password        Y       Y       N
> md5             Y       N       Y
> scram           N       N       Y

I was using 'Y' to indicate that the combination works, and 'N' to 
indicate that it does not. Assuming you're using the same notation, the 
above doesn't make any sense.

> So here is an idea for things to do now:
> 1) do not change the format of the existing passwords
> 2) do not change pg_authid
> 3) block access to instances if "password" or "md5" are used in
> pg_hba.conf if the user have a SCRAM verifier.
> 4) block access if "scram" is used and if user has a plain or md5 verifier.
> 5) Allow access if "scram" is used and if user has a SCRAM verifier.
> We had a similar discussion regarding verifier/password formats last
> year but that did not end well. It would be sad to fall back again
> into this discussion and get no result. If somebody wants to support
> access to SCRAM with plain password entries, why not. But that would
> gain a -1 from me regarding the earlier lookup of pg_authid needed to
> do the decision making on the protocol to use. And I think that we
> want SCRAM to be designed to be a maximum stable and secure.

The bottom line is that at the moment, when plaintext passwords are 
stored as is, without any indicator that it's a plaintext password, it's 
ambiguous whether a password is a SCRAM verifier, or if it's a plaintext 
password that just happens to begin with the word "scram:". That is 
completely unrelated to which combinations of stored passwords and 
authentication mechanisms we actually support or allow to work.

The only way to distinguish, is to know about every verifier kind there 
is, and check whether rolpassword looks valid as anything else than a 
plaintext password. And we already got tripped by a bug-of-omission on 
that once. If we add more verifier formats in the future, it's bound to 
happen again. Let's nip that source of bugs in the bud. Attached is a 
patch to implement what I have in mind.

Alternatively, you could argue that we should forbid storing passwords 
in plaintext altogether. I'm OK with that, too, if that's what people 
prefer. Then you cannot have a user that can log in with both MD5 and 
SCRAM authentication, but it's certainly more secure, and it's easier to 
document.

- Heikki


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Attachment
* Heikki Linnakangas (hlinnaka@iki.fi) wrote:
> On 12/14/2016 04:57 PM, Stephen Frost wrote:
> >* Peter Eisentraut (peter.eisentraut@2ndquadrant.com) wrote:
> >>On 12/14/16 5:15 AM, Michael Paquier wrote:
> >>>I would be tempted to suggest adding the verifier type as a new column
> >>>of pg_authid
> >>
> >>Yes please.
> >
> >This discussion seems to continue to come up and I don't entirely
> >understand why we keep trying to shove more things into pg_authid, or
> >worse, into rolpassword.
>
> I understand the relational beauty of having a separate column for
> the verifier type, but I don't think it would be practical.

I disagree.

> For
> starters, we'd still like to have a self-identifying string format
> like "scram-sha-256:<stuff>", so that you can conveniently pass the
> verifier as a string to CREATE USER.

I don't follow why we can't change the syntax for CREATE USER to allow
specifying the verifier type independently.  Generally speaking, I don't
expect *users* to be providing actual encoded *verifiers* very often, so
it seems like a bit of extra syntax that pg_dump has to use isn't that
big of a deal.

> I think it'll be much better to
> stick to one format, than try to split the verifier into type and
> the string, when it enters the catalog table.

Apparently, multiple people disagree with this approach.  I don't think
history is really on your side here either.

> >We should have an independent table for the verifiers, which has a
> >different column for the verifier type, and either starts off supporting
> >multiple verifiers per role or at least gives us the ability to add that
> >easily later.  We should also move rolvaliduntil to that new table.
>
> I agree we'll probably need a new table for verifiers. Or turn
> rolpassword into an array or something. We discussed that before,
> however, and it didn't really go anywhere, so right now I'd like to
> get SCRAM in with minimal changes to the rest of the system. There
> is a lot of room for improvement once it's in.

Using an array strikes me as an absolutely terrible idea- how are you
going to handle having different valid_until times then?

I do agree with trying to get SCRAM in without changing too much of the
rest of the system, but I wanted to make it clear that it's the only
point that I agree with for continuing down this path and that we should
absolutely be looking to change the CREATE USER syntax to specify the
verifier independently, plan to use a different table for the verifiers
with an independent column for the verifier type, support multiple
verifiers per role, etc, in the (hopefully very near...) future.

Thanks!

Stephen


On Thu, Dec 15, 2016 at 9:48 PM, Heikki Linnakangas <hlinnaka@iki.fi> wrote:
> The only way to distinguish, is to know about every verifier kind there is,
> and check whether rolpassword looks valid as anything else than a plaintext
> password. And we already got tripped by a bug-of-omission on that once. If
> we add more verifier formats in the future, it's bound to happen again.
> Let's nip that source of bugs in the bud. Attached is a patch to implement
> what I have in mind.

OK, I had a look at the patch proposed.

-    if (!pg_md5_encrypt(username, username, namelen, encrypted))
-        elog(ERROR, "password encryption failed");
-    if (strcmp(password, encrypted) == 0)
-        ereport(ERROR,
-                (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
-                 errmsg("password must not contain user name")));

This patch removes the only possible check for MD5 hashes that it has
never been done in passwordcheck. It may be fine to remove it, but I would
think that it is a good source of example regarding what could be done with
MD5 hashes, though limited. So it seems to me that this check should involve
as well pg_md5_encrypt on the username and compare if with the MD5 hash
given by the caller. The new code is being careful about trying to pass
down a plain password, but it is possible to load MD5 hashes directly as
well, aka pg_dumpall.

A simple ALTER USER role PASSWORD 'foo' causes a crash:
#0  0x00000000004764d7 in heap_compute_data_size (tupleDesc=0x277f090, values=0x27504b8, isnull=0x2750550 "") at
heaptuple.c:106
106                VARATT_CAN_MAKE_SHORT(DatumGetPointer(val)))
(gdb) bt
#0  0x00000000004764d7 in heap_compute_data_size (tupleDesc=0x277f090, values=0x27504b8, isnull=0x2750550 "") at
heaptuple.c:106
#1  0x00000000004781e9 in heap_form_tuple (tupleDescriptor=0x277f090, values=0x27504b8, isnull=0x2750550 "") at
heaptuple.c:736
#2  0x00000000004784d0 in heap_modify_tuple (tuple=0x277adc8, tupleDesc=0x277f090, replValues=0x7fff1369d030,
replIsnull=0x7fff1369d020"", doReplace=0x7fff1369d010 "")   at heaptuple.c:833   #3  0x0000000000673788 in AlterRole
(stmt=0x27a4f78)at user.c:845   #4  0x000000000082aa49 in standard_ProcessUtility (parsetree=0x27a4f78,
queryString=0x27a43e8"alter role ioltas password 'toto';", context=PROCESS_UTILITY_TOPLEVEL,       params=0x0,
dest=0x27a5300,completionTag=0x7fff1369d5b0 "") at utility.c:711
 

+        case PASSWORD_TYPE_PLAINTEXT:
+            shadow_pass = &shadow_pass[strlen("plain:")];
+            break;
It would be a good idea to have a generic routine able to get the plain
password value. In short I think that we should reduce the amount of
locations where "plain:" prefix is hardcoded.

> Alternatively, you could argue that we should forbid storing passwords in
> plaintext altogether. I'm OK with that, too, if that's what people prefer.
> Then you cannot have a user that can log in with both MD5 and SCRAM
> authentication, but it's certainly more secure, and it's easier to document.

At the end this may prove to be a bad idea for some developers. In local
deployments when working on a backend application with Postgres as backend,
it is actually useful to have plain passwords. At least I have found that
useful in some stuff I did many years ago.
-- 
Michael



On Thu, Dec 15, 2016 at 8:40 AM, Stephen Frost <sfrost@snowman.net> wrote:
> * Heikki Linnakangas (hlinnaka@iki.fi) wrote:
>> On 12/14/2016 04:57 PM, Stephen Frost wrote:
>> >* Peter Eisentraut (peter.eisentraut@2ndquadrant.com) wrote:
>> >>On 12/14/16 5:15 AM, Michael Paquier wrote:
>> >>>I would be tempted to suggest adding the verifier type as a new column
>> >>>of pg_authid
>> >>
>> >>Yes please.
>> >
>> >This discussion seems to continue to come up and I don't entirely
>> >understand why we keep trying to shove more things into pg_authid, or
>> >worse, into rolpassword.
>>
>> I understand the relational beauty of having a separate column for
>> the verifier type, but I don't think it would be practical.
>
> I disagree.

Me, too.  I think the idea of moving everything into a separate table
that allows multiple verifiers is probably not a good thing to do just
right now, because that introduces a bunch of additional issues above
and beyond what we need to do to get SCRAM implemented.  There are
administration and policy decisions to be made there that we should
not conflate with SCRAM proper.

However, Heikki's proposal seems to be that it's reasonable to force
rolpassword to be of the form 'type:verifier' in all cases but not
reasonable to have separate columns for type and verifier.  Eh?

>> For
>> starters, we'd still like to have a self-identifying string format
>> like "scram-sha-256:<stuff>", so that you can conveniently pass the
>> verifier as a string to CREATE USER.
>
> I don't follow why we can't change the syntax for CREATE USER to allow
> specifying the verifier type independently.  Generally speaking, I don't
> expect *users* to be providing actual encoded *verifiers* very often, so
> it seems like a bit of extra syntax that pg_dump has to use isn't that
> big of a deal.

We don't have to change the CREATE USER syntax at all.  It could just
split on the first colon and put the two halves of the string in
different places.  Of course, changing the syntax might be a good idea
anyway -- or not --- but the point is, right now, when you look at
rolpassword, there's not a clear rule for what kind of thing you've
got in there.  That's absolutely terrible design and has got to be
fixed.  Heikki's proposal of prefixing every entry with a type and a
':' will solve that problem and I'm not going to roll over in my grave
if we do it that way, but there is such a thing as normalization and
that technique could be applied here.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



On 12/15/16 8:40 AM, Stephen Frost wrote:
> I don't follow why we can't change the syntax for CREATE USER to allow
> specifying the verifier type independently.

That's what the last patch set I looked at actually does.

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



* Peter Eisentraut (peter.eisentraut@2ndquadrant.com) wrote:
> On 12/15/16 8:40 AM, Stephen Frost wrote:
> > I don't follow why we can't change the syntax for CREATE USER to allow
> > specifying the verifier type independently.
>
> That's what the last patch set I looked at actually does.

Well, same here, but it was quite a while ago and things have progressed
since then wrt SCRAM, as I understand it...

Thanks!

Stephen

On Sat, Dec 17, 2016 at 5:42 AM, Stephen Frost <sfrost@snowman.net> wrote:
> * Peter Eisentraut (peter.eisentraut@2ndquadrant.com) wrote:
>> On 12/15/16 8:40 AM, Stephen Frost wrote:
>> > I don't follow why we can't change the syntax for CREATE USER to allow
>> > specifying the verifier type independently.
>>
>> That's what the last patch set I looked at actually does.
>
> Well, same here, but it was quite a while ago and things have progressed
> since then wrt SCRAM, as I understand it...

From the discussions of last year on -hackers, it was decided to *not*
have an additional column per complains from a couple of hackers
(Robert you were in this set at this point), and the same thing was
concluded during the informal lunch meeting at PGcon. The point is,
the existing SCRAM patch set can survive without touching at *all* the
format of pg_authid. We could block SCRAM authentication when
"password" is used in pg_hba.conf and as well as when "scram" is used
with a plain password stored in pg_authid. Or look at the format of
the string in the catalog if "password" is defined and decide the
authentication protocol to follow based on that.
-- 
Michael



Michael,

* Michael Paquier (michael.paquier@gmail.com) wrote:
> On Sat, Dec 17, 2016 at 5:42 AM, Stephen Frost <sfrost@snowman.net> wrote:
> > * Peter Eisentraut (peter.eisentraut@2ndquadrant.com) wrote:
> >> On 12/15/16 8:40 AM, Stephen Frost wrote:
> >> > I don't follow why we can't change the syntax for CREATE USER to allow
> >> > specifying the verifier type independently.
> >>
> >> That's what the last patch set I looked at actually does.
> >
> > Well, same here, but it was quite a while ago and things have progressed
> > since then wrt SCRAM, as I understand it...
>
> From the discussions of last year on -hackers, it was decided to *not*
> have an additional column per complains from a couple of hackers

It seems that, at best, we didn't have consensus on it.  Hopefully we
are moving in a direction of consensus.

> (Robert you were in this set at this point), and the same thing was
> concluded during the informal lunch meeting at PGcon. The point is,
> the existing SCRAM patch set can survive without touching at *all* the
> format of pg_authid. We could block SCRAM authentication when
> "password" is used in pg_hba.conf and as well as when "scram" is used
> with a plain password stored in pg_authid. Or look at the format of
> the string in the catalog if "password" is defined and decide the
> authentication protocol to follow based on that.

As I mentioned up-thread, moving forward with minimal changes to get
SCRAM in certainly makes sense, but I do think we should be open to
(and, ideally, encouraging people to work towards) having a seperate
table for verifiers with independent columns for type and verifier.

Thanks!

Stephen

On Sat, Dec 17, 2016 at 10:23 AM, Stephen Frost <sfrost@snowman.net> wrote:
> * Michael Paquier (michael.paquier@gmail.com) wrote:
>> (Robert you were in this set at this point), and the same thing was
>> concluded during the informal lunch meeting at PGcon. The point is,
>> the existing SCRAM patch set can survive without touching at *all* the
>> format of pg_authid. We could block SCRAM authentication when
>> "password" is used in pg_hba.conf and as well as when "scram" is used
>> with a plain password stored in pg_authid. Or look at the format of
>> the string in the catalog if "password" is defined and decide the
>> authentication protocol to follow based on that.
>
> As I mentioned up-thread, moving forward with minimal changes to get
> SCRAM in certainly makes sense, but I do think we should be open to
> (and, ideally, encouraging people to work towards) having a seperate
> table for verifiers with independent columns for type and verifier.

Definitely, and you know my position on the matter or I would not have
written last year's patch series. Both things are just orthogonal IMO
at this point. And it would be good to focus just on one problem at
the moment to get it out.
-- 
Michael



On Fri, Dec 16, 2016 at 5:30 PM, Michael Paquier
<michael.paquier@gmail.com> wrote:
> On Sat, Dec 17, 2016 at 5:42 AM, Stephen Frost <sfrost@snowman.net> wrote:
>> * Peter Eisentraut (peter.eisentraut@2ndquadrant.com) wrote:
>>> On 12/15/16 8:40 AM, Stephen Frost wrote:
>>> > I don't follow why we can't change the syntax for CREATE USER to allow
>>> > specifying the verifier type independently.
>>>
>>> That's what the last patch set I looked at actually does.
>>
>> Well, same here, but it was quite a while ago and things have progressed
>> since then wrt SCRAM, as I understand it...
>
> From the discussions of last year on -hackers, it was decided to *not*
> have an additional column per complains from a couple of hackers
> (Robert you were in this set at this point), ...

Hmm, I don't recall taking that position, but then there are a lot of
things that I ought to recall and don't.  (Ask my wife!)

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



On Sun, Dec 18, 2016 at 3:59 AM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Fri, Dec 16, 2016 at 5:30 PM, Michael Paquier
> <michael.paquier@gmail.com> wrote:
>> From the discussions of last year on -hackers, it was decided to *not*
>> have an additional column per complains from a couple of hackers
>> (Robert you were in this set at this point), ...
>
> Hmm, I don't recall taking that position, but then there are a lot of
> things that I ought to recall and don't.  (Ask my wife!)

[... digging objects of the past ...]
From the past thread:
https://www.postgresql.org/message-id/CA+TgmoY790rphHBogXMbTG6MzSeNdoxdBXebEkAet9ZpZ8gvtw@mail.gmail.com
The complain is directed directly to multiple verifiers per users
though, not to have the type in a separate column.
-- 
Michael



Re: [HACKERS] Password identifiers, protocol aging and SCRAM protocol

From
Michael Paquier
Date:
On Thu, Dec 15, 2016 at 3:17 PM, Michael Paquier
<michael.paquier@gmail.com> wrote:
> In the case where the binaries are *not* built with libidn, I think
> that we had better reject valid UTF-8 string directly and just allow
> ASCII? SASLprep is a no-op on ASCII characters.
>
> Thoughts about this approach?

And Heikki has mentioned me that he'd prefer not having an extra
dependency for the normalization, which is LGPL-licensed by the way.
So I have looked at the SASLprep business to see what should be done
to get a complete implementation in core, completely independent of
anything known.

The first thing is to be able to understand in the SCRAM code if a
string is UTF-8 or not, and this code is in src/common/. pg_wchar.c
offers a set of routines exactly for this purpose, which is built with
libpq but that's not available for src/common/. So instead of moving
all the file, I'd like to create a new file in src/common/utf8.c which
includes pg_utf_mblen() and pg_utf8_islegal(). On top of that I think
that having a routine able to check a full string would be useful for
many users, as pg_utf8_islegal() can only check one set of characters.
If the password string is found to be of UTF-8 format, SASLprepare is
applied. If not, the string is copied as-is with perhaps unexpected
effects for the client But he's in trouble already if client is not
using UTF-8.

Then comes the real business... Note that's my first time touching
encoding, particularly UTF-8 in depth, so please be nice. I may write
things that are incorrect or sound so from here :)

The second thing is the normalization itself. Per RFC4013, NFKC needs
to be applied to the string.  The operation is described in [1]
completely, and it is named as doing 1) a compatibility decomposition
of the bytes of the string, followed by 2) a canonical composition.

About 1). The compatibility decomposition is defined in [2], "by
recursively applying the canonical and compatibility mappings, then
applying the canonical reordering algorithm". Canonical and
compatibility mapping are some data available in UnicodeData.txt, the
6th column of the set defined in [3] to be precise. The meaning of the
decomposition mappings is defined in [2] as well. The canonical
decomposition is basically to look for a given UTF-8 character, and
then apply the multiple characters resulting in its new shape. The
compatibility mapping should as well be applied, but [5], a perl tool
called charlint.pl doing this normalization work, does not care about
this phase... Do we?

About 2)... Once the decomposition has been applied, those bytes need
to be recomposed using the Canonical_Combining_Class field of
UnicodeData.txt in [3], which is the 3rd column of the set. Its values
are defined in [4]. An other interesting thing, charlint.pl [5] does
not care about this phase. I am wondering if we should as well not
just drop this part as well...

Once 1) and 2) are done, NKFC is complete, and so is SASLPrepare.

So what we need from Postgres side is a mapping table to, having the
following fields:
1) Hexa sequence of UTF8 character.
2) Its canonical combining class.
3) The kind of decomposition mapping if defined.
4) The decomposition mapping, in hexadecimal format.
Based on what I looked at, either perl or python could be used to
process UnicodeData.txt and to generate a header file that would be
included in the tree. There are 30k entries in UnicodeData.txt, 5k of
them have a mapping, so that will result in many tables. One thing to
improve performance would be to store the length of the table in a
static variable, order the entries by their hexadecimal keys and do a
dichotomy lookup to find an entry. We could as well use more fancy
things like a set of tables using a Radix tree using decomposed by
bytes. We should finish by just doing one lookup of the table for each
character sets anyway.

In conclusion, at this point I am looking for feedback regarding the
following items:
1) Where to put the UTF8 check routines and what to move.
2) How to generate the mapping table using UnicodeData.txt. I'd think
that using perl would be better.
3) The shape of the mapping table, which depends on how many
operations we want to support in the normalization of the strings.
The decisions for those items will drive the implementation in one
sense or another.

[1]: http://www.unicode.org/reports/tr15/#Description_Norm
[2]: http://www.unicode.org/Public/5.1.0/ucd/UCD.html#Character_Decomposition_Mappings
[3]: http://www.unicode.org/Public/5.1.0/ucd/UCD.html#UnicodeData.txt
[4]: http://www.unicode.org/Public/5.1.0/ucd/UCD.html#Canonical_Combining_Class_Values
[5]: https://www.w3.org/International/charlint/

Heikki, others, thoughts?
-- 
Michael



On Sat, Dec 17, 2016 at 5:48 PM, Michael Paquier
<michael.paquier@gmail.com> wrote:
> On Sun, Dec 18, 2016 at 3:59 AM, Robert Haas <robertmhaas@gmail.com> wrote:
>> On Fri, Dec 16, 2016 at 5:30 PM, Michael Paquier
>> <michael.paquier@gmail.com> wrote:
>>> From the discussions of last year on -hackers, it was decided to *not*
>>> have an additional column per complains from a couple of hackers
>>> (Robert you were in this set at this point), ...
>>
>> Hmm, I don't recall taking that position, but then there are a lot of
>> things that I ought to recall and don't.  (Ask my wife!)
>
> [... digging objects of the past ...]
> From the past thread:
> https://www.postgresql.org/message-id/CA+TgmoY790rphHBogXMbTG6MzSeNdoxdBXebEkAet9ZpZ8gvtw@mail.gmail.com
> The complain is directed directly to multiple verifiers per users
> though, not to have the type in a separate column.

Yes, I rather like the separate column.  But since Heikki is doing the
work (or if he is) I'm not going to gripe too much.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



On 12/16/2016 05:48 PM, Robert Haas wrote:
> On Thu, Dec 15, 2016 at 8:40 AM, Stephen Frost <sfrost@snowman.net> wrote:
>> * Heikki Linnakangas (hlinnaka@iki.fi) wrote:
>>> On 12/14/2016 04:57 PM, Stephen Frost wrote:
>>>> * Peter Eisentraut (peter.eisentraut@2ndquadrant.com) wrote:
>>>>> On 12/14/16 5:15 AM, Michael Paquier wrote:
>>>>>> I would be tempted to suggest adding the verifier type as a new column
>>>>>> of pg_authid
>>>>>
>>>>> Yes please.
>>>>
>>>> This discussion seems to continue to come up and I don't entirely
>>>> understand why we keep trying to shove more things into pg_authid, or
>>>> worse, into rolpassword.
>>>
>>> I understand the relational beauty of having a separate column for
>>> the verifier type, but I don't think it would be practical.
>>
>> I disagree.
>
> Me, too.  I think the idea of moving everything into a separate table
> that allows multiple verifiers is probably not a good thing to do just
> right now, because that introduces a bunch of additional issues above
> and beyond what we need to do to get SCRAM implemented.  There are
> administration and policy decisions to be made there that we should
> not conflate with SCRAM proper.
>
> However, Heikki's proposal seems to be that it's reasonable to force
> rolpassword to be of the form 'type:verifier' in all cases but not
> reasonable to have separate columns for type and verifier.  Eh?

I fear we'll just have to agree to disagree here, but I'll try to 
explain myself one more time.

Even if you have a separate "verifier type" column, it's not fully 
normalized, because there's still a dependency between the verifier and 
verifier type columns. You will always need to look at the verifier type 
to make sense of the verifier itself.

It's more convenient to carry the type information with the verifier 
itself, in backend code, in pg_dump, etc. Sure, you could have a 
separate "transfer" text format that has the prefix, and strip it out 
when the datum enters the system. But it is even simpler to have only 
one format, with the prefix, and use that everywhere.

It might make sense to add a separate column, to e.g. make it easier to 
e.g. query for users that have an MD5 verifier. You could do "WHERE 
rolverifiertype = 'md5'", instead of "WHERE rolpassword LIKE 'md5%'". 
It's not a big difference, though. But even if we did that, I would 
still love to have the type information *also* included with the 
verifier itself, for convenience. And if we include it in the verifier 
itself, adding a separate type column seems more trouble than it's worth.

For comparison, imagine that we added a column to pg_authid for a 
picture of the user, stored as a bytea. The picture can be in JPEG or 
PNG format. Looking at the first few bytes of the image, you can tell 
which one it is. Would it make sense to add a separate "type" column, to 
tell what format the image is in? I think it would be more convenient 
and robust to rely on the first bytes of the image data instead.

- Heikki




On 12/16/2016 03:31 AM, Michael Paquier wrote:
> On Thu, Dec 15, 2016 at 9:48 PM, Heikki Linnakangas <hlinnaka@iki.fi> wrote:
>> The only way to distinguish, is to know about every verifier kind there is,
>> and check whether rolpassword looks valid as anything else than a plaintext
>> password. And we already got tripped by a bug-of-omission on that once. If
>> we add more verifier formats in the future, it's bound to happen again.
>> Let's nip that source of bugs in the bud. Attached is a patch to implement
>> what I have in mind.
>
> OK, I had a look at the patch proposed.
>
> -    if (!pg_md5_encrypt(username, username, namelen, encrypted))
> -        elog(ERROR, "password encryption failed");
> -    if (strcmp(password, encrypted) == 0)
> -        ereport(ERROR,
> -                (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
> -                 errmsg("password must not contain user name")));
>
> This patch removes the only possible check for MD5 hashes that it has
> never been done in passwordcheck. It may be fine to remove it, but I would
> think that it is a good source of example regarding what could be done with
> MD5 hashes, though limited. So it seems to me that this check should involve
> as well pg_md5_encrypt on the username and compare if with the MD5 hash
> given by the caller.

Actually, it does still perform that check. There's a new function, 
plain_crypt_verify, that passwordcheck uses now. plain_crypt_verify() is 
intended to work with any future hash formats we might introduce in the 
future (including SCRAM), so that passwordcheck doesn't need to know 
about all the hash formats.

> A simple ALTER USER role PASSWORD 'foo' causes a crash:

Ah, fixed.

> +        case PASSWORD_TYPE_PLAINTEXT:
> +            shadow_pass = &shadow_pass[strlen("plain:")];
> +            break;
> It would be a good idea to have a generic routine able to get the plain
> password value. In short I think that we should reduce the amount of
> locations where "plain:" prefix is hardcoded.

There is such a function included in the patch, get_plain_password(char 
*shadow_pass), actually. Contrib/passwordcheck uses it. I figured that 
in crypt.c itself, it's OK to do the above directly, but 
get_plain_password() is intended to be used elsewhere.

Thanks for having a look! Attached is a new version, with that bug fixed.

- Heikki


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Attachment
On Tue, Dec 20, 2016 at 6:37 AM, Heikki Linnakangas <hlinnaka@iki.fi> wrote:
> It's more convenient to carry the type information with the verifier itself,
> in backend code, in pg_dump, etc. Sure, you could have a separate "transfer"
> text format that has the prefix, and strip it out when the datum enters the
> system. But it is even simpler to have only one format, with the prefix, and
> use that everywhere.

I see your point.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Heikki,

* Heikki Linnakangas (hlinnaka@iki.fi) wrote:
> Even if you have a separate "verifier type" column, it's not fully
> normalized, because there's still a dependency between the verifier
> and verifier type columns. You will always need to look at the
> verifier type to make sense of the verifier itself.

That's true- but you don't need to look at the verifier, or even have
*access* to the verifier, to look at the verifier type.  That is
actually very useful when you start thinking about the downstream side
of this- what about the monitoring tool which will want to check and
make sure there are only certain verifier types being used?  It'll have
to be a superuser, or have access to some superuser security defined
function, and that really sucks.  I'm not saying that we would
necessairly want the verifier type to be publicly visible, but being
able to see it without being a superuser would be good, imv.

> It's more convenient to carry the type information with the verifier
> itself, in backend code, in pg_dump, etc. Sure, you could have a
> separate "transfer" text format that has the prefix, and strip it
> out when the datum enters the system. But it is even simpler to have
> only one format, with the prefix, and use that everywhere.

It's more convenient when you need to look at both- it's not more
convenient when you only wish to look at the verifier type.  Further, it
means that we have to have a construct that assumes things about the
verifier type and verifier- what if a verifier type came along that used
a colon?  We'd have to do some special magic to handle that correctly,
and that just sucks, and anyone who is writing code to generically deal
with these fields will end up writing that same code (or forgetting to,
and not handling the case correctly).

> It might make sense to add a separate column, to e.g. make it easier
> to e.g. query for users that have an MD5 verifier. You could do
> "WHERE rolverifiertype = 'md5'", instead of "WHERE rolpassword LIKE
> 'md5%'". It's not a big difference, though. But even if we did that,
> I would still love to have the type information *also* included with
> the verifier itself, for convenience. And if we include it in the
> verifier itself, adding a separate type column seems more trouble
> than it's worth.

I don't agree that it's "not a big difference."  As I argue above- your
approach also assumes that anyone who would like to investigate the
verifier type should have access to the verifier itself, which I do not
agree with.  I also have a hard time buying the argument that it's
really so much more convenient to have the verifier type included in the
same string as the verifier that we should duplicate that information
and then run the risk that we end up with the two not matching or that
we won't ever run into complications down the road when our chosen
separator causes us difficulties.

Thanks!

Stephen

On Tue, Dec 20, 2016 at 08:34:19AM -0500, Stephen Frost wrote:
> Heikki,
> 
> * Heikki Linnakangas (hlinnaka@iki.fi) wrote:
> > Even if you have a separate "verifier type" column, it's not fully
> > normalized, because there's still a dependency between the
> > verifier and verifier type columns. You will always need to look
> > at the verifier type to make sense of the verifier itself.
> 
> That's true- but you don't need to look at the verifier, or even
> have *access* to the verifier, to look at the verifier type.

Would a view that shows only what's to the left of the first semicolon
suit this purpose?

Best,
David.
-- 
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter      XMPP: david(dot)fetter(at)gmail(dot)com

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate



On Wed, Dec 21, 2016 at 1:08 AM, David Fetter <david@fetter.org> wrote:
> Would a view that shows only what's to the left of the first semicolon
> suit this purpose?

Of course it would, you would just need to make the routines now
checking the shape of MD5 and SCRAM identifiers available at SQL level
and feed the strings into them. Now I am not sure that it's worth
having a new superuser view for that. pg_roles and pg_shadow hide the
information about verifiers.
-- 
Michael



David,

* David Fetter (david@fetter.org) wrote:
> On Tue, Dec 20, 2016 at 08:34:19AM -0500, Stephen Frost wrote:
> > * Heikki Linnakangas (hlinnaka@iki.fi) wrote:
> > > Even if you have a separate "verifier type" column, it's not fully
> > > normalized, because there's still a dependency between the
> > > verifier and verifier type columns. You will always need to look
> > > at the verifier type to make sense of the verifier itself.
> >
> > That's true- but you don't need to look at the verifier, or even
> > have *access* to the verifier, to look at the verifier type.
>
> Would a view that shows only what's to the left of the first semicolon
> suit this purpose?

Obviously a (security barrier...) view or a (security definer) function
could be used, but I don't believe either is actually a good idea.

Thanks!

Stephen

On Tue, Dec 20, 2016 at 06:14:40PM -0500, Stephen Frost wrote:
> David,
> 
> * David Fetter (david@fetter.org) wrote:
> > On Tue, Dec 20, 2016 at 08:34:19AM -0500, Stephen Frost wrote:
> > > * Heikki Linnakangas (hlinnaka@iki.fi) wrote:
> > > > Even if you have a separate "verifier type" column, it's not fully
> > > > normalized, because there's still a dependency between the
> > > > verifier and verifier type columns. You will always need to look
> > > > at the verifier type to make sense of the verifier itself.
> > > 
> > > That's true- but you don't need to look at the verifier, or even
> > > have *access* to the verifier, to look at the verifier type.
> > 
> > Would a view that shows only what's to the left of the first semicolon
> > suit this purpose?
> 
> Obviously a (security barrier...) view or a (security definer) function
> could be used, but I don't believe either is actually a good idea.

Would you be so kind as to help me understand what's wrong with that idea?

Best,
David.
-- 
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter      XMPP: david(dot)fetter(at)gmail(dot)com

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate



David,

* David Fetter (david@fetter.org) wrote:
> On Tue, Dec 20, 2016 at 06:14:40PM -0500, Stephen Frost wrote:
> > * David Fetter (david@fetter.org) wrote:
> > > On Tue, Dec 20, 2016 at 08:34:19AM -0500, Stephen Frost wrote:
> > > > * Heikki Linnakangas (hlinnaka@iki.fi) wrote:
> > > > > Even if you have a separate "verifier type" column, it's not fully
> > > > > normalized, because there's still a dependency between the
> > > > > verifier and verifier type columns. You will always need to look
> > > > > at the verifier type to make sense of the verifier itself.
> > > >
> > > > That's true- but you don't need to look at the verifier, or even
> > > > have *access* to the verifier, to look at the verifier type.
> > >
> > > Would a view that shows only what's to the left of the first semicolon
> > > suit this purpose?
> >
> > Obviously a (security barrier...) view or a (security definer) function
> > could be used, but I don't believe either is actually a good idea.
>
> Would you be so kind as to help me understand what's wrong with that idea?

For starters, it doubles-down on the assumption that we'll always be
happy with that particular separator and implies to anyone watching that
they'll be able to trust it.  Further, it's additional complication
which, at least to my eyes, is entirely in the wrong direction.

We could push everything in pg_authid into a single colon-separated text
field and call it simpler because we don't have to deal with those silly
column things, and we'd have something a lot closer to a unix passwd
file too!, but it wouldn't make it a terribly smart thing to do.  We
aren't a bunch of individual C programs having to parse out things out
of flat text files, after all.

Thanks!

Stephen

On Tue, Dec 20, 2016 at 9:23 PM, Heikki Linnakangas <hlinnaka@iki.fi> wrote:
> On 12/16/2016 03:31 AM, Michael Paquier wrote:
> Actually, it does still perform that check. There's a new function,
> plain_crypt_verify, that passwordcheck uses now. plain_crypt_verify() is
> intended to work with any future hash formats we might introduce in the
> future (including SCRAM), so that passwordcheck doesn't need to know about
> all the hash formats.

Bah. I have misread the first version of the patch, and it is indeed
keeping the username checks. Now that things don't crash that behaves
as expected:
=# load 'passwordcheck';
LOAD
=# alter role mpaquier password 'mpaquier';
ERROR:  22023: password must not contain user name
LOCATION:  check_password, passwordcheck.c:101
=# alter role mpaquier password 'md58349d3a1bc8f4f7399b1ff9dea493b15';
ERROR:  22023: password must not contain user name
LOCATION:  check_password, passwordcheck.c:82
With the patch:

>> +        case PASSWORD_TYPE_PLAINTEXT:
>> +            shadow_pass = &shadow_pass[strlen("plain:")];
>> +            break;
>> It would be a good idea to have a generic routine able to get the plain
>> password value. In short I think that we should reduce the amount of
>> locations where "plain:" prefix is hardcoded.
>
> There is such a function included in the patch, get_plain_password(char
> *shadow_pass), actually. Contrib/passwordcheck uses it. I figured that in
> crypt.c itself, it's OK to do the above directly, but get_plain_password()
> is intended to be used elsewhere.

The idea would be to have the function not return an allocated string,
just a position to it. That would be useful in plain_crypt_verify()
for example, for a total of 4 places, including get_plain_password()
where the new string allocation is done. Well, it's not like this
prefix "plain:" would change anyway in the future nor that it is going
to spread much.

> Thanks for having a look! Attached is a new version, with that bug fixed.

I have been able more advanced testing without the crash and things
seem to work properly. The attached set of tests is also able to pass
for all the combinations of hba configurations and password formats.
And looking at the code I don't have more comments.
-- 
Michael

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Attachment
On 12/14/2016 01:33 PM, Heikki Linnakangas wrote:
> I just noticed that the manual for CREATE ROLE says:
>
>> Note that older clients might lack support for the MD5 authentication
>> mechanism that is needed to work with passwords that are stored
>> encrypted.
>
> That's is incorrect. The alternative to MD5 authentication is plain
> 'password' authentication, and that works just fine with MD5-hashed
> passwords. I think that sentence is a leftover from when we still
> supported "crypt" authentication (so I actually get to blame you for
> that ;-), commit 53a5026b). Back then, it was true that if an MD5 hash
> was stored in pg_authid, you couldn't do "crypt" authentication. That
> might have left old clients out in the cold.
>
> Now that we're getting SCRAM authentication, we'll need a similar notice
> there again, for the incompatibility of a SCRAM verifier with MDD5
> authentication and vice versa.

I went ahead and removed the current bogus notice from the docs. We 
might need to put back something like it, with the SCRAM patch, but it 
needs to be rewritten anyway.

- Heikki




On 12/21/2016 04:09 AM, Michael Paquier wrote:
>> Thanks for having a look! Attached is a new version, with that bug fixed.
>
> I have been able more advanced testing without the crash and things
> seem to work properly. The attached set of tests is also able to pass
> for all the combinations of hba configurations and password formats.
> And looking at the code I don't have more comments.

Thanks!

Since not everyone agrees with this approach, I split this patch into 
two. The first patch refactors things, replacing the isMD5() function 
with get_password_type(), without changing the representation of 
pg_authid.rolpassword. That is hopefully uncontroversial. And the second 
patch adds the "plain:" prefix, which not everyone agrees on.

Barring objections I'm going to at least commit the first patch. I think 
we should commit the second one too, but it's not as critical, and the 
first patch matters more for the SCRAM patch, too.

- Heikki


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Attachment
On Tue, Jan 3, 2017 at 11:09 PM, Heikki Linnakangas <hlinnaka@iki.fi> wrote:
> Since not everyone agrees with this approach, I split this patch into two.
> The first patch refactors things, replacing the isMD5() function with
> get_password_type(), without changing the representation of
> pg_authid.rolpassword. That is hopefully uncontroversial. And the second
> patch adds the "plain:" prefix, which not everyone agrees on.
>
> Barring objections I'm going to at least commit the first patch. I think we
> should commit the second one too, but it's not as critical, and the first
> patch matters more for the SCRAM patch, too.

The split does not look correct to me. 0001 has references to the
prefix "plain:".
-- 
Michael



On 1/3/17 9:09 AM, Heikki Linnakangas wrote:
> Since not everyone agrees with this approach, I split this patch into 
> two. The first patch refactors things, replacing the isMD5() function 
> with get_password_type(), without changing the representation of 
> pg_authid.rolpassword. That is hopefully uncontroversial. And the second 
> patch adds the "plain:" prefix, which not everyone agrees on.
> 
> Barring objections I'm going to at least commit the first patch. I think 
> we should commit the second one too, but it's not as critical, and the 
> first patch matters more for the SCRAM patch, too.

Is there currently anything to review here for the commit fest?

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



On Thu, Jan 5, 2017 at 10:31 PM, Peter Eisentraut
<peter.eisentraut@2ndquadrant.com> wrote:
> On 1/3/17 9:09 AM, Heikki Linnakangas wrote:
>> Since not everyone agrees with this approach, I split this patch into
>> two. The first patch refactors things, replacing the isMD5() function
>> with get_password_type(), without changing the representation of
>> pg_authid.rolpassword. That is hopefully uncontroversial. And the second
>> patch adds the "plain:" prefix, which not everyone agrees on.
>>
>> Barring objections I'm going to at least commit the first patch. I think
>> we should commit the second one too, but it's not as critical, and the
>> first patch matters more for the SCRAM patch, too.
>
> Is there currently anything to review here for the commit fest?

The patches sent here make sense as part of the SCRAM set:
https://www.postgresql.org/message-id/6831df67-7641-1a66-4985-268609a4821f@iki.fi
I was just waiting for Heikki to fix the split of the patches before
moving on with an extra lookup though.
-- 
Michael



On 1/3/17 9:09 AM, Heikki Linnakangas wrote:
> Since not everyone agrees with this approach, I split this patch into 
> two. The first patch refactors things, replacing the isMD5() function 
> with get_password_type(), without changing the representation of 
> pg_authid.rolpassword. That is hopefully uncontroversial.

I have checked these patches.

The refactoring in the first patch seems sensible.  As Michael pointed
out, there is still a reference to "plain:" in the first patch.

The commit message needs to be updated, because the function
plain_crypt_verify() was already added in a previous patch.

I'm not fond of this kind of coding
   password = encrypt_password(password_type, stmt->role, password);

where the 'password' variable has a different meaning before and after.

This error message might be a mistake:
   elog(ERROR, "unrecognized password type conversion");

I think some pieces from the second patch could be included in the first
patch, e.g., the parts for passwordcheck.c and user.c.

> And the second 
> patch adds the "plain:" prefix, which not everyone agrees on.

The code also gets a little bit dubious, as it introduces an "unknown"
password type, which is sometimes treated as plaintext and sometimes as
an error.  I think this is going be messy.

I would skip this patch for now at least.  Too much controversy, and we
don't know how the rest of the patches for this feature will look like
to be able to know if it's worth it.

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: [HACKERS] Password identifiers, protocol aging and SCRAM protocol

From
Noah Misch
Date:
On Tue, Nov 15, 2016 at 07:52:06AM +0900, Michael Paquier wrote:
> On Sat, Nov 5, 2016 at 9:36 PM, Michael Paquier <michael.paquier@gmail.com> wrote:
> > On Sat, Nov 5, 2016 at 12:58 AM, Peter Eisentraut <peter.eisentraut@2ndquadrant.com> wrote:
> > pg_hba.conf uses "scram" as keyword, but scram refers to a family of
> > authentication methods. There is as well SCRAM-SHA-1, SCRAM-SHA-256
> > (what this patch does). Hence wouldn't it make sense to use
> > scram_sha256 in pg_hba.conf instead? If for example in the future
> > there is a SHA-512 version of SCRAM we could switch easily to that and
> > define scram_sha512.
> 
> OK, I have added more docs regarding the use of scram in pg_hba.conf,
> particularly in client-auth.sgml to describe what scram is better than
> md5 in terms of protection, and also completed the data of pg_hba.conf
> about the new keyword used in it.

The latest versions document this precisely, but I agree with Peter's concern
about plain "scram".  Suppose it's 2025 and PostgreSQL support SASL mechanisms
OAUTHBEARER, SCRAM-SHA-256, SCRAM-SHA-256-PLUS, and SCRAM-SHA3-512.  What
should the pg_hba.conf options look like at that time?  I don't think having a
single "scram" option fits in such a world.  I see two strategies that fit:

1. Single "sasl" option, with a GUC, similar to ssl_ciphers, controlling the  mechanisms to offer.
2. Separate options "scram_sha_256", "scram_sha3_512", "oauthbearer", etc.



Re: [HACKERS] Password identifiers, protocol aging and SCRAM protocol

From
Michael Paquier
Date:
On Wed, Jan 18, 2017 at 2:23 PM, Noah Misch <noah@leadboat.com> wrote:
> The latest versions document this precisely, but I agree with Peter's concern
> about plain "scram".  Suppose it's 2025 and PostgreSQL support SASL mechanisms
> OAUTHBEARER, SCRAM-SHA-256, SCRAM-SHA-256-PLUS, and SCRAM-SHA3-512.  What
> should the pg_hba.conf options look like at that time?  I don't think having a
> single "scram" option fits in such a world.

Sure.

> I see two strategies that fit:
>
> 1. Single "sasl" option, with a GUC, similar to ssl_ciphers, controlling the
>    mechanisms to offer.
> 2. Separate options "scram_sha_256", "scram_sha3_512", "oauthbearer", etc.

Or we could have a sasl option, with a mandatory array of mechanisms
to define one or more items, so method entries in pg_hba.conf would
look llke that:
sasl mechanism=scram_sha_256,scram_sha3_512

Users could define different methods in each hba line once a user and
a database map. I am not sure if many people would care about that
though.
-- 
Michael



Re: [HACKERS] Password identifiers, protocol aging and SCRAM protocol

From
Michael Paquier
Date:
On Tue, Dec 20, 2016 at 10:47 AM, Michael Paquier
<michael.paquier@gmail.com> wrote:
> And Heikki has mentioned me that he'd prefer not having an extra
> dependency for the normalization, which is LGPL-licensed by the way.
> So I have looked at the SASLprep business to see what should be done
> to get a complete implementation in core, completely independent of
> anything known.
>
> The first thing is to be able to understand in the SCRAM code if a
> string is UTF-8 or not, and this code is in src/common/. pg_wchar.c
> offers a set of routines exactly for this purpose, which is built with
> libpq but that's not available for src/common/. So instead of moving
> all the file, I'd like to create a new file in src/common/utf8.c which
> includes pg_utf_mblen() and pg_utf8_islegal(). On top of that I think
> that having a routine able to check a full string would be useful for
> many users, as pg_utf8_islegal() can only check one set of characters.
> If the password string is found to be of UTF-8 format, SASLprepare is
> applied. If not, the string is copied as-is with perhaps unexpected
> effects for the client But he's in trouble already if client is not
> using UTF-8.
>
> Then comes the real business... Note that's my first time touching
> encoding, particularly UTF-8 in depth, so please be nice. I may write
> things that are incorrect or sound so from here :)
>
> The second thing is the normalization itself. Per RFC4013, NFKC needs
> to be applied to the string.  The operation is described in [1]
> completely, and it is named as doing 1) a compatibility decomposition
> of the bytes of the string, followed by 2) a canonical composition.
>
> About 1). The compatibility decomposition is defined in [2], "by
> recursively applying the canonical and compatibility mappings, then
> applying the canonical reordering algorithm". Canonical and
> compatibility mapping are some data available in UnicodeData.txt, the
> 6th column of the set defined in [3] to be precise. The meaning of the
> decomposition mappings is defined in [2] as well. The canonical
> decomposition is basically to look for a given UTF-8 character, and
> then apply the multiple characters resulting in its new shape. The
> compatibility mapping should as well be applied, but [5], a perl tool
> called charlint.pl doing this normalization work, does not care about
> this phase... Do we?
>
> About 2)... Once the decomposition has been applied, those bytes need
> to be recomposed using the Canonical_Combining_Class field of
> UnicodeData.txt in [3], which is the 3rd column of the set. Its values
> are defined in [4]. An other interesting thing, charlint.pl [5] does
> not care about this phase. I am wondering if we should as well not
> just drop this part as well...
>
> Once 1) and 2) are done, NKFC is complete, and so is SASLPrepare.
>
> So what we need from Postgres side is a mapping table to, having the
> following fields:
> 1) Hexa sequence of UTF8 character.
> 2) Its canonical combining class.
> 3) The kind of decomposition mapping if defined.
> 4) The decomposition mapping, in hexadecimal format.
> Based on what I looked at, either perl or python could be used to
> process UnicodeData.txt and to generate a header file that would be
> included in the tree. There are 30k entries in UnicodeData.txt, 5k of
> them have a mapping, so that will result in many tables. One thing to
> improve performance would be to store the length of the table in a
> static variable, order the entries by their hexadecimal keys and do a
> dichotomy lookup to find an entry. We could as well use more fancy
> things like a set of tables using a Radix tree using decomposed by
> bytes. We should finish by just doing one lookup of the table for each
> character sets anyway.
>
> In conclusion, at this point I am looking for feedback regarding the
> following items:
> 1) Where to put the UTF8 check routines and what to move.
> 2) How to generate the mapping table using UnicodeData.txt. I'd think
> that using perl would be better.
> 3) The shape of the mapping table, which depends on how many
> operations we want to support in the normalization of the strings.
> The decisions for those items will drive the implementation in one
> sense or another.
>
> [1]: http://www.unicode.org/reports/tr15/#Description_Norm
> [2]: http://www.unicode.org/Public/5.1.0/ucd/UCD.html#Character_Decomposition_Mappings
> [3]: http://www.unicode.org/Public/5.1.0/ucd/UCD.html#UnicodeData.txt
> [4]: http://www.unicode.org/Public/5.1.0/ucd/UCD.html#Canonical_Combining_Class_Values
> [5]: https://www.w3.org/International/charlint/
>
> Heikki, others, thoughts?

FWIW, this patch is on a "waiting on author" state and that's right.
As the discussion on SASLprepare() and the decisions regarding the way
to implement it, or at least have it, are still pending, I am not
planning to move on with any implementation until we have a plan about
what to do. Just using libidn (LGPL) for a first shot is rather
painless but... I am not alone here.
-- 
Michael



Re: [HACKERS] Password identifiers, protocol aging and SCRAM protocol

From
Noah Misch
Date:
On Wed, Jan 18, 2017 at 02:30:38PM +0900, Michael Paquier wrote:
> On Wed, Jan 18, 2017 at 2:23 PM, Noah Misch <noah@leadboat.com> wrote:
> > The latest versions document this precisely, but I agree with Peter's concern
> > about plain "scram".  Suppose it's 2025 and PostgreSQL support SASL mechanisms
> > OAUTHBEARER, SCRAM-SHA-256, SCRAM-SHA-256-PLUS, and SCRAM-SHA3-512.  What
> > should the pg_hba.conf options look like at that time?  I don't think having a
> > single "scram" option fits in such a world.
> 
> Sure.
> 
> > I see two strategies that fit:
> >
> > 1. Single "sasl" option, with a GUC, similar to ssl_ciphers, controlling the
> >    mechanisms to offer.
> > 2. Separate options "scram_sha_256", "scram_sha3_512", "oauthbearer", etc.
> 
> Or we could have a sasl option, with a mandatory array of mechanisms
> to define one or more items, so method entries in pg_hba.conf would
> look llke that:
> sasl mechanism=scram_sha_256,scram_sha3_512

I like that.



Re: [HACKERS] Password identifiers, protocol aging and SCRAM protocol

From
Simon Riggs
Date:
On 19 January 2017 at 06:32, Noah Misch <noah@leadboat.com> wrote:
> On Wed, Jan 18, 2017 at 02:30:38PM +0900, Michael Paquier wrote:
>> On Wed, Jan 18, 2017 at 2:23 PM, Noah Misch <noah@leadboat.com> wrote:
>> > The latest versions document this precisely, but I agree with Peter's concern
>> > about plain "scram".  Suppose it's 2025 and PostgreSQL support SASL mechanisms
>> > OAUTHBEARER, SCRAM-SHA-256, SCRAM-SHA-256-PLUS, and SCRAM-SHA3-512.  What
>> > should the pg_hba.conf options look like at that time?  I don't think having a
>> > single "scram" option fits in such a world.
>>
>> Sure.
>>
>> > I see two strategies that fit:
>> >
>> > 1. Single "sasl" option, with a GUC, similar to ssl_ciphers, controlling the
>> >    mechanisms to offer.
>> > 2. Separate options "scram_sha_256", "scram_sha3_512", "oauthbearer", etc.
>>
>> Or we could have a sasl option, with a mandatory array of mechanisms
>> to define one or more items, so method entries in pg_hba.conf would
>> look llke that:
>> sasl mechanism=scram_sha_256,scram_sha3_512
>
> I like that.

Michael, I support your good work on this patch and its certainly shaping up.

Noah's general point is that we need to have a general, futureproof
design for the UI and I agree.

We seem to be caught between adding lots of new things as parameters
and adding new detail into pg_hba.conf.

Parameters like password_encryption are difficult here because they
essentially repeat what has already been said in the pg_hba.conf. If
we have two entries in pg_hba.conf, one saying md5 and the other
saying "scram" (or whatever), what would we set password_encryption
to? It seems clear to me that if the pg_hba.conf says md5 then
password_encryption should be md5 and if pg_hba.conf says scram then
it should be scram.

I'd like to float another idea, as a way of finding a way forwards
that will last over time

* pg_hba.conf entry would say sasl='methodX' (no spaces)
* we have a new catalog called pg_sasl that allows us to add new
methods, with appropriate function calls
* remove password_encryption parameter and always use default
encryption as specified for that session in pg_hba.conf

Which sounds nice, but many users will wish to upgrade their current
mechanisms from using md5 to scram. How will we update passwords
slowly, so that different users change from md5 to scram at different
times? Having to specify the mechanism in the pg_hba.conf makes that
almost impossible, forcing a big bang approach which subsequently may
never happen.

As a way of solving that problem, another idea would be to make the
mechanism session specific depending upon what is stored for a
particular user. That allows us to have a single pg_hba.conf entry of
"sasl", and then use md5, scram-256 or future-mechanism on a per user
basis.

I'm not sure I see a clear way forwards yet, these are just ideas and
questions to help the discussion.

-- 
Simon Riggs                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: [HACKERS] Password identifiers, protocol aging and SCRAM protocol

From
Michael Paquier
Date:
On Thu, Jan 19, 2017 at 6:17 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
> We seem to be caught between adding lots of new things as parameters
> and adding new detail into pg_hba.conf.
>
> Parameters like password_encryption are difficult here because they
> essentially repeat what has already been said in the pg_hba.conf. If
> we have two entries in pg_hba.conf, one saying md5 and the other
> saying "scram" (or whatever), what would we set password_encryption
> to? It seems clear to me that if the pg_hba.conf says md5 then
> password_encryption should be md5 and if pg_hba.conf says scram then
> it should be scram.
>
> I'd like to float another idea, as a way of finding a way forwards
> that will last over time
>
> * pg_hba.conf entry would say sasl='methodX' (no spaces)
> * we have a new catalog called pg_sasl that allows us to add new
> methods, with appropriate function calls

This would make sense if we support a mountain of protocols and that
we want to have a handler with a set of APIs used for authentication.
This is a grade higher than simple SCRAM, and this basically requires
to design a set of generic routines that are fine for covering *any*
protocol with this handler. I'd think this is rather hard per the
slight differences in SASL exchanges for different protocols.

> * remove password_encryption parameter and always use default
> encryption as specified for that session in pg_hba.conf

So if user X creates user Y with a password (defined by CREATE USER
PASSWORD) it should by default follow what pg_hba.conf dictates, which
could be pam or gss? That does not look very intuitive to me. The
advantage with the current system is that password creation and
protocol allowed for an authentication are two separate, independent
things, password_encryption being basically a wrapper for CREATE USER.
Mixing both makes things more confusing. If you are willing to move
away from password_encryption, one thing that could be used is just to
extend CREATE USER to be able to enforce the password protocol
associated, that's what the patches on this thread do with PASSWORD
(val USING protocol).

> Which sounds nice, but many users will wish to upgrade their current
> mechanisms from using md5 to scram. How will we update passwords
> slowly, so that different users change from md5 to scram at different
> times? Having to specify the mechanism in the pg_hba.conf makes that
> almost impossible, forcing a big bang approach which subsequently may
> never happen.

At this point comes the possibility to define multiple password types
for one single user instead of rolling multiple roles and renaming
htem.

> As a way of solving that problem, another idea would be to make the
> mechanism session specific depending upon what is stored for a
> particular user. That allows us to have a single pg_hba.conf entry of
> "sasl", and then use md5, scram-256 or future-mechanism on a per user
> basis.

Isn't that specifying multiple users in a single sasl entry in
pg_hba.conf? Once a user is updated, you could just move him from one
line to the other of pg_hba.conf, or use a @file in the hba entry.

> I'm not sure I see a clear way forwards yet, these are just ideas and
> questions to help the discussion.

Thanks, I find the catalog idea interesting. That's hard though per
the potential range of SASL protocols that have likely different needs
in the way messages are exchanged.
-- 
Michael



Re: [HACKERS] Password identifiers, protocol aging and SCRAM protocol

From
Michael Paquier
Date:
On Wed, Jan 18, 2017 at 2:46 PM, Michael Paquier
<michael.paquier@gmail.com> wrote:
> FWIW, this patch is on a "waiting on author" state and that's right.
> As the discussion on SASLprepare() and the decisions regarding the way
> to implement it, or at least have it, are still pending, I am not
> planning to move on with any implementation until we have a plan about
> what to do. Just using libidn (LGPL) for a first shot is rather
> painless but... I am not alone here.

With decisions on this matter pending, I am marking this patch as
"returned with feedback". If there is a consensus on what to do, I'll
be happy to do the implementation with the last CF in March in sight.
If no, that would mean that this feature will not be part of PG 10.
-- 
Michael



On 01/17/2017 11:51 PM, Peter Eisentraut wrote:
> On 1/3/17 9:09 AM, Heikki Linnakangas wrote:
>> Since not everyone agrees with this approach, I split this patch into
>> two. The first patch refactors things, replacing the isMD5() function
>> with get_password_type(), without changing the representation of
>> pg_authid.rolpassword. That is hopefully uncontroversial.
>
> I have checked these patches.
>
> The refactoring in the first patch seems sensible.  As Michael pointed
> out, there is still a reference to "plain:" in the first patch.

Fixed.

> The commit message needs to be updated, because the function
> plain_crypt_verify() was already added in a previous patch.

Fixed.

> I'm not fond of this kind of coding
>
>     password = encrypt_password(password_type, stmt->role, password);
>
> where the 'password' variable has a different meaning before and after.

Added a new local variable to avoid the confusion.

> This error message might be a mistake:
>
>     elog(ERROR, "unrecognized password type conversion");

I rephrased the error as "cannot encrypt password to requested type", 
and added a comment explaining that it cannot happen. I hope that 
helped, I'm not sure why you thought it might've been a mistake.

> I think some pieces from the second patch could be included in the first
> patch, e.g., the parts for passwordcheck.c and user.c.

I refrained from doing that for now. It would've changed the 
passwordcheck hook API in an incompatible way. Breaking the API 
explicitly would be a good thing, if we added the "plain:" prefix, 
because modules would need to deal with the prefix anyway. But until we 
do that, better to not break the API for no good reason.

>> And the second
>> patch adds the "plain:" prefix, which not everyone agrees on.
>
> The code also gets a little bit dubious, as it introduces an "unknown"
> password type, which is sometimes treated as plaintext and sometimes as
> an error.  I think this is going be messy.
>
> I would skip this patch for now at least.  Too much controversy, and we
> don't know how the rest of the patches for this feature will look like
> to be able to know if it's worth it.

Ok, I'll drop the second patch for now. I committed the first patch 
after fixing the things you and Michael pointed out. Thanks for the review!

- Heikki




On 2 February 2017 at 00:13, Heikki Linnakangas <hlinnaka@iki.fi> wrote:
> Ok, I'll drop the second patch for now. I committed the first patch after
> fixing the things you and Michael pointed out. Thanks for the review!

dbd69118 caused small compiler warning for me.

The attached fixed it.

-- 
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Attachment
On 02/02/2017 05:50 AM, David Rowley wrote:
> On 2 February 2017 at 00:13, Heikki Linnakangas <hlinnaka@iki.fi> wrote:
>> Ok, I'll drop the second patch for now. I committed the first patch after
>> fixing the things you and Michael pointed out. Thanks for the review!
>
> dbd69118 caused small compiler warning for me.
>
> The attached fixed it.

Fixed, thanks!

- Heikki




Re: [HACKERS] Password identifiers, protocol aging and SCRAM protocol

From
Heikki Linnakangas
Date:
On 12/20/2016 03:47 AM, Michael Paquier wrote:
> The first thing is to be able to understand in the SCRAM code if a
> string is UTF-8 or not, and this code is in src/common/. pg_wchar.c
> offers a set of routines exactly for this purpose, which is built with
> libpq but that's not available for src/common/. So instead of moving
> all the file, I'd like to create a new file in src/common/utf8.c which
> includes pg_utf_mblen() and pg_utf8_islegal().

Sounds reasonable. They're short functions, might also be ok to just 
copy-paste them to scram-common.c.

> On top of that I think that having a routine able to check a full
> string would be useful for many users, as pg_utf8_islegal() can only
> check one set of characters. If the password string is found to be of
> UTF-8 format, SASLprepare is applied. If not, the string is copied
> as-is with perhaps unexpected effects for the client But he's in
> trouble already if client is not using UTF-8.

Yeah.

> The second thing is the normalization itself. Per RFC4013, NFKC needs
> to be applied to the string.  The operation is described in [1]
> completely, and it is named as doing 1) a compatibility decomposition
> of the bytes of the string, followed by 2) a canonical composition.
>
> About 1). The compatibility decomposition is defined in [2], "by
> recursively applying the canonical and compatibility mappings, then
> applying the canonical reordering algorithm". Canonical and
> compatibility mapping are some data available in UnicodeData.txt, the
> 6th column of the set defined in [3] to be precise. The meaning of the
> decomposition mappings is defined in [2] as well. The canonical
> decomposition is basically to look for a given UTF-8 character, and
> then apply the multiple characters resulting in its new shape. The
> compatibility mapping should as well be applied, but [5], a perl tool
> called charlint.pl doing this normalization work, does not care about
> this phase... Do we?

Not sure. We need to do whatever the "right thing" is, according to the 
RFC. I would assume that the spec is not ambiguous this, but I haven't 
looked into the details. If it's ambiguous, then I think we need to look 
at some popular implementations to see what they do.

> About 2)... Once the decomposition has been applied, those bytes need
> to be recomposed using the Canonical_Combining_Class field of
> UnicodeData.txt in [3], which is the 3rd column of the set. Its values
> are defined in [4]. An other interesting thing, charlint.pl [5] does
> not care about this phase. I am wondering if we should as well not
> just drop this part as well...
>
> Once 1) and 2) are done, NKFC is complete, and so is SASLPrepare.

Ok.

> So what we need from Postgres side is a mapping table to, having the
> following fields:
> 1) Hexa sequence of UTF8 character.
> 2) Its canonical combining class.
> 3) The kind of decomposition mapping if defined.
> 4) The decomposition mapping, in hexadecimal format.
> Based on what I looked at, either perl or python could be used to
> process UnicodeData.txt and to generate a header file that would be
> included in the tree. There are 30k entries in UnicodeData.txt, 5k of
> them have a mapping, so that will result in many tables. One thing to
> improve performance would be to store the length of the table in a
> static variable, order the entries by their hexadecimal keys and do a
> dichotomy lookup to find an entry. We could as well use more fancy
> things like a set of tables using a Radix tree using decomposed by
> bytes. We should finish by just doing one lookup of the table for each
> character sets anyway.

Ok. I'm not too worried about the performance of this. It's only used 
for passwords, which are not that long, and it's only done when 
connecting. I'm more worried about the disk/memory usage. How small can 
we pack the tables? 10kB? 100kB? Even a few MB would probably not be too 
bad in practice, but I'd hate to bloat up libpq just for this.

> In conclusion, at this point I am looking for feedback regarding the
> following items:
> 1) Where to put the UTF8 check routines and what to move.

Covered that above.

> 2) How to generate the mapping table using UnicodeData.txt. I'd think
> that using perl would be better.

Agreed, it needs to be in Perl. That's what we require to be present 
when building PostgreSQL, it's what we use for generating other tables 
and functions.

> 3) The shape of the mapping table, which depends on how many
> operations we want to support in the normalization of the strings.
> The decisions for those items will drive the implementation in one
> sense or another.

Let's aim for small disk/memory footprint.

- Heikki

> [1]: http://www.unicode.org/reports/tr15/#Description_Norm
> [2]: http://www.unicode.org/Public/5.1.0/ucd/UCD.html#Character_Decomposition_Mappings
> [3]: http://www.unicode.org/Public/5.1.0/ucd/UCD.html#UnicodeData.txt
> [4]: http://www.unicode.org/Public/5.1.0/ucd/UCD.html#Canonical_Combining_Class_Values
> [5]: https://www.w3.org/International/charlint/




Re: [HACKERS] Password identifiers, protocol aging and SCRAM protocol

From
Michael Paquier
Date:
On Fri, Feb 3, 2017 at 9:52 PM, Heikki Linnakangas <hlinnaka@iki.fi> wrote:
> On 12/20/2016 03:47 AM, Michael Paquier wrote:
>>
>> The first thing is to be able to understand in the SCRAM code if a
>> string is UTF-8 or not, and this code is in src/common/. pg_wchar.c
>> offers a set of routines exactly for this purpose, which is built with
>> libpq but that's not available for src/common/. So instead of moving
>> all the file, I'd like to create a new file in src/common/utf8.c which
>> includes pg_utf_mblen() and pg_utf8_islegal().
>
> Sounds reasonable. They're short functions, might also be ok to just
> copy-paste them to scram-common.c.

Having a separate file makes the most sense to me I think, if we can
avoid code duplication that's better.

>> The second thing is the normalization itself. Per RFC4013, NFKC needs
>> to be applied to the string.  The operation is described in [1]
>> completely, and it is named as doing 1) a compatibility decomposition
>> of the bytes of the string, followed by 2) a canonical composition.
>>
>> About 1). The compatibility decomposition is defined in [2], "by
>> recursively applying the canonical and compatibility mappings, then
>> applying the canonical reordering algorithm". Canonical and
>> compatibility mapping are some data available in UnicodeData.txt, the
>> 6th column of the set defined in [3] to be precise. The meaning of the
>> decomposition mappings is defined in [2] as well. The canonical
>> decomposition is basically to look for a given UTF-8 character, and
>> then apply the multiple characters resulting in its new shape. The
>> compatibility mapping should as well be applied, but [5], a perl tool
>> called charlint.pl doing this normalization work, does not care about
>
> Not sure. We need to do whatever the "right thing" is, according to the RFC.
> I would assume that the spec is not ambiguous this, but I haven't looked
> into the details. If it's ambiguous, then I think we need to look at some
> popular implementations to see what they do.

The spec defines quite correctly what should be done. The
implementations are sometimes quite loose on some points though (see
charlint.pl).

>> So what we need from Postgres side is a mapping table to, having the
>> following fields:
>> 1) Hexa sequence of UTF8 character.
>> 2) Its canonical combining class.
>> 3) The kind of decomposition mapping if defined.
>> 4) The decomposition mapping, in hexadecimal format.
>> Based on what I looked at, either perl or python could be used to
>> process UnicodeData.txt and to generate a header file that would be
>> included in the tree. There are 30k entries in UnicodeData.txt, 5k of
>> them have a mapping, so that will result in many tables. One thing to
>> improve performance would be to store the length of the table in a
>> static variable, order the entries by their hexadecimal keys and do a
>> dichotomy lookup to find an entry. We could as well use more fancy
>> things like a set of tables using a Radix tree using decomposed by
>> bytes. We should finish by just doing one lookup of the table for each
>> character sets anyway.
>
> Ok. I'm not too worried about the performance of this. It's only used for
> passwords, which are not that long, and it's only done when connecting. I'm
> more worried about the disk/memory usage. How small can we pack the tables?
> 10kB? 100kB? Even a few MB would probably not be too bad in practice, but
> I'd hate to bloat up libpq just for this.

Indeed. I think I'll develop first a small utility able to do
operation. There is likely some knowledge in mb/Unicode that we can
use here. The radix tree patch would perhaps help?

>> 3) The shape of the mapping table, which depends on how many
>> operations we want to support in the normalization of the strings.
>> The decisions for those items will drive the implementation in one
>> sense or another.
>
> Let's aim for small disk/memory footprint.

OK, I'll try to give it a shot in a couple of days in the shape of an
extention or something like that. Thanks for the feedback.
-- 
Michael