Thread: Password identifiers, protocol aging and SCRAM protocol
Hi all As a continuation of the thread firstly dedicated to SCRAM: http://www.postgresql.org/message-id/55192AFE.6080106@iki.fi Here is a new thread aimed at gathering all the ideas of this previous thread and aimed at clarifying a bit what has been discussed until now regarding password protocols, verifiers, and SCRAM itself. Attached is a set of patches implementing a couple of things that have been discussed, so let's roll in. There are a couple of concepts that are introduced in this set of patches, and those patches are aimed at resolving the following things: - Introduce in Postgres an extensible password aging facility, by having a new concept of 1 user/multiple password verifier, one password verifier per protocol. - Give to system administrators tools to decide unsupported protocols, and have pg_upgrade use that - Introduce new password protocols for Postgres, aimed at replacing existing, say limited ones. Note that here is not discussed the point of password verifier rolling, which is the possibility to have multiple verifiers of the same protocol for the same user (this maps with the fact that valid_until is still part of pg_authid here, but in order to support authentication rolling it would be necessary to move it to pg_auth_verifiers). Here is a short description of each patch and what they do: 1) 0001, removing the password column from pg_authid and putting it into a new catalog called pg_auth_verifiers that has the following format: - Role OID - Password protocol - Password verifier The protocols proposed in this patch are "plain" and "md5", which map to the current things that Postgres has, so there is nothing new. What is new is the new clause PASSWORD VERIFIERS usable by CREATE/ALTER USER, like that: ALTER ROLE foo PASSWORD VERIFIERS (md5 = 'foo', plain = 'foo'); This is easily extensible as new protocols can be added on top of that. This has been discussed in the previous thread. As discussed as well previously, password_encryption is switched from a boolean switch to a list of protocols, which is md5 by default in this patch. Also, as discussed in 6174.1455501497@sss.pgh.pa.us, pg_shadow has been changed so as the password value is replaced by '*****'. This patch adds docs, regression tests, pg_dump support, etc. 2) 0002, introduction of a new GUC parameter password_protocols (superuser-only) aimed at controlling the password verifiers of protocols that can be created. This is quite simple: all the protocols specified in this list define what are the protocols allowed when creating password verifiers using CREATE/ALTER ROLE. By default, and in this patch, this is set to 'plain,md5', which is the current default in Postgres, though a system admin could set it to 'md5', to forbid the creation of unencrypted passwords for example. Docs and regressions are added on the stack, the regression tests taking advantage of the fact that this is a superuser parameters. This patch is an answer to remarks done in the last thread regarding the fact that there is no way to handle how a system controls what are the password verifier types created, and protocol aging gets its sense with with patch and 0003... 3) 0003, Introduction of a system function, that I called pg_auth_verifiers_sanitize, which is superuser-only, aimed at cleaning up password verifiers in pg_auth_verifiers depending on what the user has defined in password_protocols. This basically does a heap scan of pg_auth_verifiers, and deletes the tuple entries that are of protocols not listed in password_protocols. I have hesitated to put that in pg_upgrade_support.c, perhaps it would make more sense to have it there, but feedback is welcome. I have in mind that it is actually useful for users to have this function at hand to do post-upgrade cleanup operations. Regression tests cannot be added for this one, I guess the reason to not have them is obvious when considering installcheck... 4) 0004, Have pg_upgrade make use of the system function introduced by 0003. This is quite simple, and this allows pg_upgrade to remove entries of outdated protocols. Those 4 patches are aimed at putting in-core basics for the concept I call password protocol aging, which is a way to allow multiple password protocols to be defined in Postgres, and aimed at easing administration as well as retirement of outdated protocols, which is something that is not doable now in Postgres. The second set of patch 0005~0008 introduces a new protocol, SCRAM. This is a brushed up, rebased version of the previous patches, and is divided as follows: 5) 0005, Move of SHA1 routines of pgcrypto to src/common to allow frontend authentication code path to use SHA1. 6) 0006 is a refactoring of sendAuthRequest that taken independently makes sense. 7) 0007 is a small refactoring of RandomSalt(), to allow this function to handle salt values of different lengths 8) 0008 is another refactoring, moving a set of encoding routines from the backend's encode.c to src/common, escape, base64 and hex are moved as such, though SCRAM uses only base64. For consistency moving all the set made more sense to me. 9) 0009 is the SCRAM authentication itself.... The first 4 patches obviously are the core portion that I would like to discuss about in this CF, as they put in the base for the rest, and will surely help Postgres long-term. 0005~0008 are just refactoring patches, so they are quite simple. 0009 though is quite difficult, and needs careful review because it manipulates areas of the code where it is not necessary to be an authenticated user, so if there are bugs in it it would be possible for example to crash down Postgres just by sending authentication requests. Regards, -- Michael
Attachment
- 0001-Add-facility-to-store-multiple-password-verifiers.patch
- 0002-Introduce-password_protocols.patch
- 0003-Add-pg_auth_verifiers_sanitize.patch
- 0004-Remove-password-verifiers-for-unsupported-protocols-.patch
- 0005-Move-sha1.c-to-src-common.patch
- 0006-Refactor-sendAuthRequest.patch
- 0007-Refactor-RandomSalt-to-handle-salts-of-different-len.patch
- 0008-Move-encoding-routines-to-src-common.patch
- 0009-SCRAM-authentication.patch
Hi, Michael 23.02.2016 10:17, Michael Paquier пишет: > Attached is a set of patches implementing a couple of things that have > been discussed, so let's roll in. > > Those 4 patches are aimed at putting in-core basics for the concept I > call password protocol aging, which is a way to allow multiple > password protocols to be defined in Postgres, and aimed at easing > administration as well as retirement of outdated protocols, which is > something that is not doable now in Postgres. > > The second set of patch 0005~0008 introduces a new protocol, SCRAM. > 9) 0009 is the SCRAM authentication itself.... The theme with password checking is interesting for me, and I can give review for CF for some features. I think that review of all suggested features will require a lot of time. Is it possible to make subset of patches concerning only password strength and its aging? The patches you have applied are non-independent. They should be apply consequentially one by one. Thus the patch 0009 can't be applied without git error before 0001. In this conditions all patches were successfully applied and compiled. All tests successfully passed. > The first 4 patches obviously are the core portion that I would like > to discuss about in this CF, as they put in the base for the rest, and > will surely help Postgres long-term. 0005~0008 are just refactoring > patches, so they are quite simple. 0009 though is quite difficult, and > needs careful review because it manipulates areas of the code where it > is not necessary to be an authenticated user, so if there are bugs in > it it would be possible for example to crash down Postgres just by > sending authentication requests. > -- Regards, Valery Popov Postgres Professional http://www.postgrespro.com The Russian Postgres Company
On Fri, Feb 26, 2016 at 1:38 AM, Valery Popov <v.popov@postgrespro.ru> wrote: > Hi, Michael > > > 23.02.2016 10:17, Michael Paquier пишет: >> >> Attached is a set of patches implementing a couple of things that have >> been discussed, so let's roll in. >> >> Those 4 patches are aimed at putting in-core basics for the concept I >> call password protocol aging, which is a way to allow multiple >> password protocols to be defined in Postgres, and aimed at easing >> administration as well as retirement of outdated protocols, which is >> something that is not doable now in Postgres. >> >> The second set of patch 0005~0008 introduces a new protocol, SCRAM. >> 9) 0009 is the SCRAM authentication itself.... > > The theme with password checking is interesting for me, and I can give > review for CF for some features. > I think that review of all suggested features will require a lot of time. > Is it possible to make subset of patches concerning only password strength > and its aging? > The patches you have applied are non-independent. They should be apply > consequentially one by one. > Thus the patch 0009 can't be applied without git error before 0001. > In this conditions all patches were successfully applied and compiled. > All tests successfully passed. If you want to focus on the password protocol aging, you could just have a look at 0001~0004. -- Michael
26.02.2016 01:10, Michael Paquier пишет: > On Fri, Feb 26, 2016 at 1:38 AM, Valery Popov <v.popov@postgrespro.ru> wrote: >> Hi, Michael >> >> >> 23.02.2016 10:17, Michael Paquier пишет: >>> Attached is a set of patches implementing a couple of things that have >>> been discussed, so let's roll in. >>> >>> Those 4 patches are aimed at putting in-core basics for the concept I >>> call password protocol aging, which is a way to allow multiple >>> password protocols to be defined in Postgres, and aimed at easing >>> administration as well as retirement of outdated protocols, which is >>> something that is not doable now in Postgres. >>> >>> The second set of patch 0005~0008 introduces a new protocol, SCRAM. >>> 9) 0009 is the SCRAM authentication itself.... >> The theme with password checking is interesting for me, and I can give >> review for CF for some features. >> I think that review of all suggested features will require a lot of time. >> Is it possible to make subset of patches concerning only password strength >> and its aging? >> The patches you have applied are non-independent. They should be apply >> consequentially one by one. >> Thus the patch 0009 can't be applied without git error before 0001. >> In this conditions all patches were successfully applied and compiled. >> All tests successfully passed. > If you want to focus on the password protocol aging, you could just > have a look at 0001~0004. OK, I will review patches 0001-0004, for starting. -- Regards, Valery Popov Postgres Professional http://www.postgrespro.com The Russian Postgres Company
Hi, Michael >>> >>> >>> 23.02.2016 10:17, Michael Paquier пишет: >>>> Attached is a set of patches implementing a couple of things that have >>>> been discussed, so let's roll in. >>>> >>>> Those 4 patches are aimed at putting in-core basics for the concept I >>>> call password protocol aging, which is a way to allow multiple >>>> password protocols to be defined in Postgres, and aimed at easing >>>> administration as well as retirement of outdated protocols, which is >>>> something that is not doable now in Postgres. >>>> >>>> The second set of patch 0005~0008 introduces a new protocol, SCRAM. >>>> 9) 0009 is the SCRAM authentication itself.... >>> The theme with password checking is interesting for me, and I can give >>> review for CF for some features. >>> I think that review of all suggested features will require a lot of >>> time. >>> Is it possible to make subset of patches concerning only password >>> strength >>> and its aging? >>> The patches you have applied are non-independent. They should be apply >>> consequentially one by one. >>> Thus the patch 0009 can't be applied without git error before 0001. >>> In this conditions all patches were successfully applied and compiled. >>> All tests successfully passed. >> If you want to focus on the password protocol aging, you could just >> have a look at 0001~0004. > OK, I will review patches 0001-0004, for starting. > Below are the results of compiling and testing. ============================ I've got the last version of sources from git://git.postgresql.org/git/postgresql.git. vpopov@vpopov-Ubuntu:~/Projects/pwdtest/postgresql$ git branch * master Then I've applied patches 0001-0004 with two warnings: vpopov@vpopov-Ubuntu:~/Projects/pwdtest/postgresql$ git apply 0001-Add-facility-to-store-multiple-password-verifiers.patch 0001-Add-facility-to-store-multiple-password-verifiers.patch:2547: trailing whitespace. warning: 1 line adds whitespace errors. vpopov@vpopov-Ubuntu:~/Projects/pwdtest/postgresql$ git apply 0002-Introduce-password_protocols.patch vpopov@vpopov-Ubuntu:~/Projects/pwdtest/postgresql$ git apply 0003-Add-pg_auth_verifiers_sanitize.patch 0003-Add-pg_auth_verifiers_sanitize.patch:87: indent with spaces. if (!superuser()) warning: 1 line adds whitespace errors. vpopov@vpopov-Ubuntu:~/Projects/pwdtest/postgresql$ git apply 0004-Remove-password-verifiers-for-unsupported-protocols-.patch The compilation with option ./configure --enable-debug --enable-nls --enable-cassert --enable-tap-tests --with-perl was successful. Regression tests and all TAP-tests also passed successfully. Also I've applied patches 0005-0008 into clean sources directory with no warnings. vpopov@vpopov-Ubuntu:~/Projects/pwdtest2/postgresql$ git apply 0005-Move-sha1.c-to-src-common.patch vpopov@vpopov-Ubuntu:~/Projects/pwdtest2/postgresql$ git apply 0006-Refactor-sendAuthRequest.patch vpopov@vpopov-Ubuntu:~/Projects/pwdtest2/postgresql$ git apply 0007-Refactor-RandomSalt-to-handle-salts-of-different-len.patch vpopov@vpopov-Ubuntu:~/Projects/pwdtest2/postgresql$ git apply 0008-Move-encoding-routines-to-src-common.patch The compilation with option ./configure --enable-debug --enable-nls --enable-cassert --enable-tap-tests --with-perl was successful. Regression and the TAP-tests also passed successfully. The patch 0009 depends on all previous patches 0001-0008: first we need to apply patches 0001-0008, then 0009. Then, all patches were successfully compiled. All test passed. -- Regards, Valery Popov Postgres Professional http://www.postgrespro.com The Russian Postgres Company
On Mon, Feb 29, 2016 at 8:43 PM, Valery Popov <v.popov@postgrespro.ru> wrote: > vpopov@vpopov-Ubuntu:~/Projects/pwdtest/postgresql$ git branch Thanks for the input! > 0001-Add-facility-to-store-multiple-password-verifiers.patch:2547: trailing > whitespace. > warning: 1 line adds whitespace errors. > 0003-Add-pg_auth_verifiers_sanitize.patch:87: indent with spaces. > if (!superuser()) > warning: 1 line adds whitespace errors. Argh, yes. Those two ones have slipped though my successive rebases I think. Will fix in my tree, I don't think that it is worth sending again the whole series just for that though. -- Michael
On 1 March 2016 at 06:34, Michael Paquier <michael.paquier@gmail.com> wrote:
Hi, MichaelOn Mon, Feb 29, 2016 at 8:43 PM, Valery Popov <v.popov@postgrespro.ru> wrote:
> vpopov@vpopov-Ubuntu:~/Projects/pwdtest/postgresql$ git branch
Thanks for the input!
> 0001-Add-facility-to-store-multiple-password-verifiers.patch:2547: trailing
> whitespace.
> warning: 1 line adds whitespace errors.
> 0003-Add-pg_auth_verifiers_sanitize.patch:87: indent with spaces.
> if (!superuser())
> warning: 1 line adds whitespace errors.
Argh, yes. Those two ones have slipped though my successive rebases I
think. Will fix in my tree, I don't think that it is worth sending
again the whole series just for that though.
--
Michael
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Few questions about the documentation.
config.sgml:1200
> <listitem>
> <para>
> Specifies a comma-separated list of supported password formats by
> the server. Supported formats are currently <literal>plain</> and
> <literal>md5</>.
> </para>
>
> <para>
> When a password is specified in <xref linkend="sql-createuser"> or
> <xref linkend="sql-alterrole">, this parameter determines if the
> password specified is authorized to be stored or not, returning
> an error message to caller if it is not.
> </para>
>
> <para>
> The default is <literal>plain,md5,scram</>, meaning that MD5-encrypted
> passwords, plain passwords, and SCRAM-encrypted passwords are accepted.
> </para>
> </listitem>
The default value contains "scram". Shouldn't be here also:
> Specifies a comma-separated list of supported password formats by
> the server. Supported formats are currently <literal>plain</>,
> <literal>md5</> and <literal>scram</>.
Or I missed something?
And one more:
config.sgml:1284
> <para>
> <varname>db_user_namespace</> causes the client's and
> server's user name representation to differ.
> Authentication checks are always done with the server's user name
> so authentication methods must be configured for the
> server's user name, not the client's. Because
> <literal>md5</> uses the user name as salt on both the
> client and server, <literal>md5</> cannot be used with
> <varname>db_user_namespace</>.
> </para>
Looks like the same (pls, correct me if I'm wrong) is applicable for "scram" as I see from the code below. Shouldn't be "scram" mentioned here also? Here's the code:
> diff --git a/src/backend/libpq/hba.c b/src/backend/libpq/hba.c
> index 28f9fb5..df0cc1d 100644
> --- a/src/backend/libpq/hba.c
> +++ b/src/backend/libpq/hba.c
> @@ -1184,6 +1184,19 @@ parse_hba_line(List *line, int line_num, char *raw_line)
> }
> parsedline->auth_method = uaMD5;
> }
>+ else if (strcmp(token->string, "scram") == 0)
>+ {
>+ if (Db_user_namespace)
>+ {
>+ ereport(LOG,
>+ (errcode(ERRCODE_CONFIG_FILE_ERROR),
>+ errmsg("SCRAM authentication is not supported when \"db_user_namespace\" is enabled"),
>+ errcontext("line %d of configuration file \"%s\"",
>+ line_num, HbaFileName)));
>+ return NULL;
>+ }
>+ parsedline->auth_method = uaSASL;
>+ }
> else if (strcmp(token->string, "pam") == 0)
> #ifdef USE_PAM
> parsedline->auth_method = uaPAM;
On Wed, Mar 2, 2016 at 4:05 AM, Dmitry Dolgov <9erthalion6@gmail.com> wrote: > [...] Thanks for the review. > The default value contains "scram". Shouldn't be here also: > >> Specifies a comma-separated list of supported password formats by >> the server. Supported formats are currently <literal>plain</>, >> <literal>md5</> and <literal>scram</>. > > Or I missed something? Ah, I see. That's in the documentation of password_protocols. Yes scram should be listed there as well. That should be fixed in 0009. >> <para> >> <varname>db_user_namespace</> causes the client's and >> server's user name representation to differ. >> Authentication checks are always done with the server's user name >> so authentication methods must be configured for the >> server's user name, not the client's. Because >> <literal>md5</> uses the user name as salt on both the >> client and server, <literal>md5</> cannot be used with >> <varname>db_user_namespace</>. >> </para> > > Looks like the same (pls, correct me if I'm wrong) is applicable for "scram" > as I see from the code below. Shouldn't be "scram" mentioned here also? Oops. Good catch. Yes it should be mentioned as part of the SCRAM patch (0009). -- Michael
>> <para> >> <varname>db_user_namespace</> causes the client's and >> server's user name representation to differ. >> Authentication checks are always done with the server's user name >> so authentication methods must be configured for the >> server's user name, not the client's. Because >> <literal>md5</> uses the user name as salt on both the >> client and server, <literal>md5</> cannot be used with >> <varname>db_user_namespace</>. >> </para> Also in doc/src/sgml/ref/create_role.sgml is should be instead of <term>PASSWORD VERIFIERS ( <replaceable class="PARAMETER">verifier_type</replaceable> = '<replaceable class="PARAMETER">password</replaceable>'</term> like this <term><literal>PASSWORD VERIFIERS</> ( <replaceable class="PARAMETER">verifier_type</replaceable> = '<replaceable class="PARAMETER">password</replaceable>'</term>-- Regards, Valery Popov Postgres Professional http://www.postgrespro.com The Russian Postgres Company
On Wed, Mar 2, 2016 at 5:43 PM, Valery Popov <v.popov@postgrespro.ru> wrote: > >>> <para> >>> <varname>db_user_namespace</> causes the client's and >>> server's user name representation to differ. >>> Authentication checks are always done with the server's user name >>> so authentication methods must be configured for the >>> server's user name, not the client's. Because >>> <literal>md5</> uses the user name as salt on both the >>> client and server, <literal>md5</> cannot be used with >>> <varname>db_user_namespace</>. >>> </para> > > Also in doc/src/sgml/ref/create_role.sgml is should be instead of > <term>PASSWORD VERIFIERS ( <replaceable > class="PARAMETER">verifier_type</replaceable> = '<replaceable > class="PARAMETER">password</replaceable>'</term> > like this > <term><literal>PASSWORD VERIFIERS</> ( <replaceable > class="PARAMETER">verifier_type</replaceable> = '<replaceable > class="PARAMETER">password</replaceable>'</term> So the <literal> markup is missing. Thanks. I am taking note of it. -- Michael
This is a review of "Password identifiers, protocol aging and SCRAM protocol" patches http://www.postgresql.org/message-id/CAB7nPqSMXU35g=W9X74HVeQp0uvgJxvYOuA4A-A3M+0wfEBv-w@mail.gmail.com Contents & Purpose -------------------------- There was a discussion dedicated to SCRAM: http://www.postgresql.org/message-id/55192AFE.6080106@iki.fi This set of patches implements the following: - Introduce in Postgres an extensible password aging facility, by having a new concept of 1 user/multiple password verifier, one password verifier per protocol. - Give to system administrators tools to decide unsupported protocols, and have pg_upgrade use that - Introduce new password protocols for Postgres, aimed at replacing existing, say limited ones. This set of patches consists of 9 separate patches. Description of each patch is well described in initial thread email and in comments. The first set of patches 0001-0008 adds facility to store multiple password verifiers, CREATE ROLE and ALTER ROLE are extended with PASSWORD VERIFIERS, new superuser GUC parameters which specifies a list of supported password protocols in Postgres backend, added pg_auth_verifiers_sanitize function, removed password verifiers for unsupported protocols in pg_upgrade, and more features. The second set of patch 0005~0008 introduces a new protocol, SCRAM, and 0009 is SCRAM itself. Initial Run ------------- Included in the patches are: - source code - regression tests - documentation The source code is well commented. The patches are in context diff format and were applied correctly to HEAD (there were 2 warnings, and it was fixed by author). There were several markup warnings, should be fixed by author. Regression tests pass successfully, without errors. It seems that the patches work as expected. The patch 0009 depends on all previous patches 0001-0008: first we need to apply patches 0001-0008, then 0009. Performance ----------- I have not tested possible performance issues yet. Conclusion -------------- I think introduced features are useful and I vote for commit +1. On 03/02/2016 02:55 PM, Michael Paquier wrote: > On Wed, Mar 2, 2016 at 5:43 PM, Valery Popov <v.popov@postgrespro.ru> wrote: > So the <literal> markup is missing. Thanks. I am taking note of it. -- Regards, Valery Popov Postgres Professional http://www.postgrespro.com The Russian Postgres Company
On 2/23/16 2:17 AM, Michael Paquier wrote: > As a continuation of the thread firstly dedicated to SCRAM: > http://www.postgresql.org/message-id/55192AFE.6080106@iki.fi > Here is a new thread aimed at gathering all the ideas of this previous > thread and aimed at clarifying a bit what has been discussed until now > regarding password protocols, verifiers, and SCRAM itself. It looks like this patch set is a bit out of date. When applying 0004: $ git apply ../other/0004-Remove-password-verifiers-for-unsupported-protocols-.patch error: patch failed: src/bin/pg_upgrade/pg_upgrade.c:262 error: src/bin/pg_upgrade/pg_upgrade.c: patch does not apply Then I tried to build with just 0001-0003: cd /postgres/src/include/catalog && '/usr/bin/perl' ./duplicate_oids 3318 3319 3320 3321 3322 make[3]: *** [postgres.bki] Error 1 Could you provide an updated set of patches for review? Meanwhile I am marking this as "waiting for author". Thanks, -- -David david@pgmasters.net
On Mon, Mar 14, 2016 at 4:32 PM, David Steele <david@pgmasters.net> wrote: > On 2/23/16 2:17 AM, Michael Paquier wrote: > >> As a continuation of the thread firstly dedicated to SCRAM: >> http://www.postgresql.org/message-id/55192AFE.6080106@iki.fi >> Here is a new thread aimed at gathering all the ideas of this previous >> thread and aimed at clarifying a bit what has been discussed until now >> regarding password protocols, verifiers, and SCRAM itself. > > > It looks like this patch set is a bit out of date. > > When applying 0004: > > $ git apply > ../other/0004-Remove-password-verifiers-for-unsupported-protocols-.patch > error: patch failed: src/bin/pg_upgrade/pg_upgrade.c:262 > error: src/bin/pg_upgrade/pg_upgrade.c: patch does not apply > > Then I tried to build with just 0001-0003: > > cd /postgres/src/include/catalog && '/usr/bin/perl' ./duplicate_oids > 3318 > 3319 > 3320 > 3321 > 3322 > make[3]: *** [postgres.bki] Error 1 > > Could you provide an updated set of patches for review? Meanwhile I am > marking this as "waiting for author". Sure. I'll provide them shortly with all the comments addressed. Up to now I just had a couple of comments about docs and whitespaces, so I didn't really bother sending a new set, but this meritates a rebase. -- Michael
On Mon, Mar 14, 2016 at 5:06 PM, Michael Paquier <michael.paquier@gmail.com> wrote: > On Mon, Mar 14, 2016 at 4:32 PM, David Steele <david@pgmasters.net> wrote: >> Could you provide an updated set of patches for review? Meanwhile I am >> marking this as "waiting for author". > > Sure. I'll provide them shortly with all the comments addressed. Up to > now I just had a couple of comments about docs and whitespaces, so I > didn't really bother sending a new set, but this meritates a rebase. And here they are. I have addressed the documentation and the whitespaces reported up to now at the same time. -- Michael
Attachment
- 0001-Add-facility-to-store-multiple-password-verifiers.patch
- 0002-Introduce-password_protocols.patch
- 0003-Add-pg_auth_verifiers_sanitize.patch
- 0004-Remove-password-verifiers-for-unsupported-protocols-.patch
- 0005-Move-sha1.c-to-src-common.patch
- 0006-Refactor-sendAuthRequest.patch
- 0007-Refactor-RandomSalt-to-handle-salts-of-different-len.patch
- 0008-Move-encoding-routines-to-src-common.patch
- 0009-SCRAM-authentication.patch
Hi, All On 03/15/2016 02:07 AM, Michael Paquier wrote: > Sure. I'll provide them shortly with all the comments addressed. Up to > now I just had a couple of comments about docs and whitespaces, so I > didn't really bother sending a new set, but this meritates a rebase. > And here they are. I have addressed the documentation and the > whitespaces reported up to now at the same time. I've applied all of 0001-0009 patches from the new set with no any warnings to today's master branch. Then compiled with configure options: ./configure --enable-debug --enable-nls --enable-cassert --enable-tap-tests --with-perl All regression tests passed successfully. make check-world passed successfully. make installcheck-world failed on several contrib modules: dblink, file_fdw, hstore, pgcrypto, pgstattuple, postgres_fdw, tablefunc. The tests results are attached. Documentation looks good. Where may be a problem with make check-world and make installcheck-world results? -- Regards, Valery Popov Postgres Professional http://www.postgrespro.com The Russian Postgres Company
Attachment
On Tue, Mar 15, 2016 at 3:46 PM, Valery Popov wrote: > make installcheck-world failed on several contrib modules: > dblink, file_fdw, hstore, pgcrypto, pgstattuple, postgres_fdw, tablefunc. > The tests results are attached. > Documentation looks good. > Where may be a problem with make check-world and make installcheck-world > results? I cannot reproduce this, and my guess is that the binaries of those contrib/ modules are not up to date for the installed instance of Postgres you are running the tests on. Particularly I find this portion doubtful: SELECT avg(normal_rand)::int FROM normal_rand(100, 250, 0.2); ! server closed the connection unexpectedly ! This probably means the server terminated abnormally ! before or while processing the request. ! connection to server was lost The set of patches I am proposing here does not go through those code paths, and this is likely an aggregate failure. -- Michael
Hi Michael, On 3/14/16 7:07 PM, Michael Paquier wrote: > On Mon, Mar 14, 2016 at 5:06 PM, Michael Paquier <michael.paquier@gmail.com> wrote: > >> On Mon, Mar 14, 2016 at 4:32 PM, David Steele <david@pgmasters.net> wrote: >> >>> Could you provide an updated set of patches for review? Meanwhile I am >>> marking this as "waiting for author". >> >> Sure. I'll provide them shortly with all the comments addressed. Up to >> now I just had a couple of comments about docs and whitespaces, so I >> didn't really bother sending a new set, but this meritates a rebase. > > And here they are. I have addressed the documentation and the > whitespaces reported up to now at the same time. For this first review I would like to focus on the user visible changes introduced in 0001-0002. First I created two new users with each type of supported verifier: postgres=# create user test with password 'test'; CREATE ROLE postgres=# create user testu with unencrypted password 'testu' valid until '2017-01-01'; CREATE ROLE 1) I see that rolvaliduntil is still in pg_authid: postgres=# select oid, rolname, rolvaliduntil from pg_authid; oid | rolname | rolvaliduntil -------+---------+------------------------ 10 | vagrant |16387 | test |16388 | testu | 2017-01-01 00:00:00+00 I think that's OK if we now define it to be "role validity" (it's still password validity in the patched docs). I would also like to see a validuntil column in pg_auth_verifiers so we can track password expiration for each verifier separately. For now I think it's enough to copy the same validity both places since there can only be one verifier. 2) I don't think the column naming in pg_auth_verifiers is consistent with other catalogs: postgres=# select * from pg_auth_verifiers; roleid | verimet | verival --------+---------+------------------------------------- 16387 | m | md505a671c66aefea124cc08b76ea6d30bb 16388 | p | testu System catalogs generally use a 3 character prefix so I would expect the columns to be (if we pick avr as a prefix): avrrole avrmethod avrverifier avrvaliduntil I'm not a big fan in abbreviating too much so you can see I've expanded the names a bit. 3) rolpassword is still in pg_shadow even though it is not useful anymore: postgres=# select usename, passwd, valuntil from pg_shadow; usename | passwd | valuntil ---------+----------+------------------------vagrant | ******** |test | ******** |testu | ******** | 2017-01-01 00:00:00+00 If anyone is actually using this column in a meaningful way they are in for a nasty surprise when trying use the value in passwd as a verifier.I would prefer to drop the column entirely and producea clear error. Perhaps a better option would be to drop pg_shadow entirely since it seems to have no further purpose in life. Thanks, -- -David david@pgmasters.net
Hi! On 03/15/2016 06:59 PM, Michael Paquier wrote: > The set of patches I am proposing here does not go through those code > paths, and this is likely an aggregate failure. Michael, you were right. It was incorrect installation of contrib binaries. Now all tests pass OK, both check-world and installcheck-world, Thanks. -- Regards, Valery Popov Postgres Professional http://www.postgrespro.com The Russian Postgres Company
On Tue, Mar 15, 2016 at 6:38 PM, David Steele <david@pgmasters.net> wrote: > Hi Michael, > > On 3/14/16 7:07 PM, Michael Paquier wrote: > >> On Mon, Mar 14, 2016 at 5:06 PM, Michael Paquier <michael.paquier@gmail.com> wrote: >> >>> On Mon, Mar 14, 2016 at 4:32 PM, David Steele <david@pgmasters.net> wrote: >>> >>>> Could you provide an updated set of patches for review? Meanwhile I am >>>> marking this as "waiting for author". >>> >>> Sure. I'll provide them shortly with all the comments addressed. Up to >>> now I just had a couple of comments about docs and whitespaces, so I >>> didn't really bother sending a new set, but this meritates a rebase. >> >> And here they are. I have addressed the documentation and the >> whitespaces reported up to now at the same time. > > For this first review I would like to focus on the user visible changes > introduced in 0001-0002. Thanks for the input! > 1) I see that rolvaliduntil is still in pg_authid: > I think that's OK if we now define it to be "role validity" (it's still > password validity in the patched docs). I would also like to see a > validuntil column in pg_auth_verifiers so we can track password > expiration for each verifier separately. For now I think it's enough to > copy the same validity both places since there can only be one verifier. FWIW, this is an intentional change, and my goal is to focus on only the protocol aging for now. We will need to move rolvaliduntil to pg_auth_verifiers if we want to allow rolling updates of password verifiers for a given role, but that's a different patch, and we need to think about the SQL interface carefully. This infrastructure makes the move easier by the way to do that, and honestly I don't really see what we gain now by copying the same value to two different system catalogs. > 2) I don't think the column naming in pg_auth_verifiers is consistent > with other catalogs: > postgres=# select * from pg_auth_verifiers; > roleid | verimet | verival > --------+---------+------------------------------------- > 16387 | m | md505a671c66aefea124cc08b76ea6d30bb > 16388 | p | testu > > System catalogs generally use a 3 character prefix so I would expect the > columns to be (if we pick avr as a prefix): OK, this makes sense. > avrrole > avrmethod > avrverifier Assuming "ver" is the prefix, we get: verroleid, vermethod, vervalue. I kind of like those ones, more than with "avr" as prefix actually. Other ideas are of course welcome. > I'm not a big fan in abbreviating too much so you can see I've expanded > the names a bit. Sure. > 3) rolpassword is still in pg_shadow even though it is not useful anymore: > postgres=# select usename, passwd, valuntil from pg_shadow; > > usename | passwd | valuntil > ---------+----------+------------------------ > vagrant | ******** | > test | ******** | > testu | ******** | 2017-01-01 00:00:00+00 > > If anyone is actually using this column in a meaningful way they are in > for a nasty surprise when trying use the value in passwd as a verifier. > I would prefer to drop the column entirely and produce a clear error. > > Perhaps a better option would be to drop pg_shadow entirely since it > seems to have no further purpose in life. We discussed that on the previous thread and the conclusion was to keep pg_shadow, but to clobber the password value with "***", explaining this choice: http://www.postgresql.org/message-id/6174.1455501497@sss.pgh.pa.us -- Michael
On 3/16/16 9:00 AM, Michael Paquier wrote: > On Tue, Mar 15, 2016 at 6:38 PM, David Steele <david@pgmasters.net> wrote: > >> 1) I see that rolvaliduntil is still in pg_authid: >> I think that's OK if we now define it to be "role validity" (it's still >> password validity in the patched docs). I would also like to see a >> validuntil column in pg_auth_verifiers so we can track password >> expiration for each verifier separately. For now I think it's enough to >> copy the same validity both places since there can only be one verifier. > > FWIW, this is an intentional change, and my goal is to focus on only > the protocol aging for now. We will need to move rolvaliduntil to > pg_auth_verifiers if we want to allow rolling updates of password > verifiers for a given role, but that's a different patch, and we need > to think about the SQL interface carefully. This infrastructure makes > the move easier by the way to do that, and honestly I don't really see > what we gain now by copying the same value to two different system > catalogs. Here's my thinking. If validuntil is moved to pg_auth_verifiers now then people can start using it there. That will make it less traumatic when/if validuntil in pg_authid is removed later. The field in pg_authid could be deprecated in this release to let people know not to use it. Or, as I suggested it could be recast as role validity, which right now happens to be the same as password validity. >> 2) I don't think the column naming in pg_auth_verifiers is consistent >> with other catalogs: >> postgres=# select * from pg_auth_verifiers; >> roleid | verimet | verival >> --------+---------+------------------------------------- >> 16387 | m | md505a671c66aefea124cc08b76ea6d30bb >> 16388 | p | testu >> >> System catalogs generally use a 3 character prefix so I would expect the >> columns to be (if we pick avr as a prefix): > > OK, this makes sense. > >> avrrole >> avrmethod >> avrverifier > > Assuming "ver" is the prefix, we get: verroleid, vermethod, vervalue. > I kind of like those ones, more than with "avr" as prefix actually. > Other ideas are of course welcome. ver is fine as a prefix. >> 3) rolpassword is still in pg_shadow even though it is not useful anymore: >> postgres=# select usename, passwd, valuntil from pg_shadow; >> >> usename | passwd | valuntil >> ---------+----------+------------------------ >> vagrant | ******** | >> test | ******** | >> testu | ******** | 2017-01-01 00:00:00+00 >> >> If anyone is actually using this column in a meaningful way they are in >> for a nasty surprise when trying use the value in passwd as a verifier. >> I would prefer to drop the column entirely and produce a clear error. >> >> Perhaps a better option would be to drop pg_shadow entirely since it >> seems to have no further purpose in life. > > We discussed that on the previous thread and the conclusion was to > keep pg_shadow, but to clobber the password value with "***", > explaining this choice: > http://www.postgresql.org/message-id/6174.1455501497@sss.pgh.pa.us Ah, I missed that one. -- -David david@pgmasters.net
Hi Michael, On 3/14/16 7:07 PM, Michael Paquier wrote: > On Mon, Mar 14, 2016 at 5:06 PM, Michael Paquier > <michael.paquier@gmail.com> wrote: >> On Mon, Mar 14, 2016 at 4:32 PM, David Steele <david@pgmasters.net> wrote: >>> Could you provide an updated set of patches for review? Meanwhile I am >>> marking this as "waiting for author". >> >> Sure. I'll provide them shortly with all the comments addressed. Up to >> now I just had a couple of comments about docs and whitespaces, so I >> didn't really bother sending a new set, but this meritates a rebase. > > And here they are. I have addressed the documentation and the > whitespaces reported up to now at the same time. Here's my full review of this patch set. First let me thank you for submitting this patch for the current CF. I feel a bit guilty that I requested it and am only now posting a full review. In my defense I can only say that being CFM has been rather more work than I was expecting, but I'm sure you know the feeling. * [PATCH 1/9] Add facility to store multiple password verifiers This is a pretty big patch but I went through it carefully and found nothing to complain about. Your attention to detail is impressive as always. Be sure to update the column names for pg_auth_verifiers as we discussed in [1]. * [PATCH 2/9] Introduce password_protocols diff --git a/src/test/regress/expected/password.out b/src/test/regress/expected/password.out +SET password_protocols = 'plain'; +ALTER ROLE role_passwd5 PASSWORD VERIFIERS (plain = 'foo'); -- ok +ALTER ROLE role_passwd5 PASSWORD VERIFIERS (md5 = 'foo'); -- error +ERROR: specified password protocol not allowed +DETAIL: List of authorized protocols is specified by password_protocols. So that makes sense but you get the same result if you do: postgres=# alter user role_passwd5 password 'foo'; ERROR: specified password protocol not allowed DETAIL: List of authorized protocols is specified by password_protocols. I don't think this makes sense - if I have explicitly set password_protocols to 'plain' and I don't specify a verifier for alter user then it seems like it should work. If nothing else the error message lacks information needed to identify the problem. * [PATCH 3/9] Add pg_auth_verifiers_sanitize This function is just a little scary but since password_protocols defaults to 'plain,md5' I can live with it. * [PATCH 4/9] Remove password verifiers for unsupported protocols in pg_upgrade Same as above - it will always be important for password_protocols to default to *all* protocols to avoid data being dropped during the pg_upgrade by accident. You've done that here (and later in the SCRAM patch) so I'm satisfied but it bears watching. What I would do is add some extra comments in the GUC code to make it clear to always update the default when adding new verifiers. * [PATCH 5/9] Move sha1.c to src/common This looks fine to me and is a good reuse of code. * [PATCH 6/9] Refactor sendAuthRequest I tested this across different client versions and it seems to work fine. * [PATCH 7/9] Refactor RandomSalt to handle salts of different lengths A simple enough refactor. * [PATCH 8/9] Move encoding routines to src/common/ A bit surprising that these functions were never used by any front end code. * Subject: [PATCH 9/9] SCRAM authentication diff --git a/src/backend/commands/user.c b/src/backend/commands/user.c @@ -1616,18 +1619,34 @@ FlattenPasswordIdentifiers(List *verifiers, char *rolname) * instances of Postgres, an md5 hash passed as a plain verifier * should still be treated as anMD5 entry. */ - if (spec->veriftype == AUTH_VERIFIER_MD5 && - !isMD5(spec->value)) + switch (spec->veriftype) { - char encrypted_passwd[MD5_PASSWD_LEN + 1]; - if (!pg_md5_encrypt(spec->value, rolname, strlen(rolname), - encrypted_passwd)) - elog(ERROR, "password encryption failed"); - spec->value = pstrdup(encrypted_passwd); + case AUTH_VERIFIER_MD5: It seems like this case statement should have been introduced in patch 0001. Were you just trying to avoid churn in the code unless SCRAM is committed? diff --git a/src/backend/libpq/auth-scram.c b/src/backend/libpq/auth-scram.c + +static char * +read_attr_value(char **input, char attr) +{ Numerous functions like the above in auth-scram.c do not have comments. diff --git a/src/backend/libpq/crypt.c b/src/backend/libpq/crypt.c + else if (strcmp(token->string, "scram") == 0) + { + if (Db_user_namespace) + { + ereport(LOG, + (errcode(ERRCODE_CONFIG_FILE_ERROR), + errmsg("SCRAM authentication is not supported when \"db_user_namespace\" is enabled"), + errcontext("line %d of configuration file \"%s\"", + line_num, HbaFileName))); + return NULL; + } + parsedline->auth_method = uaSASL; + } Why is that? Is it because gss auth should be expected in this case or some limitation of SCRAM? Anyway, it wasn't clear to me why this would be true so some comments here would be good. diff --git a/src/common/scram-common.c b/src/common/scram-common.c +void +scram_HMAC_update(scram_HMAC_ctx *ctx, const char *str, int slen) +{ + SHA1Update(&ctx->sha1ctx, (const uint8 *) str, slen); +} Same in scram-common.c WRT comments. diff --git a/src/include/common/scram-common.h b/src/include/common/scram-common.h +extern void scram_ClientOrServerKey(const char *password, const char *salt, int saltlen, int iterations, const char *keystr, uint8 *result); My, that's a very long line! * A few general things: Most of the new scram modules are seriously in need of better comments - I pointed out a few but all the new files suffer from this lack. The strings "plain", "md5", and "scram" are used often enough that I think it would be nice if they were constants. I feel the same way about verifier methods 'm', 'p', 's' -- perhaps more so because they aren't very verbose. It looks like this will need a bit of work if the GSSAPI patch goes in (and vice versa). Not a problem but you'll need to be prepared to do that quickly in the event - time is flying. -- -David david@pgmasters.net [1] http://www.postgresql.org/message-id/CAB7nPqSGm-9c4yFULt4GS9TzoSuz8XbO-K7TGGGw08sztfG2Uw@mail.gmail.com
On Fri, Mar 18, 2016 at 3:16 AM, David Steele <david@pgmasters.net> wrote: > Here's my full review of this patch set. Thanks! > First let me thank you for submitting this patch for the current CF. I > feel a bit guilty that I requested it and am only now posting a full > review. In my defense I can only say that being CFM has been rather > more work than I was expecting, but I'm sure you know the feeling. I get the idea. That's a very draining activity and I can see what you are doing. That's impressive. Really. > * [PATCH 1/9] Add facility to store multiple password verifiers > > This is a pretty big patch but I went through it carefully and found > nothing to complain about. Your attention to detail is impressive as > always. > > Be sure to update the column names for pg_auth_verifiers as we discussed > in [1]. Done. I have added as well the block of 0009 you pointed out into this patch for clarity. > * [PATCH 2/9] Introduce password_protocols > > diff --git a/src/test/regress/expected/password.out > b/src/test/regress/expected/password.out > +SET password_protocols = 'plain'; > +ALTER ROLE role_passwd5 PASSWORD VERIFIERS (plain = 'foo'); -- ok > +ALTER ROLE role_passwd5 PASSWORD VERIFIERS (md5 = 'foo'); -- error > +ERROR: specified password protocol not allowed > +DETAIL: List of authorized protocols is specified by password_protocols. > > So that makes sense but you get the same result if you do: > > postgres=# alter user role_passwd5 password 'foo'; > ERROR: specified password protocol not allowed > DETAIL: List of authorized protocols is specified by password_protocols. > > I don't think this makes sense - if I have explicitly set > password_protocols to 'plain' and I don't specify a verifier for alter > user then it seems like it should work. If nothing else the error > message lacks information needed to identify the problem. Hm. The problem here is the interaction between the new password_protocols and the existing password_encryption. password_protocols involves that password_encryption should not contain elements not listed in it, in short password_protocols @> password_encryption. So I think that the GUC callbacks checking the validity of those parameter values should check that each other are not set to incorrect values. One thing to simplify those validity checks would be to make password_protocols a PGC_POSTMASTER, aka it needs a restart to be updated. This sacrifices a large portion of the regression tests though... Do others have thoughts to share? I have not updated the patch yet, and I would personally let both parameters as they are now, aka password_protocols as PGC_SUSET and password_encryption as PGC_USERSET, and check their validity when they are updated, but I am not alone here (hopefully). > * [PATCH 3/9] Add pg_auth_verifiers_sanitize > > This function is just a little scary but since password_protocols > defaults to 'plain,md5' I can live with it. Another thing that I thought about was to integrate as part of pg_upgrade_support part. That's no big deal to do it this way as well, though I thought that it could be useful for admins. So extra ideas are welcome. That's superuser-only anyway... And a critical part to manage old protocol deprecation. > * [PATCH 4/9] Remove password verifiers for unsupported protocols in > pg_upgrade > > Same as above - it will always be important for password_protocols to > default to *all* protocols to avoid data being dropped during the > pg_upgrade by accident. You've done that here (and later in the SCRAM > patch) so I'm satisfied but it bears watching. We could have an extra keyword like "all" to all mapping to all the existing protocols, but I find listing the protocols explicitly a more verbose and simple concept, that's why I chose that. > What I would do is add some extra comments in the GUC code to make it > clear to always update the default when adding new verifiers. Good idea. > * [PATCH 5/9] Move sha1.c to src/common > > This looks fine to me and is a good reuse of code. Yes. > * [PATCH 6/9] Refactor sendAuthRequest > > I tested this across different client versions and it seems to work fine. OK, cool! > * [PATCH 7/9] Refactor RandomSalt to handle salts of different lengths > > A simple enough refactor. That's something we should do as an independent change I think. > * [PATCH 8/9] Move encoding routines to src/common/ > > A bit surprising that these functions were never used by any front end code. Perhaps there are some client tools that copy-paste it. I cannot be sure. At least it seems to me that this is useful enough as an independent change. > * Subject: [PATCH 9/9] SCRAM authentication > > diff --git a/src/backend/commands/user.c b/src/backend/commands/user.c > @@ -1616,18 +1619,34 @@ FlattenPasswordIdentifiers(List *verifiers, char > *rolname) > * instances of Postgres, an md5 hash passed as a plain verifier > * should still be treated as an MD5 entry. > */ > - if (spec->veriftype == AUTH_VERIFIER_MD5 && > - !isMD5(spec->value)) > + switch (spec->veriftype) > { > - char encrypted_passwd[MD5_PASSWD_LEN + 1]; > - if (!pg_md5_encrypt(spec->value, rolname, strlen(rolname), > - encrypted_passwd)) > - elog(ERROR, "password encryption failed"); > - spec->value = pstrdup(encrypted_passwd); > + case AUTH_VERIFIER_MD5: > > It seems like this case statement should have been introduced in patch > 0001. Were you just trying to avoid churn in the code unless SCRAM is > committed? Yeah, right. I have now plugged this portion into 0001. > diff --git a/src/backend/libpq/auth-scram.c b/src/backend/libpq/auth-scram.c > + > +static char * > +read_attr_value(char **input, char attr) > +{ > > Numerous functions like the above in auth-scram.c do not have comments. Noted. I have done nothing on that yet though :) And I am lowering the priority for 0009 in this CF to keep focus on the core machinery instead, as well as other patches that need feedback. > diff --git a/src/backend/libpq/crypt.c b/src/backend/libpq/crypt.c > + else if (strcmp(token->string, "scram") == 0) > + { > + if (Db_user_namespace) > + { > + ereport(LOG, > + (errcode(ERRCODE_CONFIG_FILE_ERROR), > + errmsg("SCRAM authentication is not supported when > \"db_user_namespace\" is enabled"), > + errcontext("line %d of configuration file \"%s\"", > + line_num, HbaFileName))); > + return NULL; > + } > + parsedline->auth_method = uaSASL; > + } > > Why is that? Is it because gss auth should be expected in this case or > some limitation of SCRAM? Anyway, it wasn't clear to me why this would > be true so some comments here would be good. The username is part of the identifier used as part of the protocol, so we cannot rely on mappings of db_user_namespace. > diff --git a/src/common/scram-common.c b/src/common/scram-common.c > +void > +scram_HMAC_update(scram_HMAC_ctx *ctx, const char *str, int slen) > +{ > + SHA1Update(&ctx->sha1ctx, (const uint8 *) str, slen); > +} > > Same in scram-common.c WRT comments. OK, noted. I have not updated those comments yet though. At this stage of the game considering 0009 for integration is a rather difficult task, and I suspect enough work with the underlying patches. For 9.6, I would be happy enough if we got the basic infra in core. > diff --git a/src/include/common/scram-common.h > b/src/include/common/scram-common.h > +extern void scram_ClientOrServerKey(const char *password, const char > *salt, int saltlen, int iterations, const char *keystr, uint8 *result); > > My, that's a very long line! Oops. Sorry. > * A few general things: > > Most of the new scram modules are seriously in need of better comments - > I pointed out a few but all the new files suffer from this lack. Indeed. Honestly, as you say, time flies, and by the time of the feature freeze I am thinking that the only sane target for the CF would be to focus on 0001~0004. That's the basic infrastructure I think we need anyway. 0005~0008 are things that I think are useful taken independently and are simple refactoring, so they could be considered with the time frame we have. 0009 is a bit too complex. I expect enough comments on the first patches to keep my time busy until the end of this CF without that, that's still useful for testing by the way. > The strings "plain", "md5", and "scram" are used often enough that I > think it would be nice if they were constants. This makes sense. So I switched the code this way. Note that for md5 I think that it makes sense to use a #define variable when referring to the verifier method, not when referring to the prefix of a md5 verifier. Those full names are added in pg_auth_verifiers.h. > I feel the same way > about verifier methods 'm', 'p', 's' -- perhaps more so because they > aren't very verbose. I am thinking of the verifier abbreviations in the system catalog in a way similar to pg_class' relkind, explaining the one-character identifier, so I wish letting them as-is. > It looks like this will need a bit of work if the GSSAPI patch goes in > (and vice versa). Not a problem but you'll need to be prepared to do > that quickly in the event - time is flying. That's not an issue for me to rebase this set of patches. The only conflicts that I anticipate are on 0009, but I don't have high hopes to get this portion integrating into core for 9.6, the rest of the patches is complicated enough, and everyone bandwidth is limited. -- Michael
Attachment
- 0001-Add-facility-to-store-multiple-password-verifiers.patch
- 0002-Introduce-password_protocols.patch
- 0003-Add-pg_auth_verifiers_sanitize.patch
- 0004-Remove-password-verifiers-for-unsupported-protocols-.patch
- 0005-Move-sha1.c-to-src-common.patch
- 0006-Refactor-sendAuthRequest.patch
- 0007-Refactor-RandomSalt-to-handle-salts-of-different-len.patch
- 0008-Move-encoding-routines-to-src-common.patch
- 0009-SCRAM-authentication.patch
On Fri, Mar 18, 2016 at 9:31 AM, Michael Paquier <michael.paquier@gmail.com> wrote: > That's not an issue for me to rebase this set of patches. The only > conflicts that I anticipate are on 0009, but I don't have high hopes > to get this portion integrating into core for 9.6, the rest of the > patches is complicated enough, and everyone bandwidth is limited. I really think we ought to consider pushing this whole thing out to 9.7. I don't see how we're going to get all of this into 9.6, and these are big, user-facing changes that I don't think we should rush into under time pressure. I think it'd be better to do this early in the 9.7 cycle so that it has time to settle before the time crunch at the end. I predict this is going to have a lot of loose ends that are going to take months to settle, and we don't have that time right now. And I'd rather see all of the changes in one release than split them across two releases. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Sat, Mar 19, 2016 at 12:28 AM, Robert Haas <robertmhaas@gmail.com> wrote: > On Fri, Mar 18, 2016 at 9:31 AM, Michael Paquier > <michael.paquier@gmail.com> wrote: >> That's not an issue for me to rebase this set of patches. The only >> conflicts that I anticipate are on 0009, but I don't have high hopes >> to get this portion integrating into core for 9.6, the rest of the >> patches is complicated enough, and everyone bandwidth is limited. > > I really think we ought to consider pushing this whole thing out to > 9.7. I don't see how we're going to get all of this into 9.6, and > these are big, user-facing changes that I don't think we should rush > into under time pressure. I think it'd be better to do this early in > the 9.7 cycle so that it has time to settle before the time crunch at > the end. I predict this is going to have a lot of loose ends that are > going to take months to settle, and we don't have that time right now. > And I'd rather see all of the changes in one release than split them > across two releases. FWIW, the catalog separation is not that much a complicated patch, and that's really a change independent on SCRAM, the main matter being to manage critical index and relation entries correctly and it does not touch the authentication code, which is what IMO is the sensitive part. The catalog separation opens the door as well to multiple verifiers for the same protocol for a single role, facilitating password rolling policies, which is a feature that has been asked a lot. Nothing prevents the development of moving validuntil into pg_auth_verifiers in parallel of the SCRAM for the 9.7 release cycle, though it would facilitate it to have some basic infra in place. Just my 2c. -- Michael
Robert, all, * Robert Haas (robertmhaas@gmail.com) wrote: > On Fri, Mar 18, 2016 at 9:31 AM, Michael Paquier > <michael.paquier@gmail.com> wrote: > > That's not an issue for me to rebase this set of patches. The only > > conflicts that I anticipate are on 0009, but I don't have high hopes > > to get this portion integrating into core for 9.6, the rest of the > > patches is complicated enough, and everyone bandwidth is limited. > > I really think we ought to consider pushing this whole thing out to > 9.7. I don't see how we're going to get all of this into 9.6, and > these are big, user-facing changes that I don't think we should rush > into under time pressure. I think it'd be better to do this early in > the 9.7 cycle so that it has time to settle before the time crunch at > the end. I predict this is going to have a lot of loose ends that are > going to take months to settle, and we don't have that time right now. I'm not sure that I agree with the above. This patch has been through the ringer multiple times regarding the user-facing bits and, by and large, the results appear reasonable. Further, getting a better auth method into PG is something which I do view as a priority considering the concerns and complaints that have been, justifiably, raised against our current password-based authentication support. This isn't a new patch set either, it was submitted initially over the summer after it was pointed out, over a year ago, that people actually do care about the problems with our current implementation (amusingly, I recall having pointed out the same 5+ years ago, but only did so to this list). I've been following along on this patch set and asked David to spend time reviewing it as I feel that it's stil got a chance for 9.6, since it's been through multiple CF rounds and has had a fair bit of discussion, review, and consideration. > And I'd rather see all of the changes in one release than split them > across two releases. I agree with this. If we aren't going to get SCRAM into 9.6 then the rest is just breaking things with little benefit. I'm optomistic that we will be able to include SCRAM support in 9.6, but if that ends up not being feasible then we need to put all of the changes to the next release. I do think that if we push this off to 9.7 then we're going to have SCRAM *plus* a bunch of other changes around password policies in that release, and it'd be better to introduce SCRAM independently of the other changes. All that said, this is just my voice from having followed this thread and discussing it with David and I'm not trying to force anything. It'd certainly be nice to have and to be able to tell people that we do have a strong and recognized approach to password-based authentication in PG, but I've long been telling everyone that they should be using GSSAPI and/or SSL and can continue to do so for another year if necessary. Thanks! Stephen
On Fri, Mar 18, 2016 at 2:12 PM, Stephen Frost <sfrost@snowman.net> wrote: > I'm not sure that I agree with the above. This patch has been through > the ringer multiple times regarding the user-facing bits and, by and > large, the results appear reasonable. Further, getting a better auth > method into PG is something which I do view as a priority considering > the concerns and complaints that have been, justifiably, raised against > our current password-based authentication support. > > This isn't a new patch set either, it was submitted initially over the > summer after it was pointed out, over a year ago, that people actually > do care about the problems with our current implementation (amusingly, I > recall having pointed out the same 5+ years ago, but only did so to this > list). I am not disputing the importance of the topic, and I do realize that the patch has been around in some form since March. However, I don't think there's been a whole heck of a lot in terms of detailed code-level review, and I think that's pretty important for something that necessarily involves wire protocol changes. Doing that with the level of detail and care that it seems to me to require seems like an almost-impossible task. Most of the major features I've committed this CommitFest are patches where I've personally done multiple rounds of review on over the last several months, and in many cases, other people have been doing code reviews for months before that. I'm not denying that this patch has prompted a good deal of discussion and what I would call design review, but detailed code review? I just haven't seen much of that. >> And I'd rather see all of the changes in one release than split them >> across two releases. > > I agree with this. If we aren't going to get SCRAM into 9.6 then the > rest is just breaking things with little benefit. I'm optomistic that > we will be able to include SCRAM support in 9.6, but if that ends up not > being feasible then we need to put all of the changes to the next > release. OK, glad we agree on that. > I do think that if we push this off to 9.7 then we're going to have > SCRAM *plus* a bunch of other changes around password policies in that > release, and it'd be better to introduce SCRAM independently of the > other changes. Well, for my part, I'd be happy enough to do all of that in a release cycle - maybe SCRAM at the beginning and those other changes a little later on. I don't see that as a real conflict, and in fact, sometimes when you do several things like that in a single cycle, people start to see whatever the common theme is - security, say - as part of the message of that release a little more than they would if a feature lands here and another there. That's not all a bad thing. > All that said, this is just my voice from having followed this thread > and discussing it with David and I'm not trying to force anything. It'd > certainly be nice to have and to be able to tell people that we do have > a strong and recognized approach to password-based authentication in PG, > but I've long been telling everyone that they should be using GSSAPI > and/or SSL and can continue to do so for another year if necessary. I agree it's unfortunate, but IMHO that's kinda where we are at. If Heikki were still involved and had been working on this, I strongly suspect it would have been committed already. But he's not, and it's not clear when or if he's coming back, and I cannot imagine how we are going to begin and complete pushing in a feature of this magnitude in the three weeks before feature freeze without a lot of collateral damage. That is an opinion, not a fact, but it's one I feel pretty confident about. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Sat, Mar 19, 2016 at 3:52 AM, Robert Haas <robertmhaas@gmail.com> wrote: > On Fri, Mar 18, 2016 at 2:12 PM, Stephen Frost <sfrost@snowman.net> wrote: >> I'm not sure that I agree with the above. This patch has been through >> the ringer multiple times regarding the user-facing bits and, by and >> large, the results appear reasonable. Further, getting a better auth >> method into PG is something which I do view as a priority considering >> the concerns and complaints that have been, justifiably, raised against >> our current password-based authentication support. >> >> This isn't a new patch set either, it was submitted initially over the >> summer after it was pointed out, over a year ago, that people actually >> do care about the problems with our current implementation (amusingly, I >> recall having pointed out the same 5+ years ago, but only did so to this >> list). > > I am not disputing the importance of the topic, and I do realize that > the patch has been around in some form since March. However, I don't > think there's been a whole heck of a lot in terms of detailed > code-level review, and I think that's pretty important for something > that necessarily involves wire protocol changes. Yep, that's the desert here, though there are surely a lot of people would like a way to get out of md5 and get into something more modern (see STIG), and many companies want to get something, my company included, though this is really a complicated task, and there are few people who could really help out here I guess. > Doing that with the > level of detail and care that it seems to me to require seems like an > almost-impossible task. Most of the major features I've committed > this CommitFest are patches where I've personally done multiple rounds > of review on over the last several months, and in many cases, other > people have been doing code reviews for months before that. I'm not > denying that this patch has prompted a good deal of discussion and > what I would call design review, but detailed code review? I just > haven't seen much of that. There has been none, as well as no real discussion regarding what we want to do. The current result, particularly for the management of protocol aging, is based on things I wrote by myself which negate the many negative opinions received up to now for the past patches (mainly the feedback was "I don't like that", without real output or fresh ideas during discussion to explain why that's the case). >>> And I'd rather see all of the changes in one release than split them >>> across two releases. >> >> I agree with this. If we aren't going to get SCRAM into 9.6 then the >> rest is just breaking things with little benefit. I'm optimistic that >> we will be able to include SCRAM support in 9.6, but if that ends up not >> being feasible then we need to put all of the changes to the next >> release. > > OK, glad we agree on that. Speaking as a co-author of the stuff of this thread, the two main patches are 0001, introducing pg_auth_verifiers and 0009, adding SCRAM-SHA1. The rest is just refactoring and addition of a couple of utilities to manage the protocol aging, which are really straight-forward, and all the user-visible changes are introduced by 0001. While I really like the shape of 0001, 0009 is not there yet, and really requires more time than 3 weeks, that's more than what I can do by feature freeze of 9.6. So if the conclusion is if there is no SCRAM, all the other changes don't make much sense, let's bump it to 9.7. There is honestly still interest from here, and I would guess that the only thing I could do on top of having patches for the first CF of 9.7 is discussing the topic at the dev unconference of PGCon. >> I do think that if we push this off to 9.7 then we're going to have >> SCRAM *plus* a bunch of other changes around password policies in that >> release, and it'd be better to introduce SCRAM independently of the >> other changes. > > Well, for my part, I'd be happy enough to do all of that in a release > cycle - maybe SCRAM at the beginning and those other changes a little > later on. I don't see that as a real conflict, and in fact, sometimes > when you do several things like that in a single cycle, people start > to see whatever the common theme is - security, say - as part of the > message of that release a little more than they would if a feature > lands here and another there. That's not all a bad thing. Having a centralized theme for a given release cycle is not a bad thing, I agree. And I'd like to think that the same discussion is not going to happen again in one year... -- Michael
On Sat, Mar 19, 2016 at 8:30 AM, Michael Paquier <michael.paquier@gmail.com> wrote: >> Doing that with the >> level of detail and care that it seems to me to require seems like an >> almost-impossible task. Most of the major features I've committed >> this CommitFest are patches where I've personally done multiple rounds >> of review on over the last several months, and in many cases, other >> people have been doing code reviews for months before that. I'm not >> denying that this patch has prompted a good deal of discussion and >> what I would call design review, but detailed code review? I just >> haven't seen much of that. > > There has been none, as well as no real discussion regarding what we > want to do. The current result, particularly for the management of > protocol aging, is based on things I wrote by myself which negate the > many negative opinions received up to now for the past patches (mainly > the feedback was "I don't like that", without real output or fresh > ideas during discussion to explain why that's the case). Well, I said before and I'll say again that I don't like the idea of multiple password verifiers. I think that's an accident waiting to happen, and I'm not prepared to put in the amount of time and energy that it would take to get that feature committed despite not wanting it myself, or for being responsible for it afterwards. I'd prefer we didn't do it at all, although I'm not going to dig in my heels. I might be willing to deal with SCRAM itself, but this whole area is not my strongest suit. So ideally some other committer would be willing to pick this up. But the problem isn't even just that somebody has to hit the final commit button - as we've both said, there's a woeful lack of any meaningful review on this thread, and this sort of change really needs quite a lot of review. This has implications for backward-compatibility, for connectors that don't use libpq, etc. Really, I'm not even sure we have consensus on the direction. I mean, Heikki's proposal to adopt SCRAM sounds good enough at a broad level, but I don't really know what the alternatives are, I'm mostly just taking his word for it, and like you say, there's been a fair amount of miscellaneous negativity floating around. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Mon, Mar 21, 2016 at 11:07 PM, Robert Haas <robertmhaas@gmail.com> wrote: > Well, I said before and I'll say again that I don't like the idea of > multiple password verifiers. I think that's an accident waiting to > happen, and I'm not prepared to put in the amount of time and energy > that it would take to get that feature committed despite not wanting > it myself, or for being responsible for it afterwards. I'd prefer we > didn't do it at all, although I'm not going to dig in my heels. I > might be willing to deal with SCRAM itself, but this whole area is not > my strongest suit. So ideally some other committer would be willing > to pick this up. I won't bet my hand on that. > But the problem isn't even just that somebody has to hit the final > commit button - as we've both said, there's a woeful lack of any > meaningful review on this thread, and this sort of change really needs > quite a lot of review. Yep. > This has implications for > backward-compatibility, for connectors that don't use libpq, etc. > Really, I'm not even sure we have consensus on the direction. I mean, > Heikki's proposal to adopt SCRAM sounds good enough at a broad level, > but I don't really know what the alternatives are, I'm mostly just > taking his word for it, and like you say, there's been a fair amount > of miscellaneous negativity floating around. PAKE or J-PAKE are other alternatives I have in mind. I have marked the patch as returned with feedback. -- Michael
On Tue, Mar 22, 2016 at 2:48 PM, Michael Paquier <michael.paquier@gmail.com> wrote:
On Mon, Mar 21, 2016 at 11:07 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> Well, I said before and I'll say again that I don't like the idea of
> multiple password verifiers. I think that's an accident waiting to
> happen, and I'm not prepared to put in the amount of time and energy
> that it would take to get that feature committed despite not wanting
> it myself, or for being responsible for it afterwards. I'd prefer we
> didn't do it at all, although I'm not going to dig in my heels. I
> might be willing to deal with SCRAM itself, but this whole area is not
> my strongest suit. So ideally some other committer would be willing
> to pick this up.
I won't bet my hand on that.
In principle I'd be happy to look at it, but I doubt that I will have enough time to get it done within this CF unfortunately. Thus I'd rather not commit to doing it.. It kind of fell off my radar too long ago, as I was originally planning to look at it back in the autumn, but failed.
So basically, if somebody else has the cycles to do it in time for 9.6, please do.
I have marked the patch as returned with feedback.
Yeah, unfortunately I think that's probably right. Let's focus on things that have a better chance of making it.
----[This is a rather informal user-review]---- Here are some thoughts and experiences on using the new features, I focused on testing the basic funcionality of setting password_encryption to scram and then generating some users with passwords. After that, I took a look at the documentation, specifically all those parts that mentioned "md5", but not SCRAM, so i took some time to write those down and add my thoughts on them. We're quite keen on seeing these features in a future release, so I suggest that we add these patches to the next commitfest asap in order to keep the discussion on this topic flowing. For those of you who like to put the authentication method itself up for discussion, I'd like to add that it seems fairly simple to insert code for new authentication mechanisms. In conclusion I think these patches are very useful. My remarks follow below. Kind regards, Julian Markwort julian.markwort@uni-muenster.de Things I noticed: 1. when using either CREATE ROLE ALTER ROLE with the parameter ENCRYPTED md5 encryption isalways assumed (I've come to realize that UNENCRYPTED always equals plain and, in the past, ENCRYPTED equaled md5 since there were no other options) I don't know if this is intended behaviour. Maybe this option should be omitted (or marked as deprecated in the documentation) from the CREATE/ALTER functions (since without this Option, the password_encryption from pg_conf.hba is used) or maybe it should have it's own parameter like CREATE ROLE testuserWITH LOGIN ENCRYPTED 'SCRAM' PASSWORD 'test'; so that the desired encryption is used. From my point of view,this would be the sensible thing to do, especially if different verifiers should be allowed (as proposed by these patches). In either case, a bit of text explaining the (UN)ENCRYPTED option should be added to the documentation of the CREATE/ALTER ROLE functions. 2. Documentation III. 17. Server Setup and Operation 17.2. Creating a Database Cluster: maybe listSCRAM as a possible method for securing the db-admin 19. Client Authentication 19.1. The pg_hba.conf File: SCRAM is not listed in the list of available auth_methods to be specified in pg_conf.hba 19.3 Authentication Methods 19.3.2 PasswordAuthentication: SCRAM would belong to the same category as md5 and password, as they are all password-based. 20. Database Roles 20.2. Role Attributes: password : list SCRAM as authentication method as well VI. ALTER ROLE: is SCRAM also dependent on the role name for salting? if so, add warning. (it doesn't seem that way, however I'm curious as to why the function FlattenPasswordIdentifiers in src/backend/commands/user.c called by AlterRole passes rolname to scram_build_verifier(), when that function does absolutely nothing with this argument?) CREATE ROLE: can SCRAM also be used in the list of PASSWORD VERIFIERS? VII. 49. System Catalogs: 49.9 pg_auth_verifiers: Column names and types are mixed up in description for column vervalue: explain some basic stuff aboutmd5 maybe as well? remark: the statements about the composition of the string that is md5-hashed are contradictory. (concatenating "bar" to"foo" results in foobar, not the other way round, as it is implied in the explanation of the md5 hashing), this however, is not really linked to the changes introduced with these patches. remark: naming inconsistency: md5 vervalues are stored "md5*" why don't we take the same approach and use it on SCRAM hashes (i.e. "scram*" ). (if this is a general convention thing, please ignore this comment, however I couldn't find anything in the relevant RFC's while skimming through them). 50. Frontend/Backend Protocol 50.2.1 Start-up: add explanation for "AuthenticationSCRAMPassword" authentication request message. (?) 50.5 message formats see 50.2.1
On Wed, Mar 30, 2016 at 1:44 AM, Julian Markwort <julian.markwort@uni-muenster.de> wrote: > ----[This is a rather informal user-review]---- > > Here are some thoughts and experiences on using the new features, I focused > on testing the basic funcionality of setting password_encryption to scram > and then generating some users with passwords. After that, I took a look at > the documentation, specifically all those parts that mentioned "md5", but > not SCRAM, so i took some time to write those down and add my thoughts on > them. > > We're quite keen on seeing these features in a future release, so I suggest > that we add these patches to the next commitfest asap in order to keep the > discussion on this topic flowing. > > For those of you who like to put the authentication method itself up for > discussion, I'd like to add that it seems fairly simple to insert code for > new authentication mechanisms. > In conclusion I think these patches are very useful. The reception of the concept of multiple password verifiers for a single role was rather... cold. So except if a committer pushes hard for it is never going to show up. There is clear consensus that SCRAM is something needed though, so we may as well just focus on that. > Things I noticed: > 1. > when using either > CREATE ROLE > ALTER ROLE > with the parameter > ENCRYPTED > md5 encryption is always assumed (I've come to realize that UNENCRYPTED > always equals plain and, in the past, ENCRYPTED equaled md5 since there were > no other options) Yes, that's to match the current behavior, and make something fully backward-compatible. Switching to md5 + scram may have made sense as well though. > I don't know if this is intended behaviour. This is an intended behavior. > Maybe this option should be > omitted (or marked as deprecated in the documentation) from the CREATE/ALTER > functions (since without this Option, the password_encryption from > pg_conf.hba is used) > or maybe it should have it's own parameter like > CREATE ROLE testuser WITH LOGIN ENCRYPTED 'SCRAM' PASSWORD 'test'; > so that the desired encryption is used. > From my point of view, this would be the sensible thing to do, > especially if different verifiers should be allowed (as proposed by these > patches). The extension PASSWORD VERIFIERS is aimed at covering this need. The grammar of those queries is not a fixed thing though. > In either case, a bit of text explaining the (UN)ENCRYPTED option should > be added to the documentation of the CREATE/ALTER ROLE functions. It is specified here; http://www.postgresql.org/docs/devel/static/sql-createrole.html And the patch does not ignore that. > 2. > Documentation > III. > 17. Server Setup and Operation > 17.2. Creating a Database Cluster: maybe list SCRAM as a > possible method for securing the db-admin Indeed. > 19. Client Authentication > 19.1. The pg_hba.conf File: SCRAM is not listed in the list of > available auth_methods to be specified in pg_conf.hba > 19.3 Authentication Methods > 19.3.2 Password Authentication: SCRAM would belong to the > same category as md5 and password, as they are all password-based. > > 20. Database Roles > 20.2. Role Attributes: password : list SCRAM as authentication > method as well Indeed. > VI. > ALTER ROLE: is SCRAM also dependent on the role name for salting? if > so, add warning. No. > (it doesn't seem that way, however I'm curious as to why > the function FlattenPasswordIdentifiers in src/backend/commands/user.c > called by AlterRole passes rolname to scram_build_verifier(), when that > function does absolutely nothing with this argument?) Yeah, this argument could be removed. > CREATE ROLE: can SCRAM also be used in the list of PASSWORD > VERIFIERS? Yes. > VII. > 49. System Catalogs: > 49.9 pg_auth_verifiers: Column names and types are mixed up > in description for column vervalue: Yes, things are messed up a bit there. Thanks for noticing. > remark: naming inconsistency: md5 > vervalues are stored "md5*" why don't we take the same approach and use it > on SCRAM hashes (i.e. "scram*" ). Perhaps this makes sense if there is no pg_auth_verifiers. -- Michael
On Wed, Mar 30, 2016 at 9:46 AM, Michael Paquier <michael.paquier@gmail.com> wrote: >> Things I noticed: >> 1. >> when using either >> CREATE ROLE >> ALTER ROLE >> with the parameter >> ENCRYPTED >> md5 encryption is always assumed (I've come to realize that UNENCRYPTED >> always equals plain and, in the past, ENCRYPTED equaled md5 since there were >> no other options) > > Yes, that's to match the current behavior, and make something fully > backward-compatible. Switching to md5 + scram may have made sense as > well though. I think we're not going to have much luck getting people to switch over to SCRAM if the default remains MD5. Perhaps there should be a GUC for this - and we can initially set that GUC to md5, allowing people who are ready to adopt SCRAM to change it. And then in a later release we can change the default, once we're pretty confident that most connectors have added support for the new authentication method. This is going to take a long time to roll out. Alternatively, we could control it strictly through DDL. Note that the existing behavior is pretty wonky: alter user rhaas unencrypted password 'foo'; -> rolpassword foo alter user rhaas encrypted password 'foo'; -> rolpassword md5e748797a605a1c95f3d6b5f140b2d528 alter user rhaas encrypted password 'md5e748797a605a1c95f3d6b5f140b2d528'; -> rolpassword md5e748797a605a1c95f3d6b5f140b2d528 alter user rhaas unencrypted password 'md5e748797a605a1c95f3d6b5f140b2d528'; -> rolpassword md5e748797a605a1c95f3d6b5f140b2d528 So basically the use of the ENCRYPTED keyword means "if it does already seem to be the sort of MD5 blob we're expecting, turn it into that". And we just rely on the format to distinguish between an MD5 verifier and an unencrypted password. Personally, I think a good start here, and I think you may have something like this in the patch already, would be to split rolpassword into two columns, say rolencryption and rolpassword. rolencryption says how the password verifier is encrypted and rolpassword contains the verifier itself. Initially, rolencryption will be 'plain' or 'md5', but later we can add 'scram' as another choice, or maybe it'll be more specific like 'scram-hmac-doodad'. And then maybe introduce syntax like this: alter user rhaas set password 'raw-unencrypted-passwordt' using 'verifier-method'; alter user rhaas set password verifier 'verifier-goes-here' using 'verifier-method'; That might require making verifier a key word, which would be good to avoid. Perhaps we could use "password validator" instead? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On 03/30/2016 06:14 PM, Robert Haas wrote: > So basically the use of the ENCRYPTED keyword means "if it does > already seem to be the sort of MD5 blob we're expecting, turn it into > that". If it does NOT already seem to be... I guess? > And we just rely on the format to distinguish between an MD5 verifier > and an unencrypted password. Personally, I think a good start here, > and I think you may have something like this in the patch already, > would be to split rolpassword into two columns, say rolencryption and > rolpassword. This inches closer to Michael's suggestion to have multiple verifiers per pg_authid user ... > rolencryption says how the password verifier is encrypted and > rolpassword contains the verifier itself. Initially, rolencryption > will be 'plain' or 'md5', but later we can add 'scram' as another > choice, or maybe it'll be more specific like 'scram-hmac-doodad'. May I suggest using "{" <scheme>["."<encoding>] "}" just like Dovecot does? e.g. "{md5.hex}e748797a605a1c95f3d6b5f140b2d528" where no "{ ... }" prefix means just fallback to the old method of trying to guess what the blob contains? This would invalidate PLAIN passwords beginning with "{", though, so some measures would be needed. > And then maybe introduce syntax like this: alter user rhaas set > password 'raw-unencrypted-passwordt' using 'verifier-method'; alter > user rhaas set password verifier 'verifier-goes-here' using > 'verifier-method'; That might require making verifier a key word, > which would be good to avoid. Perhaps we could use "password > validator" instead? I'd like USING best ... though by prepending the schema for ENCRYPTED, the required information is already conveyed within the verifier, so no need to specify it again :) Just my .02€ / J.L.
On Wed, Mar 30, 2016 at 12:31 PM, José Luis Tallón <jltallon@adv-solutions.net> wrote: > On 03/30/2016 06:14 PM, Robert Haas wrote: >> So basically the use of the ENCRYPTED keyword means "if it does already >> seem to be the sort of MD5 blob we're expecting, turn it into that". > > If it does NOT already seem to be... I guess? Yes, that's what I meant. Sorry. >> rolencryption says how the password verifier is encrypted and rolpassword >> contains the verifier itself. Initially, rolencryption will be 'plain' or >> 'md5', but later we can add 'scram' as another choice, or maybe it'll be >> more specific like 'scram-hmac-doodad'. > > May I suggest using "{" <scheme>["."<encoding>] "}" just like Dovecot does? Doesn't seem very SQL-ish to me... I think we should normalize. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Thu, Mar 31, 2016 at 1:14 AM, Robert Haas <robertmhaas@gmail.com> wrote: > On Wed, Mar 30, 2016 at 9:46 AM, Michael Paquier > <michael.paquier@gmail.com> wrote: >>> Things I noticed: >>> 1. >>> when using either >>> CREATE ROLE >>> ALTER ROLE >>> with the parameter >>> ENCRYPTED >>> md5 encryption is always assumed (I've come to realize that UNENCRYPTED >>> always equals plain and, in the past, ENCRYPTED equaled md5 since there were >>> no other options) >> >> Yes, that's to match the current behavior, and make something fully >> backward-compatible. Switching to md5 + scram may have made sense as >> well though. > > I think we're not going to have much luck getting people to switch > over to SCRAM if the default remains MD5. Perhaps there should be a > GUC for this - and we can initially set that GUC to md5, allowing > people who are ready to adopt SCRAM to change it. And then in a later > release we can change the default, once we're pretty confident that > most connectors have added support for the new authentication method. > This is going to take a long time to roll out. > Alternatively, we could control it strictly through DDL. This maps quite a lot with the existing password_encryption, so adding a GUC to control only the format of protocols only for ENCRYPTED is disturbing, say password_encryption_encrypted. I'd rather keep ENCRYPTED to md5 as default when password_encryption is 'on', switch to scram a couple of releases later, and extend the DDL grammar with something like PROTOCOL {'md5' | 'plain' | 'scram'}, which can be used instead of UNENCRYPTED | ENCRYPTED as an additional keyword. Smooth transition to a more-extensive system. > Note that the existing behavior is pretty wonky: > alter user rhaas unencrypted password 'foo'; -> rolpassword foo > alter user rhaas encrypted password 'foo'; -> rolpassword > md5e748797a605a1c95f3d6b5f140b2d528 > alter user rhaas encrypted password > 'md5e748797a605a1c95f3d6b5f140b2d528'; -> rolpassword > md5e748797a605a1c95f3d6b5f140b2d528 > alter user rhaas unencrypted password > 'md5e748797a605a1c95f3d6b5f140b2d528'; -> rolpassword > md5e748797a605a1c95f3d6b5f140b2d528 I actually wrote some regression tests for that. Those are upthread as part of 0001, have for example a look at password.sql. > So basically the use of the ENCRYPTED keyword means "if it does > already seem to be the sort of MD5 blob we're expecting, turn it into > that". And we just rely on the format to distinguish between an MD5 > verifier and an unencrypted password. Personally, I think a good > start here, and I think you may have something like this in the patch > already, would be to split rolpassword into two columns, say > rolencryption and rolpassword. rolencryption says how the password > verifier is encrypted and rolpassword contains the verifier itself. The patch has something like that. And doing this split is not that complicated to be honest. Surely that would be clearer than relying on the prefix of the identifier to see if it is md5 or not. > Initially, rolencryption will be 'plain' or 'md5', but later we can > add 'scram' as another choice, or maybe it'll be more specific like > 'scram-hmac-doodad'. And then maybe introduce syntax like this: > > alter user rhaas set password 'raw-unencrypted-passwordt' using > 'verifier-method'; > alter user rhaas set password verifier 'verifier-goes-here' using > 'verifier-method'; > > That might require making verifier a key word, which would be good to > avoid. Perhaps we could use "password validator" instead? Yes, that matches what I wrote above. At this point putting that back on board and discuss it openly at PGCon is the best course of action IMO. -- Michael
So, the consensus so far seems to be: We don't want the support for multiple password verifiers per user. At least not yet. Let's get SCRAM working first, in a way that a user can only have SCRAM or an MD5 hash stored in the database, not both. We can add support for multiple verifiers per user, password aging, etc. later. Hopefully we'll make some progress on those before 9.7 is released, too, but let's treat them as separate issues and focus on SCRAM. I took a quick look at the patch set now again, and except that it needs to have the multiple password verifier support refactored out, I think it's in a pretty good shape. I don't like the pg_upgrade changes and its support function, that also seems like an orthogonal or add-on feature that would be better discussed separately. I think pg_upgrade should just do the upgrade with as little change to the system as possible, and let the admin reset/rehash/deprecate the passwords separately, when she wants to switch all users to SCRAM. So I suggest that we rip out those changes from the patch set as well. In related news, RFC 7677 that describes a new SCRAM-SHA-256 authentication mechanism, was published in November 2015. It's identical to SCRAM-SHA-1, which is what this patch set implements, except that SHA-1 has been replaced with SHA-256. Perhaps we should forget about SCRAM-SHA-1 and jump straight to SCRAM-SHA-256. RFC 7677 also adds some verbiage, in response to vulnerabilities that have been found with the "tls-unique" channel binding mechanism: > To be secure, either SCRAM-SHA-256-PLUS and SCRAM-SHA-1-PLUS MUST be > used over a TLS channel that has had the session hash extension > [RFC7627] negotiated, or session resumption MUST NOT have been used. So that doesn't affect details of the protocol per se, but once we implement channel binding, we need to check for those conditions somehow (or make sure that OpenSSL checks for them). Michael, do you plan to submit a new version of this patch set for the next commitfest? I'd like to get this committed early in the 9.7 release cycle, so that we have time to work on all the add-on stuff before the release. - Heikki
On Sun, Jul 3, 2016 at 4:54 AM, Heikki Linnakangas <hlinnaka@iki.fi> wrote: > I took a quick look at the patch set now again, and except that it needs to > have the multiple password verifier support refactored out, I think it's in > a pretty good shape. I don't like the pg_upgrade changes and its support > function, that also seems like an orthogonal or add-on feature that would be > better discussed separately. I think pg_upgrade should just do the upgrade > with as little change to the system as possible, and let the admin > reset/rehash/deprecate the passwords separately, when she wants to switch > all users to SCRAM. So I suggest that we rip out those changes from the > patch set as well. That's as well what I recall from the consensus at PGCon: only focus on the protocol addition and storage of the scram verifier. It was not mentioned directly but that's what I guess should be done. So no complains here. > In related news, RFC 7677 that describes a new SCRAM-SHA-256 authentication > mechanism, was published in November 2015. It's identical to SCRAM-SHA-1, > which is what this patch set implements, except that SHA-1 has been replaced > with SHA-256. Perhaps we should forget about SCRAM-SHA-1 and jump straight > to SCRAM-SHA-256. That's to consider. I don't thing switching to that is much complicated. > RFC 7677 also adds some verbiage, in response to vulnerabilities that have > been found with the "tls-unique" channel binding mechanism: > >> To be secure, either SCRAM-SHA-256-PLUS and SCRAM-SHA-1-PLUS MUST be >> used over a TLS channel that has had the session hash extension >> [RFC7627] negotiated, or session resumption MUST NOT have been used. > > So that doesn't affect details of the protocol per se, but once we implement > channel binding, we need to check for those conditions somehow (or make sure > that OpenSSL checks for them). Yes. > Michael, do you plan to submit a new version of this patch set for the next > commitfest? I'd like to get this committed early in the 9.7 release cycle, > so that we have time to work on all the add-on stuff before the release. Thanks. That's good news! Yes, I am still on track to submit a patch for CF1. -- Michael
On 7/2/16 6:32 PM, Michael Paquier wrote: > On Sun, Jul 3, 2016 at 4:54 AM, Heikki Linnakangas <hlinnaka@iki.fi> wrote: > >> Michael, do you plan to submit a new version of this patch set for the next >> commitfest? I'd like to get this committed early in the 9.7 release cycle, >> so that we have time to work on all the add-on stuff before the release. > > Thanks. That's good news! Yes, I am still on track to submit a patch for CF1. And I'm on board for reviews, testing, and whatever else I can help with. -- -David david@pgmasters.net
On 7/2/16 3:54 PM, Heikki Linnakangas wrote: > In related news, RFC 7677 that describes a new SCRAM-SHA-256 > authentication mechanism, was published in November 2015. It's identical > to SCRAM-SHA-1, which is what this patch set implements, except that > SHA-1 has been replaced with SHA-256. Perhaps we should forget about > SCRAM-SHA-1 and jump straight to SCRAM-SHA-256. I think a global change from SHA-1 to SHA-256 is in the air already, so if we're going to release something brand new in 2017 or so, it should be SHA-256. I suspect this would be a relatively simple change, so I wouldn't mind seeing a SHA-1-based variant in CF1 to get things rolling. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Mon, Jul 4, 2016 at 6:34 AM, Peter Eisentraut <peter.eisentraut@2ndquadrant.com> wrote: > On 7/2/16 3:54 PM, Heikki Linnakangas wrote: >> >> In related news, RFC 7677 that describes a new SCRAM-SHA-256 >> authentication mechanism, was published in November 2015. It's identical >> to SCRAM-SHA-1, which is what this patch set implements, except that >> SHA-1 has been replaced with SHA-256. Perhaps we should forget about >> SCRAM-SHA-1 and jump straight to SCRAM-SHA-256. > > I think a global change from SHA-1 to SHA-256 is in the air already, so if > we're going to release something brand new in 2017 or so, it should be > SHA-256. > > I suspect this would be a relatively simple change, so I wouldn't mind > seeing a SHA-1-based variant in CF1 to get things rolling. I'd just move this thing to SHA256, we are likely going to use that at the end. As I am coming back into that, I would as well suggest do the following, that the current set of patches is clearly missing: - Put the HMAC infrastructure stuff of pgcrypto into src/common/. It is a bit a shame to not reuse what is currently available, then I would suggest to reuse that with HMAC_SCRAM_SHAXXX as label. - Move *all* the SHA-related things of pgcrypto to src/common, including SHA1, SHA224 and SHA256. px_memset is a simple wrapper on top of memset, we should clean up that first. Any other things to consider that I am forgetting? -- Michael
On Mon, Jul 4, 2016 at 12:54 PM, Michael Paquier <michael.paquier@gmail.com> wrote: > As I am coming back into that, I would as well suggest do the > following, that the current set of patches is clearly missing: > - Put the HMAC infrastructure stuff of pgcrypto into src/common/. It > is a bit a shame to not reuse what is currently available, then I > would suggest to reuse that with HMAC_SCRAM_SHAXXX as label. > - Move *all* the SHA-related things of pgcrypto to src/common, > including SHA1, SHA224 and SHA256. px_memset is a simple wrapper on > top of memset, we should clean up that first. > Any other things to consider that I am forgetting? After looking more into that, I have come up with PG-like equivalents of things in openssl/sha.h: pg_shaXX_init(pg_shaXX_ctx *ctx, data); pg_shaXX_update(pg_shaXX_ctx *ctx, uint8 *data, size_t len); pg_shaXX_final(uint8 *dest, pg_shaXX_ctx *ctx); Then think about shaXX as 1, 224, 256, 384 and 512. Hence all those functions, moved to src/common, finish with the following shape, take an init() one: #ifdef USE_SSL #define <openssl/sha.h> #endif void pg_shaXX_init(pg_shaXX_ctx *ctx) { #ifdef USE_SSL SHAXX_Init((SHAXX_CTX *) ctx); #else //Here does the OpenBSD stuff, now part of pgcrypto #endif } And that's really ugly, all the OpenBSD things that are used by pgcrypto when the code is not built with --with-openssl gather into a single place with parts wrapped around USE_SSL. A less ugly solution would be to split that into two files, and one or the other gets included in OBJS depending on if the build is done with or without OpenSSL. We do a rather similar thing with fe/be-secure-openssl.c. Another possibility is that we could say that SCRAM is designed to work with TLS, as mentioned a bit upthread via the RFC, so we would not support it in builds compiled without OpenSSL. I think that would be a shame, but it would simplify all this refactoring juggling. So, 3 possibilities here: 1) Use a single file src/common/sha.c that includes a set of functions using USE_SSL 2) Have two files in src/common, one when build is used with OpenSSL, and the second one when built-in methods are used 3) Disable the use of SCRAM when OpenSSL is not present in the build. Opinions? My heart goes for 2) because 1) is ugly, and 3) is not appealing in terms of flexibility. -- Michael
On Tue, Jul 5, 2016 at 10:06 AM, Michael Paquier <michael.paquier@gmail.com> wrote:
TLS is complex, we don't want to do that in that case. But just the sha functions isn't *that* complex, is it?
On Mon, Jul 4, 2016 at 12:54 PM, Michael Paquier
<michael.paquier@gmail.com> wrote:
> As I am coming back into that, I would as well suggest do the
> following, that the current set of patches is clearly missing:
> - Put the HMAC infrastructure stuff of pgcrypto into src/common/. It
> is a bit a shame to not reuse what is currently available, then I
> would suggest to reuse that with HMAC_SCRAM_SHAXXX as label.
> - Move *all* the SHA-related things of pgcrypto to src/common,
> including SHA1, SHA224 and SHA256. px_memset is a simple wrapper on
> top of memset, we should clean up that first.
> Any other things to consider that I am forgetting?
After looking more into that, I have come up with PG-like equivalents
of things in openssl/sha.h:
pg_shaXX_init(pg_shaXX_ctx *ctx, data);
pg_shaXX_update(pg_shaXX_ctx *ctx, uint8 *data, size_t len);
pg_shaXX_final(uint8 *dest, pg_shaXX_ctx *ctx);
Then think about shaXX as 1, 224, 256, 384 and 512.
Hence all those functions, moved to src/common, finish with the
following shape, take an init() one:
#ifdef USE_SSL
#define <openssl/sha.h>
#endif
void
pg_shaXX_init(pg_shaXX_ctx *ctx)
{
#ifdef USE_SSL
SHAXX_Init((SHAXX_CTX *) ctx);
#else
//Here does the OpenBSD stuff, now part of pgcrypto
#endif
}
And that's really ugly, all the OpenBSD things that are used by
pgcrypto when the code is not built with --with-openssl gather into a
single place with parts wrapped around USE_SSL. A less ugly solution
would be to split that into two files, and one or the other gets
included in OBJS depending on if the build is done with or without
OpenSSL. We do a rather similar thing with fe/be-secure-openssl.c.
FWIW, the main reason for be-secure-openssl.c is that we could have support for another external SSL library. The idea was never to have a builtin replacement for it :)
However, is there something that's fundamentally better with the OpenSSL implementation? Or should we just keep *just* the #else branch in the code, the part we've imported from OpenBSD?
TLS is complex, we don't want to do that in that case. But just the sha functions isn't *that* complex, is it?
Another possibility is that we could say that SCRAM is designed to
work with TLS, as mentioned a bit upthread via the RFC, so we would
not support it in builds compiled without OpenSSL. I think that would
be a shame, but it would simplify all this refactoring juggling.
So, 3 possibilities here:
1) Use a single file src/common/sha.c that includes a set of functions
using USE_SSL
2) Have two files in src/common, one when build is used with OpenSSL,
and the second one when built-in methods are used
3) Disable the use of SCRAM when OpenSSL is not present in the build.
Opinions? My heart goes for 2) because 1) is ugly, and 3) is not
appealing in terms of flexibility.
I really dislike #3 - we want everybody to start using this...
I'm not sure how common a build without openssl is in the real world though. RPMs, DEBs, Windows installers etc all build with OpenSSL. But we probably don't want to make it mandatory, no...
On Tue, Jul 5, 2016 at 5:50 PM, Magnus Hagander <magnus@hagander.net> wrote: > On Tue, Jul 5, 2016 at 10:06 AM, Michael Paquier <michael.paquier@gmail.com> wrote: > However, is there something that's fundamentally better with the OpenSSL > implementation? Or should we just keep *just* the #else branch in the code, > the part we've imported from OpenBSD? Good question. I think that we want both, giving priority to OpenSSL if it is there. Usually their things prove to have more entropy, but I didn't look at their code to be honest. If we only use the OpenBSD stuff, it would be a good idea to refresh the in-core code. This is from OpenBSD of 2002. > TLS is complex, we don't want to do that in that case. But just the sha > functions isn't *that* complex, is it? No, they are not. >> Another possibility is that we could say that SCRAM is designed to >> work with TLS, as mentioned a bit upthread via the RFC, so we would >> not support it in builds compiled without OpenSSL. I think that would >> be a shame, but it would simplify all this refactoring juggling. >> >> So, 3 possibilities here: >> 1) Use a single file src/common/sha.c that includes a set of functions >> using USE_SSL >> 2) Have two files in src/common, one when build is used with OpenSSL, >> and the second one when built-in methods are used >> 3) Disable the use of SCRAM when OpenSSL is not present in the build. >> >> Opinions? My heart goes for 2) because 1) is ugly, and 3) is not >> appealing in terms of flexibility. > > I really dislike #3 - we want everybody to start using this... OK, after hacking that for a bit I have finished with option 2 and the set of PG-like set of routines, the use of USE_SSL in the file containing all the SHA functions of OpenBSD has proved to be really ugly, but with a split things are really clear to the eye. The stuff I got builds on OSX, Linux and MSVC. pgcrypto cannot link directly to libpgcommon.a, so I am making it compile directly with the source files, as it is doing on HEAD. > I'm not sure how common a build without openssl is in the real world though. > RPMs, DEBs, Windows installers etc all build with OpenSSL. But we probably > don't want to make it mandatory, no... I don't think that it is this much common to have an enterprise-class build of Postgres without SSL, but each company has always its own reasons, so things could exist. And I continue to move on... Thanks for the feedback. -- Michael
On Wed, Jul 6, 2016 at 4:18 PM, Michael Paquier <michael.paquier@gmail.com> wrote: > OK, after hacking that for a bit I have finished with option 2 and the > set of PG-like set of routines, the use of USE_SSL in the file > containing all the SHA functions of OpenBSD has proved to be really > ugly, but with a split things are really clear to the eye. The stuff I > got builds on OSX, Linux and MSVC. pgcrypto cannot link directly to > libpgcommon.a, so I am making it compile directly with the source > files, as it is doing on HEAD. Btw, attached is the patch I did for this part if there is any interest in it. Also, while working on the rest, I am not adding a new column to pg_auth_id to identify the password verifier type. That's just to keep the patch at a bare minimum size. Are there issues with that? -- Michael
Attachment
* Michael Paquier (michael.paquier@gmail.com) wrote: > On Tue, Jul 5, 2016 at 5:50 PM, Magnus Hagander <magnus@hagander.net> wrote: > > On Tue, Jul 5, 2016 at 10:06 AM, Michael Paquier <michael.paquier@gmail.com> wrote: > > However, is there something that's fundamentally better with the OpenSSL > > implementation? Or should we just keep *just* the #else branch in the code, > > the part we've imported from OpenBSD? > > Good question. I think that we want both, giving priority to OpenSSL > if it is there. Usually their things prove to have more entropy, but I > didn't look at their code to be honest. If we only use the OpenBSD > stuff, it would be a good idea to refresh the in-core code. This is > from OpenBSD of 2002. I agree that we definitely want to use the OpenSSL functions when they are available. > > I'm not sure how common a build without openssl is in the real world though. > > RPMs, DEBs, Windows installers etc all build with OpenSSL. But we probably > > don't want to make it mandatory, no... > > I don't think that it is this much common to have an enterprise-class > build of Postgres without SSL, but each company has always its own > reasons, so things could exist. I agree that it's useful to have the support if PG isn't built with OpenSSL for some reason. Thanks! Stephen
On Thu, Jul 7, 2016 at 7:51 AM, Stephen Frost <sfrost@snowman.net> wrote: > * Michael Paquier (michael.paquier@gmail.com) wrote: >> > I'm not sure how common a build without openssl is in the real world though. >> > RPMs, DEBs, Windows installers etc all build with OpenSSL. But we probably >> > don't want to make it mandatory, no... >> >> I don't think that it is this much common to have an enterprise-class >> build of Postgres without SSL, but each company has always its own >> reasons, so things could exist. > > I agree that it's useful to have the support if PG isn't built with > OpenSSL for some reason. OK, I am doing that at the end. And also while moving on... On another topic, here are some ideas to extend CREATE/ALTER ROLE to support SCRAM password directly: 1) protocol PASSWORD value, where protocol is { MD5 | PLAIN | SCRAM }, giving: CREATE ROLE foorole SCRAM PASSWORD value; 2) PASSWORD (protocol) value. 3) Just add SCRAM PASSWORD My mind is thinking about 1) as being the cleanest solution as this does not touch the defaults, which may change a couple of releases later. Other opinions? Note that I am also switching password_encryption to an enum, able to use as values on, off, md5, plain, scram. Of course, on => md5, off => plain to preserve the default. Other things that I am making conservative: - ENCRYPTED PASSWORD still implies MD5-encrypted password - UNENCRYPTED PASSWORD still implies plain text password - PASSWORD used alone depends on the value of password_encryption So it would be possible to move to scram by default by setting password_encryption to 'scram'. Objections are welcome, I am moving into something respecting the default behavior as much as possible. -- Michael
On Fri, Jul 15, 2016 at 9:30 AM, Michael Paquier <michael.paquier@gmail.com> wrote: > OK, I am doing that at the end. > > And also while moving on... > > On another topic, here are some ideas to extend CREATE/ALTER ROLE to > support SCRAM password directly: > 1) protocol PASSWORD value, where protocol is { MD5 | PLAIN | SCRAM }, giving: > CREATE ROLE foorole SCRAM PASSWORD value; > 2) PASSWORD (protocol) value. > 3) Just add SCRAM PASSWORD > My mind is thinking about 1) as being the cleanest solution as this > does not touch the defaults, which may change a couple of releases > later. Other opinions? I can't really understand what you are saying here, but I'm going to be -1 on adding SCRAM as a parser keyword. Let's pick a syntax like "PASSWORD SConst USING SConst" or "PASSWORD SConst ENCRYPTED WITH SConst". -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Michael Paquier wrote: > On Wed, Jul 6, 2016 at 4:18 PM, Michael Paquier > <michael.paquier@gmail.com> wrote: > > OK, after hacking that for a bit I have finished with option 2 and the > > set of PG-like set of routines, the use of USE_SSL in the file > > containing all the SHA functions of OpenBSD has proved to be really > > ugly, but with a split things are really clear to the eye. The stuff I > > got builds on OSX, Linux and MSVC. pgcrypto cannot link directly to > > libpgcommon.a, so I am making it compile directly with the source > > files, as it is doing on HEAD. > > Btw, attached is the patch I did for this part if there is any interest in it. After quickly eyeballing your patch, I agree with the decision of going with (2), even if my gut initially told me that (1) would be better because it'd require less makefile trickery. I'm surprised that you say pgcrypto cannot link libpgcommon directly. Is there some insurmountable problem there? I notice your MSVC patch uses libpgcommon while the Makefile symlinks the files. -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Wed, Jul 20, 2016 at 02:12:57PM -0400, Alvaro Herrera wrote: > Michael Paquier wrote: > > On Wed, Jul 6, 2016 at 4:18 PM, Michael Paquier > > <michael.paquier@gmail.com> wrote: > > > OK, after hacking that for a bit I have finished with option 2 and the > > > set of PG-like set of routines, the use of USE_SSL in the file > > > containing all the SHA functions of OpenBSD has proved to be really > > > ugly, but with a split things are really clear to the eye. The stuff I > > > got builds on OSX, Linux and MSVC. pgcrypto cannot link directly to > > > libpgcommon.a, so I am making it compile directly with the source > > > files, as it is doing on HEAD. > > > > Btw, attached is the patch I did for this part if there is any interest in it. > > After quickly eyeballing your patch, I agree with the decision of going > with (2), even if my gut initially told me that (1) would be better > because it'd require less makefile trickery. > > I'm surprised that you say pgcrypto cannot link libpgcommon directly. > Is there some insurmountable problem there? I notice your MSVC patch > uses libpgcommon while the Makefile symlinks the files. People have, in the past, expressed concerns about linking in pgcrypto. Apparently, in some countries, it's a legal problem. Best, David. -- David Fetter <david(at)fetter(dot)org> http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david(dot)fetter(at)gmail(dot)com Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate
On Thu, Jul 21, 2016 at 12:15 AM, Robert Haas <robertmhaas@gmail.com> wrote: > On Fri, Jul 15, 2016 at 9:30 AM, Michael Paquier > <michael.paquier@gmail.com> wrote: >> OK, I am doing that at the end. >> >> And also while moving on... >> >> On another topic, here are some ideas to extend CREATE/ALTER ROLE to >> support SCRAM password directly: >> 1) protocol PASSWORD value, where protocol is { MD5 | PLAIN | SCRAM }, giving: >> CREATE ROLE foorole SCRAM PASSWORD value; >> 2) PASSWORD (protocol) value. >> 3) Just add SCRAM PASSWORD >> My mind is thinking about 1) as being the cleanest solution as this >> does not touch the defaults, which may change a couple of releases >> later. Other opinions? > > I can't really understand what you are saying here, but I'm going to > be -1 on adding SCRAM as a parser keyword. Let's pick a syntax like > "PASSWORD SConst USING SConst" or "PASSWORD SConst ENCRYPTED WITH > SConst". No, I do not mean to make SCRAM or MD5 keywords. While hacking that, I got at some point in the mood of using "PASSWORD Sconst Sconst" but that's ugly. Sticking a keyword in between makes more sense, and USING is a good idea. I haven't thought of this one. By the way, the core patch does not have any grammar extension. The grammar extension will be on top of it and the core patch can just activate scram passwords using password_encryption. That's user unfriendly, but as the patch is large I try to cut it in as many pieces as necessary. -- Michael
On Thu, Jul 21, 2016 at 5:25 AM, David Fetter <david@fetter.org> wrote: > On Wed, Jul 20, 2016 at 02:12:57PM -0400, Alvaro Herrera wrote: >> Michael Paquier wrote: >> > On Wed, Jul 6, 2016 at 4:18 PM, Michael Paquier >> > <michael.paquier@gmail.com> wrote: >> > > OK, after hacking that for a bit I have finished with option 2 and the >> > > set of PG-like set of routines, the use of USE_SSL in the file >> > > containing all the SHA functions of OpenBSD has proved to be really >> > > ugly, but with a split things are really clear to the eye. The stuff I >> > > got builds on OSX, Linux and MSVC. pgcrypto cannot link directly to >> > > libpgcommon.a, so I am making it compile directly with the source >> > > files, as it is doing on HEAD. >> > >> > Btw, attached is the patch I did for this part if there is any interest in it. >> >> After quickly eyeballing your patch, I agree with the decision of going >> with (2), even if my gut initially told me that (1) would be better >> because it'd require less makefile trickery. Yeah, I thought the same thing as well when putting my hands in the dirt... But the in the end (2) is really less ugly. >> I'm surprised that you say pgcrypto cannot link libpgcommon directly. >> Is there some insurmountable problem there? I notice your MSVC patch >> uses libpgcommon while the Makefile symlinks the files. I am running into some weird things when linking both on OSX... But I am not done with it completely yet. I'll adjust that a bit more when producing the set of patches that will be published. So let's see. > People have, in the past, expressed concerns about linking in > pgcrypto. Apparently, in some countries, it's a legal problem. Do you have any references? I don't see that as a problem. -- Michael
On Wed, Jul 20, 2016 at 7:42 PM, Michael Paquier <michael.paquier@gmail.com> wrote: >> People have, in the past, expressed concerns about linking in >> pgcrypto. Apparently, in some countries, it's a legal problem. > > Do you have any references? I don't see that as a problem. I don't have a link to previous discussion handy, but I definitely recall that it's been discussed. I don't think that would mean that libpgcrypto couldn't depend on libpgcommon, but the reverse direction would make libpgcrypto essentially mandatory which I don't think is a direction we want to go for both technical and legal reasons. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On 7/21/16 12:19 PM, Robert Haas wrote: > On Wed, Jul 20, 2016 at 7:42 PM, Michael Paquier > <michael.paquier@gmail.com> wrote: >>> People have, in the past, expressed concerns about linking in >>> pgcrypto. Apparently, in some countries, it's a legal problem. >> >> Do you have any references? I don't see that as a problem. > > I don't have a link to previous discussion handy, but I definitely > recall that it's been discussed. I don't think that would mean that > libpgcrypto couldn't depend on libpgcommon, but the reverse direction > would make libpgcrypto essentially mandatory which I don't think is a > direction we want to go for both technical and legal reasons. I searched a few different ways and finally came up with this post from Tom: https://www.postgresql.org/message-id/11392.1389991321@sss.pgh.pa.us It's the only thing I could find, but thought it might jog something loose for somebody else. I know that export controls have been an issue for crypto in the past but have no idea what the current state of that is. -- -David david@pgmasters.net
David Steele <david@pgmasters.net> writes: > On 7/21/16 12:19 PM, Robert Haas wrote: >> On Wed, Jul 20, 2016 at 7:42 PM, Michael Paquier >> <michael.paquier@gmail.com> wrote: >>>> People have, in the past, expressed concerns about linking in >>>> pgcrypto. Apparently, in some countries, it's a legal problem. >>> Do you have any references? I don't see that as a problem. >> I don't have a link to previous discussion handy, but I definitely >> recall that it's been discussed. I don't think that would mean that >> libpgcrypto couldn't depend on libpgcommon, but the reverse direction >> would make libpgcrypto essentially mandatory which I don't think is a >> direction we want to go for both technical and legal reasons. > I searched a few different ways and finally came up with this post from Tom: > https://www.postgresql.org/message-id/11392.1389991321@sss.pgh.pa.us > It's the only thing I could find, but thought it might jog something > loose for somebody else. Way back when, like fifteen years ago, there absolutely were US export control restrictions on software containing crypto. I believe the US has figured out that that was silly, but I'm not sure everyplace else has. (And if you've been reading the news you will notice that legal restrictions on crypto are back in vogue, so it would not be wise to assume that the question is dead and buried.) So our project policy since at least the turn of the century has been that any crypto facility has to be in a separable extension, where it would be fairly easy for a packager to delete it if they need to ship a crypto-free version. Note that "crypto" for this purpose generally means reversible encryption; I've never heard that one-way hashes are illegal anywhere. So password hashing such as md5 is fine in core, and a stronger hash would be too. But pulling in pgcrypto lock, stock, and barrel is not OK. regards, tom lane
On Fri, Jul 22, 2016 at 2:31 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Way back when, like fifteen years ago, there absolutely were US export > control restrictions on software containing crypto. I believe the US has > figured out that that was silly, but I'm not sure everyplace else has. England is these days legally running a battle against data encryption. I have not heard how this is evolving these days. > (And if you've been reading the news you will notice that legal > restrictions on crypto are back in vogue, so it would not be wise to > assume that the question is dead and buried.) So our project policy > since at least the turn of the century has been that any crypto facility > has to be in a separable extension, where it would be fairly easy for > a packager to delete it if they need to ship a crypto-free version. > Note that "crypto" for this purpose generally means reversible encryption; > I've never heard that one-way hashes are illegal anywhere. So password > hashing such as md5 is fine in core, and a stronger hash would be too. > But pulling in pgcrypto lock, stock, and barrel is not OK. So it would be an issue if pgcrypto.so links directly to libpqcommon? Because that's not what I am doing now, perhaps fortunately. I moved the sha functions to src/common. But actually but thinking more about that, I don't need to do so because the routines of SCRAM shared between the frontend and the backend just need to be part of libpq so they could just be part of backend/libpq like md5. Tom, if I get it correctly, it would not be an issue if the SHA functions are directly part of the compiled backend like md5, right? Because I would like to just change my set of patches to have the SHA and the encoding functions in src/backend/libpq instead of src/common, and then have pgcrypto be compiled with a link to those files. That's a cleaner design btw, more in line with what is done for md5.. -- Michael
Michael Paquier <michael.paquier@gmail.com> writes: > On Fri, Jul 22, 2016 at 2:31 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> Note that "crypto" for this purpose generally means reversible encryption; >> I've never heard that one-way hashes are illegal anywhere. So password >> hashing such as md5 is fine in core, and a stronger hash would be too. >> But pulling in pgcrypto lock, stock, and barrel is not OK. > So it would be an issue if pgcrypto.so links directly to libpqcommon? No, I don't see why that'd be an issue. What we can't do is have libpgcommon depending on pgcrypto.so, or containing anything more than one-way-hash functionality itself. > Because I would like to just change my set of patches to have the SHA > and the encoding functions in src/backend/libpq instead of src/common, > and then have pgcrypto be compiled with a link to those files. That's > a cleaner design btw, more in line with what is done for md5.. I'm confused. We need that code in both libpq and backend, no? src/common is the place for stuff of that description. regards, tom lane
On Fri, Jul 22, 2016 at 8:48 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Michael Paquier <michael.paquier@gmail.com> writes: >> Because I would like to just change my set of patches to have the SHA >> and the encoding functions in src/backend/libpq instead of src/common, >> and then have pgcrypto be compiled with a link to those files. That's >> a cleaner design btw, more in line with what is done for md5.. > > I'm confused. We need that code in both libpq and backend, no? > src/common is the place for stuff of that description. Not necessarily. src/interfaces/libpq/Makefile uses a set of files like md5.c which is located in the backend code and directly compiles libpq.so with them, so one possibility would be to do the same for sha.c: locate the file in src/backend/libpq/ and then fetch the file directly when compiling libpq's shared library. One thing about my current set of patches is that I have begun adding files from src/common/ to libpq's list of files. As that would be new I am wondering if I should avoid doing so. Here is what I mean: --- a/src/interfaces/libpq/Makefile +++ b/src/interfaces/libpq/Makefile @@ -43,6 +43,14 @@ OBJS += $(filter crypt.o getaddrinfo.o getpeereid.o inet_aton.o open.o system.oOBJS += ip.o md5.o# utils/mbOBJS += encnames.o wchar.o +# common/ +OBJS += encode.o scram-common.o + -- Michael
Michael Paquier <michael.paquier@gmail.com> writes: > On Fri, Jul 22, 2016 at 8:48 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> I'm confused. We need that code in both libpq and backend, no? >> src/common is the place for stuff of that description. > Not necessarily. src/interfaces/libpq/Makefile uses a set of files > like md5.c which is located in the backend code and directly compiles > libpq.so with them, so one possibility would be to do the same for > sha.c: locate the file in src/backend/libpq/ and then fetch the file > directly when compiling libpq's shared library. Meh. That seems like a hack left over from before we had src/common. Having said that, src/interfaces/libpq/ does have some special requirements, because it needs the code compiled with -fpic (on most hardware), which means it can't just use the client-side libpgcommon.a builds. So maybe it's not worth improving this. > One thing about my current set of patches is that I have begun adding > files from src/common/ to libpq's list of files. As that would be new > I am wondering if I should avoid doing so. Well, it could link source files from there just as easily as from the backend. Not object files, though. regards, tom lane
On Fri, Jul 22, 2016 at 9:02 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Michael Paquier <michael.paquier@gmail.com> writes: >> One thing about my current set of patches is that I have begun adding >> files from src/common/ to libpq's list of files. As that would be new >> I am wondering if I should avoid doing so. > > Well, it could link source files from there just as easily as from the > backend. Not object files, though. OK. I'll just keep things the current way then :) -- Michael
On 22 July 2016 at 01:31, Tom Lane <tgl@sss.pgh.pa.us> wrote:
David Steele <david@pgmasters.net> writes:
> On 7/21/16 12:19 PM, Robert Haas wrote:
>> On Wed, Jul 20, 2016 at 7:42 PM, Michael Paquier
>> <michael.paquier@gmail.com> wrote:
>>>> People have, in the past, expressed concerns about linking in
>>>> pgcrypto. Apparently, in some countries, it's a legal problem.
>>> Do you have any references? I don't see that as a problem.
>> I don't have a link to previous discussion handy, but I definitely
>> recall that it's been discussed. I don't think that would mean that
>> libpgcrypto couldn't depend on libpgcommon, but the reverse direction
>> would make libpgcrypto essentially mandatory which I don't think is a
>> direction we want to go for both technical and legal reasons.
> I searched a few different ways and finally came up with this post from Tom:
> https://www.postgresql.org/message-id/11392.1389991321@sss.pgh.pa.us
> It's the only thing I could find, but thought it might jog something
> loose for somebody else.
Way back when, like fifteen years ago, there absolutely were US export
control restrictions on software containing crypto. I believe the US has
figured out that that was silly, but I'm not sure everyplace else has.
Australia has recently enacted laws that are reminiscent of the US's defunct crypto export control laws, but they add penalties for *teaching* encryption too. Yup, you can be charged for talking about it. Of course they'll only actually USE those new powers to Stop The Terrorist Threat, they promise...
Unless recently amended, they even failed to exclude academic institutions. I haven't been following it closely because, frankly, it's too ridiculous to pay much attention to, and I don't work directly with crypto anyway. But it's far from the only such colossally ignorant and idiotic law floating around.
Despite the technical frustrations involved, we should keep crypto implementations in a separate library. I agree with Tom that one-way hashes are not a practical concern, even if the laws are probably written too poorly to draw a distinction.
On Fri, Jul 22, 2016 at 9:06 AM, Michael Paquier <michael.paquier@gmail.com> wrote: > On Fri, Jul 22, 2016 at 9:02 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> Michael Paquier <michael.paquier@gmail.com> writes: >>> One thing about my current set of patches is that I have begun adding >>> files from src/common/ to libpq's list of files. As that would be new >>> I am wondering if I should avoid doing so. >> >> Well, it could link source files from there just as easily as from the >> backend. Not object files, though. > > OK. I'll just keep things the current way then :) Note: I have put more energy into that and I think that I will be able to publish a new patch set pretty soon, like at the beginning of next week. -- Michael
On Fri, Jul 22, 2016 at 3:43 PM, Michael Paquier <michael.paquier@gmail.com> wrote: > On Fri, Jul 22, 2016 at 9:06 AM, Michael Paquier > <michael.paquier@gmail.com> wrote: >> On Fri, Jul 22, 2016 at 9:02 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >>> Michael Paquier <michael.paquier@gmail.com> writes: >>>> One thing about my current set of patches is that I have begun adding >>>> files from src/common/ to libpq's list of files. As that would be new >>>> I am wondering if I should avoid doing so. >>> >>> Well, it could link source files from there just as easily as from the >>> backend. Not object files, though. >> >> OK. I'll just keep things the current way then :) > > Note: I have put more energy into that and I think that I will be able > to publish a new patch set pretty soon, like at the beginning of next > week. Ok, here is the real deal. As discussed at PGcon, I have shaved off from the set of patches the following things: - No separate catalog pg_auth_verifier - No additional column in pg_authid to determine the password type. All the logic used check if the password string has a wanted format. We do that for MD5 now, this set does it for SCRAM. - Removal of the pg_upgrade stuff. - Removal of password_protocols, so we don't care anymore about protocol aging. In short, the SCRAM verifiers get stored in rolpassword. And here is what this set of patches does: - Implementation of SCRAM-SHA-256, and not SHA1. I have moved to the one that makes the most sense considering the current situation based on RFC 5802 and 7677. - No channel binding support. I guess that this could be added later on. - password_encryption is now an enum, and gains three values: md5, plain and scram. true => md5, false => plain for backward compatibility - Grammar of CREATE/ALTER ROLE is extended with PASSWORD val USING protocol, that's a separate patch applying on top of the core patch for SASL. I have noticed as well a couple of bugs in the previous set(s) of patches: - valid_until was not checked for SCRAM - When using ENCRYPTED or UNENCRYPTED, already encrypted password should be used as-is. The same is applied to PASSWORD USING protocol to ease dump and reload. That's actually what is used for MD5. And here is a detail of the patches: - 0001, refactoring of SHA functions into src/common. - 0002, refactoring for sendAuthRequest - 0003, Refactoring for RandomSalt to accomodate with the salt used by scram (length of 10 bytes, md5 is 4). - 0004, move encoding routines to src/common/ - 0005, make password_encryption an enum - 0006, refactor some code in CREATE/ALTER role code paths related the use of password_encryption - 0007, refactor some code to have a single routine to fetch password and valid_until from pg_authid - 0008, The core implementation of SCRAM-SHA-256, with the SASL communication protocol. if you want to use SCRAM with that, things go with password_encryption = 'scram'. - 0009, addition of PASSWORD val USING protocol - 0010. regression tests for passwords. Not sure how useful they would be. But they helped me a bit. I am adding an entry in the next CF. Comments are welcome. -- Michael
Attachment
- 0004-Move-encoding-routines-to-src-common.patch
- 0005-Switch-password_encryption-to-a-enum.patch
- 0006-Refactor-decision-making-of-password-encryption-into.patch
- 0007-Create-generic-routine-to-fetch-password-and-valid-u.patch
- 0008-Support-for-SCRAM-SHA-256-authentication-RFC-5802-an.patch
- 0009-Add-clause-PASSWORD-val-USING-protocol-to-CREATE-ALT.patch
- 0010-Add-regression-tests-for-passwords.patch
- 0001-Refactor-SHA-functions-and-move-them-to-src-common.patch
- 0002-Refactor-sendAuthRequest.patch
- 0003-Refactor-RandomSalt-to-handle-salts-of-different-len.patch
On 07/22/2016 03:02 AM, Tom Lane wrote: > Michael Paquier <michael.paquier@gmail.com> writes: >> On Fri, Jul 22, 2016 at 8:48 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >>> I'm confused. We need that code in both libpq and backend, no? >>> src/common is the place for stuff of that description. > >> Not necessarily. src/interfaces/libpq/Makefile uses a set of files >> like md5.c which is located in the backend code and directly compiles >> libpq.so with them, so one possibility would be to do the same for >> sha.c: locate the file in src/backend/libpq/ and then fetch the file >> directly when compiling libpq's shared library. > > Meh. That seems like a hack left over from before we had src/common. > > Having said that, src/interfaces/libpq/ does have some special > requirements, because it needs the code compiled with -fpic (on most > hardware), which means it can't just use the client-side libpgcommon.a > builds. So maybe it's not worth improving this. src/common/Makefile says: > # This makefile generates two outputs: > # > # libpgcommon.a - contains object files with FRONTEND defined, > # for use by client application and libraries > # > # libpgcommon_srv.a - contains object files without FRONTEND defined, > # for use only by the backend binaries It claims that libpcommon.a can be used by libraries, but without -fPIC, that's a lie. >> One thing about my current set of patches is that I have begun adding >> files from src/common/ to libpq's list of files. As that would be new >> I am wondering if I should avoid doing so. > > Well, it could link source files from there just as easily as from the > backend. Not object files, though. I think that's the way to go (and that's what Michael's latest patch did). But let's update the comment in the Makefile, explaining that you can also copy or symlink source files directly from src/common as needed, for instance for shared libraries. Let's take the opportunity and also move src/backend/libpq/ip.c and md5.c into src/common. It would be weird to have sha.c in src/common, but md5.c in src/backend/libpq. Looking at ip.c, it could be split into two: some of the functions in ip.c are clearly not needed in the client, like enumerating all interfaces. - Heikki
On Thu, Aug 18, 2016 at 9:28 PM, Heikki Linnakangas <hlinnaka@iki.fi> wrote: >> # This makefile generates two outputs: >> # >> # libpgcommon.a - contains object files with FRONTEND defined, >> # for use by client application and libraries >> # >> # libpgcommon_srv.a - contains object files without FRONTEND >> defined, >> # for use only by the backend binaries > > > It claims that libpcommon.a can be used by libraries, but without -fPIC, > that's a lie. Yes. >>> One thing about my current set of patches is that I have begun adding >>> files from src/common/ to libpq's list of files. As that would be new >>> I am wondering if I should avoid doing so. >> >> >> Well, it could link source files from there just as easily as from the >> backend. Not object files, though. > > > I think that's the way to go (and that's what Michael's latest patch did). > But let's update the comment in the Makefile, explaining that you can also > copy or symlink source files directly from src/common as needed, for > instance for shared libraries. Updating that is a good idea. > Let's take the opportunity and also move src/backend/libpq/ip.c and md5.c > into src/common. It would be weird to have sha.c in src/common, but md5.c in > src/backend/libpq. Looking at ip.c, it could be split into two: some of the > functions in ip.c are clearly not needed in the client, like enumerating all > interfaces. It would be definitely better to do all that before even moving sha.c. For the current ip.c, I don't have a better idea than putting in src/common/ip.c the set of routines used by both the frontend and backend, and have fe_ip.c the new file that has the frontend-only things. Need a patch? -- Michael
On 08/18/2016 03:45 PM, Michael Paquier wrote: > On Thu, Aug 18, 2016 at 9:28 PM, Heikki Linnakangas <hlinnaka@iki.fi> wrote: >> Let's take the opportunity and also move src/backend/libpq/ip.c and md5.c >> into src/common. It would be weird to have sha.c in src/common, but md5.c in >> src/backend/libpq. Looking at ip.c, it could be split into two: some of the >> functions in ip.c are clearly not needed in the client, like enumerating all >> interfaces. > > It would be definitely better to do all that before even moving sha.c. Agreed. > For the current ip.c, I don't have a better idea than putting in > src/common/ip.c the set of routines used by both the frontend and > backend, and have fe_ip.c the new file that has the frontend-only > things. Need a patch? Yes, please. I don't think there's anything there that's needed by only the frontend, but some of the functions are needed by only the backend. So I think we'll end up with src/common/ip.c, and src/backend/libpq/be-ip.c. (Not sure about those names, pick something that makes sense, given what's left in the files.) - Heikki
On Fri, Aug 19, 2016 at 1:51 AM, Heikki Linnakangas <hlinnaka@iki.fi> wrote: > On 08/18/2016 03:45 PM, Michael Paquier wrote: >> >> On Thu, Aug 18, 2016 at 9:28 PM, Heikki Linnakangas <hlinnaka@iki.fi> >> wrote: >> For the current ip.c, I don't have a better idea than putting in >> src/common/ip.c the set of routines used by both the frontend and >> backend, and have fe_ip.c the new file that has the frontend-only >> things. Need a patch? > > > Yes, please. I don't think there's anything there that's needed by only the > frontend, but some of the functions are needed by only the backend. So I > think we'll end up with src/common/ip.c, and src/backend/libpq/be-ip.c. (Not > sure about those names, pick something that makes sense, given what's left > in the files.) OK, so let's do that first correctly. Attached are two patches: - 0001 moves md5 to src/common - 0002 that does the same for ip.c. By the way, it seems to me that having be-ip.c is not that much worth it. I am noticing that only pg_range_sockaddr could be marked as backend-only. pg_foreach_ifaddr is being used as well by tools/ifaddrs/, and this one calls as well pg_sockaddr_cidr_mask. Or is there still some utility in having src/tools/ifaddrs? If not we could move pg_sockaddr_cidr_mask and pg_foreach_ifaddr to be backend-only. With pg_range_sockaddr that would make half the routines to be marked as backend-only. I have not rebased the whole series yet of SCRAM... I'll do that after we agree on those two patches with the two commits you have already done cleaned up of course (thanks btw for those ones!). -- Michael
Attachment
On 08/19/2016 09:46 AM, Michael Paquier wrote: > On Fri, Aug 19, 2016 at 1:51 AM, Heikki Linnakangas <hlinnaka@iki.fi> wrote: >> On 08/18/2016 03:45 PM, Michael Paquier wrote: >>> >>> On Thu, Aug 18, 2016 at 9:28 PM, Heikki Linnakangas <hlinnaka@iki.fi> >>> wrote: >>> For the current ip.c, I don't have a better idea than putting in >>> src/common/ip.c the set of routines used by both the frontend and >>> backend, and have fe_ip.c the new file that has the frontend-only >>> things. Need a patch? >> >> Yes, please. I don't think there's anything there that's needed by only the >> frontend, but some of the functions are needed by only the backend. So I >> think we'll end up with src/common/ip.c, and src/backend/libpq/be-ip.c. (Not >> sure about those names, pick something that makes sense, given what's left >> in the files.) > > OK, so let's do that first correctly. Attached are two patches: > - 0001 moves md5 to src/common > - 0002 that does the same for ip.c. > By the way, it seems to me that having be-ip.c is not that much worth > it. I am noticing that only pg_range_sockaddr could be marked as > backend-only. pg_foreach_ifaddr is being used as well by > tools/ifaddrs/, and this one calls as well pg_sockaddr_cidr_mask. Or > is there still some utility in having src/tools/ifaddrs? If not we > could move pg_sockaddr_cidr_mask and pg_foreach_ifaddr to be > backend-only. With pg_range_sockaddr that would make half the routines > to be marked as backend-only. I decided to split ip.c anyway. I'd like to keep the files in src/common/ip.c as small as possible, so I think it makes sense to be quite surgical when moving things there. I kept the pg_foreach_ifaddr() function in src/backend/libpq/ifaddr.c (I renamed the file to avoid confusion with the ip.c that got moved), even though it means that test_ifaddr will have to continue to copy the file directly from src/backend/libpq. I'm OK with that, because test_ifaddrs is just a little test program that mimics the backend's behaviour of enumerating interfaces. I don't consider it to be a "real" frontend application. Pushed, after splitting. Thanks! Now let's move on to the more substantial patches. - Heikki
On Fri, Sep 2, 2016 at 7:57 PM, Heikki Linnakangas <hlinnaka@iki.fi> wrote: > I decided to split ip.c anyway. I'd like to keep the files in > src/common/ip.c as small as possible, so I think it makes sense to be quite > surgical when moving things there. I kept the pg_foreach_ifaddr() function > in src/backend/libpq/ifaddr.c (I renamed the file to avoid confusion with > the ip.c that got moved), even though it means that test_ifaddr will have to > continue to copy the file directly from src/backend/libpq. I'm OK with that, > because test_ifaddrs is just a little test program that mimics the backend's > behaviour of enumerating interfaces. I don't consider it to be a "real" > frontend application. > > Pushed, after splitting. Thanks! Now let's move on to the more substantial > patches. Before I send a new series of patches... There is one thing that I am still troubled with: the compilation of pgcrypto. First from contrib/pgcrypto/Makefile I am noticing the following issue with this block: CF_SRCS = $(if $(subst no,,$(with_openssl)), $(OSSL_SRCS), $(INT_SRCS)) CF_TESTS = $(if $(subst no,,$(with_openssl)), $(OSSL_TESTS), $(INT_TESTS)) CF_PGP_TESTS = $(if $(subst no,,$(with_zlib)), $(ZLIB_TST), $(ZLIB_OFF_TST)) How is that correct if src/Makefile.global is not loaded first? Variables like with_openssl are still not loaded at that point. Then, as per patch 0001 there are two files holding the SHA routines: sha.c with the interface taken from OpenBSD, and sha_openssl.c that uses the interface of OpenSSL. And when compiling pgcrypto, the choice of file is made depending on the value of $(with_openssl). As far as I know, the list of OBJS needs to be completely defined before loading contrib-global.mk, but I fail to see how we can do that with USE_PGXS=1... Or would it be fine to error if pgcrypto is compiled with USE_PGXS? -- Michael
Michael Paquier <michael.paquier@gmail.com> writes: > Before I send a new series of patches... There is one thing that I am > still troubled with: the compilation of pgcrypto. First from > contrib/pgcrypto/Makefile I am noticing the following issue with this > block: > CF_SRCS = $(if $(subst no,,$(with_openssl)), $(OSSL_SRCS), $(INT_SRCS)) > CF_TESTS = $(if $(subst no,,$(with_openssl)), $(OSSL_TESTS), $(INT_TESTS)) > CF_PGP_TESTS = $(if $(subst no,,$(with_zlib)), $(ZLIB_TST), $(ZLIB_OFF_TST)) > How is that correct if src/Makefile.global is not loaded first? > Variables like with_openssl are still not loaded at that point. Um, you do know that Make treats "=" definitions of variables as, essentially, macro definitions? The fact that with_openssl isn't set yet doesn't necessarily mean these definitions are wrong. Is it actually not working for you, or are you just not understanding why it works? regards, tom lane
On Fri, Sep 2, 2016 at 10:59 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Michael Paquier <michael.paquier@gmail.com> writes: >> Before I send a new series of patches... There is one thing that I am >> still troubled with: the compilation of pgcrypto. First from >> contrib/pgcrypto/Makefile I am noticing the following issue with this >> block: >> CF_SRCS = $(if $(subst no,,$(with_openssl)), $(OSSL_SRCS), $(INT_SRCS)) >> CF_TESTS = $(if $(subst no,,$(with_openssl)), $(OSSL_TESTS), $(INT_TESTS)) >> CF_PGP_TESTS = $(if $(subst no,,$(with_zlib)), $(ZLIB_TST), $(ZLIB_OFF_TST)) >> How is that correct if src/Makefile.global is not loaded first? >> Variables like with_openssl are still not loaded at that point. > > Um, you do know that Make treats "=" definitions of variables as, > essentially, macro definitions? The fact that with_openssl isn't > set yet doesn't necessarily mean these definitions are wrong. > Is it actually not working for you, or are you just not understanding > why it works? Oops right. I was trying to use an ifeq on $with_openssl, and that did not work but just using that would go correctly... -- Michael
On Fri, Sep 2, 2016 at 10:23 PM, Michael Paquier <michael.paquier@gmail.com> wrote: > On Fri, Sep 2, 2016 at 7:57 PM, Heikki Linnakangas <hlinnaka@iki.fi> wrote: >> I decided to split ip.c anyway. I'd like to keep the files in >> src/common/ip.c as small as possible, so I think it makes sense to be quite >> surgical when moving things there. I kept the pg_foreach_ifaddr() function >> in src/backend/libpq/ifaddr.c (I renamed the file to avoid confusion with >> the ip.c that got moved), even though it means that test_ifaddr will have to >> continue to copy the file directly from src/backend/libpq. I'm OK with that, >> because test_ifaddrs is just a little test program that mimics the backend's >> behaviour of enumerating interfaces. I don't consider it to be a "real" >> frontend application. >> >> Pushed, after splitting. Thanks! Now let's move on to the more substantial >> patches. Thanks for the push. > Before I send a new series of patches... There is one thing that I am > still troubled with: the compilation of pgcrypto. First from > contrib/pgcrypto/Makefile I am noticing the following issue with this > block: > CF_SRCS = $(if $(subst no,,$(with_openssl)), $(OSSL_SRCS), $(INT_SRCS)) > CF_TESTS = $(if $(subst no,,$(with_openssl)), $(OSSL_TESTS), $(INT_TESTS)) > CF_PGP_TESTS = $(if $(subst no,,$(with_zlib)), $(ZLIB_TST), $(ZLIB_OFF_TST)) > How is that correct if src/Makefile.global is not loaded first? > Variables like with_openssl are still not loaded at that point. > > Then, as per patch 0001 there are two files holding the SHA routines: > sha.c with the interface taken from OpenBSD, and sha_openssl.c that > uses the interface of OpenSSL. And when compiling pgcrypto, the choice > of file is made depending on the value of $(with_openssl). So I have solved my identity crisis here by just using INT_SRCS and OSSL_SRCS to list the correct files holding the SHA files. Thanks Tom for the hint. I need to study more my Makefile-fu. Attached is a new series: - 0001, refactoring of SHA functions into src/common. - 0002, move encoding routines to src/common/ - 0003, make password_encryption an enum - 0004, refactor some code in CREATE/ALTER role code paths related the use of password_encryption - 0005, refactor some code to have a single routine to fetch password and valid_until from pg_authid - 0006, The core implementation of SCRAM-SHA-256, with the SASL communication protocol. if you want to use SCRAM with that, things go with password_encryption = 'scram'. I have spotted here a bug with the MSVC build on the way. - 0007, addition of PASSWORD val USING protocol - 0008. regression tests for passwords. Those do not trigger the internal sha routines, which lead to inconsistent results. -- Michael
Attachment
- 0001-Refactor-SHA-functions-and-move-them-to-src-common.patch
- 0002-Move-encoding-routines-to-src-common.patch
- 0003-Switch-password_encryption-to-a-enum.patch
- 0004-Refactor-decision-making-of-password-encryption-into.patch
- 0005-Create-generic-routine-to-fetch-password-and-valid-u.patch
- 0006-Support-for-SCRAM-SHA-256-authentication-RFC-5802-an.patch
- 0007-Add-clause-PASSWORD-val-USING-protocol-to-CREATE-ALT.patch
- 0008-Add-regression-tests-for-passwords.patch
On 9/3/16 8:36 AM, Michael Paquier wrote: > > Attached is a new series: * [PATCH 1/8] Refactor SHA functions and move them to src/common/ I'd like to see more code comments in sha.c (though I realize this was copied directly from pgcrypto.) I tested by building with and without --with-openssl and running make check for the project as a whole and the pgcrypto extension. I notice that the copyright from pgcrypto/sha1.c was carried over but not the copyright from pgcrypto/sha2.c. I'm no expert on how this works, but I believe the copyright from sha2.c must be copied over. Also, are there any plans to expose these functions directly to the user without loading pgcrypto? Now that the functionality is in core it seems that would be useful. In addition, it would make this patch stand on its own rather than just being a building block * [PATCH 2/8] Move encoding routines to src/common/ I wonder if it is confusing to have two of encode.h/encode.c. Perhaps they should be renamed to make them distinct? * [PATCH 3/8] Switch password_encryption to a enum Does not apply on HEAD (98c2d3332): error: patch failed: src/backend/commands/user.c:139 error: src/backend/commands/user.c: patch does not apply error: patch failed: src/include/commands/user.h:15 error: src/include/commands/user.h: patch does not apply For here on I used 39b691f251 for review and testing. I seems you are keeping on/off for backwards compatibility, shouldn't the default now be "md5"? -#password_encryption = on +#password_encryption = on # on, off, md5 or plain * [PATCH 4/8] Refactor decision-making of password encryption into a single routine +++ b/src/backend/commands/user.c + new_record[Anum_pg_authid_rolpassword - 1] = + CStringGetTextDatum(encrypted_passwd); pfree(encrypted_passwd) here or let it get freed with the context? * [PATCH 5/8] Create generic routine to fetch password and valid until values for a role Couldn't md5_crypt_verify() be made more general and take the hash type?For instance, password_crypt_verify() with the lastparam as the new password type enum. * [PATCH 6/8] Support for SCRAM-SHA-256 authentication +++ b/contrib/passwordcheck/passwordcheck.c + case PASSWORD_TYPE_SCRAM: + /* unfortunately not much can be done here */ + break; Why can't we at least do the same check as md5 to make sure the username was not used as the password? +++ b/src/backend/libpq/auth.c + * without relying on the length word, but we hardly care about protocol + * version or older anymore.) Do you mean protocol version 2 or older? +++ b/src/backend/libpq/crypt.c return STATUS_ERROR; /* empty password */ + Looks like a stray LF. +++ b/src/backend/parser/gram.y + SAVEPOINT SCHEMA SCRAM SCROLL SEARCH SECOND_P SECURITY SELECT SEQUENCE Doesn't this belong in patch 7? Even in patch 7 it doesn't appear that SCRAM is a keyword since the protocol specified after USING is quoted. I tested this patch using both md5 and scram and was able to get both of them to working separately. However, it doesn't look like they can be used in conjunction since the pg_hba.conf entry must specify either m5 or scram (though the database can easily contain a mixture). This would probably make a migration very unpleasant. Is there any chance of a mixed mode that will allow new passwords to be set as scram while still honoring the old md5 passwords? Or does that cause too many complications with the protocol? * [PATCH 7/8] Add clause PASSWORD val USING protocol to CREATE/ALTER ROLE +++ b/doc/src/sgml/ref/create_role.sgml + Sets the role's password using the wanted protocol. How about "Sets the role's password using the requested procotol." + an unencrypted password. If the presented password string is already + in MD5-encrypted or SCRAM-encrypted format, then it is stored encrypted + as-is. How about, "If the password string is..." * [PATCH 8/8] Add regression tests for passwords OK. On the whole I find this patch set easier to digest than what was submitted for 9.6. It is more targeted but still provides very valuable functionality. I'm a bit concerned that a mixture of md5/scram could cause confusion and think this may warrant discussion somewhere in the documentation since the idea is for users to migrate from md5 to scram. -- -David david@pgmasters.net
On Mon, Sep 26, 2016 at 2:15 AM, David Steele <david@pgmasters.net> wrote: > On 9/3/16 8:36 AM, Michael Paquier wrote: >> >> Attached is a new series: Thanks for the review and the comments! > * [PATCH 1/8] Refactor SHA functions and move them to src/common/ > > I'd like to see more code comments in sha.c (though I realize this was > copied directly from pgcrypto.) OK... I have added some comments for the user-facing routines, as well as the private routines that are doing step-by-step random calculations. > I notice that the copyright from pgcrypto/sha1.c was carried over but > not the copyright from pgcrypto/sha2.c. I'm no expert on how this > works, but I believe the copyright from sha2.c must be copied over. Right, those copyright bits are missing: - * AUTHOR: Aaron D. Gifford <me@aarongifford.com> [...] - * Copyright (c) 2000-2001, Aaron D. Gifford The license block being the same, it seems to me that there is no need to copy it over. The copyright should be enough. > Also, are there any plans to expose these functions directly to the user > without loading pgcrypto? Now that the functionality is in core it > seems that would be useful. In addition, it would make this patch stand > on its own rather than just being a building block. There have been discussions about avoiding enabling those functions by default in the distribution. We'd rather not do that... > * [PATCH 2/8] Move encoding routines to src/common/ > > I wonder if it is confusing to have two of encode.h/encode.c. Perhaps > they should be renamed to make them distinct? Yes it may be a good idea to rename that, like encode_utils.[c|h] for the new files. > * [PATCH 3/8] Switch password_encryption to a enum > > Does not apply on HEAD (98c2d3332): Interesting, it works for me on da6c4f6. > For here on I used 39b691f251 for review and testing. > I seems you are keeping on/off for backwards compatibility, shouldn't > the default now be "md5"? > > -#password_encryption = on > +#password_encryption = on # on, off, md5 or plain That sounds like a good idea, so switched this way. > * [PATCH 4/8] Refactor decision-making of password encryption into a > single routine > > +++ b/src/backend/commands/user.c > + new_record[Anum_pg_authid_rolpassword - 1] = > + CStringGetTextDatum(encrypted_passwd); > > pfree(encrypted_passwd) here or let it get freed with the context? Calling encrypt_password did not ensure that the password needs to be free'd.. So I guess that at the moment I coded that I just relied on the context. But well reading now let's do this cleanly and have encrypt_password return a palloc'ed string. That's more consistent. > * [PATCH 5/8] Create generic routine to fetch password and valid until > values for a role > > Couldn't md5_crypt_verify() be made more general and take the hash type? > For instance, password_crypt_verify() with the last param as the new > password type enum. This would mean incorporating the whole SASL message exchange into this routine because the password string is part of the scram initialization context, and it seems to me that it is better to just do once a lookup at the entry in pg_authid. So we'd finish with a more confusing code I am afraid. At least that's the conclusion I came up with when doing that.. md5_crypt_verify does only the work on a received password. > * [PATCH 6/8] Support for SCRAM-SHA-256 authentication > > +++ b/contrib/passwordcheck/passwordcheck.c > + case PASSWORD_TYPE_SCRAM: > + /* unfortunately not much can be done here */ > + break; > > Why can't we at least do the same check as md5 to make sure the username > was not used as the password? You are right. We could at least check that, so changed the way you suggest. > +++ b/src/backend/libpq/auth.c > + * without relying on the length word, but we hardly care about protocol > + * version or older anymore.) > > Do you mean protocol version 2 or older? > > +++ b/src/backend/libpq/crypt.c > return STATUS_ERROR; /* empty password */ > + > > Looks like a stray LF. Fixed. > +++ b/src/backend/parser/gram.y > + SAVEPOINT SCHEMA SCRAM SCROLL SEARCH SECOND_P SECURITY SELECT SEQUENCE > > Doesn't this belong in patch 7? Even in patch 7 it doesn't appear that > SCRAM is a keyword since the protocol specified after USING is quoted. This is some garbage from a past version. Fixed. > However, it doesn't look like they can be used in conjunction since the > pg_hba.conf entry must specify either m5 or scram (though the database > can easily contain a mixture). This would probably make a migration > very unpleasant. Yep, it uses a given auth-method once user and database match. This is partially related to the problem to support multiple password verifiers per users, which was submitted last CF but got rejected because of a lack of interest, and removed to simplify this patch. You need as well to think about other things like password and protocol aging. But well, it is a problem that we don't have to tackle with this patch... > Is there any chance of a mixed mode that will allow new passwords to be > set as scram while still honoring the old md5 passwords? Or does that > cause too many complications with the protocol? Hm. That looks complicated to me. This sounds to me like a retry logic if for multiple authentication methods, and a different feature. What you'd be looking for here is a connection parameter to specify a list of protocols and try them all, no? And that: + * multiple messags sent in both directions. First message is always from > * [PATCH 7/8] Add clause PASSWORD val USING protocol to CREATE/ALTER ROLE > > +++ b/doc/src/sgml/ref/create_role.sgml > + Sets the role's password using the wanted protocol. > > How about "Sets the role's password using the requested procotol." Done. > + an unencrypted password. If the presented password string is > already > + in MD5-encrypted or SCRAM-encrypted format, then it is stored > encrypted > + as-is. > > How about, "If the password string is..." OK. > On the whole I find this patch set easier to digest than what was > submitted for 9.6. It is more targeted but still provides very valuable > functionality. Thanks. > I'm a bit concerned that a mixture of md5/scram could cause confusion > and think this may warrant discussion somewhere in the documentation > since the idea is for users to migrate from md5 to scram. We could finish with a red warning in the docs to say that users are recommended to use SCRAM instead of MD5. Just an idea, perhaps that's not mandatory for the first shot though. -- Michael
Attachment
- 0001-Refactor-SHA-functions-and-move-them-to-src-common.patch
- 0002-Move-encoding-routines-to-src-common.patch
- 0003-Switch-password_encryption-to-a-enum.patch
- 0004-Refactor-decision-making-of-password-encryption-into.patch
- 0005-Create-generic-routine-to-fetch-password-and-valid-u.patch
- 0006-Support-for-SCRAM-SHA-256-authentication-RFC-5802-an.patch
- 0007-Add-clause-PASSWORD-val-USING-protocol-to-CREATE-ALT.patch
- 0008-Add-regression-tests-for-passwords.patch
On 09/26/2016 09:02 AM, Michael Paquier wrote: > On Mon, Sep 26, 2016 at 2:15 AM, David Steele <david@pgmasters.net> wrote: >> However, it doesn't look like they can be used in conjunction since the >> pg_hba.conf entry must specify either m5 or scram (though the database >> can easily contain a mixture). This would probably make a migration >> very unpleasant. > > Yep, it uses a given auth-method once user and database match. This is > partially related to the problem to support multiple password > verifiers per users, which was submitted last CF but got rejected > because of a lack of interest, and removed to simplify this patch. You > need as well to think about other things like password and protocol > aging. But well, it is a problem that we don't have to tackle with > this patch... > >> Is there any chance of a mixed mode that will allow new passwords to be >> set as scram while still honoring the old md5 passwords? Or does that >> cause too many complications with the protocol? > > Hm. That looks complicated to me. This sounds to me like a retry logic > if for multiple authentication methods, and a different feature. What > you'd be looking for here is a connection parameter to specify a list > of protocols and try them all, no? It would be possible to have a "md5-or-scram" authentication method in pg_hba.conf, such that the server would look up the pg_authid row of the user when it receives startup message, and send an MD5 or SCRAM challenge depending on which one the user's password is encrypted with. It has one drawback though: it allows an unauthenticated user to probe if there is a role with a given name in the system, because if a user doesn't exist, we'd have to still send an MD5 or SCRAM challenge, or a "user does not exist" error without a challenge. If we send a SCRAM challenge for a non-existent user, and the attacker knows that most users still have a MD5 password, that reveals that the username doesn't most likely doesn't exist. Hmm. The server could send a SCRAM challenge first, and if the client gives an incorrect response, or the username doesn't exist, or the user's password is actually MD5-encrypted, the server could then send an MD5 challenge. It would add one round-trip to the authentication of MD5 passwords, but that seems acceptable. We can do this as a follow-up patch though. Let's try to keep this patch series small. - Heikki
On 9/26/16 4:54 AM, Heikki Linnakangas wrote: > On 09/26/2016 09:02 AM, Michael Paquier wrote: >> On Mon, Sep 26, 2016 at 2:15 AM, David Steele <david@pgmasters.net> >> wrote: >>> However, it doesn't look like they can be used in conjunction since the >>> pg_hba.conf entry must specify either m5 or scram (though the database >>> can easily contain a mixture). This would probably make a migration >>> very unpleasant. >> >> Yep, it uses a given auth-method once user and database match. This is >> partially related to the problem to support multiple password >> verifiers per users, which was submitted last CF but got rejected >> because of a lack of interest, and removed to simplify this patch. You >> need as well to think about other things like password and protocol >> aging. But well, it is a problem that we don't have to tackle with >> this patch... >> >>> Is there any chance of a mixed mode that will allow new passwords to be >>> set as scram while still honoring the old md5 passwords? Or does that >>> cause too many complications with the protocol? >> >> Hm. That looks complicated to me. This sounds to me like a retry logic >> if for multiple authentication methods, and a different feature. What >> you'd be looking for here is a connection parameter to specify a list >> of protocols and try them all, no? > > It would be possible to have a "md5-or-scram" authentication method in > pg_hba.conf, such that the server would look up the pg_authid row of the > user when it receives startup message, and send an MD5 or SCRAM > challenge depending on which one the user's password is encrypted with. > It has one drawback though: it allows an unauthenticated user to probe > if there is a role with a given name in the system, because if a user > doesn't exist, we'd have to still send an MD5 or SCRAM challenge, or a > "user does not exist" error without a challenge. If we send a SCRAM > challenge for a non-existent user, and the attacker knows that most > users still have a MD5 password, that reveals that the username doesn't > most likely doesn't exist. > > Hmm. The server could send a SCRAM challenge first, and if the client > gives an incorrect response, or the username doesn't exist, or the > user's password is actually MD5-encrypted, the server could then send an > MD5 challenge. It would add one round-trip to the authentication of MD5 > passwords, but that seems acceptable. > > We can do this as a follow-up patch though. Let's try to keep this patch > series small. Fair enough. I'm not even 100% sure we should do it, but wanted to raise it as a possible issue. -- -David david@pgmasters.net
On Mon, Sep 26, 2016 at 9:22 PM, David Steele <david@pgmasters.net> wrote: > On 9/26/16 4:54 AM, Heikki Linnakangas wrote: >> Hmm. The server could send a SCRAM challenge first, and if the client >> gives an incorrect response, or the username doesn't exist, or the >> user's password is actually MD5-encrypted, the server could then send an >> MD5 challenge. It would add one round-trip to the authentication of MD5 >> passwords, but that seems acceptable. I don't think that this applies just to md5 or scram. Could we for example use a connection parameter, like expected_auth_methods to do that? We include that in the startup packet if the caller has defined it, then the backend checks for matching entries in pg_hba.conf using the username, database and the expected auth method if specified. -- Michael
On 09/26/2016 09:02 AM, Michael Paquier wrote: > On Mon, Sep 26, 2016 at 2:15 AM, David Steele <david@pgmasters.net> wrote: >> On 9/3/16 8:36 AM, Michael Paquier wrote: >>> >>> Attached is a new series: > > Thanks for the review and the comments! I read-through this again, and did a bunch of little fixes: * Added error-handling for OOM and other errors in liybpq * In libpq, added check that the server sent back the same client-nonce * Turned ERRORs into COMMERRORs and removed DEBUG4 lines (they could reveal useful information to an attacker) * Improved comments Some things that need to be resolved (I also added FIXME comments for some of this): * A source of random values. This currently uses PostmasterRandom() similarly to how the MD5 salt is generated, in the server, but plain old random() in the client. If built with OpenSSL, we should probably use RAND_bytes(). But what if the client is built without OpenSSL? I believe the protocol doesn't require cryptographically strong randomness for the nonces, i.e. it's OK if they're predictable, but they should be different for each session. * Nonce and salt lengths. The patch currently uses 10 bytes for both, but I think I just pulled number that out of thin air. The spec doesn't say anything about nonce and salt lengths AFAICS. What do other implementations use? Is 10 bytes enough? * The spec defines a final "server-error" message that the server sends on authentication failure, or e.g. if a required extension is not supported. The patch just uses FATAL for those. Should we try to send a server-error message instead, or before, the elog(FATAL) ? I'll continue hacking this later, but need a little break for now. >> I'm a bit concerned that a mixture of md5/scram could cause confusion >> and think this may warrant discussion somewhere in the documentation >> since the idea is for users to migrate from md5 to scram. > > We could finish with a red warning in the docs to say that users are > recommended to use SCRAM instead of MD5. Just an idea, perhaps that's > not mandatory for the first shot though. Some sort of Migration Guide would certainly be in order. There isn't any easy migration path with this patch series alone, so perhaps that should be part of the follow-up patches that add the "MD5 or SCRAM" authentication method to pg_hba.conf, or support for having both verifiers for the same user in pg_authid. - Heikki
Attachment
- 0001-Refactor-SHA-functions-and-move-them-to-src-common.patch
- 0002-Move-encoding-routines-to-src-common.patch
- 0003-Switch-password_encryption-to-a-enum.patch
- 0004-Refactor-decision-making-of-password-encryption-into.patch
- 0005-Create-generic-routine-to-fetch-password-and-valid-u.patch
- 0006-Support-for-SCRAM-SHA-256-authentication-RFC-5802-an.patch
- 0007-Add-clause-PASSWORD-val-USING-protocol-to-CREATE-ALT.patch
- 0008-Add-regression-tests-for-passwords.patch
On Tue, Sep 27, 2016 at 9:01 PM, Heikki Linnakangas <hlinnaka@iki.fi> wrote: > * Added error-handling for OOM and other errors in liybpq > * In libpq, added check that the server sent back the same client-nonce > * Turned ERRORs into COMMERRORs and removed DEBUG4 lines (they could reveal > useful information to an attacker) > * Improved comments Thanks! > * A source of random values. This currently uses PostmasterRandom() > similarly to how the MD5 salt is generated, in the server, but plain old > random() in the client. If built with OpenSSL, we should probably use > RAND_bytes(). But what if the client is built without OpenSSL? I believe the > protocol doesn't require cryptographically strong randomness for the nonces, > i.e. it's OK if they're predictable, but they should be different for each > session. And what if we just replace PostmasterRandom()? pgcrypto is a useful source of inspiration here. If the server is built with OpenSSL we use RAND_bytes all the time. If not, let's use /dev/urandom. If urandom is not there, we fallback to /dev/random. For WIN32, there is CryptGenRandom(). This could just be done as an independent patch with a routine in src/common/ for example to allow both frontend and backend to use it. Do you think that this is a requirement for this patch? I think not really for the first shot. > * Nonce and salt lengths. The patch currently uses 10 bytes for both, but I > think I just pulled number that out of thin air. The spec doesn't say > anything about nonce and salt lengths AFAICS. What do other implementations > use? Is 10 bytes enough? Good question, but that seems rather short to me now that you mention it. Mongo has implemented already SCRAM-SHA-1 and they are using 3 uint64 so that's 24 bytes (sasl_scramsha1_client_conversation.cpp for example). For the salt I am seeing a reference to a string "salt" only, which is too short. > * The spec defines a final "server-error" message that the server sends on > authentication failure, or e.g. if a required extension is not supported. > The patch just uses FATAL for those. Should we try to send a server-error > message instead, or before, the elog(FATAL) ? It seems to me that sending back the error while the context is still alive, aka before the FATAL would be the way to go. That could be nicely done with an error callback while the exchange is happening. I missed that while going through the spec. -- Michael
On 09/27/2016 04:19 PM, Michael Paquier wrote: > On Tue, Sep 27, 2016 at 9:01 PM, Heikki Linnakangas <hlinnaka@iki.fi> wrote: >> * A source of random values. This currently uses PostmasterRandom() >> similarly to how the MD5 salt is generated, in the server, but plain old >> random() in the client. If built with OpenSSL, we should probably use >> RAND_bytes(). But what if the client is built without OpenSSL? I believe the >> protocol doesn't require cryptographically strong randomness for the nonces, >> i.e. it's OK if they're predictable, but they should be different for each >> session. > > And what if we just replace PostmasterRandom()? pgcrypto is a useful > source of inspiration here. If the server is built with OpenSSL we use > RAND_bytes all the time. If not, let's use /dev/urandom. If urandom is > not there, we fallback to /dev/random. For WIN32, there is > CryptGenRandom(). This could just be done as an independent patch with > a routine in src/common/ for example to allow both frontend and > backend to use it. Yeah, if built with OpenSSL, we probably should just always use RAND_bytes(). Without OpenSSL, we have to think a bit harder. The server-side code in the patch is probably good enough. After all, we use the same mechanism for the MD5 salt today. The libpq-side is not. Just calling random() won't do. We haven't needed for random numbers in libpq before, but now we do. Is the pgcrypto solution portable enough that we can use it in libpq? > Do you think that this is a requirement for this > patch? I think not really for the first shot. We need something for libpq. We can't just call random(), as that's not random unless you also do srandom(), and we don't want to do that because the application might have a different idea of what the seed should be. - Heikki
On Tue, Sep 27, 2016 at 10:42 PM, Heikki Linnakangas <hlinnaka@iki.fi> wrote: > The libpq-side is not. Just calling random() won't do. We haven't needed for > random numbers in libpq before, but now we do. Is the pgcrypto solution > portable enough that we can use it in libpq? Do you think that urandom would be enough then? The last time I took a look at that, I saw urandom on all modern platforms even those ones: OpenBSD, NetBSD, Solaris, SunOS. For Windows the CryptGen stuff would be nice enough I guess.. -- Michael
On 9/26/16 2:02 AM, Michael Paquier wrote: > On Mon, Sep 26, 2016 at 2:15 AM, David Steele <david@pgmasters.net> wrote: > > Thanks for the review and the comments! > >> I notice that the copyright from pgcrypto/sha1.c was carried over but >> not the copyright from pgcrypto/sha2.c. I'm no expert on how this >> works, but I believe the copyright from sha2.c must be copied over. > > Right, those copyright bits are missing: > - * AUTHOR: Aaron D. Gifford <me@aarongifford.com> > [...] > - * Copyright (c) 2000-2001, Aaron D. Gifford > The license block being the same, it seems to me that there is no need > to copy it over. The copyright should be enough. Looks fine to me. >> Also, are there any plans to expose these functions directly to the user >> without loading pgcrypto? Now that the functionality is in core it >> seems that would be useful. In addition, it would make this patch stand >> on its own rather than just being a building block. > > There have been discussions about avoiding enabling those functions by > default in the distribution. We'd rather not do that... OK. >> * [PATCH 2/8] Move encoding routines to src/common/ >> >> I wonder if it is confusing to have two of encode.h/encode.c. Perhaps >> they should be renamed to make them distinct? > > Yes it may be a good idea to rename that, like encode_utils.[c|h] for > the new files. I like that better. >> Couldn't md5_crypt_verify() be made more general and take the hash type? >> For instance, password_crypt_verify() with the last param as the new >> password type enum. > > This would mean incorporating the whole SASL message exchange into > this routine because the password string is part of the scram > initialization context, and it seems to me that it is better to just > do once a lookup at the entry in pg_authid. So we'd finish with a more > confusing code I am afraid. At least that's the conclusion I came up > with when doing that.. md5_crypt_verify does only the work on a > received password. Ah, yes, I see now. I missed that when I reviewed patch 6. -- -David david@pgmasters.net
On 09/26/2016 09:02 AM, Michael Paquier wrote: > On Mon, Sep 26, 2016 at 2:15 AM, David Steele <david@pgmasters.net> wrote: >> * [PATCH 3/8] Switch password_encryption to a enum >> >> Does not apply on HEAD (98c2d3332): > > Interesting, it works for me on da6c4f6. > >> For here on I used 39b691f251 for review and testing. >> I seems you are keeping on/off for backwards compatibility, shouldn't >> the default now be "md5"? >> >> -#password_encryption = on >> +#password_encryption = on # on, off, md5 or plain > > That sounds like a good idea, so switched this way. Committed this patch in the series, to turn password_encryption GUC into an enum. There was one bug in the patch: if a plaintext password was given with CREATE/ALTER USER foo PASSWORD 'bar', but password_encryption was 'md5', it would incorrectly pass PASSWORD_TYPE_MD5 to the check-password hook. That would limit the amount of checking that the hook can do. Fixed that. Also edited the docs and comments a little bit, hopefully for the better. Once we get the main SCRAM patch in, we may want to remove the "on" alias altogether. We don't promise backwards-compatibility of config files or GUC values, and not many people set password_encryption=on explicitly anyway, since it's the default. But I kept it now, as there's no ambiguity on what "on" means, yet. - Heikki
On 09/26/2016 09:02 AM, Michael Paquier wrote: >> * [PATCH 2/8] Move encoding routines to src/common/ >> > >> > I wonder if it is confusing to have two of encode.h/encode.c. Perhaps >> > they should be renamed to make them distinct? > Yes it may be a good idea to rename that, like encode_utils.[c|h] for > the new files. Looking at these encoding functions, the SCRAM protocol actually uses base64 for everything. The hex encoding is only used in the server, to encode the StoredKey and ServerKey in pg_authid. So we don't need that in the client. It would actually make sense to use base64 for the fields in pg_authid, too. Takes less space, and seems more natural for SCRAM anyway. libpq actually has its own implementation of hex encoding and decoding already, in fe-exec.c. So if we wanted to use hex-encoding for something, we could use that, or if we moved the routines from src/backend/utils/encode.c, then we should try to reuse them for the purposes of fe-exec.c, too. And libpq already has an implementation of the 'escape' encoding, too, in fe-exec.c. But as I said above, I don't think we need to touch any of that. In summary, I think we only need to move the base64 routines to src/common. I'd prefer to be quite surgical in what we put in src/common, and avoid moving stuff that's not strictly required by both the server and the client. - Heikki
On 09/28/2016 12:53 PM, Heikki Linnakangas wrote: > On 09/26/2016 09:02 AM, Michael Paquier wrote: >>> * [PATCH 2/8] Move encoding routines to src/common/ >>>> >>>> I wonder if it is confusing to have two of encode.h/encode.c. Perhaps >>>> they should be renamed to make them distinct? >> Yes it may be a good idea to rename that, like encode_utils.[c|h] for >> the new files. > > Looking at these encoding functions, the SCRAM protocol actually uses > base64 for everything. Oh, one more thing. The SCRAM spec says: > The use of base64 in SCRAM is restricted to the canonical form with > no whitespace. Our b64_encode routine does use whitespace, so we can't use it as is for SCRAM. As the patch stands, we might never output anything long enough to create linefeeds, but let's be tidy. The base64 implementation is about 100 lines of code, so perhaps we should just leave src/backend/utils/encode.c alone, and make a new copy of the base64 routines in src/common. - Heikki
On 09/28/2016 12:53 PM, Heikki Linnakangas wrote: > On 09/26/2016 09:02 AM, Michael Paquier wrote: >>> * [PATCH 2/8] Move encoding routines to src/common/ >>>> >>>> I wonder if it is confusing to have two of encode.h/encode.c. Perhaps >>>> they should be renamed to make them distinct? >> Yes it may be a good idea to rename that, like encode_utils.[c|h] for >> the new files. > > Looking at these encoding functions, the SCRAM protocol actually uses > base64 for everything. Oh, one more thing. The SCRAM spec says: > The use of base64 in SCRAM is restricted to the canonical form with > no whitespace. Our b64_encode routine does use whitespace, so we can't use it as is for SCRAM. As the patch stands, we might never output anything long enough to create linefeeds, but let's be tidy. The base64 implementation is about 100 lines of code, so perhaps we should just leave src/backend/utils/encode.c alone, and make a new copy of the base64 routines in src/common. - Heikki
On Wed, Sep 28, 2016 at 7:03 PM, Heikki Linnakangas <hlinnaka@iki.fi> wrote: > On 09/28/2016 12:53 PM, Heikki Linnakangas wrote: >> >> On 09/26/2016 09:02 AM, Michael Paquier wrote: >>>> >>>> * [PATCH 2/8] Move encoding routines to src/common/ >>>>> >>>>> >>>>> I wonder if it is confusing to have two of encode.h/encode.c. Perhaps >>>>> they should be renamed to make them distinct? >>> >>> Yes it may be a good idea to rename that, like encode_utils.[c|h] for >>> the new files. >> >> >> Looking at these encoding functions, the SCRAM protocol actually uses >> base64 for everything. OK, I thought that moving everything made more sense for consistency but let's keep src/common/ as small as possible. > Oh, one more thing. The SCRAM spec says: > >> The use of base64 in SCRAM is restricted to the canonical form with >> no whitespace. > > Our b64_encode routine does use whitespace, so we can't use it as is for > SCRAM. As the patch stands, we might never output anything long enough to > create linefeeds, but let's be tidy. The base64 implementation is about 100 > lines of code, so perhaps we should just leave src/backend/utils/encode.c > alone, and make a new copy of the base64 routines in src/common. OK, I'll refresh that tomorrow with the rest. Thanks for the commit to extend password_encryption. -- Michael
On 9/28/16 5:25 AM, Heikki Linnakangas wrote: > > Once we get the main SCRAM patch in, we may want to remove the "on" > alias altogether. We don't promise backwards-compatibility of config > files or GUC values, and not many people set password_encryption=on > explicitly anyway, since it's the default. +1. -- -David david@pgmasters.net
Heikki, Michael, Magnus, * Michael Paquier (michael.paquier@gmail.com) wrote: > On Tue, Sep 27, 2016 at 10:42 PM, Heikki Linnakangas <hlinnaka@iki.fi> wrote: > > The libpq-side is not. Just calling random() won't do. We haven't needed for > > random numbers in libpq before, but now we do. Is the pgcrypto solution > > portable enough that we can use it in libpq? > > Do you think that urandom would be enough then? The last time I took a > look at that, I saw urandom on all modern platforms even those ones: > OpenBSD, NetBSD, Solaris, SunOS. For Windows the CryptGen stuff would > be nice enough I guess.. Magnus had been working on a patch that, as I recall, he thought was portable and I believe could be used on both sides. Magnus, would what you were working on be helpful here...? Thanks! Stephen
On Wed, Sep 28, 2016 at 8:55 PM, Michael Paquier <michael.paquier@gmail.com> wrote: >> Our b64_encode routine does use whitespace, so we can't use it as is for >> SCRAM. As the patch stands, we might never output anything long enough to >> create linefeeds, but let's be tidy. The base64 implementation is about 100 >> lines of code, so perhaps we should just leave src/backend/utils/encode.c >> alone, and make a new copy of the base64 routines in src/common. > > OK, I'll refresh that tomorrow with the rest. Thanks for the commit to > extend password_encryption. OK, so after more chatting with Heikki, here is a list of TODO items and a summary of the state of things: - base64 encoding routines should drop whitespace (' ', \r, \t), and it would be better to just copy those from the backend's encode.c to src/common/. No need to move escape and binary things, nor touch backend's base64 routines. - No need to move sha1.c to src/common/. Better to just get sha2.c into src/common/ as we aim at SCRAM-SHA-256. - random() called in the client is no good. We need something better here. - The error handling needs to be reworked and should follow the protocol presented by RFC5802, by sending back e= messages. This needs a bit of work, not much I think though as the infra is in place in the core patch. - Let's discard the md5-or-scram optional thing in pg_hba.conf. This complicates the error handling protocol. I am marking this patch as returned with feedback for current CF and will post a new set soon, moving it to the next CF once I have the new set of patches ready for posting. -- Michael
On Thu, Sep 29, 2016 at 12:48 PM, Michael Paquier <michael.paquier@gmail.com> wrote: > On Wed, Sep 28, 2016 at 8:55 PM, Michael Paquier > <michael.paquier@gmail.com> wrote: >>> Our b64_encode routine does use whitespace, so we can't use it as is for >>> SCRAM. As the patch stands, we might never output anything long enough to >>> create linefeeds, but let's be tidy. The base64 implementation is about 100 >>> lines of code, so perhaps we should just leave src/backend/utils/encode.c >>> alone, and make a new copy of the base64 routines in src/common. >> >> OK, I'll refresh that tomorrow with the rest. Thanks for the commit to >> extend password_encryption. > > OK, so after more chatting with Heikki, here is a list of TODO items > and a summary of the state of things: > - base64 encoding routines should drop whitespace (' ', \r, \t), and > it would be better to just copy those from the backend's encode.c to > src/common/. No need to move escape and binary things, nor touch > backend's base64 routines. > - No need to move sha1.c to src/common/. Better to just get sha2.c > into src/common/ as we aim at SCRAM-SHA-256. > - random() called in the client is no good. We need something better here. > - The error handling needs to be reworked and should follow the > protocol presented by RFC5802, by sending back e= messages. This needs > a bit of work, not much I think though as the infra is in place in the > core patch. > - Let's discard the md5-or-scram optional thing in pg_hba.conf. This > complicates the error handling protocol. > > I am marking this patch as returned with feedback for current CF and > will post a new set soon, moving it to the next CF once I have the new > set of patches ready for posting. And so we are back on that, with a new set: - 0001, introducing pg_strong_random() in src/port/ to have the backend portion of SCRAM use it instead of random(). This patch is from Magnus who has kindly sent is to me, so the authorship goes to him. This patch replaces at the same time PostmasterRandom() with it, this way once SCRAM gets integrated both the frontend and the backend finish using the same facility. I think that's good for consistency. Compared to the version Magnus has sent me, I have changed two things: -- Reading from /dev/urandom and /dev/random is not influenced by EINTR. read() handling is also made better in case of partial reads from a given source. -- Win32 Crypto routines use MS_DEF_PROV instead of NULL. I think that's a better idea to not let the user the choice of the encryption source here. - 0002, moving all the SHA2 functions to src/common/. As mentioned upthread, this keeps the amount of code moved to src/common/ to a minimum. I have been careful to get the header files and copyright mentions into a correct shape at the same time. I have moved a couple of code blocks in a shape that make a bit more sense, not sure how you feel about that, Heikki. - 0003, creating a set of base64 routines without whitespace handling. That's more or less a copy of what is in encode.c, simplified for SCRAM. At the same time I have prefixed the routines with pg_ to make a difference with what is in encode.c. - 0004 does some refactoring regarding encrypted passwords in user.c - 0005 creates a generic routine to fetch password and valid until values for a role - 0006 adds support for SCRAM-SHA-256. I have not yet addressed the concerns regarding the handling of e= messages yet. I have fixed the nonce generation with random() though. - 0007 adds the extension for CREATE ROLE .. PASSWORD foo USING protocol - 0008 is a basic set of regression tests to test passwords. To be honest, I have now put some love into 0001~0004, but less in the rest. The first refactoring patches are going to be subject to enough comments I guess :) I'll put more love into 0005~ in the next couple of days though while reworking the message interface. Thanks, -- Michael
Attachment
- 0001-Introduce-pg_strong_random.patch
- 0002-Refactor-SHA2-functions-and-move-them-to-src-common.patch
- 0003-Add-support-for-base64-encoding-decoding-without-whi.patch
- 0004-Refactor-decision-making-of-password-encryption-into.patch
- 0005-Create-generic-routine-to-fetch-password-and-valid-u.patch
- 0006-Support-for-SCRAM-SHA-256-authentication-RFC-5802-an.patch
- 0007-Add-clause-PASSWORD-val-USING-protocol-to-CREATE-ALT.patch
- 0008-Add-regression-tests-for-passwords.patch
On 10/12/2016 11:11 AM, Michael Paquier wrote: > And so we are back on that, with a new set: Great! I'm looking at this first one for now: > - 0001, introducing pg_strong_random() in src/port/ to have the > backend portion of SCRAM use it instead of random(). This patch is > from Magnus who has kindly sent is to me, so the authorship goes to > him. This patch replaces at the same time PostmasterRandom() with it, > this way once SCRAM gets integrated both the frontend and the backend > finish using the same facility. I think that's good for consistency. > Compared to the version Magnus has sent me, I have changed two things: > -- Reading from /dev/urandom and /dev/random is not influenced by > EINTR. read() handling is also made better in case of partial reads > from a given source. > -- Win32 Crypto routines use MS_DEF_PROV instead of NULL. I think > that's a better idea to not let the user the choice of the encryption > source here. I spent some time whacking that around: * Renamed the file to src/port/pg_strong_random.c "pgsrandom" makes me think of srandom(), which this isn't. * Changed pg_strong_random() to return false on error, and let the callers handle errors. That's more error-prone than throwing an error in the function itself, as it's an easy mistake to forget to check for the return value, but we can't just "exit(1)" if called in the frontend. If it gets called from libpq during authentication, as it will with SCRAM, we want to close the connection and report an error, not exit the whole user application. Likewise, in postmaster, if we fail to generate a query cancel key when forking a backend, we don't want to FATAL and shut down the whole postmaster. * There used to be this: > /* > - * Precompute password salt values to use for this connection. It's > - * slightly annoying to do this long in advance of knowing whether we'll > - * need 'em or not, but we must do the random() calls before we fork, not > - * after. Else the postmaster's random sequence won't get advanced, and > - * all backends would end up using the same salt... > - */ > - RandomSalt(port->md5Salt, sizeof(port->md5Salt)); But that whole business of advancing postmaster's random sequence is moot now. So I moved the generation of md5 salt from postmaster to where MD5 authentication is performed. * This comment in postmaster.c was wrong: > @@ -581,7 +571,7 @@ PostmasterMain(int argc, char *argv[]) > * Note: the seed is pretty predictable from externally-visible facts such > * as postmaster start time, so avoid using random() for security-critical > * random values during postmaster startup. At the time of first > - * connection, PostmasterRandom will select a hopefully-more-random seed. > + * connection, pg_strong_random will select a hopefully-more-random seed. > */ > srandom((unsigned int) (MyProcPid ^ MyStartTime)); We don't use pg_strong_random() for that, the same PID+timestamp method is still used as before. Adjusted the comment to reflect reality. * Added "#include <Wincrypt.h>", for the CryptAcquireContext and CryptGenRandom functions? It compiled OK without that, so I guess it got pulled in via some other header file, but seems more clear and future-proof to #include it directly. * random comment kibitzing (no pun intended). This is pretty much ready for commit now, IMO, but please do review one more time. And I do have some small questions still: * We now open and close /dev/(u)random on every pg_strong_random() call. Should we be worried about performance of that? * Now that we don't call random() in postmaster anymore, is there any point in calling srandom() there (i.e. where the above incorrect comment was)? Should we remove it? random() might be used by pre-loaded extensions, though. (Hopefully not for cryptographic purposes.) * Should we backport this? Sorry if we discussed that already, but I don't remember. - Heikki
On 10/14/2016 03:08 PM, Heikki Linnakangas wrote: > I spent some time whacking that around: Sigh, forgot attachment. Here you go. - Heikki
Attachment
On Fri, Oct 14, 2016 at 9:08 PM, Heikki Linnakangas <hlinnaka@iki.fi> wrote: > On 10/12/2016 11:11 AM, Michael Paquier wrote: > * Changed pg_strong_random() to return false on error, and let the callers > handle errors. That's more error-prone than throwing an error in the > function itself, as it's an easy mistake to forget to check for the return > value, but we can't just "exit(1)" if called in the frontend. If it gets > called from libpq during authentication, as it will with SCRAM, we want to > close the connection and report an error, not exit the whole user > application. Likewise, in postmaster, if we fail to generate a query cancel > key when forking a backend, we don't want to FATAL and shut down the whole > postmaster. Okay for this one. Indeed that's a cleaner interface. > This is pretty much ready for commit now, IMO, but please do review one more > time. OK, I had an extra lookup and the patch looks in pretty good shape seen from here. - MyCancelKey = PostmasterRandom(); + if (!pg_strong_random(&MyCancelKey, sizeof(MyCancelKey))) + { + rw->rw_crashed_at = GetCurrentTimestamp(); + return false; + } It would be nice to LOG an entry here for bgworkers. + /* + * fork failed, fall through to report -- actual error message was + * logged by StartAutoVacWorker + */ Since you created a new block, the first line gets longer than 80 characters. > * We now open and close /dev/(u)random on every pg_strong_random() call. > Should we be worried about performance of that? Actually I have hacked up a small program that can be used to compare using /dev/urandom with random() calls (this emulates RandomSalt), and opening/closing /dev/urandom causes a performance hit, but the difference becomes noticeable with loop calls higher than 10k on my Linux laptop. I recall that /dev/urandom is quite slow on Linux compared to other platforms still... So for a single call per connection attempt we won't actually notice it much. I am just attaching that if you want to play with it, and you can use it as follows: ./calc [dev|random] nbytes loops That's really a quick hack but it does the job if you worry about the performance. > * Now that we don't call random() in postmaster anymore, is there any point > in calling srandom() there (i.e. where the above incorrect comment was)? > Should we remove it? random() might be used by pre-loaded extensions, > though. (Hopefully not for cryptographic purposes.) That's the business of the maintainers such modules, so my heart is telling me to rip it off, but my mind tells me that there is no point in making them unhappy either if they rely on it. I'd trust my mind on this one, other opinions are welcome. > * Should we backport this? Sorry if we discussed that already, but I don't > remember. I think that we discussed quickly the point at last PGCon during the SCRAM-committee-unofficial meeting, and that we talked about doing that only for HEAD. -- Michael
Attachment
On 10/15/2016 04:26 PM, Michael Paquier wrote: >> * Now that we don't call random() in postmaster anymore, is there any point >> in calling srandom() there (i.e. where the above incorrect comment was)? >> Should we remove it? random() might be used by pre-loaded extensions, >> though. (Hopefully not for cryptographic purposes.) > > That's the business of the maintainers such modules, so my heart is > telling me to rip it off, but my mind tells me that there is no point > in making them unhappy either if they rely on it. I'd trust my mind on > this one, other opinions are welcome. I kept it for now. Doesn't do any harm either, even if it's unnecessary. >> * Should we backport this? Sorry if we discussed that already, but I don't >> remember. > > I think that we discussed quickly the point at last PGCon during the > SCRAM-committee-unofficial meeting, and that we talked about doing > that only for HEAD. Ok, committed to HEAD. Thanks! - Heikki
On Mon, Oct 17, 2016 at 5:55 PM, Heikki Linnakangas <hlinnaka@iki.fi> wrote: > On 10/15/2016 04:26 PM, Michael Paquier wrote: >>> >>> * Now that we don't call random() in postmaster anymore, is there any >>> point >>> in calling srandom() there (i.e. where the above incorrect comment was)? >>> Should we remove it? random() might be used by pre-loaded extensions, >>> though. (Hopefully not for cryptographic purposes.) >> >> >> That's the business of the maintainers such modules, so my heart is >> telling me to rip it off, but my mind tells me that there is no point >> in making them unhappy either if they rely on it. I'd trust my mind on >> this one, other opinions are welcome. > > > I kept it for now. Doesn't do any harm either, even if it's unnecessary. > >>> * Should we backport this? Sorry if we discussed that already, but I >>> don't >>> remember. >> >> >> I think that we discussed quickly the point at last PGCon during the >> SCRAM-committee-unofficial meeting, and that we talked about doing >> that only for HEAD. > > > Ok, committed to HEAD. You removed the part of pgcrypto in charge of randomness, nice move. I was wondering about how to do with the perfc and the unix_std at some point, and ripping them off as you did is fine for me. -- Michael
On 10/17/2016 12:18 PM, Michael Paquier wrote: > You removed the part of pgcrypto in charge of randomness, nice move. I > was wondering about how to do with the perfc and the unix_std at some > point, and ripping them off as you did is fine for me. Yeah. I didn't understand the need for the perfc stuff. Are there Windows systems that don't have the Crypto APIs? I doubt it, but the buildfarm will tell us in a moment if there are. And if we don't have a good source of randomness like /dev/random, I think it's better to fail, than try to collect entropy ourselves (which is what unix_std did). If there's a platform where that doesn't work, someone will hopefully send us a patch, rather than silently fall back to an iffy implementation. - Heikki
On 10/17/2016 12:27 PM, Heikki Linnakangas wrote: > On 10/17/2016 12:18 PM, Michael Paquier wrote: >> You removed the part of pgcrypto in charge of randomness, nice move. I >> was wondering about how to do with the perfc and the unix_std at some >> point, and ripping them off as you did is fine for me. > > Yeah. I didn't understand the need for the perfc stuff. Are there > Windows systems that don't have the Crypto APIs? I doubt it, but the > buildfarm will tell us in a moment if there are. > > And if we don't have a good source of randomness like /dev/random, I > think it's better to fail, than try to collect entropy ourselves (which > is what unix_std did). If there's a platform where that doesn't work, > someone will hopefully send us a patch, rather than silently fall back > to an iffy implementation. Looks like Tom's old HP-UX box, pademelon, is not happy about this. Does (that version of) HP-UX not have /dev/urandom? I think we're going to need a bit more logging if no randomness source is available. What we have now is just "could not generate random query cancel key", which isn't very informative. Perhaps we should also call pg_strong_random() once at postmaster startup, to check that it works, instead of starting up but not accepting any connections. - Heikki
On Mon, Oct 17, 2016 at 6:18 PM, Michael Paquier <michael.paquier@gmail.com> wrote: > On Mon, Oct 17, 2016 at 5:55 PM, Heikki Linnakangas <hlinnaka@iki.fi> wrote: >> Ok, committed to HEAD. Attached is a rebased patch set for SCRAM, with the following things: - 0001, moving all the SHA2 functions to src/common/ and introducing a PG-like interface. No actual changes here. - 0002, creating a set of base64 routines without whitespace handling. Previous version sent had a bug: I missed the point that the backend version of base64 was adding a newline every 76 characters. So this is removed to make the encoding not using any whitespace. Also the routines are reworked so as they return -1 in the event of an error instead of generating an elog by themselves. That will be useful for SCRAM that needs to do its own error handling with the e= messages from the server. I think that's cleaner this way. Encoding does not have any error code paths, but decoding has, so one possible improvement would be to add in arguments a string to store an error message to make things easier for callers to debug. - 0003 does some refactoring regarding encrypted passwords in user.c. I am pretty happy with this one as well. - 0004 adds the extension for CREATE ROLE .. PASSWORD foo USING protocol. I found a bug in this one when using CREATE|ALTER ROLE .. PASSWORD missing to update the given password correctly using password_encryption. This one I am happy with it. Even if it depends on 0005 in this patch set it is possible to make it independent of it to introduce the grammar just for 'plain' and 'md5' first. In previous sets it was located after SCRAM, but it looks cleaner to get that first. I don't think I am going to change that much more now. - 0005 adds support for SCRAM-SHA-256. There is still some work to do here, particularly the error handling that requires to be extended with the e= messages sent back to the client before moving to a PG-like error code path. Those need to be set in the context of the SASL message exchange. I noticed as well that this is missing a hell lot of error checks when building the exchange messages, and when doing encoding and decoding of base64 strings. I'll address that in the next couple of days. - 0006 is the basic set of regression tests for passwords. Nothing new here, they are useful as basic tests when checking the patch. I don't think that they are worth having committed at the end. -- Michael
Attachment
- 0001-Refactor-SHA2-functions-and-move-them-to-src-common.patch
- 0002-Add-encoding-routines-for-base64-without-whitespace-.patch
- 0003-Refactor-decision-making-of-password-encryption-into.patch
- 0004-Add-clause-PASSWORD-val-USING-protocol-to-CREATE-ALT.patch
- 0005-Support-for-SCRAM-SHA-256-authentication-RFC-5802-an.patch
- 0006-Add-regression-tests-for-passwords.patch
On Tue, Oct 18, 2016 at 4:35 PM, Michael Paquier <michael.paquier@gmail.com> wrote: > On Mon, Oct 17, 2016 at 6:18 PM, Michael Paquier > <michael.paquier@gmail.com> wrote: >> On Mon, Oct 17, 2016 at 5:55 PM, Heikki Linnakangas <hlinnaka@iki.fi> wrote: >>> Ok, committed to HEAD. > > Attached is a rebased patch set for SCRAM, with the following things: > [...] And as the PostmasterRandom() patch has been reverted, here is once again a new set: - 0001, moving all the SHA2 functions to src/common/ and introducing a PG-like interface. No actual changes here. - 0002, replacing PostmasterRandom by pg_strong_random(), with a fix for the cancel key problem. - 0003, adding for pg_strong_random() a fallback for any nix platform not having /dev/random. This should be grouped with 0002, but I split it for clarity. - 0004, Add encoding routines for base64 without whitespace in src/common/. I improved the error handling here by making them return -1 in case of error and let the caller handle the error. - 0005, Refactor decision-making of password encryption into a single routine. - 0006, Add clause PASSWORD val USING protocol to CREATE/ALTER ROLE. - 0007, the SCRAM implementation. I have reworked the error handling on both the frontend and the backend. In the frontend, there were many code paths that did not bother much about many sanity checks like OOMs, so I addressed that as a whole thing. For the backend, in the event of an error, the backend sends back to the client a e= message with an error string corresponding to what happened per RFC5802. Sanity checks of the user data on the server (get the SCRAM verifier, its validuntil, empty password and the user name itself), are made part of the message exchange as in case of errors we need to return errors like e=unknown-user, e=other-errors and stuff similar to that. This makes the code in auth.c slightly cleaner btw. - 0008 is a set of regression tests. The PostmasterRandom() patch sent in this set contains the fix for cancel keys that were previously broken. I have also implemented a fallback method in 0003 inspired by pgcrypto's try_unix_std. It simply uses gettimeofday() (should be put in the upper loop actually now that I think about it!), getpid() and random() to generate some randomness, and then processes the whole through a SHA-256 hash, generating chunks of random data worth of SHA256_DIGEST_LENGTH bytes. I have not added a ./configure switch for it, but there were voices in favor of that. And this is not available on Windows (no need to care anyway as there are crypto APIs). A requirement of this patch is to have the SHA-256 routines in src/common/ first, and this will allow any platform without /dev/random to generate random numbers like pademelon. The fallback method for the pg_strong_random() is clearly not ready for commit, one reason is that libpgport should stand at a level lower than libpgcommon as far as I understand. But this patch makes pg_strong_random() in src/port depend on the SHA2 routines in src/common so it would make more sense if pg_strong_random() is moved as well to src/common instead of src/port. Honestly I think that we'd get away better with something like that than trying for example to reimplement a dependency with PRNG knowing that OpenSSL does it already, and perhaps better than we could do it. Thoughts welcome. A lot of bits are independent of that part in the patch set anyway. -- Michael
Attachment
- 0001-Refactor-SHA2-functions-and-move-them-to-src-common.patch
- 0002-Replace-PostmasterRandom-with-a-stronger-way-of-gene.patch
- 0003-Implement-last-resort-method-in-pg_strong_random.patch
- 0004-Add-encoding-routines-for-base64-without-whitespace-.patch
- 0005-Refactor-decision-making-of-password-encryption-into.patch
- 0006-Add-clause-PASSWORD-val-USING-protocol-to-CREATE-ALT.patch
- 0007-Support-for-SCRAM-SHA-256-authentication-RFC-5802-an.patch
- 0008-Add-regression-tests-for-passwords.patch
The organization of these patches makes sense to me. On 10/20/16 1:14 AM, Michael Paquier wrote: > - 0001, moving all the SHA2 functions to src/common/ and introducing a > PG-like interface. No actual changes here. That's probably alright, although the patch contains a lot more changes than I would imagine for a simple file move. I'll still have to review that in detail. > - 0002, replacing PostmasterRandom by pg_strong_random(), with a fix > for the cancel key problem. > - 0003, adding for pg_strong_random() a fallback for any nix platform > not having /dev/random. This should be grouped with 0002, but I split > it for clarity. Also makes sense, but will need more detailed review. I did not follow the previous PostmasterRandom issues closely. > - 0004, Add encoding routines for base64 without whitespace in > src/common/. I improved the error handling here by making them return > -1 in case of error and let the caller handle the error. I don't think we want to have two different copies of base64 routines. Surely we can make the existing routines do what we want with a parameter or two about whitespace and line length. > - 0005, Refactor decision-making of password encryption into a single routine. It makes sense to factor this out. We probably don't need the pstrdup if we just keep the string as is. (You could make an argument for it if the input values were const char *.) We probably also don't need the pfree. The Assert(0) can probably be done better. We usually use elog() in such cases. > - 0006, Add clause PASSWORD val USING protocol to CREATE/ALTER ROLE. "protocol" is a weird choice here. Maybe something like "method" is better. The way the USING clause is placed can be confusing. It's not clear that it belongs to PASSWORD. If someone wants to augment another clause in CREATE ROLE with a secondary argument, then it could get really confusing. I'd suggest something to group things together, like PASSWORD (val USING method). The method could be an identifier instead of a string. Please add an example to the documentation and explain better how this interacts with the existing ENCRYPTED PASSWORD clause. > - 0007, the SCRAM implementation. No documentation about pg_hba.conf changes, so I don't know how to use this. ;-) This implements SASL and SCRAM and SHA256. We need to be clear about which term we advertise to users. An explanation in the missing documentation would probably be a good start. I would also like to see a test suite that covers the authentication specifically. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Sat, Nov 5, 2016 at 12:58 AM, Peter Eisentraut <peter.eisentraut@2ndquadrant.com> wrote: > The organization of these patches makes sense to me. > > On 10/20/16 1:14 AM, Michael Paquier wrote: >> - 0001, moving all the SHA2 functions to src/common/ and introducing a >> PG-like interface. No actual changes here. > > That's probably alright, although the patch contains a lot more changes > than I would imagine for a simple file move. I'll still have to review > that in detail. The main point is to know if people are happy of having an interface of the type pg_sha256_[init|update|finish] to tackle the fact that core code contains a set of routines that map with some of the OpenSSL APIs... >> - 0002, replacing PostmasterRandom by pg_strong_random(), with a fix >> for the cancel key problem. >> - 0003, adding for pg_strong_random() a fallback for any nix platform >> not having /dev/random. This should be grouped with 0002, but I split >> it for clarity. > > Also makes sense, but will need more detailed review. I did not follow > the previous PostmasterRandom issues closely. pademelon does not have /dev/random and /dev/urandom, so the issue is related to having a fallback method... But Heikki feels that having a method producing potentially weak keys should not be in pg_strong_random(). I'd suggest to control that with a ./configure switch and call it a day. Platforms without any of the four randomness methods pg_strong_random includes play a dangerous game but... >> - 0004, Add encoding routines for base64 without whitespace in >> src/common/. I improved the error handling here by making them return >> -1 in case of error and let the caller handle the error. > > I don't think we want to have two different copies of base64 routines. > Surely we can make the existing routines do what we want with a > parameter or two about whitespace and line length. We could. Though after hacking on that I find cleaner copying the code from encoding.c after removing the whitespace handling, as Heikki has suggested. >> - 0005, Refactor decision-making of password encryption into a single routine. > > It makes sense to factor this out. We probably don't need the pstrdup > if we just keep the string as is. (You could make an argument for it if > the input values were const char *.) We probably also don't need the > pfree. The Assert(0) can probably be done better. We usually use > elog() in such cases. Hm, OK. Agreed with that. >> - 0006, Add clause PASSWORD val USING protocol to CREATE/ALTER ROLE. > > "protocol" is a weird choice here. Maybe something like "method" is > better. The way the USING clause is placed can be confusing. It's not > clear that it belongs to PASSWORD. If someone wants to augment another > clause in CREATE ROLE with a secondary argument, then it could get > really confusing. I'd suggest something to group things together, like > PASSWORD (val USING method). The method could be an identifier instead > of a string. Why not. > Please add an example to the documentation and explain better how this > interacts with the existing ENCRYPTED PASSWORD clause. Sure. >> - 0007, the SCRAM implementation. > > No documentation about pg_hba.conf changes, so I don't know how to use > this. ;-) Oops. I have focused on the code a lot during last rewrite of the patch and forgot that. I'll think about something. > This implements SASL and SCRAM and SHA256. We need to be clear about > which term we advertise to users. An explanation in the missing > documentation would probably be a good start. pg_hba.conf uses "scram" as keyword, but scram refers to a family of authentication methods. There is as well SCRAM-SHA-1, SCRAM-SHA-256 (what this patch does). Hence wouldn't it make sense to use scram_sha256 in pg_hba.conf instead? If for example in the future there is a SHA-512 version of SCRAM we could switch easily to that and define scram_sha512. There is also the channel binding to think about... So we could have a list of keywords perhaps associated with SASL? Imagine for example: sasl $algo,$channel_binding Giving potentially: sasl scram_sha256 sasl scram_sha256,channel sasl scram_sha512 sasl scram_sha512,channel In the case of the patch of this thread just the first entry would make sense, once channel binding support is added a second keyword/option could be added. And there are of course other methods that could replace SCRAM.. > I would also like to see a test suite that covers the authentication > specifically. What you have in mind is a TAP test with a couple of roles and pg_hba.conf getting rewritten then reloaded? Adding it in src/test/recovery/ is the first place that comes in mind but that's not really something related to recovery... Any ideas? -- Michael
On Tue, 18 Oct 2016 16:35:27 +0900 Michael Paquier <michael.paquier@gmail.com> wrote: Hi > Attached is a rebased patch set for SCRAM, with the following things: > - 0001, moving all the SHA2 functions to src/common/ and introducing a > PG-like interface. No actual changes here. It seems, that client nonce generation in this patch is not RFC-compliant. RFC 5802 states that SCRAM nonce should be a sequence of random printable ASCII characters excluding ',' while this patch uses sequence of random bytes from pg_strong_random function with zero byte appended. It could cause following problems 1. If zero byte happens inside random sequence, nonce would be shorter than expected, or even empty. 2. If one of bytes happens to be ASCII Code of comma, than server to the client-first message, which includes copy of client nonce, appended by server nonce, as one of unquoted comman-separated field, would be parsed incorrectly. Regards, Victor --
On Wed, Nov 9, 2016 at 3:13 PM, Victor Wagner <vitus@wagner.pp.ru> wrote: > On Tue, 18 Oct 2016 16:35:27 +0900 > Michael Paquier <michael.paquier@gmail.com> wrote: > > Hi >> Attached is a rebased patch set for SCRAM, with the following things: >> - 0001, moving all the SHA2 functions to src/common/ and introducing a >> PG-like interface. No actual changes here. > > It seems, that client nonce generation in this patch is not > RFC-compliant. > > RFC 5802 states that SCRAM nonce should be > > a sequence of random printable ASCII > characters excluding ',' > > while this patch uses sequence of random bytes from pg_strong_random > function with zero byte appended. (This is about patch 0007, not 0001) Thanks, you are right. That's not good as-is. So this basically means that the characters here should be from 32 to 127 included. generate_nonce needs just to be made smarter in the way it selects the character bytes. -- Michael
On Wed, 9 Nov 2016 15:23:11 +0900 Michael Paquier <michael.paquier@gmail.com> wrote: > > (This is about patch 0007, not 0001) > Thanks, you are right. That's not good as-is. So this basically means > that the characters here should be from 32 to 127 included. Really, most important is to exclude comma from the list of allowed characters. And this prevents us from using a range. I'd do something like: char prinables="0123456789ABCDE...xyz!@#*&+"; unsigned int r; for (i=0;i<SCRAM_NONCE_SIZE;i++) { pg_strong_random(&r,sizeof(unsigned int)) nonce[i]=printables[r%(sizeof(prinables)-1)] /* -1 is here to exclude terminating zero byte*/ } > generate_nonce needs just to be made smarter in the way it selects the > character bytes.
On Sat, Nov 5, 2016 at 9:36 PM, Michael Paquier <michael.paquier@gmail.com> wrote: > On Sat, Nov 5, 2016 at 12:58 AM, Peter Eisentraut > <peter.eisentraut@2ndquadrant.com> wrote: >> The organization of these patches makes sense to me. >> >> On 10/20/16 1:14 AM, Michael Paquier wrote: >>> - 0001, moving all the SHA2 functions to src/common/ and introducing a >>> PG-like interface. No actual changes here. >> >> That's probably alright, although the patch contains a lot more changes >> than I would imagine for a simple file move. I'll still have to review >> that in detail. > > The main point is to know if people are happy of having an interface > of the type pg_sha256_[init|update|finish] to tackle the fact that > core code contains a set of routines that map with some of the OpenSSL > APIs... Or in short that: +extern void pg_sha256_init(pg_sha256_ctx *ctx); +extern void pg_sha256_update(pg_sha256_ctx *ctx, + const uint8 *input0, size_t len); +extern void pg_sha256_final(pg_sha256_ctx *ctx, uint8 *dest); >>> - 0005, Refactor decision-making of password encryption into a single routine. >> >> It makes sense to factor this out. We probably don't need the pstrdup >> if we just keep the string as is. (You could make an argument for it if >> the input values were const char *.) We probably also don't need the >> pfree. The Assert(0) can probably be done better. We usually use >> elog() in such cases. > > Hm, OK. Agreed with that. I have replaced the Assert(0) with an elog(ERROR). OK for the additional palloc and pfree calls. I just made that for consistency in the routine for all the password types, but changed your way. >>> - 0006, Add clause PASSWORD val USING protocol to CREATE/ALTER ROLE. >> >> "protocol" is a weird choice here. Maybe something like "method" is >> better. The way the USING clause is placed can be confusing. It's not >> clear that it belongs to PASSWORD. If someone wants to augment another >> clause in CREATE ROLE with a secondary argument, then it could get >> really confusing. I'd suggest something to group things together, like >> PASSWORD (val USING method). The method could be an identifier instead >> of a string. > > Why not. Done. >> Please add an example to the documentation and explain better how this >> interacts with the existing ENCRYPTED PASSWORD clause. > > Sure. Done. >>> - 0007, the SCRAM implementation. >> >> No documentation about pg_hba.conf changes, so I don't know how to use >> this. ;-) > > Oops. I have focused on the code a lot during last rewrite of the > patch and forgot that. I'll think about something. > >> This implements SASL and SCRAM and SHA256. We need to be clear about >> which term we advertise to users. An explanation in the missing >> documentation would probably be a good start. > > pg_hba.conf uses "scram" as keyword, but scram refers to a family of > authentication methods. There is as well SCRAM-SHA-1, SCRAM-SHA-256 > (what this patch does). Hence wouldn't it make sense to use > scram_sha256 in pg_hba.conf instead? If for example in the future > there is a SHA-512 version of SCRAM we could switch easily to that and > define scram_sha512. OK, I have added more docs regarding the use of scram in pg_hba.conf, particularly in client-auth.sgml to describe what scram is better than md5 in terms of protection, and also completed the data of pg_hba.conf about the new keyword used in it. >> I would also like to see a test suite that covers the authentication >> specifically. > > What you have in mind is a TAP test with a couple of roles and > pg_hba.conf getting rewritten then reloaded? Adding it in > src/test/recovery/ is the first place that comes in mind but that's > not really something related to recovery... Any ideas? OK, hearing no complaints I have done exactly that and added a test in src/test/recovery/ with patch 0009. This place may not be the best fit though, but it looks like an overkill to add a new module in src/test/modules just for that and that's a pretty compact test. On Wed, Nov 9, 2016 at 3:13 PM, Victor Wagner <vitus@wagner.pp.ru> wrote: > On Tue, 18 Oct 2016 16:35:27 +0900 > Michael Paquier <michael.paquier@gmail.com> wrote: >> Attached is a rebased patch set for SCRAM, with the following things: >> - 0001, moving all the SHA2 functions to src/common/ and introducing a >> PG-like interface. No actual changes here. > > It seems, that client nonce generation in this patch is not > RFC-compliant. > > RFC 5802 states that SCRAM nonce should be > > a sequence of random printable ASCII > characters excluding ',' > > while this patch uses sequence of random bytes from pg_strong_random > function with zero byte appended. Right, I have fixed that in 0007 with a solution less exotic than what you suggested upthread by scanning the ASCII characters between '!' and '~', ignoring comma if selected. -- Michael
Attachment
- 0001-Refactor-SHA2-functions-and-move-them-to-src-common.patch
- 0002-Replace-PostmasterRandom-with-a-stronger-way-of-gene.patch
- 0003-Implement-last-resort-method-in-pg_strong_random.patch
- 0004-Add-encoding-routines-for-base64-without-whitespace-.patch
- 0005-Refactor-decision-making-of-password-encryption-into.patch
- 0006-Add-clause-PASSWORD-val-USING-protocol-to-CREATE-ALT.patch
- 0007-Support-for-SCRAM-SHA-256-authentication-RFC-5802-an.patch
- 0008-Add-regression-tests-for-passwords.patch
- 0009-Add-TAP-tests-for-authentication-methods.patch
On Fri, Nov 4, 2016 at 11:58 AM, Peter Eisentraut <peter.eisentraut@2ndquadrant.com> wrote: > The organization of these patches makes sense to me. > > On 10/20/16 1:14 AM, Michael Paquier wrote: >> - 0001, moving all the SHA2 functions to src/common/ and introducing a >> PG-like interface. No actual changes here. > > That's probably alright, although the patch contains a lot more changes > than I would imagine for a simple file move. I'll still have to review > that in detail. Even with git diff -M, reviewing 0001 is very difficult. It does things that are considerably in excess of what is needed to move these files from point A to point B, such as: - Renaming static functions to have a "pg" prefix. - Changing the order of the functions in the file. - Renaming an argument called "context" to "cxt". I think that is a bad plan. I think we should insist that 0001 content itself with a minimal move of the files changing no more than is absolutely necessary. If refactoring is needed, those changes can be submitted separately, which will be much easier to review. My preliminary judgement is that most of this change is pointless and should be reverted. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Tue, Nov 15, 2016 at 10:40 AM, Robert Haas <robertmhaas@gmail.com> wrote: > On Fri, Nov 4, 2016 at 11:58 AM, Peter Eisentraut > <peter.eisentraut@2ndquadrant.com> wrote: >> The organization of these patches makes sense to me. >> >> On 10/20/16 1:14 AM, Michael Paquier wrote: >>> - 0001, moving all the SHA2 functions to src/common/ and introducing a >>> PG-like interface. No actual changes here. >> >> That's probably alright, although the patch contains a lot more changes >> than I would imagine for a simple file move. I'll still have to review >> that in detail. > > Even with git diff -M, reviewing 0001 is very difficult. It does > things that are considerably in excess of what is needed to move these > files from point A to point B, such as: > > - Renaming static functions to have a "pg" prefix. > - Changing the order of the functions in the file. > - Renaming an argument called "context" to "cxt". > > I think that is a bad plan. I think we should insist that 0001 > content itself with a minimal move of the files changing no more than > is absolutely necessary. If refactoring is needed, those changes can > be submitted separately, which will be much easier to review. My > preliminary judgement is that most of this change is pointless and > should be reverted. How do you plug in that with OpenSSL? Are you suggesting to use a set of undef definitions in the new header in the same way as pgcrypto is doing, which is rather ugly? Because that's what the deal is about in this patch. -- Michael
On Tue, Nov 15, 2016 at 2:24 PM, Michael Paquier <michael.paquier@gmail.com> wrote: > How do you plug in that with OpenSSL? Are you suggesting to use a set > of undef definitions in the new header in the same way as pgcrypto is > doing, which is rather ugly? Because that's what the deal is about in > this patch. Perhaps that justifies renaming them -- although I would think the fact that they are static would prevent conflicts -- but why reorder them and change variable names? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Tue, Nov 15, 2016 at 12:40 PM, Robert Haas <robertmhaas@gmail.com> wrote: > On Tue, Nov 15, 2016 at 2:24 PM, Michael Paquier > <michael.paquier@gmail.com> wrote: >> How do you plug in that with OpenSSL? Are you suggesting to use a set >> of undef definitions in the new header in the same way as pgcrypto is >> doing, which is rather ugly? Because that's what the deal is about in >> this patch. > > Perhaps that justifies renaming them -- although I would think the > fact that they are static would prevent conflicts -- but why reorder > them and change variable names? Yeah... Perhaps I should not have done that, which was just for consistency's sake, and even if the new reordering makes more sense actually... -- Michael
On Tue, Nov 15, 2016 at 5:12 PM, Michael Paquier <michael.paquier@gmail.com> wrote: > On Tue, Nov 15, 2016 at 12:40 PM, Robert Haas <robertmhaas@gmail.com> wrote: >> On Tue, Nov 15, 2016 at 2:24 PM, Michael Paquier >> <michael.paquier@gmail.com> wrote: >>> How do you plug in that with OpenSSL? Are you suggesting to use a set >>> of undef definitions in the new header in the same way as pgcrypto is >>> doing, which is rather ugly? Because that's what the deal is about in >>> this patch. >> >> Perhaps that justifies renaming them -- although I would think the >> fact that they are static would prevent conflicts -- but why reorder >> them and change variable names? > > Yeah... Perhaps I should not have done that, which was just for > consistency's sake, and even if the new reordering makes more sense > actually... Yeah, I don't see a point to that. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Wed, Nov 16, 2016 at 4:46 AM, Robert Haas <robertmhaas@gmail.com> wrote: > On Tue, Nov 15, 2016 at 5:12 PM, Michael Paquier > <michael.paquier@gmail.com> wrote: >> On Tue, Nov 15, 2016 at 12:40 PM, Robert Haas <robertmhaas@gmail.com> wrote: >>> On Tue, Nov 15, 2016 at 2:24 PM, Michael Paquier >>> <michael.paquier@gmail.com> wrote: >>>> How do you plug in that with OpenSSL? Are you suggesting to use a set >>>> of undef definitions in the new header in the same way as pgcrypto is >>>> doing, which is rather ugly? Because that's what the deal is about in >>>> this patch. >>> >>> Perhaps that justifies renaming them -- although I would think the >>> fact that they are static would prevent conflicts -- but why reorder >>> them and change variable names? >> >> Yeah... Perhaps I should not have done that, which was just for >> consistency's sake, and even if the new reordering makes more sense >> actually... > > Yeah, I don't see a point to that. OK, by doing so here is what I have. The patch generated by format-patch, as well as diffs generated by git diff -M are reduced and the patch gets half in size. They could be reduced more by adding at the top of sha2.c a couple of defined to map the old SHAXXX_YYY variables with their PG_ equivalents, but that does not seem worth it to me, and diffs are listed line by line. -- Michael
Attachment
On Wed, Nov 16, 2016 at 1:53 PM, Michael Paquier <michael.paquier@gmail.com> wrote: >> Yeah, I don't see a point to that. > > OK, by doing so here is what I have. The patch generated by > format-patch, as well as diffs generated by git diff -M are reduced > and the patch gets half in size. They could be reduced more by adding > at the top of sha2.c a couple of defined to map the old SHAXXX_YYY > variables with their PG_ equivalents, but that does not seem worth it > to me, and diffs are listed line by line. All right, this version is much easier to review. I am a bit puzzled, though. It looks like src/common will include sha2.o if built without OpenSSL and sha2_openssl.o if built with OpenSSL. So far, so good. One would think, then, that pgcrypto would not need to worry about these functions any more because libpgcommon_srv.a is linked into the server, so any references to those symbols would presumably just work. However, that's not what you did. On Windows, you added a dependency on libpgcommon which I think is unnecessary because that stuff is already linked into the server. On non-Windows systems, however, you have instead taught pgcrypto to copy the source file it needs from src/common and recompile it. I don't understand why you need to do any of that, or why it should be different on Windows vs. non-Windows. So I think that the changes for the pgcrypto Makefile could just look like this: diff --git a/contrib/pgcrypto/Makefile b/contrib/pgcrypto/Makefile index 805db76..ddb0183 100644 --- a/contrib/pgcrypto/Makefile +++ b/contrib/pgcrypto/Makefile @@ -1,6 +1,6 @@# contrib/pgcrypto/Makefile -INT_SRCS = md5.c sha1.c sha2.c internal.c internal-sha2.c blf.c rijndael.c \ +INT_SRCS = md5.c sha1.c internal.c internal-sha2.c blf.c rijndael.c \ fortuna.c random.c pgp-mpi-internal.c imath.cINT_TESTS= sha2 And for Mkvcbuild.pm I think you could just do this: diff --git a/src/tools/msvc/Mkvcbuild.pm b/src/tools/msvc/Mkvcbuild.pm index de764dd..1993764 100644 --- a/src/tools/msvc/Mkvcbuild.pm +++ b/src/tools/msvc/Mkvcbuild.pm @@ -114,6 +114,15 @@ sub mkvcbuild md5.c pg_lzcompress.c pgfnames.c psprintf.c relpath.c rmtree.c string.c username.cwait_error.c); + if ($solution->{options}->{openssl}) + { + push(@pgcommonallfiles, 'sha2_openssl.c'); + } + else + { + push(@pgcommonallfiles, 'sha2.c'); + } + our @pgcommonfrontendfiles = ( @pgcommonallfiles, qw(fe_memutils.c file_utils.c restricted_token.c)); @@ -422,7 +431,7 @@ sub mkvcbuild { $pgcrypto->AddFiles( 'contrib/pgcrypto', 'md5.c', - 'sha1.c', 'sha2.c', + 'sha1.c', 'internal.c', 'internal-sha2.c', 'blf.c', 'rijndael.c', 'fortuna.c', 'random.c', Is there some reason that won't work? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Wed, Nov 16, 2016 at 11:24 AM, Robert Haas <robertmhaas@gmail.com> wrote: > diff --git a/contrib/pgcrypto/Makefile b/contrib/pgcrypto/Makefile > index 805db76..ddb0183 100644 > --- a/contrib/pgcrypto/Makefile > +++ b/contrib/pgcrypto/Makefile > @@ -1,6 +1,6 @@ > # contrib/pgcrypto/Makefile > > -INT_SRCS = md5.c sha1.c sha2.c internal.c internal-sha2.c blf.c rijndael.c \ > +INT_SRCS = md5.c sha1.c internal.c internal-sha2.c blf.c rijndael.c \ > fortuna.c random.c pgp-mpi-internal.c imath.c > INT_TESTS = sha2 I would like to do so. And while Linux is happy with that, macOS is not, this results in linking resolution errors when compiling the library. > And for Mkvcbuild.pm I think you could just do this: > > diff --git a/src/tools/msvc/Mkvcbuild.pm b/src/tools/msvc/Mkvcbuild.pm > index de764dd..1993764 100644 > --- a/src/tools/msvc/Mkvcbuild.pm > +++ b/src/tools/msvc/Mkvcbuild.pm > @@ -114,6 +114,15 @@ sub mkvcbuild > md5.c pg_lzcompress.c pgfnames.c psprintf.c relpath.c rmtree.c > string.c username.c wait_error.c); > > + if ($solution->{options}->{openssl}) > + { > + push(@pgcommonallfiles, 'sha2_openssl.c'); > + } > + else > + { > + push(@pgcommonallfiles, 'sha2.c'); > + } > + > our @pgcommonfrontendfiles = ( > @pgcommonallfiles, qw(fe_memutils.c file_utils.c > restricted_token.c)); > @@ -422,7 +431,7 @@ sub mkvcbuild > { > $pgcrypto->AddFiles( > 'contrib/pgcrypto', 'md5.c', > - 'sha1.c', 'sha2.c', > + 'sha1.c', > 'internal.c', 'internal-sha2.c', > 'blf.c', 'rijndael.c', > 'fortuna.c', 'random.c', > > Is there some reason that won't work? Yes we could do that for consistency with the other nix platforms. But is that really necessary as libpgcommon already has those objects? -- Michael
On Wed, Nov 16, 2016 at 6:56 PM, Michael Paquier <michael.paquier@gmail.com> wrote: > On Wed, Nov 16, 2016 at 11:24 AM, Robert Haas <robertmhaas@gmail.com> wrote: >> diff --git a/contrib/pgcrypto/Makefile b/contrib/pgcrypto/Makefile >> index 805db76..ddb0183 100644 >> --- a/contrib/pgcrypto/Makefile >> +++ b/contrib/pgcrypto/Makefile >> @@ -1,6 +1,6 @@ >> # contrib/pgcrypto/Makefile >> >> -INT_SRCS = md5.c sha1.c sha2.c internal.c internal-sha2.c blf.c rijndael.c \ >> +INT_SRCS = md5.c sha1.c internal.c internal-sha2.c blf.c rijndael.c \ >> fortuna.c random.c pgp-mpi-internal.c imath.c >> INT_TESTS = sha2 > > I would like to do so. And while Linux is happy with that, macOS is > not, this results in linking resolution errors when compiling the > library. Well, I'm running macOS and it worked for me. TBH, I don't even quite understand how it could NOT work. What makes the symbols provided by libpgcommon any different from any other symbols that are part of the binary? How could one set work and the other set fail? I can understand how there might be some problem if the backend were dynamically linked libpgcommon, but it's not. It's doing this: gcc -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Wendif-labels -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv -g -O2 -Wall -Werror -L../../src/port -L../../src/common -Wl,-dead_strip_dylibs -Wall -Werror access/brin/brin.o [many more .o files omitted for brevity] utils/fmgrtab.o ../../src/timezone/localtime.o ../../src/timezone/strftime.o ../../src/timezone/pgtz.o ../../src/port/libpgport_srv.a ../../src/common/libpgcommon_srv.a -lm -o postgres As I understand it, listing the .a file on the linker command line like that is exactly equivalent to listing out each individual .o file that is part of that static library. There shouldn't be any difference in how a symbol that's provided by one of the .o files looks vs. how a symbol that's provided by one of the .a files looks. Let's test it. [rhaas pgsql]$ nm src/backend/postgres | grep -E 'GetUserIdAndContext|psprintf' 00000001003d71d0 T _GetUserIdAndContext 000000010040f160 T _psprintf So... how would the dynamic loader know that it was supposed to find the first one and fail to find the second one? More to the point, it's clear that it DOES find the second one on every platform in the buildfarm, because adminpack, dblink, pageinspect, and pgstattuple all use psprintf without the push-ups you are proposing to undertake here. pg_md5_encrypt is used by passwordcheck, and forkname_to_number is used by pageinspect and pg_prewarm. It all just works. No special magic required. > Yes we could do that for consistency with the other nix platforms. But > is that really necessary as libpgcommon already has those objects? The point is that *postgres* already has those objects. You don't need to include them twice. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Hi, On 2016-11-16 19:29:41 -0500, Robert Haas wrote: > On Wed, Nov 16, 2016 at 6:56 PM, Michael Paquier > <michael.paquier@gmail.com> wrote: > > On Wed, Nov 16, 2016 at 11:24 AM, Robert Haas <robertmhaas@gmail.com> wrote: > >> diff --git a/contrib/pgcrypto/Makefile b/contrib/pgcrypto/Makefile > >> index 805db76..ddb0183 100644 > >> --- a/contrib/pgcrypto/Makefile > >> +++ b/contrib/pgcrypto/Makefile > >> @@ -1,6 +1,6 @@ > >> # contrib/pgcrypto/Makefile > >> > >> -INT_SRCS = md5.c sha1.c sha2.c internal.c internal-sha2.c blf.c rijndael.c \ > >> +INT_SRCS = md5.c sha1.c internal.c internal-sha2.c blf.c rijndael.c \ > >> fortuna.c random.c pgp-mpi-internal.c imath.c > >> INT_TESTS = sha2 > > > > I would like to do so. And while Linux is happy with that, macOS is > > not, this results in linking resolution errors when compiling the > > library. > > Well, I'm running macOS and it worked for me. TBH, I don't even quite > understand how it could NOT work. What makes the symbols provided by > libpgcommon any different from any other symbols that are part of the > binary? How could one set work and the other set fail? I can > understand how there might be some problem if the backend were > dynamically linked libpgcommon, but it's not. It's doing this: With -Wl,--as-neeeded the linker will dismiss unused symbols found in a static library. Maybe that's the difference? Andres
On Wed, Nov 16, 2016 at 7:36 PM, Andres Freund <andres@anarazel.de> wrote: > With -Wl,--as-neeeded the linker will dismiss unused symbols found in a > static library. Maybe that's the difference? The man page --as-needed says that --as-needed modifies the behavior of dynamic libraries, not static ones. If there is any such effect, it is undocumented. Here is the text: LD> This option affects ELF DT_NEEDED tags for dynamic libraries mentioned LD> on the command line after the --as-needed option. Normally the linker will LD> add a DT_NEEDED tag for each dynamic library mentioned on the LD> command line, regardless of whether the library is actually needed or not. LD> --as-needed causes a DT_NEEDED tag to only be emitted for a library LD> that at that point in the link satisfies a non-weak undefined symbol reference LD> from a regular object file or, if the library is not found in the DT_NEEDED LD> lists of other needed libraries, a non-weak undefined symbol reference LD> from another needed dynamic library. Object files or libraries appearing LD> on the command line after the library in question do not affect whether the LD> library is seen as needed. This is similar to the rules for extraction of object LD> files from archives. --no-as-needed restores the default behaviour. Some experimentation on my Mac reveals that my previous statement about how this works was incorrect. See attached patch for what I tried. What I find is: 1. If I create an additional source file in src/common containing a completely unused symbol (wunk) it appears in the nm output for libpgcommon_srv.a but not in the nm output for the postgres binary. 2. If I add an additional function to an existing source file in src/common containing a completely unused symbol (quux) it appears in the nm output for both libpgcommon_srv.a and also in the nm output for the postgres binary. 3. If I create an additional source file in src/backend containing a completely unused symbol (blarfle) it appears in the nm output for the postgres binary. So, it seems that the linker is willing to drop archive members if the entire .o file is used, but not individual symbols. That explains why Michael thinks we need to do something special here, because with his 0001 patch, nothing in the new sha2(_openssl).c file would immediately be used in the backend. And indeed I see now that my earlier testing was done incorrectly, and pgcrypto does in fact fail to build under my proposal. Oops. But I think that's a temporary thing. As soon as the backend is using the sha2 routines for anything (which is the point, right?) the build changes become unnecessary. For example, if I apply this patch: --- a/src/backend/lib/binaryheap.c +++ b/src/backend/lib/binaryheap.c @@ -305,3 +305,7 @@ sift_down(binaryheap *heap, int node_off) node_off = swap_off; } } + +#include "common/sha2.h" +extern void ugh(void); +void ugh(void) { pg_sha224_init(NULL); } ...then the backend ends up sucking in everything in sha2.c and the pgcrypto build works again. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Attachment
On Wed, Nov 16, 2016 at 6:51 PM, Robert Haas <robertmhaas@gmail.com> wrote: > So, it seems that the linker is willing to drop archive members if the > entire .o file is used, but not individual symbols. That explains why > Michael thinks we need to do something special here, because with his > 0001 patch, nothing in the new sha2(_openssl).c file would immediately > be used in the backend. And indeed I see now that my earlier testing > was done incorrectly, and pgcrypto does in fact fail to build under my > proposal. Oops. Ah, thanks! I did not notice that before in configure.in: if test "$PORTNAME" = "darwin"; then PGAC_PROG_CC_LDFLAGS_OPT([-Wl,-dead_strip_dylibs], $link_test_func) elif test "$PORTNAME" = "openbsd"; then PGAC_PROG_CC_LDFLAGS_OPT([-Wl,-Bdynamic], $link_test_func) else PGAC_PROG_CC_LDFLAGS_OPT([-Wl,--as-needed], $link_test_func) fi In the current set of patches, the sha2 functions would not get used until the main patch for SCRAM gets committed so that's a couple of steps and many months ahead.. And --as-needed/--no-as-needed are not supported in macos. So I would believe that the best route is just to use this patch with the way it does things, and once SCRAM gets in we could switch the build into more appropriate linking. At least that's far less ugly than having fake objects in the backend code. Of course a comment in pgcrypo's Makefile would be appropriate. -- Michael
On Wed, Nov 16, 2016 at 8:04 PM, Michael Paquier <michael.paquier@gmail.com> wrote: > In the current set of patches, the sha2 functions would not get used > until the main patch for SCRAM gets committed so that's a couple of > steps and many months ahead.. And --as-needed/--no-as-needed are not > supported in macos. So I would believe that the best route is just to > use this patch with the way it does things, and once SCRAM gets in we > could switch the build into more appropriate linking. At least that's > far less ugly than having fake objects in the backend code. Of course > a comment in pgcrypo's Makefile would be appropriate. Or a comment with a "ifeq ($(PORTNAME), darwin)" containing the additional objects to make clear that this is proper to only OSX. Other ideas are welcome. -- Michael
On Wed, Nov 16, 2016 at 11:28 PM, Michael Paquier <michael.paquier@gmail.com> wrote: > On Wed, Nov 16, 2016 at 8:04 PM, Michael Paquier > <michael.paquier@gmail.com> wrote: >> In the current set of patches, the sha2 functions would not get used >> until the main patch for SCRAM gets committed so that's a couple of >> steps and many months ahead.. And --as-needed/--no-as-needed are not >> supported in macos. So I would believe that the best route is just to >> use this patch with the way it does things, and once SCRAM gets in we >> could switch the build into more appropriate linking. At least that's >> far less ugly than having fake objects in the backend code. Of course >> a comment in pgcrypo's Makefile would be appropriate. > > Or a comment with a "ifeq ($(PORTNAME), darwin)" containing the > additional objects to make clear that this is proper to only OSX. > Other ideas are welcome. So, the problem isn't Darwin-specific. I experimented with this on Linux and found Linux does the same thing with libpgcommon_srv.a that macOS does: a file in the archive that is totally unused is omitted from the postgres binary. In Linux, however, that doesn't prevent pgcrypto from compiling anyway. It does, however, prevent it from working. Instead of failing at compile time with a complaint about missing symbols, it fails at load time. I think that's because macOS has -bundle-loader and we use it; without that, I think we'd get the same behavior on macOS that we get on Windows. The fundamental problem here is that the archive-member-dropping behavior that we're getting here is not really what we want, and I think that's going to happen on most or all architectures. For GNU ld, we could add -Wl,--whole-archive, and macOS has -all_load, but I that this is just a nest of portability problems waiting to happen. I think there are two things we can do here that are far simpler: 1. Rejigger things so that we don't build libpgcommon_srv.a in the first place, and instead add $(top_builddir)/src/common to src/backend/Makefile's value of SUBDIRS. With appropriate adjustments to src/common/Makefile, this should allow us to include all of the object files on the linker command line individually instead of building an archive library that is then used only for the postgres binary itself anyway. Then, things wouldn't get dropped. 2. Just postpone committing this patch until we're ready to use the new code in the backend someplace (or add a dummy reference to it someplace). -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Thu, Nov 17, 2016 at 8:12 AM, Robert Haas <robertmhaas@gmail.com> wrote: > So, the problem isn't Darwin-specific. I experimented with this on > Linux and found Linux does the same thing with libpgcommon_srv.a that > macOS does: a file in the archive that is totally unused is omitted > from the postgres binary. In Linux, however, that doesn't prevent > pgcrypto from compiling anyway. It does, however, prevent it from > working. Instead of failing at compile time with a complaint about > missing symbols, it fails at load time. I think that's because macOS > has -bundle-loader and we use it; without that, I think we'd get the > same behavior on macOS that we get on Windows. Yes, right. I recall seeing the regression tests failing with pgcrypto when doing that. Though I did not recall if this was specific to macos or Linux when I looked again at this patch yesterday. When testing again yesterday I was able to make the tests of pgcrypto to pass, but perhaps my build was not in a clean state... > 1. Rejigger things so that we don't build libpgcommon_srv.a in the > first place, and instead add $(top_builddir)/src/common to > src/backend/Makefile's value of SUBDIRS. With appropriate adjustments > to src/common/Makefile, this should allow us to include all of the > object files on the linker command line individually instead of > building an archive library that is then used only for the postgres > binary itself anyway. Then, things wouldn't get dropped. > > 2. Just postpone committing this patch until we're ready to use the > new code in the backend someplace (or add a dummy reference to it > someplace). At the end this refactoring makes sense because it will be used in the backend with the SCRAM engine, so we could just wait for 2 instead of having some workarounds. This is dropping the ball for later and there will be already a lot of work for the SCRAM core part, though I don't think that the SHA2 refactoring will change much going forward. Option 3 would be to do things the patch does it, aka just compiling pgcrypto using the source files directly and put a comment to revert that once the APIs are used in the backend. I can guess that you don't like that. -- Michael
On Fri, Nov 18, 2016 at 2:51 AM, Michael Paquier <michael.paquier@gmail.com> wrote: > On Thu, Nov 17, 2016 at 8:12 AM, Robert Haas <robertmhaas@gmail.com> wrote: >> So, the problem isn't Darwin-specific. I experimented with this on >> Linux and found Linux does the same thing with libpgcommon_srv.a that >> macOS does: a file in the archive that is totally unused is omitted >> from the postgres binary. In Linux, however, that doesn't prevent >> pgcrypto from compiling anyway. It does, however, prevent it from >> working. Instead of failing at compile time with a complaint about >> missing symbols, it fails at load time. I think that's because macOS >> has -bundle-loader and we use it; without that, I think we'd get the >> same behavior on macOS that we get on Windows. > > Yes, right. I recall seeing the regression tests failing with pgcrypto > when doing that. Though I did not recall if this was specific to macos > or Linux when I looked again at this patch yesterday. When testing > again yesterday I was able to make the tests of pgcrypto to pass, but > perhaps my build was not in a clean state... > >> 1. Rejigger things so that we don't build libpgcommon_srv.a in the >> first place, and instead add $(top_builddir)/src/common to >> src/backend/Makefile's value of SUBDIRS. With appropriate adjustments >> to src/common/Makefile, this should allow us to include all of the >> object files on the linker command line individually instead of >> building an archive library that is then used only for the postgres >> binary itself anyway. Then, things wouldn't get dropped. >> >> 2. Just postpone committing this patch until we're ready to use the >> new code in the backend someplace (or add a dummy reference to it >> someplace). > > At the end this refactoring makes sense because it will be used in the > backend with the SCRAM engine, so we could just wait for 2 instead of > having some workarounds. This is dropping the ball for later and there > will be already a lot of work for the SCRAM core part, though I don't > think that the SHA2 refactoring will change much going forward. > > Option 3 would be to do things the patch does it, aka just compiling > pgcrypto using the source files directly and put a comment to revert > that once the APIs are used in the backend. I can guess that you don't > like that. Nothing more will likely happen in this CF, so I have moved it to 2017-01 with the same status of "Needs Review". -- Michael
On Tue, Nov 29, 2016 at 1:36 PM, Michael Paquier <michael.paquier@gmail.com> wrote: > Nothing more will likely happen in this CF, so I have moved it to > 2017-01 with the same status of "Needs Review". Attached is a new set of patches using the new routines pg_backend_random() and pg_strong_random() to handle the randomness in SCRAM: - 0001 refactors the SHA2 routines. pgcrypto uses raw files from src/common when compiling with this patch. That works on any platform, and this is the simplified version of upthread. - 0002 adds base64 routines to src/common. - 0003 does some refactoring regarding the password encryption in ALTER/CREATE USER queries. - 0004 adds the clause PASSWORD (val USING method) in CREATE/ALTER USER. - 0005 is the code patch for SCRAM. Note that this switches pgcrypto to link to libpgcommon as SHA2 routines are used by the backend. - 0006 adds some regression tests for passwords. - 0007 adds some TAP tests for authentication. This is added to the upcoming CF. Thanks, -- Michael
Attachment
- 0001-Refactor-SHA2-functions-and-move-them-to-src-common.patch
- 0002-Add-encoding-routines-for-base64-without-whitespace-.patch
- 0003-Refactor-decision-making-of-password-encryption-into.patch
- 0004-Add-clause-PASSWORD-val-USING-protocol-to-CREATE-ALT.patch
- 0005-Support-for-SCRAM-SHA-256-authentication-RFC-5802-an.patch
- 0006-Add-regression-tests-for-passwords.patch
- 0007-Add-TAP-tests-for-authentication-methods.patch
On 12/07/2016 08:39 AM, Michael Paquier wrote: > On Tue, Nov 29, 2016 at 1:36 PM, Michael Paquier > <michael.paquier@gmail.com> wrote: >> Nothing more will likely happen in this CF, so I have moved it to >> 2017-01 with the same status of "Needs Review". > > Attached is a new set of patches using the new routines > pg_backend_random() and pg_strong_random() to handle the randomness in > SCRAM: > - 0001 refactors the SHA2 routines. pgcrypto uses raw files from > src/common when compiling with this patch. That works on any platform, > and this is the simplified version of upthread. > - 0002 adds base64 routines to src/common. > - 0003 does some refactoring regarding the password encryption in > ALTER/CREATE USER queries. > - 0004 adds the clause PASSWORD (val USING method) in CREATE/ALTER USER. > - 0005 is the code patch for SCRAM. Note that this switches pgcrypto > to link to libpgcommon as SHA2 routines are used by the backend. > - 0006 adds some regression tests for passwords. > - 0007 adds some TAP tests for authentication. > This is added to the upcoming CF. I spent a little time reading through this once again. Steady progress, did some small fixes: * Rewrote the nonce generation. In the server-side, it first generated a string of ascii-printable characters, then base64-encoded them, which is superfluous. Also, avoid calling pg_strong_random() one byte at a time, for performance reasons. * Added a more sophisticated fallback implementation in libpq, for the --disable-strong-random cases, similar to pg_backend_random(). * No need to disallow SCRAM with db_user_namespace. It doesn't include the username in the salt like MD5 does. Attached those here, as add-on patches to your latest patch set. I'll continue reviewing, but a couple of things caught my eye that you may want to jump on, in the meanwhile: On error messages, the spec says: > o e: This attribute specifies an error that occurred during > authentication exchange. It is sent by the server in its final > message and can help diagnose the reason for the authentication > exchange failure. On failed authentication, the entire server- > final-message is OPTIONAL; specifically, a server implementation > MAY conclude the SASL exchange with a failure without sending the > server-final-message. This results in an application-level error > response without an extra round-trip. If the server-final-message > is sent on authentication failure, then the "e" attribute MUST be > included. Note that it says that the server can send the error message with the e= attribute, in the *final message*. It's not a valid response in the earlier state, before sending server-first-message. I think we need to change the INIT state handling in pg_be_scram_exchange() to not send e= messages to the client. On an error at that state, it needs to just bail out without a message. The spec allows that. We can always log the detailed reason in the server log, anyway. As Peter E pointed out earlier, the documentation is lacking, on how to configure MD5 and/or SCRAM. If you put "scram" as the authentication method in pg_hba.conf, what does it mean? If you have a line for both "scram" and "md5" in pg_hba.conf, with the same database/user/hostname combo, what does that mean? Answer: The first one takes effect, the second one has no effect. Yet the example in the docs now has that, which is nonsense :-). Hopefully we'll have some kind of a "both" option, before the release, but in the meanwhile, we need describe how this works now in the docs. - Heikki
Attachment
On Thu, Dec 8, 2016 at 5:54 AM, Heikki Linnakangas <hlinnaka@iki.fi> wrote: > Attached those here, as add-on patches to your latest patch set. Thanks for looking at it! > I'll continue reviewing, but a couple of things caught my eye that you may want > to jump on, in the meanwhile: > > On error messages, the spec says: > >> o e: This attribute specifies an error that occurred during >> authentication exchange. It is sent by the server in its final >> message and can help diagnose the reason for the authentication >> exchange failure. On failed authentication, the entire server- >> final-message is OPTIONAL; specifically, a server implementation >> MAY conclude the SASL exchange with a failure without sending the >> server-final-message. This results in an application-level error >> response without an extra round-trip. If the server-final-message >> is sent on authentication failure, then the "e" attribute MUST be >> included. > > > Note that it says that the server can send the error message with the e= > attribute, in the *final message*. It's not a valid response in the earlier > state, before sending server-first-message. I think we need to change the > INIT state handling in pg_be_scram_exchange() to not send e= messages to the > client. On an error at that state, it needs to just bail out without a > message. The spec allows that. We can always log the detailed reason in the > server log, anyway. Hmmm. How do we handle the case where the user name does not match then? The spec gives an error message e= specifically for this case. If this is taken into account we need to perform sanity checks at initialization phase I am afraid as the number of iterations and the salt are part of the verifier. So you mean that just sending out a normal ERROR message is fine at an earlier step (with *logdetails filled for the backend)? I just want to be sure I understand what you mean here. > As Peter E pointed out earlier, the documentation is lacking, on how to > configure MD5 and/or SCRAM. If you put "scram" as the authentication method > in pg_hba.conf, what does it mean? If you have a line for both "scram" and > "md5" in pg_hba.conf, with the same database/user/hostname combo, what does > that mean? Answer: The first one takes effect, the second one has no effect. > Yet the example in the docs now has that, which is nonsense :-). Hopefully > we'll have some kind of a "both" option, before the release, but in the > meanwhile, we need describe how this works now in the docs. OK, it would be better to add a paragraph in client-auth.sgml regarding the mapping of the two settings. For the example of file in postgresql.conf, I would have really thought that adding directly a line with "scram" listed was enough though. Perhaps a comment to say that if md5 and scram are specified the first one wins where a user and database name map? -- Michael
On 12/08/2016 10:18 AM, Michael Paquier wrote: > On Thu, Dec 8, 2016 at 5:54 AM, Heikki Linnakangas <hlinnaka@iki.fi> wrote: >> Attached those here, as add-on patches to your latest patch set. > > Thanks for looking at it! > >> I'll continue reviewing, but a couple of things caught my eye that you may want >> to jump on, in the meanwhile: >> >> On error messages, the spec says: >> >>> o e: This attribute specifies an error that occurred during >>> authentication exchange. It is sent by the server in its final >>> message and can help diagnose the reason for the authentication >>> exchange failure. On failed authentication, the entire server- >>> final-message is OPTIONAL; specifically, a server implementation >>> MAY conclude the SASL exchange with a failure without sending the >>> server-final-message. This results in an application-level error >>> response without an extra round-trip. If the server-final-message >>> is sent on authentication failure, then the "e" attribute MUST be >>> included. >> >> >> Note that it says that the server can send the error message with the e= >> attribute, in the *final message*. It's not a valid response in the earlier >> state, before sending server-first-message. I think we need to change the >> INIT state handling in pg_be_scram_exchange() to not send e= messages to the >> client. On an error at that state, it needs to just bail out without a >> message. The spec allows that. We can always log the detailed reason in the >> server log, anyway. > > Hmmm. How do we handle the case where the user name does not match > then? The spec gives an error message e= specifically for this case. Hmm, interesting. I wonder how/when they imagine that error message to be used. I suppose you could send a dummy server-first message, with a made-up salt and iteration count, if the user is not found, so that you can report that in the server-final message. But that seems unnecessarily complicated, compared to just sending the error immediately. I could imagine using a dummy server-first messaage to hide whether the user exists, but that argument doesn't hold water if you're going to report an "unknown-user" error, anyway. Actually, we don't give away that information currently. If you try to log in with password or MD5 authentication, and the user doesn't exist, you get the same error as with an incorrect password. So, I think we do need to give the client a made-up salt and iteration count in that case, to hide the fact that the user doesn't exist. Furthermore, you can't just generate random salt and iteration count, because then you could simply try connecting twice, and see if you get the same salt and iteration count. We need to deterministically derive the salt from the username, so that you get the same salt/iteration count every time you try connecting with that username. But it needs indistinguishable from a random salt, to the client. Perhaps a SHA hash of the username and some per-cluster secret value, created by initdb. There must be research papers out there on how to do this.. To be really pedantic about that, we should also ward off timing attacks, by making sure that the dummy authentication is no faster/slower than a real one.. > If this is taken into account we need to perform sanity checks at > initialization phase I am afraid as the number of iterations and the > salt are part of the verifier. So you mean that just sending out a > normal ERROR message is fine at an earlier step (with *logdetails > filled for the backend)? I just want to be sure I understand what you > mean here. That's right, we can send a normal ERROR message. (But not for the "user-not-found" case, as discussed above.) >> As Peter E pointed out earlier, the documentation is lacking, on how to >> configure MD5 and/or SCRAM. If you put "scram" as the authentication method >> in pg_hba.conf, what does it mean? If you have a line for both "scram" and >> "md5" in pg_hba.conf, with the same database/user/hostname combo, what does >> that mean? Answer: The first one takes effect, the second one has no effect. >> Yet the example in the docs now has that, which is nonsense :-). Hopefully >> we'll have some kind of a "both" option, before the release, but in the >> meanwhile, we need describe how this works now in the docs. > > OK, it would be better to add a paragraph in client-auth.sgml > regarding the mapping of the two settings. For the example of file in > postgresql.conf, I would have really thought that adding directly a > line with "scram" listed was enough though. Perhaps a comment to say > that if md5 and scram are specified the first one wins where a user > and database name map? So, I think this makes no sense: > # Allow any user from host 192.168.12.10 to connect to database > -# "postgres" if the user's password is correctly supplied. > +# "postgres" if the user's password is correctly supplied and is > +# using the correct password method. > # > # TYPE DATABASE USER ADDRESS METHOD > host postgres all 192.168.12.10/32 md5 > +host postgres all 192.168.12.10/32 scram But this is OK: > +# Same as previous entry, except that the supplied password must be > +# encrypted with SCRAM-SHA-256. > +host all all .example.com scram > + Although, currently, the whole pg_hba.conf file in that example is a valid file that someone might have on a real server. With the above addition, it would not be. You would never have the two lines with the same host/database/user combination in pg_hba.conf. Overall, I think something like this would make sense in the example: # Allow any user from hosts in the example.com domain to connect to # any database, if the user's password is correctly supplied. # # Most users use SCRAM authentication, but some users use older clients # that don't support SCRAM authentication, and need to be able to log # in using MD5 authentication. Such users are put in the @md5users # group, everyone else must use SCRAM. # # TYPE DATABASE USER ADDRESS METHOD host all @md5users .example.com md5 host all all .example.com scram - Heikki
On Thu, Dec 8, 2016 at 5:55 PM, Heikki Linnakangas <hlinnaka@iki.fi> wrote: > On 12/08/2016 10:18 AM, Michael Paquier wrote: >> Hmmm. How do we handle the case where the user name does not match >> then? The spec gives an error message e= specifically for this case. > > Hmm, interesting. I wonder how/when they imagine that error message to be > used. I suppose you could send a dummy server-first message, with a made-up > salt and iteration count, if the user is not found, so that you can report > that in the server-final message. But that seems unnecessarily complicated, > compared to just sending the error immediately. I could imagine using a > dummy server-first messaage to hide whether the user exists, but that > argument doesn't hold water if you're going to report an "unknown-user" > error, anyway. Using directly an error message would map with MD5 and plain, but that's definitely a new protocol piece so I'd rather think that using e= once the client has sent its first message in the exchange should be answered with an appropriate SASL error... > Actually, we don't give away that information currently. If you try to log > in with password or MD5 authentication, and the user doesn't exist, you get > the same error as with an incorrect password. So, I think we do need to give > the client a made-up salt and iteration count in that case, to hide the fact > that the user doesn't exist. Furthermore, you can't just generate random > salt and iteration count, because then you could simply try connecting > twice, and see if you get the same salt and iteration count. We need to > deterministically derive the salt from the username, so that you get the > same salt/iteration count every time you try connecting with that username. > But it needs indistinguishable from a random salt, to the client. Perhaps a > SHA hash of the username and some per-cluster secret value, created by > initdb. There must be research papers out there on how to do this.. A simple idea would be to use the system ID when generating this fake salt? That's generated by initdb, once per cluster. I am wondering if it would be risky to use it for the salt. For the number of iterations the default number could be used. > To be really pedantic about that, we should also ward off timing attacks, by > making sure that the dummy authentication is no faster/slower than a real > one.. There is one catalog lookup when extracting the verifier from pg_authid, I'd guess that if we generate a fake verifier things should get pretty close. >> If this is taken into account we need to perform sanity checks at >> initialization phase I am afraid as the number of iterations and the >> salt are part of the verifier. So you mean that just sending out a >> normal ERROR message is fine at an earlier step (with *logdetails >> filled for the backend)? I just want to be sure I understand what you >> mean here. > > That's right, we can send a normal ERROR message. (But not for the > "user-not-found" case, as discussed above.) I'd think that the cases where the password is empty and the password has passed valid duration should be returned with e=other-error. If the caller sends a SCRAM request that would be impolite (?) to just throw up an error once the exchange has begun. > Although, currently, the whole pg_hba.conf file in that example is a valid > file that someone might have on a real server. With the above addition, it > would not be. You would never have the two lines with the same > host/database/user combination in pg_hba.conf. Okay. -- Michael
On Thu, Dec 8, 2016 at 10:05 PM, Michael Paquier <michael.paquier@gmail.com> wrote: > On Thu, Dec 8, 2016 at 5:55 PM, Heikki Linnakangas <hlinnaka@iki.fi> wrote: >> On 12/08/2016 10:18 AM, Michael Paquier wrote: >>> Hmmm. How do we handle the case where the user name does not match >>> then? The spec gives an error message e= specifically for this case. >> >> Hmm, interesting. I wonder how/when they imagine that error message to be >> used. I suppose you could send a dummy server-first message, with a made-up >> salt and iteration count, if the user is not found, so that you can report >> that in the server-final message. But that seems unnecessarily complicated, >> compared to just sending the error immediately. I could imagine using a >> dummy server-first message to hide whether the user exists, but that >> argument doesn't hold water if you're going to report an "unknown-user" >> error, anyway. > > Using directly an error message would map with MD5 and plain, but > that's definitely a new protocol piece so I'd rather think that using > e= once the client has sent its first message in the exchange should > be answered with an appropriate SASL error... > >> Actually, we don't give away that information currently. If you try to log >> in with password or MD5 authentication, and the user doesn't exist, you get >> the same error as with an incorrect password. So, I think we do need to give >> the client a made-up salt and iteration count in that case, to hide the fact >> that the user doesn't exist. Furthermore, you can't just generate random >> salt and iteration count, because then you could simply try connecting >> twice, and see if you get the same salt and iteration count. We need to >> deterministically derive the salt from the username, so that you get the >> same salt/iteration count every time you try connecting with that username. >> But it needs indistinguishable from a random salt, to the client. Perhaps a >> SHA hash of the username and some per-cluster secret value, created by >> initdb. There must be research papers out there on how to do this.. > > A simple idea would be to use the system ID when generating this fake > salt? That's generated by initdb, once per cluster. I am wondering if > it would be risky to use it for the salt. For the number of iterations > the default number could be used. I have been thinking more about this part quite a bit, and here is the most simple thing that we could do while respecting the protocol. That's more or less what I think you have in mind by re-reading upthread, but it does not hurt to rewrite the whole flow to be clear: 1) Server gets the startup packet, maps pg_hba.conf and moves on to the scram authentication code path. 2) Server sends back sendAuthRequest() to request user to provide a password. This maps to the plain/md5 behavior as no errors would be issued to user until he has provided a password. 3) Client sends back the password, and the first message with the user name. 4) Server receives it, and checks the data. If a failure happens at this stage, just ERROR on PG-side without sending back a e= message. This includes the username-mismatch, empty password and end of password validity. So we would never use e=unknown-user. This sticks with what you quoted upthread that the server may end the exchange before sending the final message. 5) Server sends back the challenge, and client answers back with its reply to it. Then enters the final stage of the exchange, at which point the server would issue its final message that would be e= in case of errors. If something like an OOM happens, no message would be sent so failing on an OOM ERROR on PG side would be fine as well. 6) Read final message from client and validate. 7) issue final message of server. On failure at steps 6) or 7), an e= message is returned instead of the final message. Does that look right? One thing is: when do we look up at pg_authid? After receiving the first message from client or before beginning the exchange? As the first message from client has the user name, it would make sense to do the lookup after receiving it, but from PG prospective it would just make sense to use the data already present in the startup packet. The current patch does the latter. What do you think? By the way, I have pushed the extra patches you sent into this branch: https://github.com/michaelpq/postgres/tree/scram -- Michael
On 12/09/2016 05:58 AM, Michael Paquier wrote: > On Thu, Dec 8, 2016 at 10:05 PM, Michael Paquier > <michael.paquier@gmail.com> wrote: >> On Thu, Dec 8, 2016 at 5:55 PM, Heikki Linnakangas <hlinnaka@iki.fi> wrote: >>> Actually, we don't give away that information currently. If you try to log >>> in with password or MD5 authentication, and the user doesn't exist, you get >>> the same error as with an incorrect password. So, I think we do need to give >>> the client a made-up salt and iteration count in that case, to hide the fact >>> that the user doesn't exist. Furthermore, you can't just generate random >>> salt and iteration count, because then you could simply try connecting >>> twice, and see if you get the same salt and iteration count. We need to >>> deterministically derive the salt from the username, so that you get the >>> same salt/iteration count every time you try connecting with that username. >>> But it needs indistinguishable from a random salt, to the client. Perhaps a >>> SHA hash of the username and some per-cluster secret value, created by >>> initdb. There must be research papers out there on how to do this.. >> >> A simple idea would be to use the system ID when generating this fake >> salt? That's generated by initdb, once per cluster. I am wondering if >> it would be risky to use it for the salt. For the number of iterations >> the default number could be used. I think I'd feel better with a completely separate randomly-generated value for this. System ID is not too difficult to guess, and there's no need to skimp on this. Yes, default number of iterations makes sense. We cannot completely avoid leaking information through this, unfortunately. For example, if you have a user with a non-default number of iterations, and an attacker probes that, he'll know that the username was valid, because he got back a non-default number of iterations. But let's do our best. > I have been thinking more about this part quite a bit, and here is the > most simple thing that we could do while respecting the protocol. > That's more or less what I think you have in mind by re-reading > upthread, but it does not hurt to rewrite the whole flow to be clear: > 1) Server gets the startup packet, maps pg_hba.conf and moves on to > the scram authentication code path. > 2) Server sends back sendAuthRequest() to request user to provide a > password. This maps to the plain/md5 behavior as no errors would be > issued to user until he has provided a password. > 3) Client sends back the password, and the first message with the user name. > 4) Server receives it, and checks the data. If a failure happens at > this stage, just ERROR on PG-side without sending back a e= message. > This includes the username-mismatch, empty password and end of > password validity. So we would never use e=unknown-user. This sticks > with what you quoted upthread that the server may end the exchange > before sending the final message. If we want to mimic the current behavior with MD5 authentication, I think we need to follow through with the challenge, and only fail in the last step, even if we know the password was empty or expired. MD5 authentication doesn't currently give away that information to the user. But it's OK to bail out early on OOM, or if the client sends an outright broken message. Those don't give away any information on the user account. > 5) Server sends back the challenge, and client answers back with its > reply to it. > Then enters the final stage of the exchange, at which point the server > would issue its final message that would be e= in case of errors. If > something like an OOM happens, no message would be sent so failing on > an OOM ERROR on PG side would be fine as well. > 6) Read final message from client and validate. > 7) issue final message of server. > > On failure at steps 6) or 7), an e= message is returned instead of the > final message. Does that look right? Yep. > One thing is: when do we look up at pg_authid? After receiving the > first message from client or before beginning the exchange? As the > first message from client has the user name, it would make sense to do > the lookup after receiving it, but from PG prospective it would just > make sense to use the data already present in the startup packet. The > current patch does the latter. What do you think? Let's see what fits the program flow best. Probably best to do it before beginning the exchange. I'm hacking on this right now... > By the way, I have pushed the extra patches you sent into this branch: > https://github.com/michaelpq/postgres/tree/scram Thanks! We had a quick chat with Michael, and agreed that we'd hack together on that github repository, to avoid stepping on each other's toes, and cut rebased patch sets from there to pgsql-hackers every now and then. - Heikki
Couple of things I should write down before I forget: 1. It's a bit cumbersome that the scram verifiers stored in pg_authid.rolpassword don't have any clear indication that they're scram verifiers. MD5 hashes are readily identifiable by the "md5" prefix. I think we should use a "scram-sha-256:" for scram verifiers. Actually, I think it'd be awfully nice to also prefix plaintext passwords with "plain:", but I'm not sure it's worth breaking the compatibility, if there are tools out there that peek into rolpassword. Thoughts? 2. It's currently not possible to use the plaintext "password" authentication method, for a user that has a SCRAM verifier in rolpassword. That seems like an oversight. We can't do MD5 authentication with a SCRAM verifier, but "password" we could. - Heikki
On Fri, Dec 9, 2016 at 5:11 PM, Heikki Linnakangas <hlinnaka@iki.fi> wrote: > Couple of things I should write down before I forget: > > 1. It's a bit cumbersome that the scram verifiers stored in > pg_authid.rolpassword don't have any clear indication that they're scram > verifiers. MD5 hashes are readily identifiable by the "md5" prefix. I think > we should use a "scram-sha-256:" for scram verifiers. scram-sha-256 would make the most sense to me. > Actually, I think it'd be awfully nice to also prefix plaintext passwords > with "plain:", but I'm not sure it's worth breaking the compatibility, if > there are tools out there that peek into rolpassword. Thoughts? pgbouncer is the only thing coming up in mind. It looks at pg_shadow for password values. pg_dump'ing data from pre-10 instances will also need to adapt. I see tricky the compatibility with the exiting CREATE USER PASSWORD command though, so I am wondering if that's worth the complication. > 2. It's currently not possible to use the plaintext "password" > authentication method, for a user that has a SCRAM verifier in rolpassword. > That seems like an oversight. We can't do MD5 authentication with a SCRAM > verifier, but "password" we could. Yeah, that should be possible... -- Michael
On 12/09/2016 05:58 AM, Michael Paquier wrote: > > One thing is: when do we look up at pg_authid? After receiving the > first message from client or before beginning the exchange? As the > first message from client has the user name, it would make sense to do > the lookup after receiving it, but from PG prospective it would just > make sense to use the data already present in the startup packet. The > current patch does the latter. What do you think? While hacking on this, I came up with the attached refactoring, against current master. I think it makes the current code more readable, anyway, and it provides a get_role_password() function that SCRAM can use, to look up the stored password. (This is essentially the same refactoring that was included in the SCRAM patch set, that introduced the get_role_details() function.) Barring objections, I'll go ahead and commit this first. - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Attachment
On Fri, Dec 09, 2016 at 11:51:45AM +0200, Heikki Linnakangas wrote: > On 12/09/2016 05:58 AM, Michael Paquier wrote: > > > > One thing is: when do we look up at pg_authid? After receiving the > > first message from client or before beginning the exchange? As the > > first message from client has the user name, it would make sense to do > > the lookup after receiving it, but from PG prospective it would just > > make sense to use the data already present in the startup packet. The > > current patch does the latter. What do you think? > > While hacking on this, I came up with the attached refactoring, against > current master. I think it makes the current code more readable, anyway, and > it provides a get_role_password() function that SCRAM can use, to look up > the stored password. (This is essentially the same refactoring that was > included in the SCRAM patch set, that introduced the get_role_details() > function.) > > Barring objections, I'll go ahead and commit this first. Here are some comments. > @@ -720,12 +721,16 @@ CheckMD5Auth(Port *port, char **logdetail) > sendAuthRequest(port, AUTH_REQ_MD5, md5Salt, 4); > > passwd = recv_password_packet(port); > - > if (passwd == NULL) > return STATUS_EOF; /* client wouldn't send password */ This looks like useless noise. > - shadow_pass = TextDatumGetCString(datum); > + *shadow_pass = TextDatumGetCString(datum); > > datum = SysCacheGetAttr(AUTHNAME, roleTup, > Anum_pg_authid_rolvaliduntil, &isnull); > @@ -83,100 +83,146 @@ md5_crypt_verify(const char *role, char *client_pass, > { > *logdetail = psprintf(_("User \"%s\" has an empty password."), > role); > + *shadow_pass = NULL; > return STATUS_ERROR; /* empty password */ > } Here the password is allocated by text_to_cstring(), that's only 1 byte but it should be free()'d. -- Michael
On 12/09/2016 01:10 PM, Michael Paquier wrote: > On Fri, Dec 09, 2016 at 11:51:45AM +0200, Heikki Linnakangas wrote: >> On 12/09/2016 05:58 AM, Michael Paquier wrote: >>> >>> One thing is: when do we look up at pg_authid? After receiving the >>> first message from client or before beginning the exchange? As the >>> first message from client has the user name, it would make sense to do >>> the lookup after receiving it, but from PG prospective it would just >>> make sense to use the data already present in the startup packet. The >>> current patch does the latter. What do you think? >> >> While hacking on this, I came up with the attached refactoring, against >> current master. I think it makes the current code more readable, anyway, and >> it provides a get_role_password() function that SCRAM can use, to look up >> the stored password. (This is essentially the same refactoring that was >> included in the SCRAM patch set, that introduced the get_role_details() >> function.) >> >> Barring objections, I'll go ahead and commit this first. Ok, committed. >> - shadow_pass = TextDatumGetCString(datum); >> + *shadow_pass = TextDatumGetCString(datum); >> >> datum = SysCacheGetAttr(AUTHNAME, roleTup, >> Anum_pg_authid_rolvaliduntil, &isnull); >> @@ -83,100 +83,146 @@ md5_crypt_verify(const char *role, char *client_pass, >> { >> *logdetail = psprintf(_("User \"%s\" has an empty password."), >> role); >> + *shadow_pass = NULL; >> return STATUS_ERROR; /* empty password */ >> } > > Here the password is allocated by text_to_cstring(), that's only 1 byte > but it should be free()'d. Fixed. Thanks, good catch! It doesn't matter in practice as we'll disconnect shortly afterwards anyway, but given that the callers pfree() other things on error, let's be tidy. - Heikki
A few couple more things that caught my eye while hacking on this: 1. We don't use SASLPrep to scrub username's and passwords. That's by choice, for usernames, because historically in PostgreSQL usernames can be stored in any encoding, but SASLPrep assumes UTF-8. We dodge that by passing an empty username in the authentication exchange anyway, because we always use the username we got from the startup packet. But for passwords, I think we need to fix that. The spec is very clear on that: > Note that implementations MUST either implement SASLprep or disallow > use of non US-ASCII Unicode codepoints in "str". 2. I think we should check nonces, etc. more carefully, to not contain invalid characters. For example, in the server, we use the read_attr_value() function to read the client's nonce. Per the spec, the nonce should consist of ASCII printable characters, but we will accept anything except the comma. That's no trouble to the server, but let's be strict. To summarize, here's the overall TODO list so far: * Use SASLPrep for passwords. * Check nonces, etc. to not contain invalid characters. * Derive mock SCRAM verifier for non-existent users deterministically from username. * Allow plain 'password' authentication for users with a SCRAM verifier in rolpassword. * Throw an error if an "authorization identity" is given. ATM, we just ignore it, but seems better to reject the attempt than do something that might not be what the client expects. * Add "scram-sha-256" prefix to SCRAM verifiers stored in pg_authid.rolpassword. Anything else I'm missing? I've created a wiki page, mostly to host that TODO list, while we hack this to completion: https://wiki.postgresql.org/wiki/SCRAM_authentication. Feel free to add stuff that comes to mind, and remove stuff as you push patches to the branch on github. - Heikki
On 12 December 2016 at 22:39, Heikki Linnakangas <hlinnaka@iki.fi> wrote: > * Throw an error if an "authorization identity" is given. ATM, we just > ignore it, but seems better to reject the attempt than do something that > might not be what the client expects. Yeah. That might be an opportunity to make admins' and connection poolers' lives much happier down the track, but first we'd need a way of specifying a mapping for the other users a given user is permitted to masquerade as (like we have for roles and role membership). We have SET SESSION AUTHORIZATION already, which has all the same benefits and security problems as allowing connect-time selection of authorization identity without such a framework. And we have SET ROLE. ERRORing is the right thing to do here, so we can safely use this protocol functionality later if we want to allow user masquerading. -- Craig Ringer http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On Mon, Dec 12, 2016 at 11:39 PM, Heikki Linnakangas <hlinnaka@iki.fi> wrote: > A few couple more things that caught my eye while hacking on this: > > 1. We don't use SASLPrep to scrub username's and passwords. That's by > choice, for usernames, because historically in PostgreSQL usernames can be > stored in any encoding, but SASLPrep assumes UTF-8. We dodge that by passing > an empty username in the authentication exchange anyway, because we always > use the username we got from the startup packet. But for passwords, I think > we need to fix that. The spec is very clear on that: > >> Note that implementations MUST either implement SASLprep or disallow >> use of non US-ASCII Unicode codepoints in "str". > > 2. I think we should check nonces, etc. more carefully, to not contain > invalid characters. For example, in the server, we use the read_attr_value() > function to read the client's nonce. Per the spec, the nonce should consist > of ASCII printable characters, but we will accept anything except the comma. > That's no trouble to the server, but let's be strict. > > To summarize, here's the overall TODO list so far: > > * Use SASLPrep for passwords. > > * Check nonces, etc. to not contain invalid characters. > > * Derive mock SCRAM verifier for non-existent users deterministically from > username. > > * Allow plain 'password' authentication for users with a SCRAM verifier in > rolpassword. > > * Throw an error if an "authorization identity" is given. ATM, we just > ignore it, but seems better to reject the attempt than do something that > might not be what the client expects. > > * Add "scram-sha-256" prefix to SCRAM verifiers stored in > pg_authid.rolpassword. > > Anything else I'm missing? > > I've created a wiki page, mostly to host that TODO list, while we hack this > to completion: https://wiki.postgresql.org/wiki/SCRAM_authentication. Feel > free to add stuff that comes to mind, and remove stuff as you push patches > to the branch on github. Based on the current code, I think you have the whole list. I'll try to look once again at the code to see I have anything else in mind. Improving the TAP regression tests is also an item, with SCRAM authentication support when a plain password is stored. -- Michael
On Tue, Dec 13, 2016 at 10:43 AM, Michael Paquier <michael.paquier@gmail.com> wrote: > On Mon, Dec 12, 2016 at 11:39 PM, Heikki Linnakangas <hlinnaka@iki.fi> wrote: >> A few couple more things that caught my eye while hacking on this: Looking at what we have now, in the branch... >> * Use SASLPrep for passwords. SASLPrep is defined here: https://tools.ietf.org/html/rfc4013 And stringprep is here: https://tools.ietf.org/html/rfc3454 So that's roughly applying a conversion from the mapping table, taking into account prohibited, bi-directional, mapping characters, etc. The spec says that the password should be in unicode. But we cannot be sure of that, right? Those mapping tables should be likely a separated thing.. (perl has Unicode::Stringprep::Mapping for example). >> * Check nonces, etc. to not contain invalid characters. Fixed this one. >> * Derive mock SCRAM verifier for non-existent users deterministically from >> username. You have put in place the facility to allow that. The only thing that comes in mind to generate something per-cluster is to have BootStrapXLOG() generate an "authentication secret identifier" with a uint64 and add that in the control file. Using pg_backend_random() would be a good idea here. >> * Allow plain 'password' authentication for users with a SCRAM verifier in >> rolpassword. Done. >> * Throw an error if an "authorization identity" is given. ATM, we just >> ignore it, but seems better to reject the attempt than do something that >> might not be what the client expects. Done. >> * Add "scram-sha-256" prefix to SCRAM verifiers stored in >> pg_authid.rolpassword. You did it. -- Michael
pg_authid.rolpassword format (was Re: [HACKERS] Password identifiers,protocol aging and SCRAM protocol)
From
Heikki Linnakangas
Date:
On 12/09/2016 10:19 AM, Michael Paquier wrote: > On Fri, Dec 9, 2016 at 5:11 PM, Heikki Linnakangas <hlinnaka@iki.fi> wrote: >> Couple of things I should write down before I forget: >> >> 1. It's a bit cumbersome that the scram verifiers stored in >> pg_authid.rolpassword don't have any clear indication that they're scram >> verifiers. MD5 hashes are readily identifiable by the "md5" prefix. I think >> we should use a "scram-sha-256:" for scram verifiers. > > scram-sha-256 would make the most sense to me. > >> Actually, I think it'd be awfully nice to also prefix plaintext passwords >> with "plain:", but I'm not sure it's worth breaking the compatibility, if >> there are tools out there that peek into rolpassword. Thoughts? > > pgbouncer is the only thing coming up in mind. It looks at pg_shadow > for password values. pg_dump'ing data from pre-10 instances will also > need to adapt. I see tricky the compatibility with the exiting CREATE > USER PASSWORD command though, so I am wondering if that's worth the > complication. > >> 2. It's currently not possible to use the plaintext "password" >> authentication method, for a user that has a SCRAM verifier in rolpassword. >> That seems like an oversight. We can't do MD5 authentication with a SCRAM >> verifier, but "password" we could. > > Yeah, that should be possible... The tip of the work branch can now do SCRAM authentication, when a user has a plaintext password in pg_authid.rolpassword. The reverse doesn't work, however: you cannot do plain "password" authentication, when the user has a SCRAM verifier in pg_authid.rolpassword. It gets worse: plain "password" authentication doesn't check if the string stored in pg_authid.rolpassword is a SCRAM authenticator, and treats it as a plaintext password, so you can do this: PGPASSWORD="scram-sha-256:mDBuqO1mEekieg==:4096:17dc259499c1a184c26ee5b19715173d9354195f510b4d3af8be585acb39ae33:d3d713149c6becbbe56bae259aafe4e95b79ab7e3b50f2fbd850ea7d7b7c114f" psql postgres -h localhost -U scram_user I think we're going to have a more bugs like this, if we don't start to explicitly label plaintext passwords as such. So, let's add "plain:" prefix to plaintext passwords, in pg_authid.rolpassword. With that, these would be valid values in pg_authid.rolpassword: plain:foo md55a962ce7a24371a10e85627a484cac28 scram-sha-256:mDBuqO1mEekieg==:4096:17dc259499c1a184c26ee5b19715173d9354195f510b4d3af8be585acb39ae33:d3d713149c6becbbe56bae259aafe4e95b79ab7e3b50f2fbd850ea7d7b7c114f But anything that doesn't begin with "plain:", "md5", or "scram-sha-256:" would be invalid. You shouldn't have invalid values in the column, but if you do, all the authentication mechanisms would reject it. It would be nice to also change the format of MD5 passwords to have a colon, as in "md5:<hash>", but that's probably not worth breaking compatibility for. Almost no-one stores passwords in plaintext, so changing the format of that wouldn't affect many people, but there might well be tools out there that peek into MD5 hashes. - Heikki
Re: pg_authid.rolpassword format (was Re: [HACKERS] Passwordidentifiers, protocol aging and SCRAM protocol)
From
Michael Paquier
Date:
On Wed, Dec 14, 2016 at 5:51 PM, Heikki Linnakangas <hlinnaka@iki.fi> wrote: > The tip of the work branch can now do SCRAM authentication, when a user has > a plaintext password in pg_authid.rolpassword. The reverse doesn't work, > however: you cannot do plain "password" authentication, when the user has a > SCRAM verifier in pg_authid.rolpassword. It gets worse: plain "password" > authentication doesn't check if the string stored in pg_authid.rolpassword > is a SCRAM authenticator, and treats it as a plaintext password, so you can > do this: > > PGPASSWORD="scram-sha-256:mDBuqO1mEekieg==:4096:17dc259499c1a184c26ee5b19715173d9354195f510b4d3af8be585acb39ae33:d3d713149c6becbbe56bae259aafe4e95b79ab7e3b50f2fbd850ea7d7b7c114f" > psql postgres -h localhost -U scram_user This one's fun. > I think we're going to have a more bugs like this, if we don't start to > explicitly label plaintext passwords as such. > > So, let's add "plain:" prefix to plaintext passwords, in > pg_authid.rolpassword. With that, these would be valid values in > pg_authid.rolpassword: > > [...] > > But anything that doesn't begin with "plain:", "md5", or "scram-sha-256:" > would be invalid. You shouldn't have invalid values in the column, but if > you do, all the authentication mechanisms would reject it. I would be tempted to suggest adding the verifier type as a new column of pg_authid, but as CREATE USER PASSWORD accepts strings with md5 prefix as-is for ages using the "plain:" prefix is definitely a better plan. My opinion on the matter has changed compared to a couple of months back. > It would be nice to also change the format of MD5 passwords to have a colon, > as in "md5:<hash>", but that's probably not worth breaking compatibility > for. Almost no-one stores passwords in plaintext, so changing the format of > that wouldn't affect many people, but there might well be tools out there > that peek into MD5 hashes. Yes, let's not take this road. This work is definitely something that should be done before anything else. Need a patch or are you on it? -- Michael
Re: pg_authid.rolpassword format (was Re: [HACKERS] Passwordidentifiers, protocol aging and SCRAM protocol)
From
Magnus Hagander
Date:
On Wed, Dec 14, 2016 at 9:51 AM, Heikki Linnakangas <hlinnaka@iki.fi> wrote:
On 12/09/2016 10:19 AM, Michael Paquier wrote:On Fri, Dec 9, 2016 at 5:11 PM, Heikki Linnakangas <hlinnaka@iki.fi> wrote:Couple of things I should write down before I forget:
1. It's a bit cumbersome that the scram verifiers stored in
pg_authid.rolpassword don't have any clear indication that they're scram
verifiers. MD5 hashes are readily identifiable by the "md5" prefix. I think
we should use a "scram-sha-256:" for scram verifiers.
scram-sha-256 would make the most sense to me.Actually, I think it'd be awfully nice to also prefix plaintext passwords
with "plain:", but I'm not sure it's worth breaking the compatibility, if
there are tools out there that peek into rolpassword. Thoughts?
pgbouncer is the only thing coming up in mind. It looks at pg_shadow
for password values. pg_dump'ing data from pre-10 instances will also
need to adapt. I see tricky the compatibility with the exiting CREATE
USER PASSWORD command though, so I am wondering if that's worth the
complication.2. It's currently not possible to use the plaintext "password"
authentication method, for a user that has a SCRAM verifier in rolpassword.
That seems like an oversight. We can't do MD5 authentication with a SCRAM
verifier, but "password" we could.
Yeah, that should be possible...
The tip of the work branch can now do SCRAM authentication, when a user has a plaintext password in pg_authid.rolpassword. The reverse doesn't work, however: you cannot do plain "password" authentication, when the user has a SCRAM verifier in pg_authid.rolpassword. It gets worse: plain "password" authentication doesn't check if the string stored in pg_authid.rolpassword is a SCRAM authenticator, and treats it as a plaintext password, so you can do this:
PGPASSWORD="scram-sha-256:mDBuqO1mEekieg==:4096:17dc259499c1 a184c26ee5b19715173d9354195f51 0b4d3af8be585acb39ae33:d3d7131 49c6becbbe56bae259aafe4e95b79a b7e3b50f2fbd850ea7d7b7c114f" psql postgres -h localhost -U scram_user
I think we're going to have a more bugs like this, if we don't start to explicitly label plaintext passwords as such.
So, let's add "plain:" prefix to plaintext passwords, in pg_authid.rolpassword. With that, these would be valid values in pg_authid.rolpassword:
plain:foo
md55a962ce7a24371a10e85627a484cac28
scram-sha-256:mDBuqO1mEekieg==:4096:17dc259499c1a184c26ee5b1 9715173d9354195f510b4d3af8be58 5acb39ae33:d3d713149c6becbbe56 bae259aafe4e95b79ab7e3b50f2fbd 850ea7d7b7c114f
I would so like to just drop support for plain passwords completely :) But there's a backwards compatibility issue to think about of course.
But -- is there any actual usecase for them anymore?
If not, another option could be to just specifically check that it's *not* "md5<something>" or "scram-<something>:<something>". That would invalidate plaintext passwords that have those texts in them of course, but what's the likelyhood of that in reality?
Though I guess that might at least in theory be more bug-prone, so going with a "plain:" prefix seems like a good idea as well.
But anything that doesn't begin with "plain:", "md5", or "scram-sha-256:" would be invalid. You shouldn't have invalid values in the column, but if you do, all the authentication mechanisms would reject it.
It would be nice to also change the format of MD5 passwords to have a colon, as in "md5:<hash>", but that's probably not worth breaking compatibility for. Almost no-one stores passwords in plaintext, so changing the format of that wouldn't affect many people, but there might well be tools out there that peek into MD5 hashes.
There are definitely tools that do that, so +1 on leaving that alone.
Re: pg_authid.rolpassword format (was Re: [HACKERS] Passwordidentifiers, protocol aging and SCRAM protocol)
From
Heikki Linnakangas
Date:
On 12/14/2016 12:15 PM, Michael Paquier wrote: > This work is definitely something that should be done before anything > else. Need a patch or are you on it? I'm on it.. - Heikki
Re: pg_authid.rolpassword format (was Re: [HACKERS] Passwordidentifiers, protocol aging and SCRAM protocol)
From
Heikki Linnakangas
Date:
On 12/14/2016 12:27 PM, Magnus Hagander wrote: > I would so like to just drop support for plain passwords completely :) But > there's a backwards compatibility issue to think about of course. > > But -- is there any actual usecase for them anymore? Hmm. At the moment, I don't think there is. But, a password stored in plaintext works with either MD5 or SCRAM, or any future authentication mechanism. So as soon as we have SCRAM authentication, it becomes somewhat useful again. In a nutshell: auth / stored MD5 SCRAM plaintext ----------------------------------------- password Y Y Y md5 Y N Y scram N Y Y If a password is stored in plaintext, it can be used with any authentication mechanism. And the plaintext 'password' authentication mechanism works with any kind of a stored password. But an MD5 hash cannot be used with SCRAM authentication, or vice versa. I just noticed that the manual for CREATE ROLE says: > Note that older clients might lack support for the MD5 authentication > mechanism that is needed to work with passwords that are stored > encrypted. That's is incorrect. The alternative to MD5 authentication is plain 'password' authentication, and that works just fine with MD5-hashed passwords. I think that sentence is a leftover from when we still supported "crypt" authentication (so I actually get to blame you for that ;-), commit 53a5026b). Back then, it was true that if an MD5 hash was stored in pg_authid, you couldn't do "crypt" authentication. That might have left old clients out in the cold. Now that we're getting SCRAM authentication, we'll need a similar notice there again, for the incompatibility of a SCRAM verifier with MDD5 authentication and vice versa. > If not, another option could be to just specifically check that it's *not* > "md5<something>" or "scram-<something>:<something>". That would invalidate > plaintext passwords that have those texts in them of course, but what's the > likelyhood of that in reality? Hmm, we have dismissed that risk for the MD5 hashes (and we also have a length check for them), but as we get new hash formats, the risk increases. Someone might well want to use "plain:of:jars" as password. Perhaps we should use a more complicated pattern. I googled around for how others store SCRAM and other password hashes. Many other systems seem to have similar naming schemes. The closest thing to a standard I could find was: https://github.com/P-H-C/phc-string-format/blob/master/phc-sf-spec.md Perhaps we should also use something like "$plain$<password>" or "$scram-sha-256$<iterations>$<salt>$<key>$"? There's also https://tools.ietf.org/html/rfc5803, which specifies how to store SCRAM verifiers in LDAP. I don't understand enough of LDAP to understand what those actually look like, though, and there were no examples in the RFC. I wonder if we should also worry about storing multiple verifiers in rolpassword? We don't support that now, but we might in the future. It might come handy, if you could easily store multiple hashes in a single string, separated by commas for example. - Heikki
Re: pg_authid.rolpassword format (was Re: [HACKERS] Passwordidentifiers, protocol aging and SCRAM protocol)
From
Peter Eisentraut
Date:
On 12/14/16 5:15 AM, Michael Paquier wrote: > I would be tempted to suggest adding the verifier type as a new column > of pg_authid Yes please. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: pg_authid.rolpassword format (was Re: [HACKERS] Passwordidentifiers, protocol aging and SCRAM protocol)
From
Stephen Frost
Date:
* Peter Eisentraut (peter.eisentraut@2ndquadrant.com) wrote: > On 12/14/16 5:15 AM, Michael Paquier wrote: > > I would be tempted to suggest adding the verifier type as a new column > > of pg_authid > > Yes please. This discussion seems to continue to come up and I don't entirely understand why we keep trying to shove more things into pg_authid, or worse, into rolpassword. We should have an independent table for the verifiers, which has a different column for the verifier type, and either starts off supporting multiple verifiers per role or at least gives us the ability to add that easily later. We should also move rolvaliduntil to that new table. No, I am specifically *not* concerned with "backwards compatibility" of that table- we continually add to it and change it and applications which are so closely tied to PG that they look at pg_authid need to be updated with nearly every release anyway. What we *do* need to make sure we get correct is what pg_dump/pg_upgrade do, but that's entirely within our control to manage and shouldn't be that much of an issue to implement. Thanks! Stephen
Re: pg_authid.rolpassword format (was Re: [HACKERS] Passwordidentifiers, protocol aging and SCRAM protocol)
From
Bruce Momjian
Date:
On Wed, Dec 14, 2016 at 11:27:15AM +0100, Magnus Hagander wrote: > I would so like to just drop support for plain passwords completely :) But > there's a backwards compatibility issue to think about of course. > > But -- is there any actual usecase for them anymore? I thought we recommended 'password' for SSL connections because if you use MD5 passwords the password text layout is known and that simplifies cryptanalysis. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + As you are, so once was I. As I am, so you will be. + + Ancient Roman grave inscription +
Re: pg_authid.rolpassword format (was Re: [HACKERS] Password identifiers, protocol aging and SCRAM protocol)
From
Heikki Linnakangas
Date:
On 14 December 2016 20:12:05 EET, Bruce Momjian <bruce@momjian.us> wrote: >On Wed, Dec 14, 2016 at 11:27:15AM +0100, Magnus Hagander wrote: >> I would so like to just drop support for plain passwords completely >:) But >> there's a backwards compatibility issue to think about of course. >> >> But -- is there any actual usecase for them anymore? > >I thought we recommended 'password' for SSL connections because if you >use MD5 passwords the password text layout is known and that simplifies >cryptanalysis. No, that makes no sense. And whether you use 'password' or 'md5' authentication is a different question than whether youstore passwords in plaintext or as md5 hashes. Magnus was asking whether it ever makes sense to *store* passwords in plaintext. Since you brought it up, there is a legitimate argument to be made that 'password' authentication is more secure than 'md5',when SSL is used. Namely, if an attacker can acquire contents of pg_authid e.g. by stealing a backup tape, with 'md5'authentication he can log in as any user, using just the stolen hashes. But with 'password', he needs to reverse thehash first. It's not a great difference, but it's something. - Heikki
Re: pg_authid.rolpassword format (was Re: [HACKERS] Passwordidentifiers, protocol aging and SCRAM protocol)
From
Stephen Frost
Date:
* Heikki Linnakangas (hlinnaka@iki.fi) wrote: > On 14 December 2016 20:12:05 EET, Bruce Momjian <bruce@momjian.us> wrote: > >On Wed, Dec 14, 2016 at 11:27:15AM +0100, Magnus Hagander wrote: > >> I would so like to just drop support for plain passwords completely > >:) But > >> there's a backwards compatibility issue to think about of course. > >> > >> But -- is there any actual usecase for them anymore? > > > >I thought we recommended 'password' for SSL connections because if you > >use MD5 passwords the password text layout is known and that simplifies > >cryptanalysis. > > No, that makes no sense. And whether you use 'password' or 'md5' authentication is a different question than whether youstore passwords in plaintext or as md5 hashes. Magnus was asking whether it ever makes sense to *store* passwords in plaintext. Right. > Since you brought it up, there is a legitimate argument to be made that 'password' authentication is more secure than 'md5',when SSL is used. Namely, if an attacker can acquire contents of pg_authid e.g. by stealing a backup tape, with 'md5'authentication he can log in as any user, using just the stolen hashes. But with 'password', he needs to reverse thehash first. It's not a great difference, but it's something. Tunnelled passwords which are stored as hashes is also well understood and comparable to SSH with passwords in /etc/passwd. Storing plaintext passwords has been bad form for just about forever and I wouldn't be sad to see our support of it go. At the least, as was discussed somewhere, but I'm not sure where it ended up, we should give administrators the ability to control what ways a password can be stored. In particular, once a user has migrated all of their users to SCRAM, they should be able to say "don't let new passwords be in any format other than SCRAM-SHA-256". Thanks! Stephen
Re: pg_authid.rolpassword format (was Re: [HACKERS] Passwordidentifiers, protocol aging and SCRAM protocol)
From
"Joshua D. Drake"
Date:
On 12/14/2016 11:41 AM, Stephen Frost wrote: > * Heikki Linnakangas (hlinnaka@iki.fi) wrote: >> On 14 December 2016 20:12:05 EET, Bruce Momjian <bruce@momjian.us> wrote: >>> On Wed, Dec 14, 2016 at 11:27:15AM +0100, Magnus Hagander wrote: > Storing plaintext passwords has been bad form for just about forever and > I wouldn't be sad to see our support of it go. At the least, as was > discussed somewhere, but I'm not sure where it ended up, we should give > administrators the ability to control what ways a password can be > stored. In particular, once a user has migrated all of their users to > SCRAM, they should be able to say "don't let new passwords be in any > format other than SCRAM-SHA-256". It isn't as bad as it used to be. I remember with PASSWORD was the default. I agree that we should be able to set a policy that says, "we only allow X for password storage". JD > > Thanks! > > Stephen > -- Command Prompt, Inc. http://the.postgres.company/ +1-503-667-4564 PostgreSQL Centered full stack support, consulting and development. Everyone appreciates your honesty, until you are honest with them.
Re: pg_authid.rolpassword format (was Re: [HACKERS] Passwordidentifiers, protocol aging and SCRAM protocol)
From
Michael Paquier
Date:
On Wed, Dec 14, 2016 at 8:33 PM, Heikki Linnakangas <hlinnaka@iki.fi> wrote: > But, a password stored in plaintext works with either MD5 or SCRAM, or any > future authentication mechanism. So as soon as we have SCRAM authentication, > it becomes somewhat useful again. > > In a nutshell: > > auth / stored MD5 SCRAM plaintext > ----------------------------------------- > password Y Y Y > md5 Y N Y > scram N Y Y > > If a password is stored in plaintext, it can be used with any authentication > mechanism. And the plaintext 'password' authentication mechanism works with > any kind of a stored password. But an MD5 hash cannot be used with SCRAM > authentication, or vice versa. So.. I have been thinking about this portion of the thread. And what I find the most scary is not the fact that we use plain passwords for SCRAM authentication, it is the fact that we would need to do a catalog lookup earlier in the connection workflow to decide what is the connection protocol to use depending on the username provided in the startup packet if the pg_hba.conf entry matching the user and database names uses "password". And, honestly, why do we actually need to have a support table that spread? SCRAM is designed to be secure, so it seems to me that it would on the contrary a bad idea to encourage the use of plain passwords if we actually think that they should never be used (they are actually useful for located, development instances, not production ones). So what I would suggest would be to have a support table like that: auth / stored MD5 SCRAM plaintext ----------------------------------------- password Y Y N md5 Y N Y scram N N Y So here is an idea for things to do now: 1) do not change the format of the existing passwords 2) do not change pg_authid 3) block access to instances if "password" or "md5" are used in pg_hba.conf if the user have a SCRAM verifier. 4) block access if "scram" is used and if user has a plain or md5 verifier. 5) Allow access if "scram" is used and if user has a SCRAM verifier. We had a similar discussion regarding verifier/password formats last year but that did not end well. It would be sad to fall back again into this discussion and get no result. If somebody wants to support access to SCRAM with plain password entries, why not. But that would gain a -1 from me regarding the earlier lookup of pg_authid needed to do the decision making on the protocol to use. And I think that we want SCRAM to be designed to be a maximum stable and secure. -- Michael
On Tue, Dec 13, 2016 at 2:44 PM, Michael Paquier <michael.paquier@gmail.com> wrote: > SASLPrep is defined here: > https://tools.ietf.org/html/rfc4013 > And stringprep is here: > https://tools.ietf.org/html/rfc3454 > So that's roughly applying a conversion from the mapping table, taking > into account prohibited, bi-directional, mapping characters, etc. The > spec says that the password should be in unicode. But we cannot be > sure of that, right? Those mapping tables should be likely a separated > thing.. (perl has Unicode::Stringprep::Mapping for example). OK. I have look at that and I have bumped into libidn, that offers a couple of APIs that could be used directly for this purpose. Particularly, what has caught my eyes is stringprep_profile(): https://www.gnu.org/software/libidn/manual/html_node/Stringprep-Functions.html res = stringprep_profile (input, output, "SASLprep", STRINGPREP_NO_UNASSIGNED); libidn can be installed on Windows, and I have found packages for cygwin, mingw, linux, freebsd and macos via brew. In the case where libidn is not installed, I think that the safest path would be to check if the input string has any high bits set (0x80) and bail out because that would mean that it is a UTF-8 string that we cannot change. Any thoughts about using libidn? Also, after discussion with Heikki, here are the things that we need to do: 1) In libpq, we need to check if the string is valid utf-8. If that's valid utf-8, apply SASLprep. if not, copy the string as-is. We could error as well in this case... Perhaps a WARNING could be more adapted, that's the most tricky case, and if the client does not use utf-8 that may lead to unexpected behavior. 2) In server, when the password verifier is created. If client_encoding is utf-8, but not server_encoding, convert the password to utf-8 and build the verifier after applying SASLprep. In the case where the binaries are *not* built with libidn, I think that we had better reject valid UTF-8 string directly and just allow ASCII? SASLprep is a no-op on ASCII characters. Thoughts about this approach? -- Michael
Re: pg_authid.rolpassword format (was Re: [HACKERS] Passwordidentifiers, protocol aging and SCRAM protocol)
From
Heikki Linnakangas
Date:
On 12/14/2016 04:57 PM, Stephen Frost wrote: > * Peter Eisentraut (peter.eisentraut@2ndquadrant.com) wrote: >> On 12/14/16 5:15 AM, Michael Paquier wrote: >>> I would be tempted to suggest adding the verifier type as a new column >>> of pg_authid >> >> Yes please. > > This discussion seems to continue to come up and I don't entirely > understand why we keep trying to shove more things into pg_authid, or > worse, into rolpassword. I understand the relational beauty of having a separate column for the verifier type, but I don't think it would be practical. For starters, we'd still like to have a self-identifying string format like "scram-sha-256:<stuff>", so that you can conveniently pass the verifier as a string to CREATE USER. I think it'll be much better to stick to one format, than try to split the verifier into type and the string, when it enters the catalog table. > We should have an independent table for the verifiers, which has a > different column for the verifier type, and either starts off supporting > multiple verifiers per role or at least gives us the ability to add that > easily later. We should also move rolvaliduntil to that new table. I agree we'll probably need a new table for verifiers. Or turn rolpassword into an array or something. We discussed that before, however, and it didn't really go anywhere, so right now I'd like to get SCRAM in with minimal changes to the rest of the system. There is a lot of room for improvement once it's in. - Heikki
Re: pg_authid.rolpassword format (was Re: [HACKERS] Passwordidentifiers, protocol aging and SCRAM protocol)
From
Heikki Linnakangas
Date:
On 12/15/2016 03:00 AM, Michael Paquier wrote: > On Wed, Dec 14, 2016 at 8:33 PM, Heikki Linnakangas <hlinnaka@iki.fi> wrote: >> But, a password stored in plaintext works with either MD5 or SCRAM, or any >> future authentication mechanism. So as soon as we have SCRAM authentication, >> it becomes somewhat useful again. >> >> In a nutshell: >> >> auth / stored MD5 SCRAM plaintext >> ----------------------------------------- >> password Y Y Y >> md5 Y N Y >> scram N Y Y >> >> If a password is stored in plaintext, it can be used with any authentication >> mechanism. And the plaintext 'password' authentication mechanism works with >> any kind of a stored password. But an MD5 hash cannot be used with SCRAM >> authentication, or vice versa. > > So.. I have been thinking about this portion of the thread. And what I > find the most scary is not the fact that we use plain passwords for > SCRAM authentication, it is the fact that we would need to do a > catalog lookup earlier in the connection workflow to decide what is > the connection protocol to use depending on the username provided in > the startup packet if the pg_hba.conf entry matching the user and > database names uses "password". I don't see why we would need to do a catalog lookup any earlier. With "password" authentication, the server can simply request the client to send its password. When it receives it, it performs the catalog lookup to get pg_authid.rolpassword. If it's in plaintext, just compare it, if it's an MD5 hash, hash the client's password and compare, and if it's a SCRAM verifier, build a verifier with the same salt and iteration count and compare. > And, honestly, why do we actually need to have a support table that > spread? SCRAM is designed to be secure, so it seems to me that it > would on the contrary a bad idea to encourage the use of plain > passwords if we actually think that they should never be used (they > are actually useful for located, development instances, not production > ones). I agree we should not encourage bad password practices. But as long as we support passwords to be stored in plaintext at all, it makes no sense to not allow them to be used with SCRAM. The fact that you can use a password stored in plaintext with both MD5 and SCRAM is literally the only reason you would store a password in plaintext, so if we don't want to allow that, we should disallow storing passwords in plaintext altogether. > So what I would suggest would be to have a support table like > that: > auth / stored MD5 SCRAM plaintext > ----------------------------------------- > password Y Y N > md5 Y N Y > scram N N Y I was using 'Y' to indicate that the combination works, and 'N' to indicate that it does not. Assuming you're using the same notation, the above doesn't make any sense. > So here is an idea for things to do now: > 1) do not change the format of the existing passwords > 2) do not change pg_authid > 3) block access to instances if "password" or "md5" are used in > pg_hba.conf if the user have a SCRAM verifier. > 4) block access if "scram" is used and if user has a plain or md5 verifier. > 5) Allow access if "scram" is used and if user has a SCRAM verifier. > We had a similar discussion regarding verifier/password formats last > year but that did not end well. It would be sad to fall back again > into this discussion and get no result. If somebody wants to support > access to SCRAM with plain password entries, why not. But that would > gain a -1 from me regarding the earlier lookup of pg_authid needed to > do the decision making on the protocol to use. And I think that we > want SCRAM to be designed to be a maximum stable and secure. The bottom line is that at the moment, when plaintext passwords are stored as is, without any indicator that it's a plaintext password, it's ambiguous whether a password is a SCRAM verifier, or if it's a plaintext password that just happens to begin with the word "scram:". That is completely unrelated to which combinations of stored passwords and authentication mechanisms we actually support or allow to work. The only way to distinguish, is to know about every verifier kind there is, and check whether rolpassword looks valid as anything else than a plaintext password. And we already got tripped by a bug-of-omission on that once. If we add more verifier formats in the future, it's bound to happen again. Let's nip that source of bugs in the bud. Attached is a patch to implement what I have in mind. Alternatively, you could argue that we should forbid storing passwords in plaintext altogether. I'm OK with that, too, if that's what people prefer. Then you cannot have a user that can log in with both MD5 and SCRAM authentication, but it's certainly more secure, and it's easier to document. - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Attachment
Re: pg_authid.rolpassword format (was Re: [HACKERS] Passwordidentifiers, protocol aging and SCRAM protocol)
From
Stephen Frost
Date:
* Heikki Linnakangas (hlinnaka@iki.fi) wrote: > On 12/14/2016 04:57 PM, Stephen Frost wrote: > >* Peter Eisentraut (peter.eisentraut@2ndquadrant.com) wrote: > >>On 12/14/16 5:15 AM, Michael Paquier wrote: > >>>I would be tempted to suggest adding the verifier type as a new column > >>>of pg_authid > >> > >>Yes please. > > > >This discussion seems to continue to come up and I don't entirely > >understand why we keep trying to shove more things into pg_authid, or > >worse, into rolpassword. > > I understand the relational beauty of having a separate column for > the verifier type, but I don't think it would be practical. I disagree. > For > starters, we'd still like to have a self-identifying string format > like "scram-sha-256:<stuff>", so that you can conveniently pass the > verifier as a string to CREATE USER. I don't follow why we can't change the syntax for CREATE USER to allow specifying the verifier type independently. Generally speaking, I don't expect *users* to be providing actual encoded *verifiers* very often, so it seems like a bit of extra syntax that pg_dump has to use isn't that big of a deal. > I think it'll be much better to > stick to one format, than try to split the verifier into type and > the string, when it enters the catalog table. Apparently, multiple people disagree with this approach. I don't think history is really on your side here either. > >We should have an independent table for the verifiers, which has a > >different column for the verifier type, and either starts off supporting > >multiple verifiers per role or at least gives us the ability to add that > >easily later. We should also move rolvaliduntil to that new table. > > I agree we'll probably need a new table for verifiers. Or turn > rolpassword into an array or something. We discussed that before, > however, and it didn't really go anywhere, so right now I'd like to > get SCRAM in with minimal changes to the rest of the system. There > is a lot of room for improvement once it's in. Using an array strikes me as an absolutely terrible idea- how are you going to handle having different valid_until times then? I do agree with trying to get SCRAM in without changing too much of the rest of the system, but I wanted to make it clear that it's the only point that I agree with for continuing down this path and that we should absolutely be looking to change the CREATE USER syntax to specify the verifier independently, plan to use a different table for the verifiers with an independent column for the verifier type, support multiple verifiers per role, etc, in the (hopefully very near...) future. Thanks! Stephen
Re: pg_authid.rolpassword format (was Re: [HACKERS] Passwordidentifiers, protocol aging and SCRAM protocol)
From
Michael Paquier
Date:
On Thu, Dec 15, 2016 at 9:48 PM, Heikki Linnakangas <hlinnaka@iki.fi> wrote: > The only way to distinguish, is to know about every verifier kind there is, > and check whether rolpassword looks valid as anything else than a plaintext > password. And we already got tripped by a bug-of-omission on that once. If > we add more verifier formats in the future, it's bound to happen again. > Let's nip that source of bugs in the bud. Attached is a patch to implement > what I have in mind. OK, I had a look at the patch proposed. - if (!pg_md5_encrypt(username, username, namelen, encrypted)) - elog(ERROR, "password encryption failed"); - if (strcmp(password, encrypted) == 0) - ereport(ERROR, - (errcode(ERRCODE_INVALID_PARAMETER_VALUE), - errmsg("password must not contain user name"))); This patch removes the only possible check for MD5 hashes that it has never been done in passwordcheck. It may be fine to remove it, but I would think that it is a good source of example regarding what could be done with MD5 hashes, though limited. So it seems to me that this check should involve as well pg_md5_encrypt on the username and compare if with the MD5 hash given by the caller. The new code is being careful about trying to pass down a plain password, but it is possible to load MD5 hashes directly as well, aka pg_dumpall. A simple ALTER USER role PASSWORD 'foo' causes a crash: #0 0x00000000004764d7 in heap_compute_data_size (tupleDesc=0x277f090, values=0x27504b8, isnull=0x2750550 "") at heaptuple.c:106 106 VARATT_CAN_MAKE_SHORT(DatumGetPointer(val))) (gdb) bt #0 0x00000000004764d7 in heap_compute_data_size (tupleDesc=0x277f090, values=0x27504b8, isnull=0x2750550 "") at heaptuple.c:106 #1 0x00000000004781e9 in heap_form_tuple (tupleDescriptor=0x277f090, values=0x27504b8, isnull=0x2750550 "") at heaptuple.c:736 #2 0x00000000004784d0 in heap_modify_tuple (tuple=0x277adc8, tupleDesc=0x277f090, replValues=0x7fff1369d030, replIsnull=0x7fff1369d020"", doReplace=0x7fff1369d010 "") at heaptuple.c:833 #3 0x0000000000673788 in AlterRole (stmt=0x27a4f78)at user.c:845 #4 0x000000000082aa49 in standard_ProcessUtility (parsetree=0x27a4f78, queryString=0x27a43e8"alter role ioltas password 'toto';", context=PROCESS_UTILITY_TOPLEVEL, params=0x0, dest=0x27a5300,completionTag=0x7fff1369d5b0 "") at utility.c:711 + case PASSWORD_TYPE_PLAINTEXT: + shadow_pass = &shadow_pass[strlen("plain:")]; + break; It would be a good idea to have a generic routine able to get the plain password value. In short I think that we should reduce the amount of locations where "plain:" prefix is hardcoded. > Alternatively, you could argue that we should forbid storing passwords in > plaintext altogether. I'm OK with that, too, if that's what people prefer. > Then you cannot have a user that can log in with both MD5 and SCRAM > authentication, but it's certainly more secure, and it's easier to document. At the end this may prove to be a bad idea for some developers. In local deployments when working on a backend application with Postgres as backend, it is actually useful to have plain passwords. At least I have found that useful in some stuff I did many years ago. -- Michael
Re: pg_authid.rolpassword format (was Re: [HACKERS] Passwordidentifiers, protocol aging and SCRAM protocol)
From
Robert Haas
Date:
On Thu, Dec 15, 2016 at 8:40 AM, Stephen Frost <sfrost@snowman.net> wrote: > * Heikki Linnakangas (hlinnaka@iki.fi) wrote: >> On 12/14/2016 04:57 PM, Stephen Frost wrote: >> >* Peter Eisentraut (peter.eisentraut@2ndquadrant.com) wrote: >> >>On 12/14/16 5:15 AM, Michael Paquier wrote: >> >>>I would be tempted to suggest adding the verifier type as a new column >> >>>of pg_authid >> >> >> >>Yes please. >> > >> >This discussion seems to continue to come up and I don't entirely >> >understand why we keep trying to shove more things into pg_authid, or >> >worse, into rolpassword. >> >> I understand the relational beauty of having a separate column for >> the verifier type, but I don't think it would be practical. > > I disagree. Me, too. I think the idea of moving everything into a separate table that allows multiple verifiers is probably not a good thing to do just right now, because that introduces a bunch of additional issues above and beyond what we need to do to get SCRAM implemented. There are administration and policy decisions to be made there that we should not conflate with SCRAM proper. However, Heikki's proposal seems to be that it's reasonable to force rolpassword to be of the form 'type:verifier' in all cases but not reasonable to have separate columns for type and verifier. Eh? >> For >> starters, we'd still like to have a self-identifying string format >> like "scram-sha-256:<stuff>", so that you can conveniently pass the >> verifier as a string to CREATE USER. > > I don't follow why we can't change the syntax for CREATE USER to allow > specifying the verifier type independently. Generally speaking, I don't > expect *users* to be providing actual encoded *verifiers* very often, so > it seems like a bit of extra syntax that pg_dump has to use isn't that > big of a deal. We don't have to change the CREATE USER syntax at all. It could just split on the first colon and put the two halves of the string in different places. Of course, changing the syntax might be a good idea anyway -- or not --- but the point is, right now, when you look at rolpassword, there's not a clear rule for what kind of thing you've got in there. That's absolutely terrible design and has got to be fixed. Heikki's proposal of prefixing every entry with a type and a ':' will solve that problem and I'm not going to roll over in my grave if we do it that way, but there is such a thing as normalization and that technique could be applied here. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: pg_authid.rolpassword format (was Re: [HACKERS] Passwordidentifiers, protocol aging and SCRAM protocol)
From
Peter Eisentraut
Date:
On 12/15/16 8:40 AM, Stephen Frost wrote: > I don't follow why we can't change the syntax for CREATE USER to allow > specifying the verifier type independently. That's what the last patch set I looked at actually does. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: pg_authid.rolpassword format (was Re: [HACKERS] Passwordidentifiers, protocol aging and SCRAM protocol)
From
Stephen Frost
Date:
* Peter Eisentraut (peter.eisentraut@2ndquadrant.com) wrote: > On 12/15/16 8:40 AM, Stephen Frost wrote: > > I don't follow why we can't change the syntax for CREATE USER to allow > > specifying the verifier type independently. > > That's what the last patch set I looked at actually does. Well, same here, but it was quite a while ago and things have progressed since then wrt SCRAM, as I understand it... Thanks! Stephen
Re: pg_authid.rolpassword format (was Re: [HACKERS] Passwordidentifiers, protocol aging and SCRAM protocol)
From
Michael Paquier
Date:
On Sat, Dec 17, 2016 at 5:42 AM, Stephen Frost <sfrost@snowman.net> wrote: > * Peter Eisentraut (peter.eisentraut@2ndquadrant.com) wrote: >> On 12/15/16 8:40 AM, Stephen Frost wrote: >> > I don't follow why we can't change the syntax for CREATE USER to allow >> > specifying the verifier type independently. >> >> That's what the last patch set I looked at actually does. > > Well, same here, but it was quite a while ago and things have progressed > since then wrt SCRAM, as I understand it... From the discussions of last year on -hackers, it was decided to *not* have an additional column per complains from a couple of hackers (Robert you were in this set at this point), and the same thing was concluded during the informal lunch meeting at PGcon. The point is, the existing SCRAM patch set can survive without touching at *all* the format of pg_authid. We could block SCRAM authentication when "password" is used in pg_hba.conf and as well as when "scram" is used with a plain password stored in pg_authid. Or look at the format of the string in the catalog if "password" is defined and decide the authentication protocol to follow based on that. -- Michael
Re: pg_authid.rolpassword format (was Re: [HACKERS] Passwordidentifiers, protocol aging and SCRAM protocol)
From
Stephen Frost
Date:
Michael, * Michael Paquier (michael.paquier@gmail.com) wrote: > On Sat, Dec 17, 2016 at 5:42 AM, Stephen Frost <sfrost@snowman.net> wrote: > > * Peter Eisentraut (peter.eisentraut@2ndquadrant.com) wrote: > >> On 12/15/16 8:40 AM, Stephen Frost wrote: > >> > I don't follow why we can't change the syntax for CREATE USER to allow > >> > specifying the verifier type independently. > >> > >> That's what the last patch set I looked at actually does. > > > > Well, same here, but it was quite a while ago and things have progressed > > since then wrt SCRAM, as I understand it... > > From the discussions of last year on -hackers, it was decided to *not* > have an additional column per complains from a couple of hackers It seems that, at best, we didn't have consensus on it. Hopefully we are moving in a direction of consensus. > (Robert you were in this set at this point), and the same thing was > concluded during the informal lunch meeting at PGcon. The point is, > the existing SCRAM patch set can survive without touching at *all* the > format of pg_authid. We could block SCRAM authentication when > "password" is used in pg_hba.conf and as well as when "scram" is used > with a plain password stored in pg_authid. Or look at the format of > the string in the catalog if "password" is defined and decide the > authentication protocol to follow based on that. As I mentioned up-thread, moving forward with minimal changes to get SCRAM in certainly makes sense, but I do think we should be open to (and, ideally, encouraging people to work towards) having a seperate table for verifiers with independent columns for type and verifier. Thanks! Stephen
Re: pg_authid.rolpassword format (was Re: [HACKERS] Passwordidentifiers, protocol aging and SCRAM protocol)
From
Michael Paquier
Date:
On Sat, Dec 17, 2016 at 10:23 AM, Stephen Frost <sfrost@snowman.net> wrote: > * Michael Paquier (michael.paquier@gmail.com) wrote: >> (Robert you were in this set at this point), and the same thing was >> concluded during the informal lunch meeting at PGcon. The point is, >> the existing SCRAM patch set can survive without touching at *all* the >> format of pg_authid. We could block SCRAM authentication when >> "password" is used in pg_hba.conf and as well as when "scram" is used >> with a plain password stored in pg_authid. Or look at the format of >> the string in the catalog if "password" is defined and decide the >> authentication protocol to follow based on that. > > As I mentioned up-thread, moving forward with minimal changes to get > SCRAM in certainly makes sense, but I do think we should be open to > (and, ideally, encouraging people to work towards) having a seperate > table for verifiers with independent columns for type and verifier. Definitely, and you know my position on the matter or I would not have written last year's patch series. Both things are just orthogonal IMO at this point. And it would be good to focus just on one problem at the moment to get it out. -- Michael
Re: pg_authid.rolpassword format (was Re: [HACKERS] Passwordidentifiers, protocol aging and SCRAM protocol)
From
Robert Haas
Date:
On Fri, Dec 16, 2016 at 5:30 PM, Michael Paquier <michael.paquier@gmail.com> wrote: > On Sat, Dec 17, 2016 at 5:42 AM, Stephen Frost <sfrost@snowman.net> wrote: >> * Peter Eisentraut (peter.eisentraut@2ndquadrant.com) wrote: >>> On 12/15/16 8:40 AM, Stephen Frost wrote: >>> > I don't follow why we can't change the syntax for CREATE USER to allow >>> > specifying the verifier type independently. >>> >>> That's what the last patch set I looked at actually does. >> >> Well, same here, but it was quite a while ago and things have progressed >> since then wrt SCRAM, as I understand it... > > From the discussions of last year on -hackers, it was decided to *not* > have an additional column per complains from a couple of hackers > (Robert you were in this set at this point), ... Hmm, I don't recall taking that position, but then there are a lot of things that I ought to recall and don't. (Ask my wife!) -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: pg_authid.rolpassword format (was Re: [HACKERS] Passwordidentifiers, protocol aging and SCRAM protocol)
From
Michael Paquier
Date:
On Sun, Dec 18, 2016 at 3:59 AM, Robert Haas <robertmhaas@gmail.com> wrote: > On Fri, Dec 16, 2016 at 5:30 PM, Michael Paquier > <michael.paquier@gmail.com> wrote: >> From the discussions of last year on -hackers, it was decided to *not* >> have an additional column per complains from a couple of hackers >> (Robert you were in this set at this point), ... > > Hmm, I don't recall taking that position, but then there are a lot of > things that I ought to recall and don't. (Ask my wife!) [... digging objects of the past ...] From the past thread: https://www.postgresql.org/message-id/CA+TgmoY790rphHBogXMbTG6MzSeNdoxdBXebEkAet9ZpZ8gvtw@mail.gmail.com The complain is directed directly to multiple verifiers per users though, not to have the type in a separate column. -- Michael
On Thu, Dec 15, 2016 at 3:17 PM, Michael Paquier <michael.paquier@gmail.com> wrote: > In the case where the binaries are *not* built with libidn, I think > that we had better reject valid UTF-8 string directly and just allow > ASCII? SASLprep is a no-op on ASCII characters. > > Thoughts about this approach? And Heikki has mentioned me that he'd prefer not having an extra dependency for the normalization, which is LGPL-licensed by the way. So I have looked at the SASLprep business to see what should be done to get a complete implementation in core, completely independent of anything known. The first thing is to be able to understand in the SCRAM code if a string is UTF-8 or not, and this code is in src/common/. pg_wchar.c offers a set of routines exactly for this purpose, which is built with libpq but that's not available for src/common/. So instead of moving all the file, I'd like to create a new file in src/common/utf8.c which includes pg_utf_mblen() and pg_utf8_islegal(). On top of that I think that having a routine able to check a full string would be useful for many users, as pg_utf8_islegal() can only check one set of characters. If the password string is found to be of UTF-8 format, SASLprepare is applied. If not, the string is copied as-is with perhaps unexpected effects for the client But he's in trouble already if client is not using UTF-8. Then comes the real business... Note that's my first time touching encoding, particularly UTF-8 in depth, so please be nice. I may write things that are incorrect or sound so from here :) The second thing is the normalization itself. Per RFC4013, NFKC needs to be applied to the string. The operation is described in [1] completely, and it is named as doing 1) a compatibility decomposition of the bytes of the string, followed by 2) a canonical composition. About 1). The compatibility decomposition is defined in [2], "by recursively applying the canonical and compatibility mappings, then applying the canonical reordering algorithm". Canonical and compatibility mapping are some data available in UnicodeData.txt, the 6th column of the set defined in [3] to be precise. The meaning of the decomposition mappings is defined in [2] as well. The canonical decomposition is basically to look for a given UTF-8 character, and then apply the multiple characters resulting in its new shape. The compatibility mapping should as well be applied, but [5], a perl tool called charlint.pl doing this normalization work, does not care about this phase... Do we? About 2)... Once the decomposition has been applied, those bytes need to be recomposed using the Canonical_Combining_Class field of UnicodeData.txt in [3], which is the 3rd column of the set. Its values are defined in [4]. An other interesting thing, charlint.pl [5] does not care about this phase. I am wondering if we should as well not just drop this part as well... Once 1) and 2) are done, NKFC is complete, and so is SASLPrepare. So what we need from Postgres side is a mapping table to, having the following fields: 1) Hexa sequence of UTF8 character. 2) Its canonical combining class. 3) The kind of decomposition mapping if defined. 4) The decomposition mapping, in hexadecimal format. Based on what I looked at, either perl or python could be used to process UnicodeData.txt and to generate a header file that would be included in the tree. There are 30k entries in UnicodeData.txt, 5k of them have a mapping, so that will result in many tables. One thing to improve performance would be to store the length of the table in a static variable, order the entries by their hexadecimal keys and do a dichotomy lookup to find an entry. We could as well use more fancy things like a set of tables using a Radix tree using decomposed by bytes. We should finish by just doing one lookup of the table for each character sets anyway. In conclusion, at this point I am looking for feedback regarding the following items: 1) Where to put the UTF8 check routines and what to move. 2) How to generate the mapping table using UnicodeData.txt. I'd think that using perl would be better. 3) The shape of the mapping table, which depends on how many operations we want to support in the normalization of the strings. The decisions for those items will drive the implementation in one sense or another. [1]: http://www.unicode.org/reports/tr15/#Description_Norm [2]: http://www.unicode.org/Public/5.1.0/ucd/UCD.html#Character_Decomposition_Mappings [3]: http://www.unicode.org/Public/5.1.0/ucd/UCD.html#UnicodeData.txt [4]: http://www.unicode.org/Public/5.1.0/ucd/UCD.html#Canonical_Combining_Class_Values [5]: https://www.w3.org/International/charlint/ Heikki, others, thoughts? -- Michael
Re: pg_authid.rolpassword format (was Re: [HACKERS] Passwordidentifiers, protocol aging and SCRAM protocol)
From
Robert Haas
Date:
On Sat, Dec 17, 2016 at 5:48 PM, Michael Paquier <michael.paquier@gmail.com> wrote: > On Sun, Dec 18, 2016 at 3:59 AM, Robert Haas <robertmhaas@gmail.com> wrote: >> On Fri, Dec 16, 2016 at 5:30 PM, Michael Paquier >> <michael.paquier@gmail.com> wrote: >>> From the discussions of last year on -hackers, it was decided to *not* >>> have an additional column per complains from a couple of hackers >>> (Robert you were in this set at this point), ... >> >> Hmm, I don't recall taking that position, but then there are a lot of >> things that I ought to recall and don't. (Ask my wife!) > > [... digging objects of the past ...] > From the past thread: > https://www.postgresql.org/message-id/CA+TgmoY790rphHBogXMbTG6MzSeNdoxdBXebEkAet9ZpZ8gvtw@mail.gmail.com > The complain is directed directly to multiple verifiers per users > though, not to have the type in a separate column. Yes, I rather like the separate column. But since Heikki is doing the work (or if he is) I'm not going to gripe too much. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: pg_authid.rolpassword format (was Re: [HACKERS] Passwordidentifiers, protocol aging and SCRAM protocol)
From
Heikki Linnakangas
Date:
On 12/16/2016 05:48 PM, Robert Haas wrote: > On Thu, Dec 15, 2016 at 8:40 AM, Stephen Frost <sfrost@snowman.net> wrote: >> * Heikki Linnakangas (hlinnaka@iki.fi) wrote: >>> On 12/14/2016 04:57 PM, Stephen Frost wrote: >>>> * Peter Eisentraut (peter.eisentraut@2ndquadrant.com) wrote: >>>>> On 12/14/16 5:15 AM, Michael Paquier wrote: >>>>>> I would be tempted to suggest adding the verifier type as a new column >>>>>> of pg_authid >>>>> >>>>> Yes please. >>>> >>>> This discussion seems to continue to come up and I don't entirely >>>> understand why we keep trying to shove more things into pg_authid, or >>>> worse, into rolpassword. >>> >>> I understand the relational beauty of having a separate column for >>> the verifier type, but I don't think it would be practical. >> >> I disagree. > > Me, too. I think the idea of moving everything into a separate table > that allows multiple verifiers is probably not a good thing to do just > right now, because that introduces a bunch of additional issues above > and beyond what we need to do to get SCRAM implemented. There are > administration and policy decisions to be made there that we should > not conflate with SCRAM proper. > > However, Heikki's proposal seems to be that it's reasonable to force > rolpassword to be of the form 'type:verifier' in all cases but not > reasonable to have separate columns for type and verifier. Eh? I fear we'll just have to agree to disagree here, but I'll try to explain myself one more time. Even if you have a separate "verifier type" column, it's not fully normalized, because there's still a dependency between the verifier and verifier type columns. You will always need to look at the verifier type to make sense of the verifier itself. It's more convenient to carry the type information with the verifier itself, in backend code, in pg_dump, etc. Sure, you could have a separate "transfer" text format that has the prefix, and strip it out when the datum enters the system. But it is even simpler to have only one format, with the prefix, and use that everywhere. It might make sense to add a separate column, to e.g. make it easier to e.g. query for users that have an MD5 verifier. You could do "WHERE rolverifiertype = 'md5'", instead of "WHERE rolpassword LIKE 'md5%'". It's not a big difference, though. But even if we did that, I would still love to have the type information *also* included with the verifier itself, for convenience. And if we include it in the verifier itself, adding a separate type column seems more trouble than it's worth. For comparison, imagine that we added a column to pg_authid for a picture of the user, stored as a bytea. The picture can be in JPEG or PNG format. Looking at the first few bytes of the image, you can tell which one it is. Would it make sense to add a separate "type" column, to tell what format the image is in? I think it would be more convenient and robust to rely on the first bytes of the image data instead. - Heikki
Re: pg_authid.rolpassword format (was Re: [HACKERS] Passwordidentifiers, protocol aging and SCRAM protocol)
From
Heikki Linnakangas
Date:
On 12/16/2016 03:31 AM, Michael Paquier wrote: > On Thu, Dec 15, 2016 at 9:48 PM, Heikki Linnakangas <hlinnaka@iki.fi> wrote: >> The only way to distinguish, is to know about every verifier kind there is, >> and check whether rolpassword looks valid as anything else than a plaintext >> password. And we already got tripped by a bug-of-omission on that once. If >> we add more verifier formats in the future, it's bound to happen again. >> Let's nip that source of bugs in the bud. Attached is a patch to implement >> what I have in mind. > > OK, I had a look at the patch proposed. > > - if (!pg_md5_encrypt(username, username, namelen, encrypted)) > - elog(ERROR, "password encryption failed"); > - if (strcmp(password, encrypted) == 0) > - ereport(ERROR, > - (errcode(ERRCODE_INVALID_PARAMETER_VALUE), > - errmsg("password must not contain user name"))); > > This patch removes the only possible check for MD5 hashes that it has > never been done in passwordcheck. It may be fine to remove it, but I would > think that it is a good source of example regarding what could be done with > MD5 hashes, though limited. So it seems to me that this check should involve > as well pg_md5_encrypt on the username and compare if with the MD5 hash > given by the caller. Actually, it does still perform that check. There's a new function, plain_crypt_verify, that passwordcheck uses now. plain_crypt_verify() is intended to work with any future hash formats we might introduce in the future (including SCRAM), so that passwordcheck doesn't need to know about all the hash formats. > A simple ALTER USER role PASSWORD 'foo' causes a crash: Ah, fixed. > + case PASSWORD_TYPE_PLAINTEXT: > + shadow_pass = &shadow_pass[strlen("plain:")]; > + break; > It would be a good idea to have a generic routine able to get the plain > password value. In short I think that we should reduce the amount of > locations where "plain:" prefix is hardcoded. There is such a function included in the patch, get_plain_password(char *shadow_pass), actually. Contrib/passwordcheck uses it. I figured that in crypt.c itself, it's OK to do the above directly, but get_plain_password() is intended to be used elsewhere. Thanks for having a look! Attached is a new version, with that bug fixed. - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Attachment
Re: pg_authid.rolpassword format (was Re: [HACKERS] Passwordidentifiers, protocol aging and SCRAM protocol)
From
Robert Haas
Date:
On Tue, Dec 20, 2016 at 6:37 AM, Heikki Linnakangas <hlinnaka@iki.fi> wrote: > It's more convenient to carry the type information with the verifier itself, > in backend code, in pg_dump, etc. Sure, you could have a separate "transfer" > text format that has the prefix, and strip it out when the datum enters the > system. But it is even simpler to have only one format, with the prefix, and > use that everywhere. I see your point. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: pg_authid.rolpassword format (was Re: [HACKERS] Passwordidentifiers, protocol aging and SCRAM protocol)
From
Stephen Frost
Date:
Heikki, * Heikki Linnakangas (hlinnaka@iki.fi) wrote: > Even if you have a separate "verifier type" column, it's not fully > normalized, because there's still a dependency between the verifier > and verifier type columns. You will always need to look at the > verifier type to make sense of the verifier itself. That's true- but you don't need to look at the verifier, or even have *access* to the verifier, to look at the verifier type. That is actually very useful when you start thinking about the downstream side of this- what about the monitoring tool which will want to check and make sure there are only certain verifier types being used? It'll have to be a superuser, or have access to some superuser security defined function, and that really sucks. I'm not saying that we would necessairly want the verifier type to be publicly visible, but being able to see it without being a superuser would be good, imv. > It's more convenient to carry the type information with the verifier > itself, in backend code, in pg_dump, etc. Sure, you could have a > separate "transfer" text format that has the prefix, and strip it > out when the datum enters the system. But it is even simpler to have > only one format, with the prefix, and use that everywhere. It's more convenient when you need to look at both- it's not more convenient when you only wish to look at the verifier type. Further, it means that we have to have a construct that assumes things about the verifier type and verifier- what if a verifier type came along that used a colon? We'd have to do some special magic to handle that correctly, and that just sucks, and anyone who is writing code to generically deal with these fields will end up writing that same code (or forgetting to, and not handling the case correctly). > It might make sense to add a separate column, to e.g. make it easier > to e.g. query for users that have an MD5 verifier. You could do > "WHERE rolverifiertype = 'md5'", instead of "WHERE rolpassword LIKE > 'md5%'". It's not a big difference, though. But even if we did that, > I would still love to have the type information *also* included with > the verifier itself, for convenience. And if we include it in the > verifier itself, adding a separate type column seems more trouble > than it's worth. I don't agree that it's "not a big difference." As I argue above- your approach also assumes that anyone who would like to investigate the verifier type should have access to the verifier itself, which I do not agree with. I also have a hard time buying the argument that it's really so much more convenient to have the verifier type included in the same string as the verifier that we should duplicate that information and then run the risk that we end up with the two not matching or that we won't ever run into complications down the road when our chosen separator causes us difficulties. Thanks! Stephen
Re: pg_authid.rolpassword format (was Re: [HACKERS] Passwordidentifiers, protocol aging and SCRAM protocol)
From
David Fetter
Date:
On Tue, Dec 20, 2016 at 08:34:19AM -0500, Stephen Frost wrote: > Heikki, > > * Heikki Linnakangas (hlinnaka@iki.fi) wrote: > > Even if you have a separate "verifier type" column, it's not fully > > normalized, because there's still a dependency between the > > verifier and verifier type columns. You will always need to look > > at the verifier type to make sense of the verifier itself. > > That's true- but you don't need to look at the verifier, or even > have *access* to the verifier, to look at the verifier type. Would a view that shows only what's to the left of the first semicolon suit this purpose? Best, David. -- David Fetter <david(at)fetter(dot)org> http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david(dot)fetter(at)gmail(dot)com Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate
Re: pg_authid.rolpassword format (was Re: [HACKERS] Passwordidentifiers, protocol aging and SCRAM protocol)
From
Michael Paquier
Date:
On Wed, Dec 21, 2016 at 1:08 AM, David Fetter <david@fetter.org> wrote: > Would a view that shows only what's to the left of the first semicolon > suit this purpose? Of course it would, you would just need to make the routines now checking the shape of MD5 and SCRAM identifiers available at SQL level and feed the strings into them. Now I am not sure that it's worth having a new superuser view for that. pg_roles and pg_shadow hide the information about verifiers. -- Michael
Re: pg_authid.rolpassword format (was Re: [HACKERS] Passwordidentifiers, protocol aging and SCRAM protocol)
From
Stephen Frost
Date:
David, * David Fetter (david@fetter.org) wrote: > On Tue, Dec 20, 2016 at 08:34:19AM -0500, Stephen Frost wrote: > > * Heikki Linnakangas (hlinnaka@iki.fi) wrote: > > > Even if you have a separate "verifier type" column, it's not fully > > > normalized, because there's still a dependency between the > > > verifier and verifier type columns. You will always need to look > > > at the verifier type to make sense of the verifier itself. > > > > That's true- but you don't need to look at the verifier, or even > > have *access* to the verifier, to look at the verifier type. > > Would a view that shows only what's to the left of the first semicolon > suit this purpose? Obviously a (security barrier...) view or a (security definer) function could be used, but I don't believe either is actually a good idea. Thanks! Stephen
Re: pg_authid.rolpassword format (was Re: [HACKERS] Passwordidentifiers, protocol aging and SCRAM protocol)
From
David Fetter
Date:
On Tue, Dec 20, 2016 at 06:14:40PM -0500, Stephen Frost wrote: > David, > > * David Fetter (david@fetter.org) wrote: > > On Tue, Dec 20, 2016 at 08:34:19AM -0500, Stephen Frost wrote: > > > * Heikki Linnakangas (hlinnaka@iki.fi) wrote: > > > > Even if you have a separate "verifier type" column, it's not fully > > > > normalized, because there's still a dependency between the > > > > verifier and verifier type columns. You will always need to look > > > > at the verifier type to make sense of the verifier itself. > > > > > > That's true- but you don't need to look at the verifier, or even > > > have *access* to the verifier, to look at the verifier type. > > > > Would a view that shows only what's to the left of the first semicolon > > suit this purpose? > > Obviously a (security barrier...) view or a (security definer) function > could be used, but I don't believe either is actually a good idea. Would you be so kind as to help me understand what's wrong with that idea? Best, David. -- David Fetter <david(at)fetter(dot)org> http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david(dot)fetter(at)gmail(dot)com Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate
Re: pg_authid.rolpassword format (was Re: [HACKERS] Passwordidentifiers, protocol aging and SCRAM protocol)
From
Stephen Frost
Date:
David, * David Fetter (david@fetter.org) wrote: > On Tue, Dec 20, 2016 at 06:14:40PM -0500, Stephen Frost wrote: > > * David Fetter (david@fetter.org) wrote: > > > On Tue, Dec 20, 2016 at 08:34:19AM -0500, Stephen Frost wrote: > > > > * Heikki Linnakangas (hlinnaka@iki.fi) wrote: > > > > > Even if you have a separate "verifier type" column, it's not fully > > > > > normalized, because there's still a dependency between the > > > > > verifier and verifier type columns. You will always need to look > > > > > at the verifier type to make sense of the verifier itself. > > > > > > > > That's true- but you don't need to look at the verifier, or even > > > > have *access* to the verifier, to look at the verifier type. > > > > > > Would a view that shows only what's to the left of the first semicolon > > > suit this purpose? > > > > Obviously a (security barrier...) view or a (security definer) function > > could be used, but I don't believe either is actually a good idea. > > Would you be so kind as to help me understand what's wrong with that idea? For starters, it doubles-down on the assumption that we'll always be happy with that particular separator and implies to anyone watching that they'll be able to trust it. Further, it's additional complication which, at least to my eyes, is entirely in the wrong direction. We could push everything in pg_authid into a single colon-separated text field and call it simpler because we don't have to deal with those silly column things, and we'd have something a lot closer to a unix passwd file too!, but it wouldn't make it a terribly smart thing to do. We aren't a bunch of individual C programs having to parse out things out of flat text files, after all. Thanks! Stephen
Re: pg_authid.rolpassword format (was Re: [HACKERS] Passwordidentifiers, protocol aging and SCRAM protocol)
From
Michael Paquier
Date:
On Tue, Dec 20, 2016 at 9:23 PM, Heikki Linnakangas <hlinnaka@iki.fi> wrote: > On 12/16/2016 03:31 AM, Michael Paquier wrote: > Actually, it does still perform that check. There's a new function, > plain_crypt_verify, that passwordcheck uses now. plain_crypt_verify() is > intended to work with any future hash formats we might introduce in the > future (including SCRAM), so that passwordcheck doesn't need to know about > all the hash formats. Bah. I have misread the first version of the patch, and it is indeed keeping the username checks. Now that things don't crash that behaves as expected: =# load 'passwordcheck'; LOAD =# alter role mpaquier password 'mpaquier'; ERROR: 22023: password must not contain user name LOCATION: check_password, passwordcheck.c:101 =# alter role mpaquier password 'md58349d3a1bc8f4f7399b1ff9dea493b15'; ERROR: 22023: password must not contain user name LOCATION: check_password, passwordcheck.c:82 With the patch: >> + case PASSWORD_TYPE_PLAINTEXT: >> + shadow_pass = &shadow_pass[strlen("plain:")]; >> + break; >> It would be a good idea to have a generic routine able to get the plain >> password value. In short I think that we should reduce the amount of >> locations where "plain:" prefix is hardcoded. > > There is such a function included in the patch, get_plain_password(char > *shadow_pass), actually. Contrib/passwordcheck uses it. I figured that in > crypt.c itself, it's OK to do the above directly, but get_plain_password() > is intended to be used elsewhere. The idea would be to have the function not return an allocated string, just a position to it. That would be useful in plain_crypt_verify() for example, for a total of 4 places, including get_plain_password() where the new string allocation is done. Well, it's not like this prefix "plain:" would change anyway in the future nor that it is going to spread much. > Thanks for having a look! Attached is a new version, with that bug fixed. I have been able more advanced testing without the crash and things seem to work properly. The attached set of tests is also able to pass for all the combinations of hba configurations and password formats. And looking at the code I don't have more comments. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Attachment
Re: pg_authid.rolpassword format (was Re: [HACKERS] Passwordidentifiers, protocol aging and SCRAM protocol)
From
Heikki Linnakangas
Date:
On 12/14/2016 01:33 PM, Heikki Linnakangas wrote: > I just noticed that the manual for CREATE ROLE says: > >> Note that older clients might lack support for the MD5 authentication >> mechanism that is needed to work with passwords that are stored >> encrypted. > > That's is incorrect. The alternative to MD5 authentication is plain > 'password' authentication, and that works just fine with MD5-hashed > passwords. I think that sentence is a leftover from when we still > supported "crypt" authentication (so I actually get to blame you for > that ;-), commit 53a5026b). Back then, it was true that if an MD5 hash > was stored in pg_authid, you couldn't do "crypt" authentication. That > might have left old clients out in the cold. > > Now that we're getting SCRAM authentication, we'll need a similar notice > there again, for the incompatibility of a SCRAM verifier with MDD5 > authentication and vice versa. I went ahead and removed the current bogus notice from the docs. We might need to put back something like it, with the SCRAM patch, but it needs to be rewritten anyway. - Heikki
Re: pg_authid.rolpassword format (was Re: [HACKERS] Passwordidentifiers, protocol aging and SCRAM protocol)
From
Heikki Linnakangas
Date:
On 12/21/2016 04:09 AM, Michael Paquier wrote: >> Thanks for having a look! Attached is a new version, with that bug fixed. > > I have been able more advanced testing without the crash and things > seem to work properly. The attached set of tests is also able to pass > for all the combinations of hba configurations and password formats. > And looking at the code I don't have more comments. Thanks! Since not everyone agrees with this approach, I split this patch into two. The first patch refactors things, replacing the isMD5() function with get_password_type(), without changing the representation of pg_authid.rolpassword. That is hopefully uncontroversial. And the second patch adds the "plain:" prefix, which not everyone agrees on. Barring objections I'm going to at least commit the first patch. I think we should commit the second one too, but it's not as critical, and the first patch matters more for the SCRAM patch, too. - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Attachment
Re: pg_authid.rolpassword format (was Re: [HACKERS] Passwordidentifiers, protocol aging and SCRAM protocol)
From
Michael Paquier
Date:
On Tue, Jan 3, 2017 at 11:09 PM, Heikki Linnakangas <hlinnaka@iki.fi> wrote: > Since not everyone agrees with this approach, I split this patch into two. > The first patch refactors things, replacing the isMD5() function with > get_password_type(), without changing the representation of > pg_authid.rolpassword. That is hopefully uncontroversial. And the second > patch adds the "plain:" prefix, which not everyone agrees on. > > Barring objections I'm going to at least commit the first patch. I think we > should commit the second one too, but it's not as critical, and the first > patch matters more for the SCRAM patch, too. The split does not look correct to me. 0001 has references to the prefix "plain:". -- Michael
Re: pg_authid.rolpassword format (was Re: [HACKERS] Passwordidentifiers, protocol aging and SCRAM protocol)
From
Peter Eisentraut
Date:
On 1/3/17 9:09 AM, Heikki Linnakangas wrote: > Since not everyone agrees with this approach, I split this patch into > two. The first patch refactors things, replacing the isMD5() function > with get_password_type(), without changing the representation of > pg_authid.rolpassword. That is hopefully uncontroversial. And the second > patch adds the "plain:" prefix, which not everyone agrees on. > > Barring objections I'm going to at least commit the first patch. I think > we should commit the second one too, but it's not as critical, and the > first patch matters more for the SCRAM patch, too. Is there currently anything to review here for the commit fest? -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: pg_authid.rolpassword format (was Re: [HACKERS] Passwordidentifiers, protocol aging and SCRAM protocol)
From
Michael Paquier
Date:
On Thu, Jan 5, 2017 at 10:31 PM, Peter Eisentraut <peter.eisentraut@2ndquadrant.com> wrote: > On 1/3/17 9:09 AM, Heikki Linnakangas wrote: >> Since not everyone agrees with this approach, I split this patch into >> two. The first patch refactors things, replacing the isMD5() function >> with get_password_type(), without changing the representation of >> pg_authid.rolpassword. That is hopefully uncontroversial. And the second >> patch adds the "plain:" prefix, which not everyone agrees on. >> >> Barring objections I'm going to at least commit the first patch. I think >> we should commit the second one too, but it's not as critical, and the >> first patch matters more for the SCRAM patch, too. > > Is there currently anything to review here for the commit fest? The patches sent here make sense as part of the SCRAM set: https://www.postgresql.org/message-id/6831df67-7641-1a66-4985-268609a4821f@iki.fi I was just waiting for Heikki to fix the split of the patches before moving on with an extra lookup though. -- Michael
Re: pg_authid.rolpassword format (was Re: [HACKERS] Passwordidentifiers, protocol aging and SCRAM protocol)
From
Peter Eisentraut
Date:
On 1/3/17 9:09 AM, Heikki Linnakangas wrote: > Since not everyone agrees with this approach, I split this patch into > two. The first patch refactors things, replacing the isMD5() function > with get_password_type(), without changing the representation of > pg_authid.rolpassword. That is hopefully uncontroversial. I have checked these patches. The refactoring in the first patch seems sensible. As Michael pointed out, there is still a reference to "plain:" in the first patch. The commit message needs to be updated, because the function plain_crypt_verify() was already added in a previous patch. I'm not fond of this kind of coding password = encrypt_password(password_type, stmt->role, password); where the 'password' variable has a different meaning before and after. This error message might be a mistake: elog(ERROR, "unrecognized password type conversion"); I think some pieces from the second patch could be included in the first patch, e.g., the parts for passwordcheck.c and user.c. > And the second > patch adds the "plain:" prefix, which not everyone agrees on. The code also gets a little bit dubious, as it introduces an "unknown" password type, which is sometimes treated as plaintext and sometimes as an error. I think this is going be messy. I would skip this patch for now at least. Too much controversy, and we don't know how the rest of the patches for this feature will look like to be able to know if it's worth it. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Tue, Nov 15, 2016 at 07:52:06AM +0900, Michael Paquier wrote: > On Sat, Nov 5, 2016 at 9:36 PM, Michael Paquier <michael.paquier@gmail.com> wrote: > > On Sat, Nov 5, 2016 at 12:58 AM, Peter Eisentraut <peter.eisentraut@2ndquadrant.com> wrote: > > pg_hba.conf uses "scram" as keyword, but scram refers to a family of > > authentication methods. There is as well SCRAM-SHA-1, SCRAM-SHA-256 > > (what this patch does). Hence wouldn't it make sense to use > > scram_sha256 in pg_hba.conf instead? If for example in the future > > there is a SHA-512 version of SCRAM we could switch easily to that and > > define scram_sha512. > > OK, I have added more docs regarding the use of scram in pg_hba.conf, > particularly in client-auth.sgml to describe what scram is better than > md5 in terms of protection, and also completed the data of pg_hba.conf > about the new keyword used in it. The latest versions document this precisely, but I agree with Peter's concern about plain "scram". Suppose it's 2025 and PostgreSQL support SASL mechanisms OAUTHBEARER, SCRAM-SHA-256, SCRAM-SHA-256-PLUS, and SCRAM-SHA3-512. What should the pg_hba.conf options look like at that time? I don't think having a single "scram" option fits in such a world. I see two strategies that fit: 1. Single "sasl" option, with a GUC, similar to ssl_ciphers, controlling the mechanisms to offer. 2. Separate options "scram_sha_256", "scram_sha3_512", "oauthbearer", etc.
On Wed, Jan 18, 2017 at 2:23 PM, Noah Misch <noah@leadboat.com> wrote: > The latest versions document this precisely, but I agree with Peter's concern > about plain "scram". Suppose it's 2025 and PostgreSQL support SASL mechanisms > OAUTHBEARER, SCRAM-SHA-256, SCRAM-SHA-256-PLUS, and SCRAM-SHA3-512. What > should the pg_hba.conf options look like at that time? I don't think having a > single "scram" option fits in such a world. Sure. > I see two strategies that fit: > > 1. Single "sasl" option, with a GUC, similar to ssl_ciphers, controlling the > mechanisms to offer. > 2. Separate options "scram_sha_256", "scram_sha3_512", "oauthbearer", etc. Or we could have a sasl option, with a mandatory array of mechanisms to define one or more items, so method entries in pg_hba.conf would look llke that: sasl mechanism=scram_sha_256,scram_sha3_512 Users could define different methods in each hba line once a user and a database map. I am not sure if many people would care about that though. -- Michael
On Tue, Dec 20, 2016 at 10:47 AM, Michael Paquier <michael.paquier@gmail.com> wrote: > And Heikki has mentioned me that he'd prefer not having an extra > dependency for the normalization, which is LGPL-licensed by the way. > So I have looked at the SASLprep business to see what should be done > to get a complete implementation in core, completely independent of > anything known. > > The first thing is to be able to understand in the SCRAM code if a > string is UTF-8 or not, and this code is in src/common/. pg_wchar.c > offers a set of routines exactly for this purpose, which is built with > libpq but that's not available for src/common/. So instead of moving > all the file, I'd like to create a new file in src/common/utf8.c which > includes pg_utf_mblen() and pg_utf8_islegal(). On top of that I think > that having a routine able to check a full string would be useful for > many users, as pg_utf8_islegal() can only check one set of characters. > If the password string is found to be of UTF-8 format, SASLprepare is > applied. If not, the string is copied as-is with perhaps unexpected > effects for the client But he's in trouble already if client is not > using UTF-8. > > Then comes the real business... Note that's my first time touching > encoding, particularly UTF-8 in depth, so please be nice. I may write > things that are incorrect or sound so from here :) > > The second thing is the normalization itself. Per RFC4013, NFKC needs > to be applied to the string. The operation is described in [1] > completely, and it is named as doing 1) a compatibility decomposition > of the bytes of the string, followed by 2) a canonical composition. > > About 1). The compatibility decomposition is defined in [2], "by > recursively applying the canonical and compatibility mappings, then > applying the canonical reordering algorithm". Canonical and > compatibility mapping are some data available in UnicodeData.txt, the > 6th column of the set defined in [3] to be precise. The meaning of the > decomposition mappings is defined in [2] as well. The canonical > decomposition is basically to look for a given UTF-8 character, and > then apply the multiple characters resulting in its new shape. The > compatibility mapping should as well be applied, but [5], a perl tool > called charlint.pl doing this normalization work, does not care about > this phase... Do we? > > About 2)... Once the decomposition has been applied, those bytes need > to be recomposed using the Canonical_Combining_Class field of > UnicodeData.txt in [3], which is the 3rd column of the set. Its values > are defined in [4]. An other interesting thing, charlint.pl [5] does > not care about this phase. I am wondering if we should as well not > just drop this part as well... > > Once 1) and 2) are done, NKFC is complete, and so is SASLPrepare. > > So what we need from Postgres side is a mapping table to, having the > following fields: > 1) Hexa sequence of UTF8 character. > 2) Its canonical combining class. > 3) The kind of decomposition mapping if defined. > 4) The decomposition mapping, in hexadecimal format. > Based on what I looked at, either perl or python could be used to > process UnicodeData.txt and to generate a header file that would be > included in the tree. There are 30k entries in UnicodeData.txt, 5k of > them have a mapping, so that will result in many tables. One thing to > improve performance would be to store the length of the table in a > static variable, order the entries by their hexadecimal keys and do a > dichotomy lookup to find an entry. We could as well use more fancy > things like a set of tables using a Radix tree using decomposed by > bytes. We should finish by just doing one lookup of the table for each > character sets anyway. > > In conclusion, at this point I am looking for feedback regarding the > following items: > 1) Where to put the UTF8 check routines and what to move. > 2) How to generate the mapping table using UnicodeData.txt. I'd think > that using perl would be better. > 3) The shape of the mapping table, which depends on how many > operations we want to support in the normalization of the strings. > The decisions for those items will drive the implementation in one > sense or another. > > [1]: http://www.unicode.org/reports/tr15/#Description_Norm > [2]: http://www.unicode.org/Public/5.1.0/ucd/UCD.html#Character_Decomposition_Mappings > [3]: http://www.unicode.org/Public/5.1.0/ucd/UCD.html#UnicodeData.txt > [4]: http://www.unicode.org/Public/5.1.0/ucd/UCD.html#Canonical_Combining_Class_Values > [5]: https://www.w3.org/International/charlint/ > > Heikki, others, thoughts? FWIW, this patch is on a "waiting on author" state and that's right. As the discussion on SASLprepare() and the decisions regarding the way to implement it, or at least have it, are still pending, I am not planning to move on with any implementation until we have a plan about what to do. Just using libidn (LGPL) for a first shot is rather painless but... I am not alone here. -- Michael
On Wed, Jan 18, 2017 at 02:30:38PM +0900, Michael Paquier wrote: > On Wed, Jan 18, 2017 at 2:23 PM, Noah Misch <noah@leadboat.com> wrote: > > The latest versions document this precisely, but I agree with Peter's concern > > about plain "scram". Suppose it's 2025 and PostgreSQL support SASL mechanisms > > OAUTHBEARER, SCRAM-SHA-256, SCRAM-SHA-256-PLUS, and SCRAM-SHA3-512. What > > should the pg_hba.conf options look like at that time? I don't think having a > > single "scram" option fits in such a world. > > Sure. > > > I see two strategies that fit: > > > > 1. Single "sasl" option, with a GUC, similar to ssl_ciphers, controlling the > > mechanisms to offer. > > 2. Separate options "scram_sha_256", "scram_sha3_512", "oauthbearer", etc. > > Or we could have a sasl option, with a mandatory array of mechanisms > to define one or more items, so method entries in pg_hba.conf would > look llke that: > sasl mechanism=scram_sha_256,scram_sha3_512 I like that.
On 19 January 2017 at 06:32, Noah Misch <noah@leadboat.com> wrote: > On Wed, Jan 18, 2017 at 02:30:38PM +0900, Michael Paquier wrote: >> On Wed, Jan 18, 2017 at 2:23 PM, Noah Misch <noah@leadboat.com> wrote: >> > The latest versions document this precisely, but I agree with Peter's concern >> > about plain "scram". Suppose it's 2025 and PostgreSQL support SASL mechanisms >> > OAUTHBEARER, SCRAM-SHA-256, SCRAM-SHA-256-PLUS, and SCRAM-SHA3-512. What >> > should the pg_hba.conf options look like at that time? I don't think having a >> > single "scram" option fits in such a world. >> >> Sure. >> >> > I see two strategies that fit: >> > >> > 1. Single "sasl" option, with a GUC, similar to ssl_ciphers, controlling the >> > mechanisms to offer. >> > 2. Separate options "scram_sha_256", "scram_sha3_512", "oauthbearer", etc. >> >> Or we could have a sasl option, with a mandatory array of mechanisms >> to define one or more items, so method entries in pg_hba.conf would >> look llke that: >> sasl mechanism=scram_sha_256,scram_sha3_512 > > I like that. Michael, I support your good work on this patch and its certainly shaping up. Noah's general point is that we need to have a general, futureproof design for the UI and I agree. We seem to be caught between adding lots of new things as parameters and adding new detail into pg_hba.conf. Parameters like password_encryption are difficult here because they essentially repeat what has already been said in the pg_hba.conf. If we have two entries in pg_hba.conf, one saying md5 and the other saying "scram" (or whatever), what would we set password_encryption to? It seems clear to me that if the pg_hba.conf says md5 then password_encryption should be md5 and if pg_hba.conf says scram then it should be scram. I'd like to float another idea, as a way of finding a way forwards that will last over time * pg_hba.conf entry would say sasl='methodX' (no spaces) * we have a new catalog called pg_sasl that allows us to add new methods, with appropriate function calls * remove password_encryption parameter and always use default encryption as specified for that session in pg_hba.conf Which sounds nice, but many users will wish to upgrade their current mechanisms from using md5 to scram. How will we update passwords slowly, so that different users change from md5 to scram at different times? Having to specify the mechanism in the pg_hba.conf makes that almost impossible, forcing a big bang approach which subsequently may never happen. As a way of solving that problem, another idea would be to make the mechanism session specific depending upon what is stored for a particular user. That allows us to have a single pg_hba.conf entry of "sasl", and then use md5, scram-256 or future-mechanism on a per user basis. I'm not sure I see a clear way forwards yet, these are just ideas and questions to help the discussion. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Thu, Jan 19, 2017 at 6:17 PM, Simon Riggs <simon@2ndquadrant.com> wrote: > We seem to be caught between adding lots of new things as parameters > and adding new detail into pg_hba.conf. > > Parameters like password_encryption are difficult here because they > essentially repeat what has already been said in the pg_hba.conf. If > we have two entries in pg_hba.conf, one saying md5 and the other > saying "scram" (or whatever), what would we set password_encryption > to? It seems clear to me that if the pg_hba.conf says md5 then > password_encryption should be md5 and if pg_hba.conf says scram then > it should be scram. > > I'd like to float another idea, as a way of finding a way forwards > that will last over time > > * pg_hba.conf entry would say sasl='methodX' (no spaces) > * we have a new catalog called pg_sasl that allows us to add new > methods, with appropriate function calls This would make sense if we support a mountain of protocols and that we want to have a handler with a set of APIs used for authentication. This is a grade higher than simple SCRAM, and this basically requires to design a set of generic routines that are fine for covering *any* protocol with this handler. I'd think this is rather hard per the slight differences in SASL exchanges for different protocols. > * remove password_encryption parameter and always use default > encryption as specified for that session in pg_hba.conf So if user X creates user Y with a password (defined by CREATE USER PASSWORD) it should by default follow what pg_hba.conf dictates, which could be pam or gss? That does not look very intuitive to me. The advantage with the current system is that password creation and protocol allowed for an authentication are two separate, independent things, password_encryption being basically a wrapper for CREATE USER. Mixing both makes things more confusing. If you are willing to move away from password_encryption, one thing that could be used is just to extend CREATE USER to be able to enforce the password protocol associated, that's what the patches on this thread do with PASSWORD (val USING protocol). > Which sounds nice, but many users will wish to upgrade their current > mechanisms from using md5 to scram. How will we update passwords > slowly, so that different users change from md5 to scram at different > times? Having to specify the mechanism in the pg_hba.conf makes that > almost impossible, forcing a big bang approach which subsequently may > never happen. At this point comes the possibility to define multiple password types for one single user instead of rolling multiple roles and renaming htem. > As a way of solving that problem, another idea would be to make the > mechanism session specific depending upon what is stored for a > particular user. That allows us to have a single pg_hba.conf entry of > "sasl", and then use md5, scram-256 or future-mechanism on a per user > basis. Isn't that specifying multiple users in a single sasl entry in pg_hba.conf? Once a user is updated, you could just move him from one line to the other of pg_hba.conf, or use a @file in the hba entry. > I'm not sure I see a clear way forwards yet, these are just ideas and > questions to help the discussion. Thanks, I find the catalog idea interesting. That's hard though per the potential range of SASL protocols that have likely different needs in the way messages are exchanged. -- Michael
On Wed, Jan 18, 2017 at 2:46 PM, Michael Paquier <michael.paquier@gmail.com> wrote: > FWIW, this patch is on a "waiting on author" state and that's right. > As the discussion on SASLprepare() and the decisions regarding the way > to implement it, or at least have it, are still pending, I am not > planning to move on with any implementation until we have a plan about > what to do. Just using libidn (LGPL) for a first shot is rather > painless but... I am not alone here. With decisions on this matter pending, I am marking this patch as "returned with feedback". If there is a consensus on what to do, I'll be happy to do the implementation with the last CF in March in sight. If no, that would mean that this feature will not be part of PG 10. -- Michael
Re: pg_authid.rolpassword format (was Re: [HACKERS] Passwordidentifiers, protocol aging and SCRAM protocol)
From
Heikki Linnakangas
Date:
On 01/17/2017 11:51 PM, Peter Eisentraut wrote: > On 1/3/17 9:09 AM, Heikki Linnakangas wrote: >> Since not everyone agrees with this approach, I split this patch into >> two. The first patch refactors things, replacing the isMD5() function >> with get_password_type(), without changing the representation of >> pg_authid.rolpassword. That is hopefully uncontroversial. > > I have checked these patches. > > The refactoring in the first patch seems sensible. As Michael pointed > out, there is still a reference to "plain:" in the first patch. Fixed. > The commit message needs to be updated, because the function > plain_crypt_verify() was already added in a previous patch. Fixed. > I'm not fond of this kind of coding > > password = encrypt_password(password_type, stmt->role, password); > > where the 'password' variable has a different meaning before and after. Added a new local variable to avoid the confusion. > This error message might be a mistake: > > elog(ERROR, "unrecognized password type conversion"); I rephrased the error as "cannot encrypt password to requested type", and added a comment explaining that it cannot happen. I hope that helped, I'm not sure why you thought it might've been a mistake. > I think some pieces from the second patch could be included in the first > patch, e.g., the parts for passwordcheck.c and user.c. I refrained from doing that for now. It would've changed the passwordcheck hook API in an incompatible way. Breaking the API explicitly would be a good thing, if we added the "plain:" prefix, because modules would need to deal with the prefix anyway. But until we do that, better to not break the API for no good reason. >> And the second >> patch adds the "plain:" prefix, which not everyone agrees on. > > The code also gets a little bit dubious, as it introduces an "unknown" > password type, which is sometimes treated as plaintext and sometimes as > an error. I think this is going be messy. > > I would skip this patch for now at least. Too much controversy, and we > don't know how the rest of the patches for this feature will look like > to be able to know if it's worth it. Ok, I'll drop the second patch for now. I committed the first patch after fixing the things you and Michael pointed out. Thanks for the review! - Heikki
Re: pg_authid.rolpassword format (was Re: [HACKERS] Passwordidentifiers, protocol aging and SCRAM protocol)
From
David Rowley
Date:
On 2 February 2017 at 00:13, Heikki Linnakangas <hlinnaka@iki.fi> wrote: > Ok, I'll drop the second patch for now. I committed the first patch after > fixing the things you and Michael pointed out. Thanks for the review! dbd69118 caused small compiler warning for me. The attached fixed it. -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Attachment
Re: pg_authid.rolpassword format (was Re: [HACKERS] Passwordidentifiers, protocol aging and SCRAM protocol)
From
Heikki Linnakangas
Date:
On 02/02/2017 05:50 AM, David Rowley wrote: > On 2 February 2017 at 00:13, Heikki Linnakangas <hlinnaka@iki.fi> wrote: >> Ok, I'll drop the second patch for now. I committed the first patch after >> fixing the things you and Michael pointed out. Thanks for the review! > > dbd69118 caused small compiler warning for me. > > The attached fixed it. Fixed, thanks! - Heikki
On 12/20/2016 03:47 AM, Michael Paquier wrote: > The first thing is to be able to understand in the SCRAM code if a > string is UTF-8 or not, and this code is in src/common/. pg_wchar.c > offers a set of routines exactly for this purpose, which is built with > libpq but that's not available for src/common/. So instead of moving > all the file, I'd like to create a new file in src/common/utf8.c which > includes pg_utf_mblen() and pg_utf8_islegal(). Sounds reasonable. They're short functions, might also be ok to just copy-paste them to scram-common.c. > On top of that I think that having a routine able to check a full > string would be useful for many users, as pg_utf8_islegal() can only > check one set of characters. If the password string is found to be of > UTF-8 format, SASLprepare is applied. If not, the string is copied > as-is with perhaps unexpected effects for the client But he's in > trouble already if client is not using UTF-8. Yeah. > The second thing is the normalization itself. Per RFC4013, NFKC needs > to be applied to the string. The operation is described in [1] > completely, and it is named as doing 1) a compatibility decomposition > of the bytes of the string, followed by 2) a canonical composition. > > About 1). The compatibility decomposition is defined in [2], "by > recursively applying the canonical and compatibility mappings, then > applying the canonical reordering algorithm". Canonical and > compatibility mapping are some data available in UnicodeData.txt, the > 6th column of the set defined in [3] to be precise. The meaning of the > decomposition mappings is defined in [2] as well. The canonical > decomposition is basically to look for a given UTF-8 character, and > then apply the multiple characters resulting in its new shape. The > compatibility mapping should as well be applied, but [5], a perl tool > called charlint.pl doing this normalization work, does not care about > this phase... Do we? Not sure. We need to do whatever the "right thing" is, according to the RFC. I would assume that the spec is not ambiguous this, but I haven't looked into the details. If it's ambiguous, then I think we need to look at some popular implementations to see what they do. > About 2)... Once the decomposition has been applied, those bytes need > to be recomposed using the Canonical_Combining_Class field of > UnicodeData.txt in [3], which is the 3rd column of the set. Its values > are defined in [4]. An other interesting thing, charlint.pl [5] does > not care about this phase. I am wondering if we should as well not > just drop this part as well... > > Once 1) and 2) are done, NKFC is complete, and so is SASLPrepare. Ok. > So what we need from Postgres side is a mapping table to, having the > following fields: > 1) Hexa sequence of UTF8 character. > 2) Its canonical combining class. > 3) The kind of decomposition mapping if defined. > 4) The decomposition mapping, in hexadecimal format. > Based on what I looked at, either perl or python could be used to > process UnicodeData.txt and to generate a header file that would be > included in the tree. There are 30k entries in UnicodeData.txt, 5k of > them have a mapping, so that will result in many tables. One thing to > improve performance would be to store the length of the table in a > static variable, order the entries by their hexadecimal keys and do a > dichotomy lookup to find an entry. We could as well use more fancy > things like a set of tables using a Radix tree using decomposed by > bytes. We should finish by just doing one lookup of the table for each > character sets anyway. Ok. I'm not too worried about the performance of this. It's only used for passwords, which are not that long, and it's only done when connecting. I'm more worried about the disk/memory usage. How small can we pack the tables? 10kB? 100kB? Even a few MB would probably not be too bad in practice, but I'd hate to bloat up libpq just for this. > In conclusion, at this point I am looking for feedback regarding the > following items: > 1) Where to put the UTF8 check routines and what to move. Covered that above. > 2) How to generate the mapping table using UnicodeData.txt. I'd think > that using perl would be better. Agreed, it needs to be in Perl. That's what we require to be present when building PostgreSQL, it's what we use for generating other tables and functions. > 3) The shape of the mapping table, which depends on how many > operations we want to support in the normalization of the strings. > The decisions for those items will drive the implementation in one > sense or another. Let's aim for small disk/memory footprint. - Heikki > [1]: http://www.unicode.org/reports/tr15/#Description_Norm > [2]: http://www.unicode.org/Public/5.1.0/ucd/UCD.html#Character_Decomposition_Mappings > [3]: http://www.unicode.org/Public/5.1.0/ucd/UCD.html#UnicodeData.txt > [4]: http://www.unicode.org/Public/5.1.0/ucd/UCD.html#Canonical_Combining_Class_Values > [5]: https://www.w3.org/International/charlint/
On Fri, Feb 3, 2017 at 9:52 PM, Heikki Linnakangas <hlinnaka@iki.fi> wrote: > On 12/20/2016 03:47 AM, Michael Paquier wrote: >> >> The first thing is to be able to understand in the SCRAM code if a >> string is UTF-8 or not, and this code is in src/common/. pg_wchar.c >> offers a set of routines exactly for this purpose, which is built with >> libpq but that's not available for src/common/. So instead of moving >> all the file, I'd like to create a new file in src/common/utf8.c which >> includes pg_utf_mblen() and pg_utf8_islegal(). > > Sounds reasonable. They're short functions, might also be ok to just > copy-paste them to scram-common.c. Having a separate file makes the most sense to me I think, if we can avoid code duplication that's better. >> The second thing is the normalization itself. Per RFC4013, NFKC needs >> to be applied to the string. The operation is described in [1] >> completely, and it is named as doing 1) a compatibility decomposition >> of the bytes of the string, followed by 2) a canonical composition. >> >> About 1). The compatibility decomposition is defined in [2], "by >> recursively applying the canonical and compatibility mappings, then >> applying the canonical reordering algorithm". Canonical and >> compatibility mapping are some data available in UnicodeData.txt, the >> 6th column of the set defined in [3] to be precise. The meaning of the >> decomposition mappings is defined in [2] as well. The canonical >> decomposition is basically to look for a given UTF-8 character, and >> then apply the multiple characters resulting in its new shape. The >> compatibility mapping should as well be applied, but [5], a perl tool >> called charlint.pl doing this normalization work, does not care about > > Not sure. We need to do whatever the "right thing" is, according to the RFC. > I would assume that the spec is not ambiguous this, but I haven't looked > into the details. If it's ambiguous, then I think we need to look at some > popular implementations to see what they do. The spec defines quite correctly what should be done. The implementations are sometimes quite loose on some points though (see charlint.pl). >> So what we need from Postgres side is a mapping table to, having the >> following fields: >> 1) Hexa sequence of UTF8 character. >> 2) Its canonical combining class. >> 3) The kind of decomposition mapping if defined. >> 4) The decomposition mapping, in hexadecimal format. >> Based on what I looked at, either perl or python could be used to >> process UnicodeData.txt and to generate a header file that would be >> included in the tree. There are 30k entries in UnicodeData.txt, 5k of >> them have a mapping, so that will result in many tables. One thing to >> improve performance would be to store the length of the table in a >> static variable, order the entries by their hexadecimal keys and do a >> dichotomy lookup to find an entry. We could as well use more fancy >> things like a set of tables using a Radix tree using decomposed by >> bytes. We should finish by just doing one lookup of the table for each >> character sets anyway. > > Ok. I'm not too worried about the performance of this. It's only used for > passwords, which are not that long, and it's only done when connecting. I'm > more worried about the disk/memory usage. How small can we pack the tables? > 10kB? 100kB? Even a few MB would probably not be too bad in practice, but > I'd hate to bloat up libpq just for this. Indeed. I think I'll develop first a small utility able to do operation. There is likely some knowledge in mb/Unicode that we can use here. The radix tree patch would perhaps help? >> 3) The shape of the mapping table, which depends on how many >> operations we want to support in the normalization of the strings. >> The decisions for those items will drive the implementation in one >> sense or another. > > Let's aim for small disk/memory footprint. OK, I'll try to give it a shot in a couple of days in the shape of an extention or something like that. Thanks for the feedback. -- Michael