Thread: Add SQL function for SHA1
Hi all, SHA-1 is now an option available for cryptohashes, and like the existing set of functions of SHA-2, I don't really see a reason why we should not have a SQL function for SHA1. Attached is a patch doing that. The same code pattern was repeated 4 times on HEAD for the SHA-2 functions for the bytea -> bytea hashing, so I have refactored the whole thing while integrating the new function, shaving some code from cryptohashfuncs.c. Thoughts? -- Michael
Attachment
On Mon, Jan 25, 2021 at 10:12:28PM +0900, Michael Paquier wrote: > SHA-1 is now an option available for cryptohashes, and like the > existing set of functions of SHA-2, I don't really see a reason why we > should not have a SQL function for SHA1. NIST deprecated SHA1 over ten years ago. It's too late to be adding this.
+1 to adding a SHA1 SQL function. Even if it's deprecated, there's plenty of historical usage that I can see it being useful.
Either way, the rest of the refactor can be improved a bit to perform a single palloc() and remove the memcpy(). Attached is a diff for cryptohashfuncs.c that does that by writing the digest final directly to the result. It also removes the digest length arg and determines it in the switch block. There's only one correct digest length for each type so there's no reason to give callers the option to give the wrong one.
Attachment
On Mon, Jan 25, 2021 at 10:42:25PM -0500, Sehrope Sarkuni wrote: > +1 to adding a SHA1 SQL function. Even if it's deprecated, there's plenty > of historical usage that I can see it being useful. Let's wait for more opinions to see if we agree that this addition is helpful or not. Even if this is not added, I think that there is still value in refactoring the code anyway for the SHA-2 functions. > Either way, the rest of the refactor can be improved a bit to perform a > single palloc() and remove the memcpy(). Attached is a diff for > cryptohashfuncs.c that does that by writing the digest final directly to > the result. It also removes the digest length arg and determines it in the > switch block. There's only one correct digest length for each type so > there's no reason to give callers the option to give the wrong one. Yeah, what you have here is better. + default: + elog(ERROR, "unsupported digest type %d", type); Not using a default clause is the purpose here, as it would generate a compilation warning if a value in the enum is forgotten. Hence, if a new option is added to pg_cryptohash_type in the future, people won't miss that they could add a SQL function for the new option. If we decide that MD5 and SHA1 have no need to use this code path, I'd rather just use elog(ERROR) instead. -- Michael
Attachment
On Tue, Jan 26, 2021 at 01:06:29PM +0900, Michael Paquier wrote: > On Mon, Jan 25, 2021 at 10:42:25PM -0500, Sehrope Sarkuni wrote: > > +1 to adding a SHA1 SQL function. Even if it's deprecated, there's plenty > > of historical usage that I can see it being useful. > > Let's wait for more opinions to see if we agree that this addition is > helpful or not. Even if this is not added, I think that there is > still value in refactoring the code anyway for the SHA-2 functions. > +1 I know that it has been deprecated, but it can be very useful when working with data from pre-deprecation. :) It is annoying to have to resort to plperl or plpython because it is not available. The lack or orthogonality is painful. Regards, Ken
On Mon, Jan 25, 2021 at 10:23:30PM -0600, Kenneth Marshall wrote: > On Tue, Jan 26, 2021 at 01:06:29PM +0900, Michael Paquier wrote: > > On Mon, Jan 25, 2021 at 10:42:25PM -0500, Sehrope Sarkuni wrote: > > > +1 to adding a SHA1 SQL function. Even if it's deprecated, there's plenty > > > of historical usage that I can see it being useful. > > > > Let's wait for more opinions to see if we agree that this addition is > > helpful or not. Even if this is not added, I think that there is > > still value in refactoring the code anyway for the SHA-2 functions. > > > > +1 I know that it has been deprecated, but it can be very useful when > working with data from pre-deprecation. :) It is annoying to have to > resort to plperl or plpython because it is not available. The lack or > orthogonality is painful. Yes, I think having SHA1 makes sense --- there are probably still valid uses for it. -- Bruce Momjian <bruce@momjian.us> https://momjian.us EDB https://enterprisedb.com The usefulness of a cup is in its emptiness, Bruce Lee
On Mon, Jan 25, 2021 at 10:12:28PM +0900, Michael Paquier wrote: > Hi all, > > SHA-1 is now an option available for cryptohashes, and like the > existing set of functions of SHA-2, I don't really see a reason why we > should not have a SQL function for SHA1. Attached is a patch doing > that. Thanks for doing this! While there are applications SHA1 is no longer good for, there are plenty where it's still in play. One I use frequently is git. While there are plans for creating an upgrade path to more cryptographically secure hashes, it will take some years before repositories have converted over. Best, David. -- David Fetter <david(at)fetter(dot)org> http://fetter.org/ Phone: +1 415 235 3778 Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate
On Mon, Jan 25, 2021 at 11:27:28PM -0500, Bruce Momjian wrote: > On Mon, Jan 25, 2021 at 10:23:30PM -0600, Kenneth Marshall wrote: >> +1 I know that it has been deprecated, but it can be very useful when >> working with data from pre-deprecation. :) It is annoying to have to >> resort to plperl or plpython because it is not available. The lack or >> orthogonality is painful. plperl and plpython can be annoying to require if you have strong security requirements as these are untrusted languages, but I don't completely agree with this argument because pgcrypto gives the option to use SHA1 with digest(), and this one is fine to have even in environments under STIG or equally-constrained environments. > Yes, I think having SHA1 makes sense --- there are probably still valid > uses for it. Consistency with the existing in-core SQL functions for cryptohashes and the possibility to not need pgcrypto are my only arguments at hand. ;) -- Michael
Attachment
> On 26 Jan 2021, at 04:28, Noah Misch <noah@leadboat.com> wrote: > > On Mon, Jan 25, 2021 at 10:12:28PM +0900, Michael Paquier wrote: >> SHA-1 is now an option available for cryptohashes, and like the >> existing set of functions of SHA-2, I don't really see a reason why we >> should not have a SQL function for SHA1. > > NIST deprecated SHA1 over ten years ago. It's too late to be adding this. Agreed, and pgcrypto already allows for using sha1. It seems like any legitimate need for sha1 could be better served by an extension rather than supplying it in-core. -- Daniel Gustafsson https://vmware.com/
On Tue, Jan 26, 2021 at 10:38:43AM +0100, Daniel Gustafsson wrote: > Agreed, and pgcrypto already allows for using sha1. > > It seems like any legitimate need for sha1 could be better served by an > extension rather than supplying it in-core. Both of you telling the same thing is enough for me to discard this new stuff. I'd like to refactor the code anyway as that's a nice cleanup, and this would have the advantage to make people look at cryptohashfuncs.c if introducing a new type. After sleeping about it, I think that I would just make MD5 and SHA1 issue an elog(ERROR) if the internal routine is taken in those cases, like in the attached. If there are any comments or objections to the refactoring piece, please let me know. -- Michael
Attachment
On Tue, Jan 26, 2021 at 8:53 PM Michael Paquier <michael@paquier.xyz> wrote:
On Tue, Jan 26, 2021 at 10:38:43AM +0100, Daniel Gustafsson wrote:
> Agreed, and pgcrypto already allows for using sha1.
>
> It seems like any legitimate need for sha1 could be better served by an
> extension rather than supplying it in-core.
Both of you telling the same thing is enough for me to discard this
new stuff. I'd like to refactor the code anyway as that's a nice
cleanup, and this would have the advantage to make people look at
cryptohashfuncs.c if introducing a new type. After sleeping about it,
I think that I would just make MD5 and SHA1 issue an elog(ERROR) if
the internal routine is taken in those cases, like in the attached.
Thanks for the enum explanation too.
On Tue, Jan 26, 2021 at 09:53:52PM -0500, Sehrope Sarkuni wrote: > The refactor patch looks good. It builds and passes make check. Thanks for double-checking! The refactoring has been just done as of f854c69. -- Michael