Re: [GENERAL] SHA1 on postgres 8.3 - Mailing list pgsql-hackers

From Mark Mielke
Subject Re: [GENERAL] SHA1 on postgres 8.3
Date
Msg-id 47F4F895.5010106@mark.mielke.cc
Whole thread Raw
In response to Re: [GENERAL] SHA1 on postgres 8.3  ("Greg Sabino Mullane" <greg@turnstep.com>)
Responses Re: [GENERAL] SHA1 on postgres 8.3  (Svenne Krap <svenne@krap.dk>)
List pgsql-hackers
Greg Sabino Mullane wrote:
> 4) We're also encouraging the use of md5() by making it the only option.
> Yes, we can talk about why people *shouldn't* use it for this purpose
> or that, but they will.
>   

There is always the Java route - internal classes have package-scope 
constructors to specifically prevent them from being accidentally used 
(and relied on).

I prefer the "let them use it, but warn them not to have expectations" 
route, which is what PostgreSQL is doing today. The above is not a 
legitimate reason to provide additional functions in the core.

> 5) It seems unwise to go through the trouble of just adding sha1(), when
> we could easily add some better hashes, which has the nice side effect
> of making us stand out more and push the envelope, rather than play follow
> the leader, as was mentioned at PGCon East

This presumes that better hashes truly exist. It is basic math to show 
that all hashes will include collisions. Ignoring the possibility that 
one hash has theoretical better distribution for real documents, the 
real "benefit" of SHA-1 over MD5, is that it has more bits. The 
"ultimate" solution here, is to store the original using the "full copy" 
hash technique, with 0 chance of collision. This extreme defeats the 
purpose of a hash to start with.

Why does PostgreSQL need something better than md5 as part of core? 
Bragging rights?

Cheers,
mark

-- 
Mark Mielke <mark@mielke.cc>



pgsql-hackers by date:

Previous
From: PFC
Date:
Subject: Re: COPY Transform support
Next
From: "Tom Dunstan"
Date:
Subject: Re: modules