Hash Function: MD5 or other? - Mailing list pgsql-general

From Peter Fein
Subject Hash Function: MD5 or other?
Date
Msg-id 20050613174959.6fb1df80@layout.pfein.org
Whole thread Raw
Responses Re: Hash Function: MD5 or other?
Re: Hash Function: MD5 or other?
List pgsql-general
Hi-

I wanted to use a partially unique index (dependent on a flag) on a TEXT
column, but the index row size was too big for btrees.  See the thread
"index row size 2728 exceeds btree maximum, 2713" from the beginning of
this month for someone with a similar problem.  In it, someone suggests
indexing on a hash of the text.  I'm fine with this, as the texts in
question are similar enough to each other to make collisions unlikely
and a collision won't really cause any serious problems.

My question is: is the builtin MD5 appropriate for this use or should I
be using a function from pl/something?  Figures on collision rates would
be nice as well - the typical chunk of text is probably 1k-8k.

Thanks!

--
Peter Fein                 pfein@pobox.com                 773-575-0694

Basically, if you're not a utopianist, you're a schmuck. -J. Feldman

pgsql-general by date:

Previous
From: "Jonah H. Harris"
Date:
Subject: Re: [HACKERS] mirroring oracle database in pgsql
Next
From: Sean Davis
Date:
Subject: Re: [HACKERS] mirroring oracle database in pgsql