Thread: Which is faster: md5() or hashtext()?

Which is faster: md5() or hashtext()?

From
"Henry C."
Date:
G'day,

I need to do a mass update on about 550 million rows (I will be breaking it up
into chunks based on id value so I can monitor progress).

Hashing one of the columns is part of the process and I was wondering which is
more efficient/faster:  md5() or hashtext()?

hashtext() produces a nice tight integer value, whereas md5() produces a fixed
string.  My instinct says hashtext(), but there may be a lot more to hashext()
than meets the eye.

Any ideas?

Thanks
Henry


Re: Which is faster: md5() or hashtext()?

From
Grzegorz Jaśkiewicz
Date:
Timing is on.
psql (9.1devel)
Type "help" for help.

# select count(hashtext(a::text)) FROM generate_series(1,10000) a;
 count
-------
 10000
(1 row)

Time: 106.637 ms
# select count(hashtext(a::text)) FROM generate_series(1,1000000) a;
  count
---------
 1000000
(1 row)

Time: 770.823 ms
# select count(md5(a::text)) FROM generate_series(1,1000000) a;
  count
---------
 1000000
(1 row)

Time: 1238.453 ms
# select count(hashtext(a::text)) FROM generate_series(1,1000000) a;
  count
---------
 1000000
(1 row)

Time: 763.169 ms
# select count(md5(a::text)) FROM generate_series(1,1000000) a;
  count
---------
 1000000
(1 row)

Time: 1258.958 ms


I would say hashtext is consequently beating md5 in terms of performance here.

Just remember, that it returns integer, unlike md5 that returns text.

Re: Which is faster: md5() or hashtext()?

From
"Henry C."
Date:
On Fri, November 5, 2010 09:52, Grzegorz Jaśkiewicz wrote:
> Timing is on.
> I would say hashtext is consequently beating md5 in terms of performance
> here.

nice concise answer, thanks Grzegorz.