Which is faster: md5() or hashtext()?

Started by Henry C.over 15 years ago3 messagesgeneral
Jump to latest
#1Henry C.
henka@cityweb.co.za

G'day,

I need to do a mass update on about 550 million rows (I will be breaking it up
into chunks based on id value so I can monitor progress).

Hashing one of the columns is part of the process and I was wondering which is
more efficient/faster: md5() or hashtext()?

hashtext() produces a nice tight integer value, whereas md5() produces a fixed
string. My instinct says hashtext(), but there may be a lot more to hashext()
than meets the eye.

Any ideas?

Thanks
Henry

#2Grzegorz Jaśkiewicz
gryzman@gmail.com
In reply to: Henry C. (#1)
Re: Which is faster: md5() or hashtext()?

Timing is on.
psql (9.1devel)
Type "help" for help.

# select count(hashtext(a::text)) FROM generate_series(1,10000) a;
count
-------
10000
(1 row)

Time: 106.637 ms
# select count(hashtext(a::text)) FROM generate_series(1,1000000) a;
count
---------
1000000
(1 row)

Time: 770.823 ms
# select count(md5(a::text)) FROM generate_series(1,1000000) a;
count
---------
1000000
(1 row)

Time: 1238.453 ms
# select count(hashtext(a::text)) FROM generate_series(1,1000000) a;
count
---------
1000000
(1 row)

Time: 763.169 ms
# select count(md5(a::text)) FROM generate_series(1,1000000) a;
count
---------
1000000
(1 row)

Time: 1258.958 ms

I would say hashtext is consequently beating md5 in terms of performance here.

Just remember, that it returns integer, unlike md5 that returns text.

#3Henry C.
henka@cityweb.co.za
In reply to: Grzegorz Jaśkiewicz (#2)
Re: Which is faster: md5() or hashtext()?

On Fri, November 5, 2010 09:52, Grzegorz Jaśkiewicz wrote:

Timing is on.
I would say hashtext is consequently beating md5 in terms of performance
here.

nice concise answer, thanks Grzegorz.