Storing a file hash as primary key

Started by Eduardo Pérez Uretaalmost 22 years ago3 messagesgeneral
Jump to latest
#1Eduardo Pérez Ureta
eperez@it.uc3m.es

I was wondering what the best way is to store a file hash (MD5 or SHA1)
and make it primary key indexed.
I have seen some people storing the hexadecimal encoded MD5 in a
CHAR(32) but it may be a better idea to use a CHAR(16) without encoding
the string, but that may cause some problems.

What do you recommend?
Do you have any experiences storing file hashes in a database?
Do you know any good opensource software that stores file hashes in the
database (to take a look)?

#2Bruce Momjian
bruce@momjian.us
In reply to: Eduardo Pérez Ureta (#1)
Re: Storing a file hash as primary key

Eduardo P�rez Ureta <eperez@it.uc3m.es> writes:

I was wondering what the best way is to store a file hash (MD5 or SHA1)
and make it primary key indexed.
I have seen some people storing the hexadecimal encoded MD5 in a
CHAR(32) but it may be a better idea to use a CHAR(16) without encoding
the string, but that may cause some problems.

I would say either char(32) or bytea(16). Not char(16) since you don't want to
treat the raw binary data using any specific character encoding or sort it
according to any locale specific rules etc.

Personally I would have preferred bytea(16) but for some reason the php
drivers seem to jut drop NULL there when I try to store raw binary md5 hashes.
So for now I just declared it bytea with no length specification and store the
hex encoded hash.

If anyone knows how to get Pear::DB to store binary data in a bytea column, by
all means.

--
greg

#3Joe Conway
mail@joeconway.com
In reply to: Bruce Momjian (#2)
Re: Storing a file hash as primary key

Greg Stark wrote:

Personally I would have preferred bytea(16) but for some reason the php
drivers seem to jut drop NULL there when I try to store raw binary md5 hashes.
So for now I just declared it bytea with no length specification and store the
hex encoded hash.

If anyone knows how to get Pear::DB to store binary data in a bytea column, by
all means.

Did you try using pg_escape_bytea()?

Joe