SERIAL or INT8 / Unique BLOB's

Started by Richard Teviotdaleover 24 years ago7 messagesgeneral
Jump to latest
#1Richard Teviotdale
richard@satcomresources.com

Using SERIAL or INT8

I have been using SERIAL a great deal when creating table primary keys.
The type of integer that this creates is an INT4, but I noticed that there is a larger INT8 available.

Would it be advisable to create a sequence manually and use the INT8 if the table size might be expected to grow beyond the 2 billion limit imposed by a INT4 (INTEGER) field type?

Unique BLOB's

I want to allow users to put blob's into a database. But I would like the database to recognise when a user is tying to insert a file that is identical to an existing file, as opposed to duplicating the blob. Either way I have to record the oid of the blob in a table so I can "locate" it later. I call this table blob_oid.

My plan is:

Add a field ( blob_md5 VARCHAR(32) ) to the blob_oid table and make a UNIQUE INDEX on this field to prevent duplicate values.
Before adding any blobs do a md5sum of the file and try inserting this value into blob_md5. If the file already exists we should get an error because it would have the same md5sum, otherwise upload the blob and update the blob_oid record created in the previous step.

I think this would work, but perhaps there is a better way?

#2Jeffrey W. Baker
jwbaker@acm.org
In reply to: Richard Teviotdale (#1)
Re: SERIAL or INT8 / Unique BLOB's

On Wed, 2 Jan 2002, Dave Trombley wrote:

Also, FYI, 7.2 looks-like-it-will/does have a
'bigserial'/'serial8' type.

yabbut, has anyone seen yet if 7.2 can use bigint in an index (on a 32-bit
machine)?

-jwb

#3Dave Trombley
dtrom@bumba.net
In reply to: Richard Teviotdale (#1)
Re: SERIAL or INT8 / Unique BLOB's

Richard Teviotdale wrote:

Using SERIAL or INT8

I have been using SERIAL a great deal when creating table primary keys.

The type of integer that this creates is an INT4, but I noticed that
there is a larger INT8 available.

Would it be advisable to create a sequence manually and use the INT8
if the table size might be expected to grow beyond the 2 billion limit
imposed by a INT4 (INTEGER) field type?

Currently, I don't think sequences can return values larger than
the max int4. You could of course emulate a sequence, but if you don't
have a large number of rows at any given time you may want to take
advantage of the fact that sequences can cycle around, optionally, since
they do cache values for performance gain.

(http://www2.us.postgresql.org/users-lounge/docs/7.1/reference/sql-createsequence.html
)

Also, FYI, 7.2 looks-like-it-will/does have a
'bigserial'/'serial8' type.

-dj trombley
<dtrom@bumba.net>

#4Tom Lane
tgl@sss.pgh.pa.us
In reply to: Jeffrey W. Baker (#2)
Re: SERIAL or INT8 / Unique BLOB's

"Jeffrey W. Baker" <jwbaker@acm.org> writes:

yabbut, has anyone seen yet if 7.2 can use bigint in an index (on a 32-bit
machine)?

bigint index support has been there for quite awhile. Possibly you
are missing the need to cast the constant to bigint:

WHERE bigintcol = 42::bigint
or
WHERE bigintcol = '42'
but not just
WHERE bigintcol = 42
since int8-eq-int4 is not an indexable operator for an int8 index.

int2, float4, and numeric columns suffer from variants of this issue,
as do OID and some other types.

There have been various proposals to fix this class of annoyance,
but none have got past the assembled complainants yet ;-). See the
pghackers archives, eg mid-May 2000, for past arguments.

regards, tom lane

#5Dave Trombley
dtrom@bumba.net
In reply to: Jeffrey W. Baker (#2)
Re: SERIAL or INT8 / Unique BLOB's

Jeffrey W. Baker wrote:

yabbut, has anyone seen yet if 7.2 can use bigint in an index (on a 32-bit
machine)?

I don't see why it wouldn't! gcc has supported 8-byte longs for a
(*sigh*) long, long, time now. =)
This is on my i686-class linux box, build of yesterday's CVS snapshot:

test=# create table ttab(a int, b serial8);
NOTICE: CREATE TABLE will create implicit sequence 'ttab_b_seq' for
SERIAL column 'ttab.b'
NOTICE: CREATE TABLE / UNIQUE will create implicit index 'ttab_b_key'
for table 'ttab'
CREATE
test=# CREATE INDEX ttab_b_idx ON ttab (a);
CREATE
test=# CREATE FUNCTION populate(int) RETURNS bool AS '
test'# BEGIN
test'# FOR i IN 1..$1 LOOP
test'# INSERT INTO ttab(a) VALUES((i*i)%133);
test'# END LOOP;
test'# RETURN true;
test'# END; ' language 'plpgsql';
CREATE
test=# select populate(90000);
test=# EXPLAIN SELECT * FROM ttab WHERE a = 30 LIMIT 18;
NOTICE: QUERY PLAN:

Limit (cost=0.00..17.07 rows=5 width=12)
-> Index Scan using ttab_b_idx on ttab (cost=0.00..17.07 rows=5
width=12)

EXPLAIN
test=# select * from ttab where a = 30 limit 10;
a | b
----+--------
30 | 136329
30 | 136315
30 | 136253
30 | 136239
30 | 136196
30 | 136182
30 | 136120
30 | 136106
30 | 136063
30 | 136049
(10 rows)

-dj

#6Jeffrey W. Baker
jwbaker@acm.org
In reply to: Dave Trombley (#5)
Re: SERIAL or INT8 / Unique BLOB's

On Wed, 2 Jan 2002, Dave Trombley wrote:

Jeffrey W. Baker wrote:

yabbut, has anyone seen yet if 7.2 can use bigint in an index (on a 32-bit
machine)?

I don't see why it wouldn't! gcc has supported 8-byte longs for a
(*sigh*) long, long, time now. =)
This is on my i686-class linux box, build of yesterday's CVS snapshot:

I only asked because 7.1 won't use a bigint index under any circumstances.
I discovered this *after* creating some tremendously large tables :)

-jwb

#7Jeffrey W. Baker
jwbaker@acm.org
In reply to: Tom Lane (#4)
Re: SERIAL or INT8 / Unique BLOB's

On Wed, 2 Jan 2002, Tom Lane wrote:

"Jeffrey W. Baker" <jwbaker@acm.org> writes:

yabbut, has anyone seen yet if 7.2 can use bigint in an index (on a 32-bit
machine)?

bigint index support has been there for quite awhile. Possibly you
are missing the need to cast the constant to bigint:

Of course. I was already told that once but apparently I forget it
already. -jwb