unique row identifier data type exhausted . . .

Started by Frank Joerdensalmost 26 years ago15 messagesgeneral
Jump to latest
#1Frank Joerdens
frank@joerdens.de

It feels like there should be some *really* obvious answer to this
question, and I'll find myself whacking my forehead in self-abasement
and out of sheer relief to have found the answer to a problem that
should not have bothered me in the first place since the answer is too
self-evident . . . however, it is bothering me: what happens if the data
type that you've chosen to uniquely identify a row is exhausted? If, for
instance you use int4 and you've had your couple billion deletes and
inserts on the table and the next nextval('seq') . . . well, what
exactly happens and how do they do it? Admittedly, 10^9 is a big number
but it is far from out of the question that you'd reach it on a really
busy database (can't think of a real-world example but that ought to be
a moot point), not to mention oids since they are unique across an
entire database.

Cheers
Frank

--
frank joerdens

joerdens new media
heinrich-roller str. 16/17
10405 berlin
germany

e: frank@joerdens.de
t: +49 (0)30 44055471
f: +49 (0)30 44055475
h: http://www.joerdens.de

pgp public key: http://www.joerdens.de/pgp/frank_joerdens.asc

#2Andrew Snow
als@fl.net.au
In reply to: Frank Joerdens (#1)
RE: unique row identifier data type exhausted . . .

It feels like there should be some *really* obvious answer to this
question, and I'll find myself whacking my forehead in self-abasement
and out of sheer relief to have found the answer to a problem that
should not have bothered me in the first place since the answer is too
self-evident . . . however, it is bothering me: what happens if the data
type that you've chosen to uniquely identify a row is exhausted? If, for
instance you use int4 and you've had your couple billion deletes and
inserts on the table and the next nextval('seq') . . . well, what
exactly happens and how do they do it? Admittedly, 10^9 is a big number
but it is far from out of the question that you'd reach it on a really
busy database (can't think of a real-world example but that ought to be
a moot point), not to mention oids since they are unique across an
entire database.

I am curious to know how difficult it would be (if at all) to change the
type that oid represents, to a 64 bit number. C'mon guys, this isn't the 90s
any more!

- Andrew

#3Jurgen Defurne
defurnj@glo.be
In reply to: Frank Joerdens (#1)
Re: unique row identifier data type exhausted . . .

Frank Joerdens wrote:

It feels like there should be some *really* obvious answer to this
question, and I'll find myself whacking my forehead in self-abasement
and out of sheer relief to have found the answer to a problem that
should not have bothered me in the first place since the answer is too
self-evident . . . however, it is bothering me: what happens if the data
type that you've chosen to uniquely identify a row is exhausted? If, for
instance you use int4 and you've had your couple billion deletes and
inserts on the table and the next nextval('seq') . . . well, what
exactly happens and how do they do it? Admittedly, 10^9 is a big number
but it is far from out of the question that you'd reach it on a really
busy database (can't think of a real-world example but that ought to be
a moot point), not to mention oids since they are unique across an
entire database.

This is the command to create a sequence :

CREATE SEQUENCE seqname
[ INCREMENT increment ]
[ MINVALUE minvalue ]
[ MAXVALUE maxvalue ]
[ START start ]
[ CACHE cache ]
[ CYCLE ]

The CYCLE option restarts the sequence automatically upon overflow. Of
course, your analysis of the database should point out that it is possible
to do this.

Jurgen Defurne
defurnj@glo.be

#4Bruce Momjian
bruce@momjian.us
In reply to: Andrew Snow (#2)
Re: unique row identifier data type exhausted . . .

It feels like there should be some *really* obvious answer to this
question, and I'll find myself whacking my forehead in self-abasement
and out of sheer relief to have found the answer to a problem that
should not have bothered me in the first place since the answer is too
self-evident . . . however, it is bothering me: what happens if the data
type that you've chosen to uniquely identify a row is exhausted? If, for
instance you use int4 and you've had your couple billion deletes and
inserts on the table and the next nextval('seq') . . . well, what
exactly happens and how do they do it? Admittedly, 10^9 is a big number
but it is far from out of the question that you'd reach it on a really
busy database (can't think of a real-world example but that ought to be
a moot point), not to mention oids since they are unique across an
entire database.

I am curious to know how difficult it would be (if at all) to change the
type that oid represents, to a 64 bit number. C'mon guys, this isn't the 90s
any more!

When we are sure all platforms support 64-bit int's, we will move in
that direction.

-- 
  Bruce Momjian                        |  http://www.op.net/~candle
  pgman@candle.pha.pa.us               |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026
#5Andrew Snow
als@fl.net.au
In reply to: Bruce Momjian (#4)
RE: unique row identifier data type exhausted . . .

When we are sure all platforms support 64-bit int's, we will move in
that direction.

Sorry if this is a stupid question, but couldn't you fairly easily make it
an option at compile time? To use either 32 or 64 bit OID's.
(And, less importantly, for sequences)

- Andrew

#6Bruce Momjian
bruce@momjian.us
In reply to: Andrew Snow (#5)
Re: unique row identifier data type exhausted . . .

When we are sure all platforms support 64-bit int's, we will move in
that direction.

Sorry if this is a stupid question, but couldn't you fairly easily make it
an option at compile time? To use either 32 or 64 bit OID's.
(And, less importantly, for sequences)

Well, we could but then all the API would change, and internally there
would be changes. No one has really asked for it vs. the work in doing
it.

-- 
  Bruce Momjian                        |  http://www.op.net/~candle
  pgman@candle.pha.pa.us               |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026
#7Tom Cook
tcook@lisa.com.au
In reply to: Andrew Snow (#5)
RE: unique row identifier data type exhausted . . .

On Mon, 24 Apr 2000, Andrew Snow wrote:

When we are sure all platforms support 64-bit int's, we will move in
that direction.

Sorry if this is a stupid question, but couldn't you fairly easily make it
an option at compile time? To use either 32 or 64 bit OID's.
(And, less importantly, for sequences)

Is this necessarily a good solution? If you use 64-bit OIDs, some joker
will just hook up a several-terra-byte disk array to his machine, try to
store the location of every molecule in the universe and break it.

Admittedly, ~2x10^20 is a very large number, but that's what they thought
about 2000, also...

What I'm saying is, is there a better way of doing this?

Cheers
--
Tom Cook - Software Engineer

"Christ died for sin, the righteous for the unrighteous, to bring you to God."
- 1 Peter 3:18

LISAcorp - www.lisa.com.au

--------------------------------------------------
38 Greenhill Rd. Level 3, 228 Pitt Street
Wayville, SA, 5034 Sydney, NSW, 2000

Phone: +61 8 8272 1555 Phone: +61 2 9283 0877
Fax: +61 8 8271 1199 Fax: +61 2 9283 0866
--------------------------------------------------

#8Franck Martin
Franck@sopac.org.fj
In reply to: Tom Cook (#7)
RE: unique row identifier data type exhausted . . .

For me it seems more a design problem than the length of internal number....

Why not create a table with 2 field containing int, setting the primary key
on both of them and running a sequencing scheme on both as if it was a
single number....

There are no more limitations anymore, as if you know you will need a lot of
record you may decide to use 1, 2, or 3 numbers...

Question, does postgress is able to run sequence on 2 combined numbers...

Cheers...

Franck Martin
Network and Database Development Officer
SOPAC South Pacific Applied Geoscience Commission
Fiji
E-mail: franck@sopac.org.fj <mailto:franck@sopac.org.fj>
Web site: www.sopac.org.fj <http://www.sopac.org.fj&gt;

-----Original Message-----
From: Tom Cook [mailto:tcook@lisa.com.au]
Sent: Wednesday, April 26, 2000 11:49 AM
To: Pgsql-General@Postgresql. Org
Subject: RE: [GENERAL] unique row identifier data
type exhausted . . .

On Mon, 24 Apr 2000, Andrew Snow wrote:

When we are sure all platforms support 64-bit int's, we

will move in

that direction.

Sorry if this is a stupid question, but couldn't you

fairly easily make it

an option at compile time? To use either 32 or 64 bit

OID's.

(And, less importantly, for sequences)

Is this necessarily a good solution? If you use 64-bit OIDs,
some joker
will just hook up a several-terra-byte disk array to his
machine, try to
store the location of every molecule in the universe and
break it.

Admittedly, ~2x10^20 is a very large number, but that's what
they thought
about 2000, also...

What I'm saying is, is there a better way of doing this?

#9Andrew Snow
als@fl.net.au
In reply to: Tom Cook (#7)
RE: unique row identifier data type exhausted . . .

Is this necessarily a good solution? If you use 64-bit OIDs, some joker
will just hook up a several-terra-byte disk array to his machine, try to
store the location of every molecule in the universe and break it.

If you have to have OIDs at all, its a lot better than a 32 bit number. I
think it would be easier to switch to 64 bit OIDs than ditch them
completely.
The "serial" type should definitely be 64 bit. To make matters worse I
believe its really only a 31 bit number as the plus/minus symbol is
discarded. But I think moving to 64 bit will take place soon enough, when
it needs to, and it should shut everyone up.

"Christ died for sin, the righteous for the unrighteous, to bring
you to God."
- 1 Peter 3:18

Is it just me, or are there a lot of Christian PostgreSQL users? ;-) (I'm
not knocking it, I'm one too..)

- Andrew

#10Bruce Momjian
bruce@momjian.us
In reply to: Andrew Snow (#9)
Re: unique row identifier data type exhausted . . .

Is this necessarily a good solution? If you use 64-bit OIDs, some joker
will just hook up a several-terra-byte disk array to his machine, try to
store the location of every molecule in the universe and break it.

If you have to have OIDs at all, its a lot better than a 32 bit number. I
think it would be easier to switch to 64 bit OIDs than ditch them
completely.
The "serial" type should definitely be 64 bit. To make matters worse I
believe its really only a 31 bit number as the plus/minus symbol is
discarded. But I think moving to 64 bit will take place soon enough, when
it needs to, and it should shut everyone up.

If you look at that TODO list, oid's flowing over 32-bits is not
something we are losing sleep over. In fact, the first fix would be to
make sure oid's are truly treated as unsigned int's, thereby doubling
their range. I have done some of those myself, but I am sure there are
more areas that need fixing.

Illustra's solution was to use two int32's, making the upper 32-bit
value represent the site, so oid's remain unique as they move between
sites. If we picked a random 32-bit oid on initdb startup, that would
pretty much make them unique all the time.

"Christ died for sin, the righteous for the unrighteous, to bring
you to God."
- 1 Peter 3:18

Is it just me, or are there a lot of Christian PostgreSQL users? ;-) (I'm
not knocking it, I'm one too..)

Good question. :-)

-- 
  Bruce Momjian                        |  http://www.op.net/~candle
  pgman@candle.pha.pa.us               |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026
#11Peter Eisentraut
peter_e@gmx.net
In reply to: Tom Cook (#7)
RE: unique row identifier data type exhausted . . .

On Wed, 26 Apr 2000, Tom Cook wrote:

Is this necessarily a good solution? If you use 64-bit OIDs, some joker
will just hook up a several-terra-byte disk array to his machine, try to
store the location of every molecule in the universe and break it.

That's not going to work anyway. To store information about a molecule you
need at least one such molecule to hold that state, barring major
revolutions in storage technology. :-)

Admittedly, ~2x10^20 is a very large number, but that's what they thought
about 2000, also...

A while ago I said that in order to exhaust the oid space you need to add
1 million new records a day for more than 10 years. Then someone said, ok,
what if I have an email service with 1 million users that each get 10
emails a day. Then you're talking about 1 year. But in order to exhaust 64
bits, you can have 10^9 users (i.e., everyone), getting two million emails
a day for 1000 years. That seems pretty safe for as long as I care.

Of course to store all molecules you really need more like 384 bits.

What I'm saying is, is there a better way of doing this?

Transfinite numbers ;)

--
Peter Eisentraut Sernanders v�g 10:115
peter_e@gmx.net 75262 Uppsala
http://yi.org/peter-e/ Sweden

#12Dale Anderson
danderso@crystalsugar.com
In reply to: Peter Eisentraut (#11)
RE: unique row identifier data type exhausted . . .

Seems to me that 64 bit OIDs is enough for any rational sensible person, and if it's not enough for you, then you have way too much time to think about it.

e99re41@DoCS.UU.SE 04/26/00 08:53AM >>>

On Wed, 26 Apr 2000, Tom Cook wrote:

Is this necessarily a good solution? If you use 64-bit OIDs, some joker
will just hook up a several-terra-byte disk array to his machine, try to
store the location of every molecule in the universe and break it.

That's not going to work anyway. To store information about a molecule you
need at least one such molecule to hold that state, barring major
revolutions in storage technology. :-)

Admittedly, ~2x10^20 is a very large number, but that's what they thought
about 2000, also...

A while ago I said that in order to exhaust the oid space you need to add
1 million new records a day for more than 10 years. Then someone said, ok,
what if I have an email service with 1 million users that each get 10
emails a day. Then you're talking about 1 year. But in order to exhaust 64
bits, you can have 10^9 users (i.e., everyone), getting two million emails
a day for 1000 years. That seems pretty safe for as long as I care.

Of course to store all molecules you really need more like 384 bits.

What I'm saying is, is there a better way of doing this?

Transfinite numbers ;)

--
Peter Eisentraut Sernanders v�g 10:115
peter_e@gmx.net 75262 Uppsala
http://yi.org/peter-e/ Sweden

#13Mark Dalphin
mdalphin@amgen.com
In reply to: Bruce Momjian (#10)
Re: unique row identifier data type exhausted . . .

Bruce Momjian wrote:

Is this necessarily a good solution? If you use 64-bit OIDs, some joker
will just hook up a several-terra-byte disk array to his machine, try to
store the location of every molecule in the universe and break it.

If you have to have OIDs at all, its a lot better than a 32 bit number. I
think it would be easier to switch to 64 bit OIDs than ditch them
completely.
The "serial" type should definitely be 64 bit. To make matters worse I
believe its really only a 31 bit number as the plus/minus symbol is
discarded. But I think moving to 64 bit will take place soon enough, when
it needs to, and it should shut everyone up.

If you look at that TODO list, oid's flowing over 32-bits is not
something we are losing sleep over. In fact, the first fix would be to
make sure oid's are truly treated as unsigned int's, thereby doubling
their range. I have done some of those myself, but I am sure there are
more areas that need fixing.

Illustra's solution was to use two int32's, making the upper 32-bit
value represent the site, so oid's remain unique as they move between
sites. If we picked a random 32-bit oid on initdb startup, that would
pretty much make them unique all the time.

--
Bruce Momjian                        |  http://www.op.net/~candle
pgman@candle.pha.pa.us               |  (610) 853-3000
+  If your life is a hard drive,     |  830 Blythe Avenue
+  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026

While I am working on a system which could use 64-bit OIDs, and I think it is a
good idea to move to them, I wonder if the developers should consider the people
who are running older, "legacy" systems as well. Moving to a 64-bit OID would
add considerably to the space required (ie the overhead) to run the database.
Many Linux systems are "Linux" because Windows got too big and clunky to run
there. If possible, I'd suggest leaving the OID size as a compile time switch
so those who wish to run "light" can do so, and those who wish to tally the
molecules of the universe can think about how to compress the data to fit within
a 64-bit OID.

Mark

--
Mark Dalphin email: mdalphin@amgen.com
Mail Stop: 29-2-A phone: +1-805-447-4951 (work)
One Amgen Center Drive +1-805-375-0680 (home)
Thousand Oaks, CA 91320 fax: +1-805-499-9955 (work)

#14Jan Wieck
JanWieck@Yahoo.com
In reply to: Dale Anderson (#12)
Re: unique row identifier data type exhausted . . .

Is this necessarily a good solution? If you use 64-bit OIDs, some joker
will just hook up a several-terra-byte disk array to his machine, try to
store the location of every molecule in the universe and break it.

That's not going to work anyway. To store information about a molecule you
need at least one such molecule to hold that state, barring major
revolutions in storage technology. :-)

Maybe one or two quarks are enough to represent a single bit.
Then you can break this barrier and store the data, because
most molecules consists of more quarks.

But that's an incomplete approach again, because if we could
store the position of each quark and all other occurences of
energy (along with it's actual direction and speed), we could
add rules and/or triggers and end up with a complete UNIVERSE
simulator in Postgres.

Can someone ask IBM (Interstellar Business Machines Corp.)
what database they used in our UNIVERSE? Must be running in
our parent universe, so the real question is: "can we
determine the universe nesting level we actually live in?"

Jan

--

#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me. #
#========================================= wieck@debis.com (Jan Wieck) #

#15Tom Cook
tcook@lisa.com.au
In reply to: Dale Anderson (#12)
RE: unique row identifier data type exhausted . . .

Alright, alright already! Maybe 64-bit OIDs are sufficient for reasonable
people. My point was that, as soon as you set a limit on something,
someone will find a use for it which pushes that limit.

"640K should be enought for anyone." - William Gates

On Wed, 26 Apr 2000, Dale Anderson wrote:

Seems to me that 64 bit OIDs is enough for any rational sensible person, and if it's not enough for you, then you have way too much time to think about it.

e99re41@DoCS.UU.SE 04/26/00 08:53AM >>>

On Wed, 26 Apr 2000, Tom Cook wrote:

Is this necessarily a good solution? If you use 64-bit OIDs, some joker
will just hook up a several-terra-byte disk array to his machine, try to
store the location of every molecule in the universe and break it.

That's not going to work anyway. To store information about a molecule you
need at least one such molecule to hold that state, barring major
revolutions in storage technology. :-)

Admittedly, ~2x10^20 is a very large number, but that's what they thought
about 2000, also...

A while ago I said that in order to exhaust the oid space you need to add
1 million new records a day for more than 10 years. Then someone said, ok,
what if I have an email service with 1 million users that each get 10
emails a day. Then you're talking about 1 year. But in order to exhaust 64
bits, you can have 10^9 users (i.e., everyone), getting two million emails
a day for 1000 years. That seems pretty safe for as long as I care.

Of course to store all molecules you really need more like 384 bits.

What I'm saying is, is there a better way of doing this?

Transfinite numbers ;)

--
Peter Eisentraut Sernanders v���g 10:115
peter_e@gmx.net 75262 Uppsala
http://yi.org/peter-e/ Sweden

--
Tom Cook - Software Engineer

"Never criticize a man until you've walked a mile in his shoes; that way,
when you criticize him, you're a mile away and have his shoes."
- Unknown

LISAcorp - www.lisa.com.au

--------------------------------------------------
38 Greenhill Rd. Level 3, 228 Pitt Street
Wayville, SA, 5034 Sydney, NSW, 2000

Phone: +61 8 8272 1555 Phone: +61 2 9283 0877
Fax: +61 8 8271 1199 Fax: +61 2 9283 0866
--------------------------------------------------