Using oids

Started by Malcolm Warrenover 22 years ago30 messages
#1Malcolm Warren
malcolm@villeinitalia.com

While updating to 7.3.4 I note with some alarm the following passage in
README.Debian.migration.gz written by Oliver Elphick:

"Some schema designs rely on the use of oids as row identifiers. This is
definitely not recommended, not least because oids are not guaranteed to
exist in all future versions of PostgreSQL. Oids are an internal feature
only. They are not suitable as candidate keys, since they are not
guaranteed to be unique; furthermore, the starting point for oids is likely
to change whenever a change to the database structure occurs."

While I have not used oids to join tables, I have used them extensively in
programming, because if Postgres has supplied a unique number for each row,
why on earth should I bother supplying another one of my own?

Like many people starting with Postgres, three or four years ago I
carefully read Bruce Momjian's excellent introductory book on Postgres,
which explained many initially difficult concepts with such clarity.

The book states "Every row in POSTGRESQL is assigned a unique, normally
invisible number called an object identification number (OID). When the
software is initialized with initdb, a counter is created and set to
approximately seventeen-thousand. The counter is used to uniquely number
every row. Although databases may be created and destroyed, the counter
continues to increase. It is used by all databases, so identification
numbers are always unique. No two rows in any table or in any database will
ever have the same object ID."

Further down we read:
"Object identification numbers can be used as primary and foreign key
values in joins. Since every row has a unique object ID, a separate column
is not needed to hold the row's unique number."

On the next page are listed the limitations of oids, for example they are
nonsequential, nonmodifiable, and not backed up by default, but for my uses
these were not problems at all. I have merely used the oid number as a
temporary unique identifier before assigning a permanent booking number to
it, which takes about a nanosecond, and in other similar cases.

To sum up: The Debian migration gzip file declares that oids are not
guaranteed to be unique, issues dire warnings about using them as keys and
worst of all states that they may be phased out in the future.

The book states that they are unique, tells you how to use them, actually
gives an example of using them as primary and foreign keys (which
fortunately I decided was not very wise) and certainly doesn't say anything
about phasing them out in the future.

Can anybody shed any light on this?
Malcolm Warren

#2Shridhar Daithankar
shridhar_daithankar@persistent.co.in
In reply to: Malcolm Warren (#1)
Re: Using oids

On 3 Sep 2003 at 10:27, Malcolm Warren wrote:

To sum up: The Debian migration gzip file declares that oids are not
guaranteed to be unique, issues dire warnings about using them as keys and
worst of all states that they may be phased out in the future.

The book states that they are unique, tells you how to use them, actually
gives an example of using them as primary and foreign keys (which
fortunately I decided was not very wise) and certainly doesn't say anything
about phasing them out in the future.

Yes. It is correct. As of 7.3.x and onwards oids are optional at table creation
times. They default to be available for new objects but that is for backwards
compatibility I believe. In future, they would default to be not available for
a particular object(hopefully). Right now you need to explicitly specify no
oids while creating tables etc.

About oids not being unique, oids can assume 4 billion different values. If you
have more than those many rows in a table, oids will wrap around and will no
longer be unique in that object.

About oids being eliminated, I am sure it would happen some time in the future,
looking at the development on this issue. Core team could elaborate more on
this.

Correct me if I am wrong.

HTH

Bye
Shridhar

--
Nusbaum's Rule: The more pretentious the corporate name, the smaller the
organization. (For instance, the Murphy Center for the Codification of Human
and Organizational Law, contrasted to IBM, GM, and AT&T.)

#3Bo Lorentsen
bl@netgroup.dk
In reply to: Shridhar Daithankar (#2)
Re: Using oids

On Wed, 2003-09-03 at 11:10, Shridhar Daithankar wrote:

Yes. It is correct. As of 7.3.x and onwards oids are optional at table creation
times. They default to be available for new objects but that is for backwards
compatibility I believe. In future, they would default to be not available for
a particular object(hopefully). Right now you need to explicitly specify no
oids while creating tables etc.

I do understand the limitation of the oid as implimented now, but why
remove the possibility to make unique row references (like after an
insert), instead of extenting the oid ?

I don't care if they wrap, or is a unique string or anything else, as
long as I can use it to refetch a row after an insert, without keeping
track of app. implimentation specific SERIAL fields.

About oids not being unique, oids can assume 4 billion different values. If you
have more than those many rows in a table, oids will wrap around and will no
longer be unique in that object.

I see that this is a problem, and 4 billion is not alot, but why not
make another format like in oracle, so that it still is possible to
refere to a row using a unique --- thing. There must be an internal oid
somewhere, what we may be able to use.

About oids being eliminated, I am sure it would happen some time in the future,
looking at the development on this issue. Core team could elaborate more on
this.

Sounds like a sad thing, if this is not replaced by something simular,
but more fit to a large DB.

/BL

#4Shridhar Daithankar
shridhar_daithankar@persistent.co.in
In reply to: Bo Lorentsen (#3)
Re: Using oids

On 3 Sep 2003 at 11:28, Bo Lorentsen wrote:

On Wed, 2003-09-03 at 11:10, Shridhar Daithankar wrote:

Yes. It is correct. As of 7.3.x and onwards oids are optional at table creation
times. They default to be available for new objects but that is for backwards
compatibility I believe. In future, they would default to be not available for
a particular object(hopefully). Right now you need to explicitly specify no
oids while creating tables etc.

I do understand the limitation of the oid as implimented now, but why
remove the possibility to make unique row references (like after an
insert), instead of extenting the oid ?

I don't care if they wrap, or is a unique string or anything else, as
long as I can use it to refetch a row after an insert, without keeping
track of app. implimentation specific SERIAL fields.

Well, what I do is, declare a serate sequence, retrive next available value and
explicitly insert it into a integer field. That avoids having to retrieve the
latest value again.

I don't know if this is a widespread practice but I find it useful for more
than one way in the environment in which I program.

About oids not being unique, oids can assume 4 billion different values. If you
have more than those many rows in a table, oids will wrap around and will no
longer be unique in that object.

I see that this is a problem, and 4 billion is not alot, but why not
make another format like in oracle, so that it still is possible to
refere to a row using a unique --- thing. There must be an internal oid
somewhere, what we may be able to use.

I understand. With growing use of 64 bit hardware, 4 billion will be history
pretty soon.

However historically oids were assumed to be 32 bit. There could be places
which unintentionally assumed it as such. Cleaning all those places is pretty
difficult given the big code base postgresql has.

If you compile postgresql with Oid as 64 bit integer, that will work in most
cases probably. However it does not guarantee that it will always work. There
always could be some places which assumed 32 bit data types.

That is one of the problem as I understand.

(Gathered and extended from one of Tom's post. correct me if I am wrong)

About oids being eliminated, I am sure it would happen some time in the future,
looking at the development on this issue. Core team could elaborate more on
this.

Sounds like a sad thing, if this is not replaced by something simular,
but more fit to a large DB.

You can request this to be a TODO to hackers. They will decide if this is worth
having it. Personally I support it the way it is. Optionally available.

For large table containing billions of rows, Oids add to tuple size and overall
IO. If you are not using Oids, they become overhead. Ability to turn them off
is certainly nice..

Bye
Shridhar

--
Unfair animal names:-- tsetse fly -- bullhead-- booby -- duck-billed
platypus-- sapsucker -- Clarence -- Gary Larson

#5Malcolm Warren
malcolm@villeinitalia.com
In reply to: Bo Lorentsen (#3)
Re: Using oids

On Wed, 03 Sep 2003 11:28:01 +0200, Bo Lorentsen <bl@netgroup.dk> wrote:

On Wed, 2003-09-03 at 11:10, Shridhar Daithankar wrote:

Yes. It is correct. As of 7.3.x and onwards oids are optional at table
creation times. They default to be available for new objects but that is
for backwards compatibility I believe. In future, they would default to
be not available for a particular object(hopefully). Right now you need
to explicitly specify no oids while creating tables etc.

The point about oids is that they are so useful that many people have used
them extensively, me included.
And I did so on the basis of a book written by Bruce Mowjian, one of
Postgresql's major contributors.

If we are saying that there will always be backwards compatibility then I
can sleep at night.
If we are saying that at some point oids will exist no more then all of a
sudden I am going to have to re-write an awful lot of code.

Malcolm Warren

#6Bo Lorentsen
bl@netgroup.dk
In reply to: Shridhar Daithankar (#4)
Re: Using oids

On Wed, 2003-09-03 at 11:38, Shridhar Daithankar wrote:

Well, what I do is, declare a serate sequence, retrive next available value and
explicitly insert it into a integer field. That avoids having to retrieve the
latest value again.

Yeps, this is what I call an application specific implimentation, as one
can't do this at a more genral layer (that does not know about your
table layout).

Like having a general function that insert a row and return the newly
inserted row, containing the defaults set by PG. My code contain this
function (http://www.lue.dk/prj/dbc), and I have no way to make this
work if I'm not able to fetch the oid after an insert, in some way.

I don't know if this is a widespread practice but I find it useful for more
than one way in the environment in which I program.

You are not the only one recommenting this solution :-)

I understand. With growing use of 64 bit hardware, 4 billion will be history
pretty soon.

Agreed !

However historically oids were assumed to be 32 bit. There could be places
which unintentionally assumed it as such. Cleaning all those places is pretty
difficult given the big code base postgresql has.

One day I will try to look after myself, but what does PG do internaly,
when referring to rows in a unique way (lets say in an index) ?

If you compile postgresql with Oid as 64 bit integer, that will work in most
cases probably. However it does not guarantee that it will always work. There
always could be some places which assumed 32 bit data types.

But if convertet to a string type, all involved places would fail, and
no uncertency are involved when fixing it (well, a little too primitive
argument, I know) :-)

You can request this to be a TODO to hackers. They will decide if this is worth
having it. Personally I support it the way it is. Optionally available.

I'm not sure, if my skills reach this fare, but thanks for the advice.
And for the rest --- we disagree :-)

For large table containing billions of rows, Oids add to tuple size and overall
IO. If you are not using Oids, they become overhead. Ability to turn them off
is certainly nice..

Yeps, if they really are not nessesary.

/B

#7Malcolm Warren
malcolm@villeinitalia.com
In reply to: Shridhar Daithankar (#2)
Re: Using oids

On Wed, 03 Sep 2003 11:40:01 +0100, Oliver Elphick <olly@lfix.co.uk> wrote:

On Wed, 2003-09-03 at 10:39, Malcolm Warren wrote:

The point about oids is that they are so useful that many people have
used them extensively, me included.
And I did so on the basis of a book written by Bruce Mowjian, one of
Postgresql's major contributors.

If we are saying that there will always be backwards compatibility then
I can sleep at night.
If we are saying that at some point oids will exist no more then all of
a sudden I am going to have to re-write an awful lot of code.

I wrote that passage for the Debian package on the basis of various
emails I have read on the lists in the past. Whether the oid feature
will ever be completely removed, I cannot say.

It is certainly the case that if oids wrap round, uniqueness is no
longer guaranteed. You may be certain that your oid counter won't wrap,
but I cannot express such certainty for all possible users of the Debian
package.

It is my contention that it is wrong to build your database design on an
internal feature of the database; it automatically makes it unportable
to any other database. To my mind, it also demonstrates that the
database design is faulty. If the tables in question have candidate
keys, why not use them? If they don't, how can it be meaningful to use
the oid as a foreign key?

Thanks for your reply Oliver, it's great to hear direct from the source.

I agree with you about database design and in fact fortunately I don't use
oids as foreign keys, which I thought unwise.
However I have found oids very useful as temporary unique references to a
record in my programming.
If I had known when I started writing my code three years ago that there
was even the
slightest doubt about continuing with oids then I wouldn't have used them.

As it is, I've used them extensively, because like many people I am
fortunate enough to be only dealing in thousands, not even millions of
records. And also because I had read a lot of documentation and had seen
nothing
that even hinted at a problem using them, on the contrary, I found
references to their usefulness.

So, will there be backwards compatibility? Surely if they are going to be
got rid of (for the moment by default they are still used), there has to be
the possibility to switch them on again where necessary?

I think with this email I have had my final say. I see that there is still
debate going on about what to do with oids and I've been able to have put
my point. Thanks.

Malcolm Warren

#8Martijn van Oosterhout
kleptog@svana.org
In reply to: Bo Lorentsen (#6)
Re: Using oids

On Wed, Sep 03, 2003 at 12:20:42PM +0200, Bo Lorentsen wrote:

On Wed, 2003-09-03 at 11:38, Shridhar Daithankar wrote:

Well, what I do is, declare a serate sequence, retrive next available value and
explicitly insert it into a integer field. That avoids having to retrieve the
latest value again.

Yeps, this is what I call an application specific implimentation, as one
can't do this at a more genral layer (that does not know about your
table layout).

But your insert function needs to know something about the table it's
inserting into. The sequences have quite predicatable names. Besides, you
can set the name yourself (DCL does this IIRC).

Like having a general function that insert a row and return the newly
inserted row, containing the defaults set by PG. My code contain this
function (http://www.lue.dk/prj/dbc), and I have no way to make this
work if I'm not able to fetch the oid after an insert, in some way.

The only thing you need to know is the name of the primary key field. This
many be a problem in a generic layer. If you like you can make a UNIQUE
INDEX on the oid column and retry inserts when they fail.

In your code, do create an index on the OID column? If not, that's be a
performance hit,

However historically oids were assumed to be 32 bit. There could be places
which unintentionally assumed it as such. Cleaning all those places is pretty
difficult given the big code base postgresql has.

One day I will try to look after myself, but what does PG do internaly,
when referring to rows in a unique way (lets say in an index) ?

Something called CTID usually (AFAIK). It's a (block,tuple) pair. It changes
whenever you do an update. Or a vacuum.

For large table containing billions of rows, Oids add to tuple size and overall
IO. If you are not using Oids, they become overhead. Ability to turn them off
is certainly nice..

Yeps, if they really are not nessesary.

If I were doing it would extract the primary key of each table on startup
and then change that one line of code to:

os << "SELECT * FROM " << sTable << " WHERE "
<< prikey << " = currval('" << sTable << "_" << prikey << "_seq')";

Say you have a LAST_ID function and you a table with more than one sequence,
which would it return?

Hope this helps,
--
Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/

Show quoted text

"All that is needed for the forces of evil to triumph is for enough good
men to do nothing." - Edmond Burke
"The penalty good people pay for not being interested in politics is to be
governed by people worse than themselves." - Plato

#9Bo Lorentsen
bl@netgroup.dk
In reply to: Martijn van Oosterhout (#8)
Re: Using oids

On Wed, 2003-09-03 at 13:19, Martijn van Oosterhout wrote:

But your insert function needs to know something about the table it's
inserting into. The sequences have quite predicatable names. Besides, you
can set the name yourself (DCL does this IIRC).

No it don't know anything about the table it insert into. I simply do
the following :

1. INSERT data (comming from another layer)
2. Get the last oid
3. SELECT * FROM the same table where oid = what I just found.

I know absolutly nothing about the table, and I like it this way :-)

The only thing you need to know is the name of the primary key field. This
many be a problem in a generic layer. If you like you can make a UNIQUE
INDEX on the oid column and retry inserts when they fail.

Hmm, it all end up putting alot of information to a lower layer, and
this is sad as PG already knows, but it may not tell me.

In your code, do create an index on the OID column? If not, that's be a
performance hit,

I'm not sure what you mean !

Something called CTID usually (AFAIK). It's a (block,tuple) pair. It changes
whenever you do an update. Or a vacuum.

So no id for a row, but only for a row instance ? Is this the reason for
the growing index files ?

If I were doing it would extract the primary key of each table on startup
and then change that one line of code to:

os << "SELECT * FROM " << sTable << " WHERE "
<< prikey << " = currval('" << sTable << "_" << prikey << "_seq')";

Thanks, but I have to be aware of the "prikey" name, and demand a prikey
for all tables to insert row into :-(

Hope this helps,

I know what you mean, but I don't like the impact of the solution.

/BL

#10Oliver Elphick
olly@lfix.co.uk
In reply to: Martijn van Oosterhout (#8)
Re: Using oids

On Wed, 2003-09-03 at 12:19, Martijn van Oosterhout wrote:

If I were doing it would extract the primary key of each table on startup
and then change that one line of code to:

os << "SELECT * FROM " << sTable << " WHERE "
<< prikey << " = currval('" << sTable << "_" << prikey << "_seq')";

You cannot use currval() until you have used nextval() on the same
sequence in the same session.

--
Oliver Elphick Oliver.Elphick@lfix.co.uk
Isle of Wight, UK http://www.lfix.co.uk/oliver
GPG: 1024D/3E1D0C1C: CA12 09E0 E8D5 8870 5839 932A 614D 4C34 3E1D 0C1C
========================================
"And he said unto his disciples, Therefore I say unto
you, Take no thought for your life, what ye shall eat;
neither for the body, what ye shall put on. For life
is more than meat, and the body is more than clothing.
Consider the ravens, for they neither sow nor reap;
they have neither storehouse nor barn; and yet God
feeds them; how much better you are than the birds!
Consider the lilies, how they grow; they toil
not, they spin not; and yet I say unto you, that
Solomon in all his glory was not arrayed like one of
these. If then God so clothe the grass, which is to
day in the field, and tomorrow is cast into the oven;
how much more will he clothe you, O ye of little
faith? And seek not what ye shall eat, or what ye
shall drink, neither be ye of doubtful mind.
But rather seek ye the kingdom of God; and all these
things shall be added unto you."
Luke 12:22-24; 27-29; 31.

#11Tom Lane
tgl@sss.pgh.pa.us
In reply to: Bo Lorentsen (#3)
Re: Using oids

Bo Lorentsen <bl@netgroup.dk> writes:

On Wed, 2003-09-03 at 11:10, Shridhar Daithankar wrote:

About oids not being unique, oids can assume 4 billion different
values. If you have more than those many rows in a table, oids will
wrap around and will no longer be unique in that object.

I see that this is a problem, and 4 billion is not alot, but why not
make another format like in oracle, so that it still is possible to
refere to a row using a unique --- thing.

The reason OIDs shouldn't be considered unique is that there is no
mechanism to enforce that they are unique --- unless you make one,
that is, create a unique index on OID for a table. The system does
not do that for you since it would be excessive overhead for tables
in which the user doesn't care about OID uniqueness. But I'd
definitely recommend it if you are using OIDs for row identifiers.

If you want a globally unique ID based on OIDs, use the table OID
concatenated with the row OID.

I don't foresee OIDs going away in the future, because they are used
internally by the system; but I don't foresee them being "improved"
either. Serial columns already do much of what people would like from
OIDs, and the developers' response to all questions along this line
is likely to be "use a serial column instead". It's possible that the
default for table creation will switch to WITHOUT OIDS at some future
time, though I don't consider that a done deal because of the backwards
compatibility issue.

There must be an internal oid
somewhere, what we may be able to use.

No, there isn't. There is only ctid, which is not useful as a long-term
row identifier, because UPDATE and VACUUM can change it.

regards, tom lane

#12Alvaro Herrera Munoz
alvherre@dcc.uchile.cl
In reply to: Malcolm Warren (#7)
Re: Using oids

On Wed, Sep 03, 2003 at 01:05:30PM +0200, Malcolm Warren wrote:

I agree with you about database design and in fact fortunately I don't use
oids as foreign keys, which I thought unwise. However I have found oids very
useful as temporary unique references to a record in my programming. If I
had known when I started writing my code three years ago that there was even
the slightest doubt about continuing with oids then I wouldn't have used
them.

You can create tables WITH OIDS (this is by default on 7.3, but will
probably changed in some future release). If you also create an unique
index on the oid column of the table, you have all you need. But beware
that some INSERTs will fail because the OID counter will wrap around at some
point. There is a non-zero probability that the newly generated OID will
collide with an existing tuple in that table; you have to be prepared to
repeat your query in that case, which can be a pain if you are doing
something else in the same transaction.

--
Alvaro Herrera (<alvherre[@]dcc.uchile.cl>)
"Lo esencial es invisible para los ojos" (A. de Saint Ex���pery)

#13Bo Lorentsen
bl@netgroup.dk
In reply to: Tom Lane (#11)
Re: Using oids

On Wed, 2003-09-03 at 16:13, Tom Lane wrote:

The reason OIDs shouldn't be considered unique is that there is no
mechanism to enforce that they are unique --- unless you make one,
that is, create a unique index on OID for a table. The system does
not do that for you since it would be excessive overhead for tables
in which the user doesn't care about OID uniqueness. But I'd
definitely recommend it if you are using OIDs for row identifiers.

Ok, so my little INSERT / SELECT show will continue to work for a long
time, as I only uses the oids on short term bacis.

If you want a globally unique ID based on OIDs, use the table OID
concatenated with the row OID.

Ok, this make sense !

No, there isn't. There is only ctid, which is not useful as a long-term
row identifier, because UPDATE and VACUUM can change it.

But there is no way for the client user to user these in a
"PQgetLastCtid" and "SELECT * from zxy where ctid = 42", so this will
not help :-)

Thanks anyway, may oid's live for a long time, and one day become grown
up 64 bit values :-)

/BL

#14Martijn van Oosterhout
kleptog@svana.org
In reply to: Bo Lorentsen (#9)
Re: Using oids

On Wed, Sep 03, 2003 at 01:47:01PM +0200, Bo Lorentsen wrote:

On Wed, 2003-09-03 at 13:19, Martijn van Oosterhout wrote:

The only thing you need to know is the name of the primary key field. This
many be a problem in a generic layer. If you like you can make a UNIQUE
INDEX on the oid column and retry inserts when they fail.

Hmm, it all end up putting alot of information to a lower layer, and
this is sad as PG already knows, but it may not tell me.

Well, in a sense it know and in a sense it doesn't. Sequences are not
considered special in terms of returning data to the client. It's just
another function from the parser's point of view.

In your code, do create an index on the OID column? If not, that's be a
performance hit,

I'm not sure what you mean !

If you know the OID of a row, PostgreSQL doesn't have a special lookup table
to find it. That's also why they're not unique; the backend would have to
scan through every table to find out if the next one is available.

So, unless you specifically add an index to the table, looking up by OID
will always trigger a sequential scan.

That said, there is no reason why someone couldn't create a last_sequence()
function so you could say SELECT currval( last_sequence() ). Ofcourse, if
your table has no SERIAL field, you're stuffed either way.
--
Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/

Show quoted text

"All that is needed for the forces of evil to triumph is for enough good
men to do nothing." - Edmond Burke
"The penalty good people pay for not being interested in politics is to be
governed by people worse than themselves." - Plato

#15Dennis Gearon
gearond@fireserve.net
In reply to: Shridhar Daithankar (#2)
Re: Using oids

The elimination is in concert with the dying of popularity in 'Object
Oriented Databases', right?

Shridhar Daithankar wrote:

Show quoted text

On 3 Sep 2003 at 10:27, Malcolm Warren wrote:

To sum up: The Debian migration gzip file declares that oids are not
guaranteed to be unique, issues dire warnings about using them as keys and
worst of all states that they may be phased out in the future.

The book states that they are unique, tells you how to use them, actually
gives an example of using them as primary and foreign keys (which
fortunately I decided was not very wise) and certainly doesn't say anything
about phasing them out in the future.

Yes. It is correct. As of 7.3.x and onwards oids are optional at table creation
times. They default to be available for new objects but that is for backwards
compatibility I believe. In future, they would default to be not available for
a particular object(hopefully). Right now you need to explicitly specify no
oids while creating tables etc.

About oids not being unique, oids can assume 4 billion different values. If you
have more than those many rows in a table, oids will wrap around and will no
longer be unique in that object.

About oids being eliminated, I am sure it would happen some time in the future,
looking at the development on this issue. Core team could elaborate more on
this.

Correct me if I am wrong.

HTH

Bye
Shridhar

--
Nusbaum's Rule: The more pretentious the corporate name, the smaller the
organization. (For instance, the Murphy Center for the Codification of Human
and Organizational Law, contrasted to IBM, GM, and AT&T.)

---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faqs/FAQ.html

#16Dennis Gearon
gearond@fireserve.net
In reply to: Oliver Elphick (#10)
Re: Using oids

Why is that, anyway, and why should it be?

Oliver Elphick wrote:

Show quoted text

On Wed, 2003-09-03 at 12:19, Martijn van Oosterhout wrote:

If I were doing it would extract the primary key of each table on startup
and then change that one line of code to:

os << "SELECT * FROM " << sTable << " WHERE "
<< prikey << " = currval('" << sTable << "_" << prikey << "_seq')";

You cannot use currval() until you have used nextval() on the same
sequence in the same session.

#17Doug McNaught
doug@mcnaught.org
In reply to: Shridhar Daithankar (#2)
Re: Using oids

Dennis Gearon <gearond@fireserve.net> writes:

Oliver Elphick wrote:

You cannot use currval() until you have used nextval() on the same
sequence in the same session.

Why is that, anyway, and why should it be?

Because that's what currval() does. It doesn't have anything to do
with sequence values in other sessions. It gives you the last value
*you* got for thee sequence, so if you haven't called nextval() yet
you should and do get an error.

Maybe it should have been called lastval(), but that could be a bit
misleading too...

-Doug

#18Bo Lorentsen
bl@netgroup.dk
In reply to: Martijn van Oosterhout (#14)
Re: Using oids

On Wed, 2003-09-03 at 17:28, Martijn van Oosterhout wrote:

If you know the OID of a row, PostgreSQL doesn't have a special lookup table
to find it. That's also why they're not unique; the backend would have to
scan through every table to find out if the next one is available.

Ahh, thats not nice, hav'nt checked that, yet.

So, unless you specifically add an index to the table, looking up by OID
will always trigger a sequential scan.

I thought it was much more easy for PG to find these, but I quess ctid
are the one that is fast to find.

That said, there is no reason why someone couldn't create a last_sequence()
function so you could say SELECT currval( last_sequence() ). Ofcourse, if
your table has no SERIAL field, you're stuffed either way.

Not as nice as oid's.

/BL

#19Greg Stark
gsstark@mit.edu
In reply to: Bo Lorentsen (#9)
Re: Using oids

Bo Lorentsen <bl@netgroup.dk> writes:

If I were doing it would extract the primary key of each table on startup
and then change that one line of code to:

os << "SELECT * FROM " << sTable << " WHERE "
<< prikey << " = currval('" << sTable << "_" << prikey << "_seq')";

Thanks, but I have to be aware of the "prikey" name, and demand a prikey
for all tables to insert row into :-(

This is an issue faced mostly by driver developers that want to provide high
level abstract interfaces.

The problem is that using OIDs is basically imposing a primary key on every
table even when the application designer didn't want one. They're mostly
redundant because most tables will have a primary key, wasteful for small
tables, and inadequate for large tables.

I don't like hard coding the assumption that the sequence name is based on the
primary key column name either though. Not every table will have a primary key
of "serial" type. Consider reference tables where the primary key is
non-arbitrary value. Even when it is, the sequence name can be truncated.

The new binary FE protocol included some discussion of API features to allow
drivers like JDBC get column information. I believe that API included an
indication of what the primary key column was. I'm not sure it includes a hook
to get the value of the last insertion, presumably via the sequence. If it
does I would think that would be far preferable to using OIDs.

The disadvantage: tables with no primary key whatsoever would couldn't be
supported by your high level abstraction. I only end up with tables with no
primary keys for many-to-many relationships (or one-to-many sets of immutable
data which amounts to the same thing) anyways. You want to insert, do mass
deletions, but never update such records anyways.

The pros: no extra overhead for OIDs, more portable to other databases.

--
greg

#20Bruno Wolff III
bruno@wolff.to
In reply to: Dennis Gearon (#16)
Re: Using oids

On Wed, Sep 03, 2003 at 08:46:42 -0700,
Dennis Gearon <gearond@fireserve.net> wrote:

Why is that, anyway, and why should it be?

Because it reduces contention by giving each backend its own pool
of sequence values. But until you call nextval a backend won't
have any values reserved.

#21Darko Prenosil
darko.prenosil@finteh.hr
In reply to: Bo Lorentsen (#13)
Re: Using oids

On Wednesday 03 September 2003 17:24, Bo Lorentsen wrote:

On Wed, 2003-09-03 at 16:13, Tom Lane wrote:

The reason OIDs shouldn't be considered unique is that there is no
mechanism to enforce that they are unique --- unless you make one,
that is, create a unique index on OID for a table. The system does
not do that for you since it would be excessive overhead for tables
in which the user doesn't care about OID uniqueness. But I'd
definitely recommend it if you are using OIDs for row identifiers.

Ok, so my little INSERT / SELECT show will continue to work for a long
time, as I only uses the oids on short term bacis.

If you want a globally unique ID based on OIDs, use the table OID
concatenated with the row OID.

Ok, this make sense !

No, there isn't. There is only ctid, which is not useful as a long-term
row identifier, because UPDATE and VACUUM can change it.

But there is no way for the client user to user these in a
"PQgetLastCtid" and "SELECT * from zxy where ctid = 42", so this will
not help :-)

Thanks anyway, may oid's live for a long time, and one day become grown
up 64 bit values :-)

Any other way, a lot of (my) code will become useless :-(

Regards !

#22Jonathan Bartlett
johnnyb@eskimo.com
In reply to: Shridhar Daithankar (#2)
Re: Using oids

About oids not being unique, oids can assume 4 billion different values. If you
have more than those many rows in a table, oids will wrap around and will no
longer be unique in that object.

Not quite. After 4 billion inserts (even spread across millions of
tables), you run out of OIDs and they will no longer be unique. OIDs I
think were originally meant to be globally unique identifiers, but they
are no longer so, and are really no longer useful. When I want globally
unique identifiers, I use an int8 column + sequence.

Jon

#23Jonathan Bartlett
johnnyb@eskimo.com
In reply to: Malcolm Warren (#5)
Re: Using oids

If we are saying that there will always be backwards compatibility then I
can sleep at night.
If we are saying that at some point oids will exist no more then all of a
sudden I am going to have to re-write an awful lot of code.

Why? Couldn't you do alter table X add column oid int; alter table X
alter oid set default nextval('oids'); create sequence oids ...

Jon

Show quoted text

Malcolm Warren

---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo@postgresql.org so that your
message can get through to the mailing list cleanly

#24Jonathan Bartlett
johnnyb@eskimo.com
In reply to: Bo Lorentsen (#9)
Re: Using oids

No it don't know anything about the table it insert into. I simply do
the following :

1. INSERT data (comming from another layer)
2. Get the last oid
3. SELECT * FROM the same table where oid = what I just found.

I know absolutly nothing about the table, and I like it this way :-)

The way I do it is to have a global sequence called 'objects' that spits
out 64-bit values, and then EVERY TABLE has a 64-bit field called
object_id, which defaults to nextval('objects') if I don't specify it.
So, on every table no matter what, I could do:

1. select nextval('objects');
2. INSERT data (comming from another layer, but set object_id to the
value I got in #1)
3. SELECT * FROM the same table where oid = what I just selected in #1

Jon

#25Bo Lorentsen
bl@netgroup.dk
In reply to: Greg Stark (#19)
Re: Using oids

On Wed, 2003-09-03 at 18:24, Greg Stark wrote:

I am maintaining a driver layer, so that exactly why I care about this
kind of problem :-)

The pros: no extra overhead for OIDs, more portable to other databases.

So when will this kind of meta data be provided, to let me detect if
there is at "PRIMARY KEY" on a table. ?

/BL

#26Jonathan Bartlett
johnnyb@eskimo.com
In reply to: Bo Lorentsen (#13)
Re: Using oids

If you want a globally unique ID based on OIDs, use the table OID
concatenated with the row OID.

Ok, this make sense !

Are you sure this works after you hit the 4 billion mark?

#27Bo Lorentsen
bl@netgroup.dk
In reply to: Jonathan Bartlett (#26)
Re: Using oids

On Wed, 2003-09-03 at 22:12, Jonathan Bartlett wrote:

Are you sure this works after you hit the 4 billion mark?

As long as the returened oid is unique on the table in current session !

/BL

#28Tom Lane
tgl@sss.pgh.pa.us
In reply to: Jonathan Bartlett (#26)
Re: Using oids

Jonathan Bartlett <johnnyb@eskimo.com> writes:

If you want a globally unique ID based on OIDs, use the table OID
concatenated with the row OID.

Ok, this make sense !

Are you sure this works after you hit the 4 billion mark?

If you have a unique index on OID on each table for which you care, yes.

As someone else pointed out, you do then have to cope with the
possibility of insertions failing because of OID conflicts.

regards, tom lane

#29Ashley Cambrell
ash@freaky-namuh.com
In reply to: Martijn van Oosterhout (#14)
Re: [HACKERS] Using oids

That said, there is no reason why someone couldn't create a last_sequence()
function so you could say SELECT currval( last_sequence() ). Ofcourse, if
your table has no SERIAL field, you're stuffed either way.

Instead of SELECT currval( last_sequence() ), what about implementing
oracl type binding?

Ala
http://groups.google.com.au/groups?hl=en&amp;lr=&amp;ie=UTF-8&amp;oe=UTF-8&amp;th=5cc63a569577d024#link2

With the new FE/BE changes, how easy would it be to implement? (I
mentioned it the FE/BE discussions)

Ashley Cambrell

#30Martijn van Oosterhout
kleptog@svana.org
In reply to: Bo Lorentsen (#25)
Re: Using oids

On Wed, Sep 03, 2003 at 10:08:47PM +0200, Bo Lorentsen wrote:

On Wed, 2003-09-03 at 18:24, Greg Stark wrote:

I am maintaining a driver layer, so that exactly why I care about this
kind of problem :-)

The pros: no extra overhead for OIDs, more portable to other databases.

So when will this kind of meta data be provided, to let me detect if
there is at "PRIMARY KEY" on a table. ?

That metadata has been around for a while, it's all stored in the system
tables. It's not returned with each query though. pg_attribute and pg_class
tell you almost everything you need to know.

--
Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/

Show quoted text

"All that is needed for the forces of evil to triumph is for enough good
men to do nothing." - Edmond Burke
"The penalty good people pay for not being interested in politics is to be
governed by people worse than themselves." - Plato