Proposal: OID wraparound: summary and proposal
At the same time that we announce support for optional OIDs,
we should announce that, in future releases, OIDs will only be
guaranteed unique (modulo wraparounds) within a single table.
... if an appropriate unique constraint is explicitly created.
Seems reasonable --- that will give people notice that we're thinking
about separate-OID-generator-per-table ideas.
Imho we should think about adding other parts to the external representation
of OID before we start thinking about moving from 4 to 8 bytes in the heap.
Essentially the oid would then be a concatenated e.g. 16 byte number,
that is constructed with:
oid128 = installation oid<<96 + class oid<<64 + for_future_use<<32 + tuple oid
Imho walking that direction would serve the "OID" idea a lot better,
and could actually guarantee a globally unique oid, if the "installation
oid" was centrally managed.
It has the additional advantage of knowing the class by only looking at the oid.
The btree code could be specially tuned to only consider the lower 4(or 8) bytes
on insert and make an early exit for select where oid = wrong class id.
Andreas
I would just like to comment that for our project, GNU Enterprise, we
use our own 128 bit object ID that is unique (UUID) for every row in
all tables.
It seems to me, without having looked into it, that having both a
PostgreSQL UID and our own 128 bit objectid (UUID) is redundant and
slows the whole process down. But we are storing object data in the
database and require and absolutely unique objectid. We are planning
for enterprise usage and expect to need 128 bits to uniquely define
our objects.
So I would request strongly that we have an option for a 128 bit
unique id for all rows in the database and/or that it is configurable
so we can best decide how to use it. We would like to use our own
and have the postgreSQL uid fast and small or have it larger and
slower but remove the need to generate our own uid.
Neil
neilt@gnue.org
GNU Enterprise
http://www.gnuenterprise.org/
http://www.gnuenterprise.org/~neilt/sc.html
At 10:17 AM +0200 8/3/01, Zeugswetter Andreas SB wrote:
Show quoted text
At the same time that we announce support for optional OIDs,
we should announce that, in future releases, OIDs will only be
guaranteed unique (modulo wraparounds) within a single table.... if an appropriate unique constraint is explicitly created.
Seems reasonable --- that will give people notice that we're thinking
about separate-OID-generator-per-table ideas.Imho we should think about adding other parts to the external representation
of OID before we start thinking about moving from 4 to 8 bytes in the heap.
Essentially the oid would then be a concatenated e.g. 16 byte number,
that is constructed with:oid128 = installation oid<<96 + class oid<<64 +
for_future_use<<32 + tuple oidImho walking that direction would serve the "OID" idea a lot better,
and could actually guarantee a globally unique oid, if the "installation
oid" was centrally managed.It has the additional advantage of knowing the class by only looking
at the oid.The btree code could be specially tuned to only consider the lower
4(or 8) bytes
on insert and make an early exit for select where oid = wrong class id.
Neil Tiffin wrote:
I would just like to comment that for our project, GNU Enterprise, we
use our own 128 bit object ID that is unique (UUID) for every row in
all tables.It seems to me, without having looked into it, that having both a
PostgreSQL UID and our own 128 bit objectid (UUID) is redundant and
slows the whole process down. But we are storing object data in the
database and require and absolutely unique objectid. We are planning
for enterprise usage and expect to need 128 bits to uniquely define
our objects.
Is it just an 128-bit int from a sequence or does it have some internal
structure ?
What kind of enterprise do you expect to have more than
18 446 744 073 709 551 615 of objects that can uniquely be identified
by 64 bits ?
-------------
Hannu
At 10:09 AM +0500 8/7/01, Hannu Krosing wrote:
Neil Tiffin wrote:
I would just like to comment that for our project, GNU Enterprise, we
use our own 128 bit object ID that is unique (UUID) for every row in
all tables.It seems to me, without having looked into it, that having both a
PostgreSQL UID and our own 128 bit objectid (UUID) is redundant and
slows the whole process down. But we are storing object data in the
database and require and absolutely unique objectid. We are planning
for enterprise usage and expect to need 128 bits to uniquely define
our objects.Is it just an 128-bit int from a sequence or does it have some internal
structure ?What kind of enterprise do you expect to have more than
18 446 744 073 709 551 615 of objects that can uniquely be identified
by 64 bits ?
Our objectid is a UUID from libuuid (provided by e2fsprogs, requires
development files. debian package uuid-dev provides all necessary
files.) We use the text representation which IIRC is 33 characters
(38 minus the "-") to store it in the database. (And I dont think
this is the best way to do it.) As for 64 bits being enough, you may
just be right. Our developer that did this part of the code has left
(and we are taking the opportunity to examine this).
We will eventually compete with SAP, Peoplesoft etc. and consider
that SAP has about 20,000 tables to represent an enterprise plus the
life of the system at 10 years and you start to knock down the number
very fast.
I think in the short term we could be happy with a 64 bit id. As we
don't even have our first application working (but we are within a
couple of months) and it will be years before we have a system that
will perform in large scale environments.
In either case the perfect solution, for us, would be to be able to
configure the PostgreSQL uid as none, 64 bit or 128 bit uid at
compile time. A default of 64 bits would be just fine. But we need
to have the uid unique for the database or we will still have to
create and use our own uid (and that will slow everything down).
I have not even considered multiple database servers running
different database, which is our design goal. In this case we would
like to have a slimmed down (and blazingly fast) PostgreSQL server in
which we manage the uid in our middleware. This is because the uid
must be unique accross all servers and database vendors. (I don't
claim to be a database guru, so if we are all wet here please feel
free to help correct our misunderstandings.)
--
Neil
neilt@gnue.org
GNU Enterprise
http://www.gnuenterprise.org/
http://www.gnuenterprise.org/~neilt/sc.html
Neil Tiffin <ntiffin@earthlink.net> writes:
I have not even considered multiple database servers running
different database, which is our design goal. In this case we would
like to have a slimmed down (and blazingly fast) PostgreSQL server in
which we manage the uid in our middleware. This is because the uid
must be unique accross all servers and database vendors.
Given those requirements, it seems like your UID *must* be an
application-defined column; there's no way you'll get a bunch of
different database vendors to all sign on to your approach to UIDs.
So in reality, I think the feature you want is precisely to be able
to suppress Postgres' automatic OID generation on your table(s), since
it's of no value to you. The number of cycles saved per insert isn't
going to be all that large, but they'll add up...
regards, tom lane
At 11:22 AM -0400 8/7/01, Tom Lane wrote:
Neil Tiffin <ntiffin@earthlink.net> writes:
I have not even considered multiple database servers running
different database, which is our design goal. In this case we would
like to have a slimmed down (and blazingly fast) PostgreSQL server in
which we manage the uid in our middleware. This is because the uid
must be unique accross all servers and database vendors.Given those requirements, it seems like your UID *must* be an
application-defined column; there's no way you'll get a bunch of
different database vendors to all sign on to your approach to UIDs.So in reality, I think the feature you want is precisely to be able
to suppress Postgres' automatic OID generation on your table(s), since
it's of no value to you. The number of cycles saved per insert isn't
going to be all that large, but they'll add up...
That sounds about right. Its amazing how having to write this stuff
down clarifies ones thoughts.
--
Neil
neilt@gnue.org
GNU Enterprise
http://www.gnuenterprise.org/
http://www.gnuenterprise.org/~neilt/sc.html
Neil Tiffin wrote:
I have not even considered multiple database servers running different
database, which is our design goal. In this case we would like to have
a slimmed down (and blazingly fast) PostgreSQL server in which we manage
the uid in our middleware. This is because the uid must be unique
accross all servers and database vendors. (I don't claim to be a
database guru, so if we are all wet here please feel free to help
correct our misunderstandings.)
I am not 100% sure, but I would believe that the
oid/uid/whatever_we_call_it only has to be unique within the table.
At least as long as you don't exchange data between different databases.
As soon as you transfer data from db a to db b, it's good to have an
object id that is unique in the world.
--
Reinhard Mueller
GNU Enterprise project
http://www.gnue.org