looking for a globally unique row ID

Started by Rafal Pietrakover 8 years ago46 messagesgeneral

Jump to latest

Rafal Pietrak

rafal@ztk-rp.eu

over 8 years ago

Hello everybody,

Can anybody help me find a way to implement an ID which:

1. guarantees being unique across multiple tables.

2. guarantees its uniqueness not only during INSERT, but also during the
lifetime of the database/application (e.i. during future UPDATES).

3. guarantees persistence of value across database backup/restore/upgrade.

an obvious candidate - a single SERIAL() (same serial) used in every
table that needs that ID does not guarantee (2).

on the other hand having a dedicated table just for the purpose of
holding that UNIQUE column with all the "client tables" having FK
pointing to it does not guarantee that multiple "client tables" want use
the same ID from that table. ... and I don't know a way to make an FK
"backfireing" from that table into a multitude of client tables.

I thought of using OIDs on tables that need this unique ID ... but
querying the web returns warnings of OID "rollover" problem. As I'm not
quite sure how to manage that, and if (3) is guaranteed (not just being
lucky during couple of tests), I think this is not an option here.

I'd appreciate any hints, as I'm bitting my head against this problem
for a long time, now.

Thenx in advance,

-R

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

John R Pierce

pierce@hogranch.com

over 8 years ago

In reply to: Rafal Pietrak (#1)

Re: looking for a globally unique row ID

On 9/14/2017 12:45 AM, Rafal Pietrak wrote:

Can anybody help me find a way to implement an ID which:

1. guarantees being unique across multiple tables.

2. guarantees its uniqueness not only during INSERT, but also during the
lifetime of the database/application (e.i. during future UPDATES).

3. guarantees persistence of value across database backup/restore/upgrade.

isn't that the problem that GUID are supposed to answer ?

--
john r pierce, recycling bits in santa cruz

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Michael Paquier

michael@paquier.xyz

over 8 years ago

In reply to: Rafal Pietrak (#1)

Re: looking for a globally unique row ID

On Thu, Sep 14, 2017 at 4:45 PM, Rafal Pietrak <rafal@ztk-rp.eu> wrote:

Can anybody help me find a way to implement an ID which:

1. guarantees being unique across multiple tables.

2. guarantees its uniqueness not only during INSERT, but also during the
lifetime of the database/application (e.i. during future UPDATES).

3. guarantees persistence of value across database backup/restore/upgrade.
[...]
I'd appreciate any hints, as I'm bitting my head against this problem
for a long time, now.

UUID would give you some room for requirement 1. 2 and 3 can be found
with the usage of a serial column.
--
Michael

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

George Neuner

gneuner2@comcast.net

over 8 years ago

In reply to: Rafal Pietrak (#1)

Re: looking for a globally unique row ID

On Thu, 14 Sep 2017 09:45:59 +0200, Rafal Pietrak <rafal@ztk-rp.eu>
wrote:

Hello everybody,

Can anybody help me find a way to implement an ID which:

1. guarantees being unique across multiple tables.

2. guarantees its uniqueness not only during INSERT, but also during the
lifetime of the database/application (e.i. during future UPDATES).

3. guarantees persistence of value across database backup/restore/upgrade.

UUID is the obvious choice, but it does take a lot of space.

Something like this might do the job:
http://rob.conery.io/2014/05/28/a-better-id-generator-for-postgresql/

George

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Rafal Pietrak

rafal@ztk-rp.eu

over 8 years ago

In reply to: George Neuner (#4)

Re: looking for a globally unique row ID

W dniu 14.09.2017 o 10:57, George Neuner pisze:

On Thu, 14 Sep 2017 09:45:59 +0200, Rafal Pietrak <rafal@ztk-rp.eu>
wrote:

Hello everybody,

Can anybody help me find a way to implement an ID which:

1. guarantees being unique across multiple tables.

2. guarantees its uniqueness not only during INSERT, but also during the
lifetime of the database/application (e.i. during future UPDATES).

3. guarantees persistence of value across database backup/restore/upgrade.

Seeing the answers I feel, I should probably have added:

4. not necessarily guarantee "planetary wide" uniquness. Meaning:
backup/restore should instantiate those ID dupplication on the second
instance of the database.

UUID is the obvious choice, but it does take a lot of space.

I was hoping for something like a database-scoped "primary key" - which
in particular does not need to be anything big.... provided the dataset
is small.

As far as I can tell, UUID is an ID, that is "simple/fast" to generate,
and has "extremally low" probability of collisions.

Instead I was looking for a "mechanizms/program-sql-idioms" which don't
have to be particularly efficient, but once generated, no matter what,
the uniqueness is asurred by the database. Including UPDATEs - e.i.
assignment of a completly new ID for a particular ROW.

But I understand I may quit searching - there is nothing "so simple".

Something like this might do the job:
http://rob.conery.io/2014/05/28/a-better-id-generator-for-postgresql/

I'll have a look, but this is not quite the tool I'm looking for.

George

Thank you all for answers.

-R

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

vinny

vinny@xs4all.nl

over 8 years ago

In reply to: Rafal Pietrak (#5)

Re: looking for a globally unique row ID

On 2017-09-14 15:06, Rafal Pietrak wrote:

W dniu 14.09.2017 o 10:57, George Neuner pisze:

On Thu, 14 Sep 2017 09:45:59 +0200, Rafal Pietrak <rafal@ztk-rp.eu>
wrote:

Hello everybody,

Can anybody help me find a way to implement an ID which:

1. guarantees being unique across multiple tables.

2. guarantees its uniqueness not only during INSERT, but also during
the
lifetime of the database/application (e.i. during future UPDATES).

3. guarantees persistence of value across database
backup/restore/upgrade.

Seeing the answers I feel, I should probably have added:

4. not necessarily guarantee "planetary wide" uniquness. Meaning:
backup/restore should instantiate those ID dupplication on the second
instance of the database.

UUID is the obvious choice, but it does take a lot of space.

I was hoping for something like a database-scoped "primary key" - which
in particular does not need to be anything big.... provided the dataset
is small.

As far as I can tell, UUID is an ID, that is "simple/fast" to generate,
and has "extremally low" probability of collisions.

Instead I was looking for a "mechanizms/program-sql-idioms" which don't
have to be particularly efficient, but once generated, no matter what,
the uniqueness is asurred by the database. Including UPDATEs - e.i.
assignment of a completly new ID for a particular ROW.

But I understand I may quit searching - there is nothing "so simple".

If it is only one database, on one server, then couldn't you just use
one sequence?
If oyu prefix the value with some identifier of the current table then
you cannot get duplicates
across tables even if you reset the sequence.

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Merlin Moncure

mmoncure@gmail.com

over 8 years ago

In reply to: Rafal Pietrak (#1)

Re: looking for a globally unique row ID

On Thu, Sep 14, 2017 at 2:45 AM, Rafal Pietrak <rafal@ztk-rp.eu> wrote:

Hello everybody,

Can anybody help me find a way to implement an ID which:

1. guarantees being unique across multiple tables.

2. guarantees its uniqueness not only during INSERT, but also during the
lifetime of the database/application (e.i. during future UPDATES).

3. guarantees persistence of value across database backup/restore/upgrade.

an obvious candidate - a single SERIAL() (same serial) used in every
table that needs that ID does not guarantee (2).

A shared sequence meets all of those requirements. I tend to prefer
this over GUID. GUIDs are twice the size, and randomized which has
some unpleasant performance characteristics. The advantage they
bring is being able to set up multiple generation points (say, across
multiple database servers) without fear of conflict.

Sequence generation is very fast so there is no performance related
argument not to use the approach.

merlin

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Karl Czajkowski

karlcz@isi.edu

over 8 years ago

In reply to: vinny (#6)

Re: looking for a globally unique row ID

On Sep 14, vinny modulated:

If it is only one database, on one server, then couldn't you just
use one sequence?
If oyu prefix the value with some identifier of the current table
then you cannot get duplicates
across tables even if you reset the sequence.

I didn't follow the whole thread, so I apologize if I'm repeating
earlier suggestions.

We use a shared sequence to issue the new identifiers, and in fact
limited the sequence to 43 bits so they can serialize as JSON numbers
for the benefit of our clients. We disabled wrapping, so it will fail
if we exhaust the range.

If you rapidly churn through identifiers and could envision exhausting
64 bits in your database's lifetime, you should probably just use
UUIDs instead of a sequence. A timestamp-based UUID still has
reasonably sorting and indexing properties.

To "guarantee" uniqueness with a shared sequence or UUID generator,
you can use a trigger to prevent override of identifiers from SQL. As
long as you always use the correct value generator during INSERT and
disallow mutation of identifiers during UPDATE, the rows will not
share identifiers.

Karl

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Steve Atkins

steve@blighty.com

over 8 years ago

In reply to: Karl Czajkowski (#8)

Re: looking for a globally unique row ID

On Sep 14, 2017, at 8:38 AM, Karl Czajkowski <karlcz@isi.edu> wrote:

On Sep 14, vinny modulated:

If it is only one database, on one server, then couldn't you just
use one sequence?
If oyu prefix the value with some identifier of the current table
then you cannot get duplicates
across tables even if you reset the sequence.

I didn't follow the whole thread, so I apologize if I'm repeating
earlier suggestions.

We use a shared sequence to issue the new identifiers, and in fact
limited the sequence to 43 bits so they can serialize as JSON numbers
for the benefit of our clients. We disabled wrapping, so it will fail
if we exhaust the range.

If you rapidly churn through identifiers and could envision exhausting
64 bits in your database's lifetime, you should probably just use

2^63 nanoseconds is about three centuries.

Unless you need to generate identifiers in multiple places a simple
bigserial is good enough. (If you do need to generate unique identifiers
at facebook / instagram / twitter scale then there are other options, but
you're almost certainly not that big and you probably don't).

For distributed ids on a system you control there are a bunch of 64 bit
id generation algorithms that work well. Twitter snowflake was one of the
earlier ones.

Where UUIDs or GUIDs shine is when you want to be able to generate
ids with a reasonably guarantee that nobody else, anywhere on the planet
or off, ever has or ever will generate the same ID. If you're not in that
situation you don't really need the behaviour they try to guarantee.

UUIDs instead of a sequence. A timestamp-based UUID still has
reasonably sorting and indexing properties.

To "guarantee" uniqueness with a shared sequence or UUID generator,
you can use a trigger to prevent override of identifiers from SQL. As
long as you always use the correct value generator during INSERT and
disallow mutation of identifiers during UPDATE, the rows will not
share identifiers.

Cheers,
Steve

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#10

Rafal Pietrak

rafal@ztk-rp.eu

over 8 years ago

In reply to: Merlin Moncure (#7)

Re: looking for a globally unique row ID

W dniu 14.09.2017 o 15:54, Merlin Moncure pisze:

On Thu, Sep 14, 2017 at 2:45 AM, Rafal Pietrak <rafal@ztk-rp.eu> wrote:

Hello everybody,

Can anybody help me find a way to implement an ID which:

1. guarantees being unique across multiple tables.

2. guarantees its uniqueness not only during INSERT, but also during the
lifetime of the database/application (e.i. during future UPDATES).

3. guarantees persistence of value across database backup/restore/upgrade.

an obvious candidate - a single SERIAL() (same serial) used in every
table that needs that ID does not guarantee (2).

A shared sequence meets all of those requirements. I tend to prefer

Not really.

As I said, I'm not looking for performance or "fair probability" of
planetary-wide uniqueness.

My main objective is the "guarantee". Which I've tried to indicate
referring to "future UPDATEs".

What I mean here is functionality similar to "primary key", or "unique
constraint". Whenever somebody (application, like faulty application
IMPORTANT!) tries to INSERT or UPDATE a not unique value there (which in
fact could possibly be generated earlier by UUID algorithms, or even a
sequence), if that value is among table that uses that (misterious)
"global primary key"; that application just fails the transaction like
any other "not unique" constraint failing.

That's the goal.

Multitude of tablas using a single sequence does not give that guarantee.

As I've said, a solution closest to my target is a separate table with
just one column of that "global primary key", which get inserted/updated
within trigger on insert/update of the "client tables" ... only I'm not
so sure how to "cleanly" manage multitude of tables using the same key
of that "global table of keys"... that is its "back references".

So I'm stuck with seriously incomplete solution.

that's why I have an impression, that I'm going into entirely wrong
direction here.

this over GUID. GUIDs are twice the size, and randomized which has
some unpleasant performance characteristics. The advantage they
bring is being able to set up multiple generation points (say, across
multiple database servers) without fear of conflict.

Sequence generation is very fast so there is no performance related
argument not to use the approach.

merlin

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#11

Rob Sargent

robjsargent@gmail.com

over 8 years ago

In reply to: Rafal Pietrak (#10)

Re: looking for a globally unique row ID

On 09/14/2017 11:11 AM, Rafal Pietrak wrote:

Not really.

As I said, I'm not looking for performance or "fair probability" of
planetary-wide uniqueness.

My main objective is the "guarantee". Which I've tried to indicate
referring to "future UPDATEs".

What I mean here is functionality similar to "primary key", or "unique
constraint". Whenever somebody (application, like faulty application
IMPORTANT!) tries to INSERT or UPDATE a not unique value there (which in
fact could possibly be generated earlier by UUID algorithms, or even a
sequence), if that value is among table that uses that (misterious)
"global primary key"; that application just fails the transaction like
any other "not unique" constraint failing.

That's the goal.

Multitude of tablas using a single sequence does not give that guarantee.

As I've said, a solution closest to my target is a separate table with
just one column of that "global primary key", which get inserted/updated
within trigger on insert/update of the "client tables" ... only I'm not
so sure how to "cleanly" manage multitude of tables using the same key
of that "global table of keys"... that is its "back references".

So I'm stuck with seriously incomplete solution.

that's why I have an impression, that I'm going into entirely wrong
direction here.

So you care if the same id is used in separate, unrelated tables? What's
your fear here? And I completely get the confusion generated be the
same small integer being re-used in various context ("sample id" is the
bane for me). Could you use a sufficiently accurate time value?

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#12

Kenneth Marshall

ktm@rice.edu

over 8 years ago

In reply to: Rafal Pietrak (#10)

Re: looking for a globally unique row ID

On Thu, Sep 14, 2017 at 07:11:19PM +0200, Rafal Pietrak wrote:

As I said, I'm not looking for performance or "fair probability" of
planetary-wide uniqueness.

My main objective is the "guarantee". Which I've tried to indicate
referring to "future UPDATEs".

What I mean here is functionality similar to "primary key", or "unique
constraint". Whenever somebody (application, like faulty application
IMPORTANT!) tries to INSERT or UPDATE a not unique value there (which in
fact could possibly be generated earlier by UUID algorithms, or even a
sequence), if that value is among table that uses that (misterious)
"global primary key"; that application just fails the transaction like
any other "not unique" constraint failing.

That's the goal.

that's why I have an impression, that I'm going into entirely wrong
direction here.

Hi Rafal,

How many tables do you need to support a unique key across? My approach
to problems like this is to to provide constraints that will allow normal
DB functions to provide these assurances. For example, give each table
its on serial generator as a primary key, but make the sequences disjoint.
There are 9,592 prime numbers less than 100,000. Give each table one of
those as the increment and within that table you will never hit the sequence
value generated for a second table. This will at least allow you to prevent
any table from ever using the value for another table. Obviously, this may
not fit your use case, but it provides another way to attack the problem.
Good luck.

Regards,
Ken

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#13

Rafal Pietrak

rafal@ztk-rp.eu

over 8 years ago

In reply to: Rob Sargent (#11)

Re: looking for a globally unique row ID

W dniu 14.09.2017 o 19:30, Rob Sargent pisze:

On 09/14/2017 11:11 AM, Rafal Pietrak wrote:

[------------------]

So I'm stuck with seriously incomplete solution.

that's why I have an impression, that I'm going into entirely wrong
direction here.

So you care if the same id is used in separate, unrelated tables? What's
your fear here? And I completely get the confusion generated be the

You can call it fear, but I see it as an attempt for "schema/design
resilience".

Like with "unique constraint". You lay out the schema, and whatever bug
get planted into an application, the database does not let you put
inconsistent data there.

And answering your question, my problem is actually trivial. I need that
to systematically cover document identification within archive. There
are couple of document "Classes" I need to cover, their representation
in the DB requires different columns, so they are registered in
different table. I've started the project with one table for all those
classes, and quite soon it become monstrous.

So every document class got their own table. (and their private unique ID).

But problem rises when I need to referencing those documents along their
lifetime. Every class needs separate table for that. I cannot have a
single table of "history" (document-ID, dispatch-date, recepient), as
I'm not able to make FK from single table into multitude of tables (with
separate classes). So the forest of tables grows.

The above schema would get significantly simpler if only I could have a
"global primary key"... or something equivalent.

And as I said, this is not a problem of performance (or identification
space exhaustion). It's a problem of "schema explosion". Something that
actually is quite simple become complex (if not unmanageable) just
because I cannot see a good tool to manage the "globality".

I kick this problem for some time now - rewriting the basic schema
numerous times, and:

1. either I get monstrous "document master table" - which very
effectively leads to contents inconsistency (document get attributs
which dont belong to their classes)

2. or the schema explodes, as with documents stored in smaller (less
columns) specialized by-class table ... they all need their private
"managemnet tables" which FK into its respective document tables.

Both ways turned out to be quite expensive in maintenance.

Throwing actual numbers: 12 basic classes of documents; 17 tables
registering various operations document may undergo during its lifetime.
Variant (2) above make it 12*17 = 204 tables, which I'm currently
maintaining.... and it's too much. With variant (1) I simply wasn't able
to effectively keep document attributes consistent.

Thus I'm searching for tools (paradigms/sql-idioms) that would fit the
problem.

same small integer being re-used in various context ("sample id" is the
bane for me). Could you use a sufficiently accurate time value?

But thank you all for answers.

-R

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#14

Rob Sargent

robjsargent@gmail.com

over 8 years ago

In reply to: Rafal Pietrak (#13)

Re: looking for a globally unique row ID

On 09/14/2017 02:39 PM, Rafal Pietrak wrote:

W dniu 14.09.2017 o 19:30, Rob Sargent pisze:

On 09/14/2017 11:11 AM, Rafal Pietrak wrote:

[------------------]

So I'm stuck with seriously incomplete solution.

that's why I have an impression, that I'm going into entirely wrong
direction here.

So you care if the same id is used in separate, unrelated tables? What's
your fear here? And I completely get the confusion generated be the

You can call it fear, but I see it as an attempt for "schema/design
resilience".

Like with "unique constraint". You lay out the schema, and whatever bug
get planted into an application, the database does not let you put
inconsistent data there.

And answering your question, my problem is actually trivial. I need that
to systematically cover document identification within archive. There
are couple of document "Classes" I need to cover, their representation
in the DB requires different columns, so they are registered in
different table. I've started the project with one table for all those
classes, and quite soon it become monstrous.

So every document class got their own table. (and their private unique ID).

But problem rises when I need to referencing those documents along their
lifetime. Every class needs separate table for that. I cannot have a
single table of "history" (document-ID, dispatch-date, recepient), as
I'm not able to make FK from single table into multitude of tables (with
separate classes). So the forest of tables grows.

The above schema would get significantly simpler if only I could have a
"global primary key"... or something equivalent.

And as I said, this is not a problem of performance (or identification
space exhaustion). It's a problem of "schema explosion". Something that
actually is quite simple become complex (if not unmanageable) just
because I cannot see a good tool to manage the "globality".

I kick this problem for some time now - rewriting the basic schema
numerous times, and:

1. either I get monstrous "document master table" - which very
effectively leads to contents inconsistency (document get attributs
which dont belong to their classes)

2. or the schema explodes, as with documents stored in smaller (less
columns) specialized by-class table ... they all need their private
"managemnet tables" which FK into its respective document tables.

Both ways turned out to be quite expensive in maintenance.

Throwing actual numbers: 12 basic classes of documents; 17 tables
registering various operations document may undergo during its lifetime.
Variant (2) above make it 12*17 = 204 tables, which I'm currently
maintaining.... and it's too much. With variant (1) I simply wasn't able
to effectively keep document attributes consistent.

Thus I'm searching for tools (paradigms/sql-idioms) that would fit the
problem.

Isn't this typically handled with an inheritance (parent-children)
setup. MasterDocument has id, subtype and any common columns (create
date etc) then dependents use the same id from master to complete the
data for a given type. This is really common in ORM tools. Not clear
from the description if the operations could be similarly handled
(operation id, operation type as master of 17 dependent
operationSpecifics; there is also the "Activity Model")

same small integer being re-used in various context ("sample id" is the
bane for me). Could you use a sufficiently accurate time value?

But thank you all for answers.

-R

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#15

Karl Czajkowski

karlcz@isi.edu

over 8 years ago

In reply to: Rafal Pietrak (#10)

Re: looking for a globally unique row ID

On Sep 14, Rafal Pietrak modulated:

My main objective is the "guarantee". Which I've tried to indicate
referring to "future UPDATEs".

With a well-behaved application, it is sufficient to define each
ID column as:

id int8 UNIQUE NOT NULL DEFAULT nextval('sharedsequence')

and ensure that you always use defaults for INSERT and never
set the column during UPDATE.

You can also use a BEFORE/FOR EACH ROW trigger to enforce these
conventions, if you are worried about accidental violations in your
SQL. In that case, leave the DEFAULT as NULL in the table definition,
but make the trigger do this for INSERT:

NEW.id := nextval('sharedsequence');

and this enforcement check for UPDATE:

IF OLD.id != NEW.id THEN RAISE EXCEPTION 'id is immutable';

If that's not a strong enough guarantee, I'm not sure I understand
your problem statement...

Karl

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#16

Gavin Flower

GavinFlower@archidevsys.co.nz

over 8 years ago

In reply to: Kenneth Marshall (#12)

Re: looking for a globally unique row ID

On 15/09/17 06:15, Kenneth Marshall wrote:

On Thu, Sep 14, 2017 at 07:11:19PM +0200, Rafal Pietrak wrote:

As I said, I'm not looking for performance or "fair probability" of
planetary-wide uniqueness.

My main objective is the "guarantee". Which I've tried to indicate
referring to "future UPDATEs".

What I mean here is functionality similar to "primary key", or "unique
constraint". Whenever somebody (application, like faulty application
IMPORTANT!) tries to INSERT or UPDATE a not unique value there (which in
fact could possibly be generated earlier by UUID algorithms, or even a
sequence), if that value is among table that uses that (misterious)
"global primary key"; that application just fails the transaction like
any other "not unique" constraint failing.

That's the goal.

that's why I have an impression, that I'm going into entirely wrong
direction here.

Hi Rafal,

How many tables do you need to support a unique key across? My approach
to problems like this is to to provide constraints that will allow normal
DB functions to provide these assurances. For example, give each table
its on serial generator as a primary key, but make the sequences disjoint.
There are 9,592 prime numbers less than 100,000. Give each table one of
those as the increment and within that table you will never hit the sequence
value generated for a second table. This will at least allow you to prevent
any table from ever using the value for another table. Obviously, this may
not fit your use case, but it provides another way to attack the problem.
Good luck.

Regards,
Ken

Hmm...

2 4 6 8 10

3 6

5 10

Adding a prime as an increment is not sufficient to guarantee uniqueness!

You have to ensure that the product of the 2 smallest primes you use is
greater than any number you'd need to generate.ï¿œ With such large primes
you may run out of sequence numbers faster than you would like!

Cheers,
Gavin

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#17

Merlin Moncure

mmoncure@gmail.com

over 8 years ago

In reply to: Rafal Pietrak (#10)

Re: looking for a globally unique row ID

On Thu, Sep 14, 2017 at 12:11 PM, Rafal Pietrak <rafal@ztk-rp.eu> wrote:

W dniu 14.09.2017 o 15:54, Merlin Moncure pisze:

On Thu, Sep 14, 2017 at 2:45 AM, Rafal Pietrak <rafal@ztk-rp.eu> wrote:

Hello everybody,

Can anybody help me find a way to implement an ID which:

1. guarantees being unique across multiple tables.

2. guarantees its uniqueness not only during INSERT, but also during the
lifetime of the database/application (e.i. during future UPDATES).

3. guarantees persistence of value across database backup/restore/upgrade.

an obvious candidate - a single SERIAL() (same serial) used in every
table that needs that ID does not guarantee (2).

A shared sequence meets all of those requirements. I tend to prefer

Not really.

As I said, I'm not looking for performance or "fair probability" of
planetary-wide uniqueness.

My main objective is the "guarantee". Which I've tried to indicate
referring to "future UPDATEs".

What I mean here is functionality similar to "primary key", or "unique
constraint". Whenever somebody (application, like faulty application
IMPORTANT!) tries to INSERT or UPDATE a not unique value there (which in
fact could possibly be generated earlier by UUID algorithms, or even a
sequence), if that value is among table that uses that (misterious)
"global primary key"; that application just fails the transaction like
any other "not unique" constraint failing.

hm, I don't understand you. Are you trying to defend against forged
or erroneous data? With both sequences and GUIDs, the assumption is
that the application will use the generated id and send it back to the
database. With sequences, the database *guarantees* that the
identifier is unique with no exceptions; there never will be a unique
value. Can you give a hypothetical example of how you think they
wouldn't work?

merlin

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#18

David G. Johnston

david.g.johnston@gmail.com

over 8 years ago

In reply to: Rafal Pietrak (#1)

Re: looking for a globally unique row ID

On Thu, Sep 14, 2017 at 12:45 AM, Rafal Pietrak <rafal@ztk-rp.eu> wrote:

Hello everybody,

Can anybody help me find a way to implement an ID which:

1. guarantees being unique across multiple tables.

2. guarantees its uniqueness not only during INSERT, but also during the
lifetime of the database/application (e.i. during future UPDATES).

3. guarantees persistence of value across database backup/restore/upgrade.

an obvious candidate - a single SERIAL() (same serial) used in every
table that needs that ID does not guarantee (2).

I don't see how PostgreSQL can provide an absolute guarantee here. As
others have been saying you can get very close, though. Any of them have
the same basic property - you need to convert client SQL into "requests for
changes" and perform the actual changes within system-managed code while
restricting any possibility for applications to make those changes
themselves. You can get a basic version of this using triggers. Or you
can go all-out and write API functions for every one of these tables and
have the application perform CRUD only via these functions. These
functions then, and not the application, would control key generation.
You disallow updating IDs and when inserting an ID you insert it into a
"id log" table that has a unique index on it and if that insertion succeeds
you can then associate it with the record being insert into the main
table. You could setup an FK as well but it wouldn't provide the needed
guarantee.

Nothing will stop an administrator, or a user with incorrectly configured
permissions, from bypassing all of that but properly configured application
roles will be unable to do so.

David J.

#19

Joshua D. Drake

jd@commandprompt.com

over 8 years ago

In reply to: David G. Johnston (#18)

Re: looking for a globally unique row ID

On 09/14/2017 03:27 PM, David G. Johnston wrote:

On Thu, Sep 14, 2017 at 12:45 AM, Rafal Pietrak <rafal@ztk-rp.eu
<mailto:rafal@ztk-rp.eu>>wrote:

Hello everybody,

Can anybody help me find a way to implement an ID which:

1. guarantees being unique across multiple tables.

2. guarantees its uniqueness not only during INSERT, but also during the
lifetime of the database/application (e.i. during future UPDATES).

3. guarantees persistence of value across database
backup/restore/upgrade.

an obvious candidate - a single SERIAL() (same serial) used in every
table that needs that ID does not guarantee (2).

Isn't this what GUIDS are for?

--
Command Prompt, Inc. || http://the.postgres.company/ || @cmdpromptinc

PostgreSQL Centered full stack support, consulting and development.
Advocate: @amplifypostgres || Learn: https://pgconf.us
***** Unless otherwise stated, opinions are my own. *****

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#20

George Neuner

gneuner2@comcast.net

over 8 years ago

In reply to: Rafal Pietrak (#1)

Re: looking for a globally unique row ID

On Thu, 14 Sep 2017 17:02:05 -0500, Merlin Moncure
<mmoncure@gmail.com> wrote:

... With sequences, the database *guarantees* that the
identifier is unique with no exceptions; there never will be a unique
value. Can you give a hypothetical example of how you think they
wouldn't work?

Jumping in here, but a month or so past someone here was complaining
about a restore resetting all the sequences in the database. Obviously
a strange situation [I've never seen it], but something to worry about
if you are relying on sequences for uniqueness.

Upthread I suggested the timestamp+counter approach. I wasn't
thinking about this issue specifically, but it is immune to the
counter being reset [accidentally or otherwise].

George

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#21

Rafal Pietrak

rafal@ztk-rp.eu

over 8 years ago

In reply to: Rob Sargent (#14)

#22