GUID for postgreSQL
Hi All,
We are a small developing house in South Africa, which is in the process of
porting our Apps from
MS SQL to PostgreSQL. We use the newid() [globally unique identifier
(GUID)] function in SQL a lot, and need the same for pg.
Our development platform is .NET using c#. We also plan to start using Mono
C# in the future.
We will be deploying PostgreSQL on Windows (2003 Server) and Linux (Ubuntu)
platforms.
We have search the web and found c code that can do this, but we do not have
c programmers.
Are there anybody that can help us compiling these for us, we need it on
both OSs?
Thanks
Regards,
Riaan van der Westhuizen
CEO
Huizensoft (Pty) Ltd
Tel: +27 44 871 5534
Fax: +27 44 871 5098
This e-mail and any attachments thereto is confidential and
is intended solely for the use of the addressee's.
If you are not the intended recipient, be advised that any use,
dissemination, forwarding, printing, or copying of this e-mail is strictly
prohibited.
Huizensoft (Pty) Ltd accepts no liability for any views or opinions
expressed in
this e-mail or for any loss or damages that may be suffered by any person
whomsoever, arising from, or in connection with, or caused by, the use of
this e-mail.
Am Mittwoch, den 27.07.2005, 09:48 +0200 schrieb Riaan van der
Westhuizen:
Hi All,
We are a small developing house in South Africa, which is in the process of
porting our Apps from
MS SQL to PostgreSQL. We use the newid() [globally unique identifier
(GUID)] function in SQL a lot, and need the same for pg.Our development platform is .NET using c#. We also plan to start using Mono
C# in the future.
We will be deploying PostgreSQL on Windows (2003 Server) and Linux (Ubuntu)
platforms.We have search the web and found c code that can do this, but we do not have
c programmers.Are there anybody that can help us compiling these for us, we need it on
both OSļæ½s?
I'd create a sequence:
CREATE SEQUENCE global_unique_id_seq;
and a function:
CREATE OR REPLACE FUNCTION newid()
RETURNS text AS
$BODY$ SELECT nextval('global_unique_id_seq')::text; $BODY$
LANGUAGE 'sql' VOLATILE;
now every call to newid() returns a garantied unique id for
say the next 18446744073709551616 calls.
Of course you can obfuscate the ID even more using
md5, include servername and so on, but this will not improve
security in any way (unless you mix data with 2nd database)
This e-mail and any attachments thereto is confidential and
is intended solely for the use of the addressee's.
If you are not the intended recipient, be advised that any use,
dissemination, forwarding, printing, or copying of this e-mail is strictly
prohibited.Huizensoft (Pty) Ltd accepts no liability for any views or opinions
expressed in
this e-mail or for any loss or damages that may be suffered by any person
whomsoever, arising from, or in connection with, or caused by, the use of
this e-mail.
I'd skip this disclaimer as it is not relevant to law and makes the
company look a bit moronic (sorry) especially on mailinglists.
Tino
Hi Riaan.!
i think that is a better solution, to use the postgres native type
"serial", that it is a autoincremental number type. You can see it
on the postgres manual. It's very usefull because of you only need to
redifine the type of your id vars. For example, on this moment you
have
CREATE TABLE test ( name varchar(50), id int );
insert into test values ('prueba',newid);
On postgres you can resplace thoses
CREATE TABLE test ( name varchar(50), id serial primary key);
insert into test values ('prueba');
Serial type create automatically a sequence asocited to id. and always
you insert something using its default value, the serial id
autoincrements its value.
thanks , i have not a good english but i try to help !
Atte:
Edwin Barrios !
Gerente iBand Networks Ltda
Show quoted text
On 7/27/05, Riaan van der Westhuizen <riaan@huizensoft.co.za> wrote:
Hi All,
We are a small developing house in South Africa, which is in the process of
porting our Apps from
MS SQL to PostgreSQL. We use the newid() [globally unique identifier
(GUID)] function in SQL a lot, and need the same for pg.Our development platform is .NET using c#. We also plan to start using Mono
C# in the future.
We will be deploying PostgreSQL on Windows (2003 Server) and Linux (Ubuntu)
platforms.We have search the web and found c code that can do this, but we do not have
c programmers.Are there anybody that can help us compiling these for us, we need it on
both OS's?Thanks
Regards,
Riaan van der Westhuizen
CEOHuizensoft (Pty) Ltd
Tel: +27 44 871 5534
Fax: +2744 871 5098This e-mail and any attachments thereto is confidential and
is intended solely for the use of the addressee's.
If you are not the intended recipient, be advised that any use,
dissemination, forwarding, printing, or copying of this e-mail is strictly
prohibited.Huizensoft (Pty) Ltd accepts no liability for any views or opinions
expressed in
this e-mail or for any loss or damages that may be suffered by any person
whomsoever, arising from, or in connection with, or caused by, the use of
this e-mail.---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings
On Jul 27, 2005, at 4:49 AM, Tino Wildenhain wrote:
I'd create a sequence:
CREATE SEQUENCE global_unique_id_seq;
and a function:
CREATE OR REPLACE FUNCTION newid()
RETURNS text AS
$BODY$ SELECT nextval('global_unique_id_seq')::text; $BODY$
LANGUAGE 'sql' VOLATILE;now every call to newid() returns a garantied unique id for
say the next 18446744073709551616 calls.
Of course you can obfuscate the ID even more using
md5, include servername and so on, but this will not improve
security in any way (unless you mix data with 2nd database)
This is not really a viable replacement for a GUID == globally unique
identifier. Here global means that if I use the application in
multiple databases, I'm guaranteed that no two identifiers will be
the same. Using a sequence will only support uniqueness for a single
database.
Best,
John DeSoi, Ph.D.
http://pgedit.com/
Power Tools for PostgreSQL
On Wed, 2005-07-27 at 15:32, John DeSoi wrote:
On Jul 27, 2005, at 4:49 AM, Tino Wildenhain wrote:
I'd create a sequence:
CREATE SEQUENCE global_unique_id_seq;
and a function:
CREATE OR REPLACE FUNCTION newid()
RETURNS text AS
$BODY$ SELECT nextval('global_unique_id_seq')::text; $BODY$
LANGUAGE 'sql' VOLATILE;now every call to newid() returns a garantied unique id for
say the next 18446744073709551616 calls.
Of course you can obfuscate the ID even more using
md5, include servername and so on, but this will not improve
security in any way (unless you mix data with 2nd database)This is not really a viable replacement for a GUID == globally unique
identifier. Here global means that if I use the application in
multiple databases, I'm guaranteed that no two identifiers will be
the same. Using a sequence will only support uniqueness for a single
database.
So, how can two databases, not currently talking to one another,
guarantee that their GUIDs don't collide? using a large randomly
generated name space only reduces the chances of collision, it doesn't
actually guarantee it.
Windows uses the MAC address in GUID generation.
-----Original Message-----
From: pgsql-general-owner@postgresql.org [mailto:pgsql-general-
owner@postgresql.org] On Behalf Of Scott Marlowe
Sent: Wednesday, July 27, 2005 1:47 PM
To: John DeSoi
Cc: Tino Wildenhain; Riaan van der Westhuizen; Postgresql-General
Subject: Re: [GENERAL] GUID for postgreSQLOn Wed, 2005-07-27 at 15:32, John DeSoi wrote:
On Jul 27, 2005, at 4:49 AM, Tino Wildenhain wrote:
I'd create a sequence:
CREATE SEQUENCE global_unique_id_seq;
and a function:
CREATE OR REPLACE FUNCTION newid()
RETURNS text AS
$BODY$ SELECT nextval('global_unique_id_seq')::text; $BODY$
LANGUAGE 'sql' VOLATILE;now every call to newid() returns a garantied unique id for
say the next 18446744073709551616 calls.
Of course you can obfuscate the ID even more using
md5, include servername and so on, but this will not improve
security in any way (unless you mix data with 2nd database)This is not really a viable replacement for a GUID == globally
unique
identifier. Here global means that if I use the application in
multiple databases, I'm guaranteed that no two identifiers will be
the same. Using a sequence will only support uniqueness for a single
database.So, how can two databases, not currently talking to one another,
guarantee that their GUIDs don't collide? using a large randomly
generated name space only reduces the chances of collision, it doesn't
actually guarantee it.---------------------------(end of
broadcast)---------------------------
Show quoted text
TIP 3: Have you checked our extensive FAQ?
Import Notes
Resolved by subject fallback
On Jul 27, 2005, at 4:46 PM, Scott Marlowe wrote:
So, how can two databases, not currently talking to one another,
guarantee that their GUIDs don't collide? using a large randomly
generated name space only reduces the chances of collision, it doesn't
actually guarantee it.
Like MD5, there is no 100% guarantee, but the collision possibility
supposed to be is very close to zero.
See http://en.wikipedia.org/wiki/GUID
John DeSoi, Ph.D.
http://pgedit.com/
Power Tools for PostgreSQL
Yes, this is the problem with GUIDs... you can calculate them by mashing
toghether things like the time, a network address, and some random
numbers, which makes it very unlikely for a collision.... but at the end
of the day that G stand for global, *not* guaranteed.
On Wed, 27 Jul 2005, Scott Marlowe wrote:
Show quoted text
On Wed, 2005-07-27 at 15:32, John DeSoi wrote:
On Jul 27, 2005, at 4:49 AM, Tino Wildenhain wrote:
I'd create a sequence:
CREATE SEQUENCE global_unique_id_seq;
and a function:
CREATE OR REPLACE FUNCTION newid()
RETURNS text AS
$BODY$ SELECT nextval('global_unique_id_seq')::text; $BODY$
LANGUAGE 'sql' VOLATILE;now every call to newid() returns a garantied unique id for
say the next 18446744073709551616 calls.
Of course you can obfuscate the ID even more using
md5, include servername and so on, but this will not improve
security in any way (unless you mix data with 2nd database)This is not really a viable replacement for a GUID == globally unique
identifier. Here global means that if I use the application in
multiple databases, I'm guaranteed that no two identifiers will be
the same. Using a sequence will only support uniqueness for a single
database.So, how can two databases, not currently talking to one another,
guarantee that their GUIDs don't collide? using a large randomly
generated name space only reduces the chances of collision, it doesn't
actually guarantee it.---------------------------(end of broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?
On Wed, 2005-07-27 at 15:57, John DeSoi wrote:
On Jul 27, 2005, at 4:46 PM, Scott Marlowe wrote:
So, how can two databases, not currently talking to one another,
guarantee that their GUIDs don't collide? using a large randomly
generated name space only reduces the chances of collision, it doesn't
actually guarantee it.Like MD5, there is no 100% guarantee, but the collision possibility
supposed to be is very close to zero.
Then I would think a better thought out solution would be one where your
unique ids ARE guaranteed to be unique, where you used something like
select 'astringuniqtothismachine'||nextval('localsequence');
That really would be guaranteed unique as long as you set up each
machine to have a string unique to it.
This is not really a viable replacement for a GUID ==
globally unique
identifier. Here global means that if I use the application in
multiple databases, I'm guaranteed that no two identifierswill be the
same. Using a sequence will only support uniqueness for a single
database.So, how can two databases, not currently talking to one
another, guarantee that their GUIDs don't collide? using a
large randomly generated name space only reduces the chances
of collision, it doesn't actually guarantee it.
At least on Windows, the GUID is derived in part from the computers
primary MAC address. No, it's not a guarantee, but it's pretty unlikely
:-)
//Magnus
Import Notes
Resolved by subject fallback
You could guarantee it, for example...
Something like (pseudocode here):
create sequence local_id;
create domain guid AS text default ('54-' || (nextval(local_id))::text);
where 54 is the database id. In this way, every inserted GUID will be
guaranteed to contain a GUID in two parts: A database identifier and a
locally unique local identifier. These could then be parsed in a
reasonable way.
The only way I think one can come up with *guaranteed* globally unique
identifiers is to place such information such as we use with other
things that must be globally unique: have a locally unique identifier
along with a globally unique location identifieer. Sort of like we have
with IP addresses, MAC addresses, telephone numbers, etc...
Best Wishes,
Chris Travers
Metatron Technology Consulting
Ben wrote:
Show quoted text
Yes, this is the problem with GUIDs... you can calculate them by mashing
toghether things like the time, a network address, and some random
numbers, which makes it very unlikely for a collision.... but at the end
of the day that G stand for global, *not* guaranteed.On Wed, 27 Jul 2005, Scott Marlowe wrote:
On Wed, 2005-07-27 at 15:32, John DeSoi wrote:
On Jul 27, 2005, at 4:49 AM, Tino Wildenhain wrote:
I'd create a sequence:
CREATE SEQUENCE global_unique_id_seq;
and a function:
CREATE OR REPLACE FUNCTION newid()
RETURNS text AS
$BODY$ SELECT nextval('global_unique_id_seq')::text; $BODY$
LANGUAGE 'sql' VOLATILE;now every call to newid() returns a garantied unique id for
say the next 18446744073709551616 calls.
Of course you can obfuscate the ID even more using
md5, include servername and so on, but this will not improve
security in any way (unless you mix data with 2nd database)This is not really a viable replacement for a GUID == globally unique
identifier. Here global means that if I use the application in
multiple databases, I'm guaranteed that no two identifiers will be
the same. Using a sequence will only support uniqueness for a single
database.So, how can two databases, not currently talking to one another,
guarantee that their GUIDs don't collide? using a large randomly
generated name space only reduces the chances of collision, it doesn't
actually guarantee it.---------------------------(end of broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?---------------------------(end of broadcast)---------------------------
TIP 6: explain analyze is your friend
Magnus Hagander wrote:
At least on Windows, the GUID is derived in part from the computers
primary MAC address. No, it's not a guarantee, but it's pretty unlikely
:-)
The danger is not that the MAC address will be duplicated, but that
other factors will lead to an MD5 collision.
Unless you can show me that there is a 1:1 correspondence of all
possible unique factors going into the GUID generation and the output,
then I will say it is still no guarantee.
Just because two documents or files have the same MD5 doesn't mean that
they are the same files either. I.e. you can't go searching all files
by MD5 checksums and expecting to find the right one. OTOH, MD5
provides reasonable assurance that any given file (once you know its
intended MD5) has not been tampered with. I.e. MD5 is not meant to
preclude collisions, but rather it is meant to preclude *intentional*
collisions. Similarly, if we want a guaranteed uniqueness to a GUID we
have to have some sort of unique string to the GUID prepended to it (not
merely used in a hash).
So you could use the Mac address of the machine, I guess, if you wanted
to....
Best Wishes,
Chris Travers
Metatron Technology Consulting
There is a "privacy hole" from using the MAC address. (Read it in the
WIKI article someone else posted).
Probably, it would be better to use a one way hash of the MAC address.
-----Original Message-----
From: pgsql-general-owner@postgresql.org [mailto:pgsql-general-
owner@postgresql.org] On Behalf Of Chris Travers
Sent: Wednesday, July 27, 2005 2:27 PM
To: Magnus Hagander; pgsql-general
Subject: Re: [GENERAL] GUID for postgreSQLMagnus Hagander wrote:
At least on Windows, the GUID is derived in part from the computers
primary MAC address. No, it's not a guarantee, but it's pretty
unlikely
:-)
The danger is not that the MAC address will be duplicated, but that
other factors will lead to an MD5 collision.Unless you can show me that there is a 1:1 correspondence of all
possible unique factors going into the GUID generation and the output,
then I will say it is still no guarantee.Just because two documents or files have the same MD5 doesn't mean
that
they are the same files either. I.e. you can't go searching all files
by MD5 checksums and expecting to find the right one. OTOH, MD5
provides reasonable assurance that any given file (once you know its
intended MD5) has not been tampered with. I.e. MD5 is not meant to
preclude collisions, but rather it is meant to preclude *intentional*
collisions. Similarly, if we want a guaranteed uniqueness to a GUID
we
have to have some sort of unique string to the GUID prepended to it
(not
merely used in a hash).
So you could use the Mac address of the machine, I guess, if you
wanted
to....
Best Wishes,
Chris Travers
Metatron Technology Consulting---------------------------(end of
broadcast)---------------------------
Show quoted text
TIP 5: don't forget to increase your free space map settings
Import Notes
Resolved by subject fallback
Am Mittwoch, den 27.07.2005, 23:03 +0200 schrieb Magnus Hagander:
This is not really a viable replacement for a GUID ==
globally unique
identifier. Here global means that if I use the application in
multiple databases, I'm guaranteed that no two identifierswill be the
same. Using a sequence will only support uniqueness for a single
database.
So, how can two databases, not currently talking to one
another, guarantee that their GUIDs don't collide? using a
large randomly generated name space only reduces the chances
of collision, it doesn't actually guarantee it.
well, then give each database a numer, name or whatever and
pad sequence with it. Nothing simpler then that.
The global uniqueness is just a wishfull dream. Nobody can
garantie it. If you use a distingushed name or number for
each of your databases, its garantied.
At least on Windows, the GUID is derived in part from the computers
primary MAC address. No, it's not a guarantee, but it's pretty unlikely
:-)
Yes, thats one way.. But really you just need a domain (not
related to the internet meaning of domains ;) Anything
that lets you identify (or at least distinguish) _your_
databases. (unlikely you are working with every or random
databases in the world ;)
You dont even need a hash function (md5, sha1) if uniqueness is
all you need. Otoh, the often raised clash argument with md5 does not
count here because collisions dont happen just per coincidence
given the very limited rule (database-identifier + serial)
Use Dblink and do a select off of a sequence on just one of the boxes?
You could set up a view that uses DBlink on all the boxes that points to
the master seq box.
should work.
Scott Marlowe wrote:
Show quoted text
So, how can two databases, not currently talking to one another,
guarantee that their GUIDs don't collide? using a large randomly
generated name space only reduces the chances of collision, it doesn't
actually guarantee it.---------------------------(end of broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?
On Wed, Jul 27, 2005 at 05:40:11PM -0500, Tony Caduto wrote:
Use Dblink and do a select off of a sequence on just one of the boxes?
You could set up a view that uses DBlink on all the boxes that points to
the master seq box.should work.
It'll make the whole thing painfully slow.
--
Alvaro Herrera (<alvherre[a]alvh.no-ip.org>)
<inflex> really, I see PHP as like a stange amalgamation of C, Perl, Shell
<crab> inflex: you know that "amalgam" means "mixture with mercury",
more or less, right?
<crab> i.e., "deadly poison"
On Jul 27, 2005, at 5:00 PM, Scott Marlowe wrote:
Then I would think a better thought out solution would be one where
your
unique ids ARE guaranteed to be unique, where you used something likeselect 'astringuniqtothismachine'||nextval('localsequence');
That really would be guaranteed unique as long as you set up each
machine to have a string unique to it.
I have implemented this type of approach in distributed systems. The
problem is users who make a copy of their database, continue to use
both copies, and then call you when they try to merge things
together. I would say user opportunity to mess this up is way more
likely than having a GUID collision.
I'm not saying that GUIDs are the ultimate solution to this problem.
The original poster brought up the need to store GUIDs in a database.
There are protocols and standards that require GUIDs and I merely
agree it would be nice to have a GUID data type.
John DeSoi, Ph.D.
http://pgedit.com/
Power Tools for PostgreSQL
On Wed, Jul 27, 2005 at 07:43:08PM -0400, John DeSoi wrote:
I'm not saying that GUIDs are the ultimate solution to this problem.
The original poster brought up the need to store GUIDs in a database.
There are protocols and standards that require GUIDs and I merely
agree it would be nice to have a GUID data type.
AFAIR there is one on gborg.
--
Alvaro Herrera (<alvherre[a]alvh.no-ip.org>)
"Just treat us the way you want to be treated + some extra allowance
for ignorance." (Michael Brusser)
I don't think it would be that slow, unless running on a modem.
It would really depend on what you are doing, calling dblink 1000 times
a second would be slow, but I could see it being
used succesfully for other applications.
Alvaro Herrera wrote:
Show quoted text
On Wed, Jul 27, 2005 at 05:40:11PM -0500, Tony Caduto wrote:
Use Dblink and do a select off of a sequence on just one of the boxes?
You could set up a view that uses DBlink on all the boxes that points to
the master seq box.should work.
It'll make the whole thing painfully slow.
Also sprach Scott Marlowe (smarlowe@g2switchworks.com)
On Wed, 2005-07-27 at 15:57, John DeSoi wrote:
On Jul 27, 2005, at 4:46 PM, Scott Marlowe wrote:
So, how can two databases, not currently talking to one another,
guarantee that their GUIDs don't collide? using a large randomly
generated name space only reduces the chances of collision, it doesn't
actually guarantee it.Like MD5, there is no 100% guarantee, but the collision possibility
supposed to be is very close to zero.Then I would think a better thought out solution would be one where your
unique ids ARE guaranteed to be unique, where you used something likeselect 'astringuniqtothismachine'||nextval('localsequence');
That really would be guaranteed unique as long as you set up each
machine to have a string unique to it.
I was a little bit confused about the uniqueness of GUID (esp. using
only 16 Bit [1]Why do people want to have IPv6? -- PGP FPR: CF74 D5F2 4871 3E5C FFFE 0130 11F4 C41E B3FB AE33 -- Der Geist des Kriegers sollte mit Beginn des Neujahrstages bis zum Ende des Jahres vom Gedanken an seinen Tod beherrscht werden.) and read the article about the UUID:
http://en.wikipedia.org/wiki/Universally_Unique_Identifier
It states:
Conceptually, the original (version 1) generation scheme for UUIDs was
to concatenate the UUID version with the MAC address of the computer
that is generating the UUID, and with the number of 100-nanosecond
intervals since the adoption of the Gregorian calendar. In practice,
the actual algorithm is more complicated. This scheme has been
criticized in that it is not sufficiently 'opaque'; it reveals both
the identity of the computer that generated the UUID and the time at
which it did so.
Several other generation algorithms have been developed and
incorporated into the standard, including a scheme relying only on
random numbers (version 4 UUIDs), and a scheme deriving a UUID from a
URL via MD5 (version 3 UUIDs) or SHA-1 (version 5 UUIDs) hashing.
The 5.0 release of Java provides a class that will produce 128-bit
UUIDs. The API documentation for the java.util.UUID class refers to
ISO/IEC 11578:1996.
So as I understand the GUIDs only apply to the Microsoft Universe[TM], or
are there any other serious apllications using it?
I don't see any value of that, if not all real databases support one
GUID standard (with more than 16 Bit) worldwide. It might also be a
good idea to reserve space of the hash to identify the database
(vendor), just like FCC IDs or MACs do, but that would require a
central authority assigning those names and numbers.
[1]: Why do people want to have IPv6? -- PGP FPR: CF74 D5F2 4871 3E5C FFFE 0130 11F4 C41E B3FB AE33 -- Der Geist des Kriegers sollte mit Beginn des Neujahrstages bis zum Ende des Jahres vom Gedanken an seinen Tod beherrscht werden.
--
PGP FPR: CF74 D5F2 4871 3E5C FFFE 0130 11F4 C41E B3FB AE33
--
Der Geist des Kriegers sollte mit Beginn des Neujahrstages bis zum Ende
des Jahres vom Gedanken an seinen Tod beherrscht werden.
Daijouji Shigesuke in "Budo Shoshin Shuu"