IS it a good practice to use SERIAL as Primary Key?
Hi all,
I am wonderring if it is a good practice to use SERIAL index as primary key,
as it is only available up to 9999999?
Currently i am dealing with storing LDAP users into Postgres and i am
looking for a better way to make use of the DN as primary key instead of
SERIAL index.
Any advice or suggestion is appreciated.
Thanks.
Regards,
Carter
_________________________________________________________________
Find singles online in your area with MSN Dating and Match.com!
http://match.sg.msn.com/match/mt.cfm?pg=channel&tcid=281203
On Nov 22, 2006, at 6:23 PM, carter ck wrote:
Hi all,
I am wonderring if it is a good practice to use SERIAL index as
primary key, as it is only available up to 9999999?
Where did you get that idea? A serial should be good up to at least
2,000,000,000 or so, and if that's not enough there's always bigserial.
Currently i am dealing with storing LDAP users into Postgres and i
am looking for a better way to make use of the DN as primary key
instead of SERIAL index.Any advice or suggestion is appreciated.
If you want a synthetic primary key then a serial field is the
easiest way to create one.
Cheers,
Steve
"carter ck" <carterck32@hotmail.com> writes:
I am wonderring if it is a good practice to use SERIAL index as primary key,
as it is only available up to 9999999?
Where in the world did you get that idea?
SERIAL goes up to 2^31 (2 billion); if you need more use BIGSERIAL.
regards, tom lane
I am wonderring if it is a good practice to use SERIAL index as primary key,
as it is only available up to 9999999?
That isn't true. It is much larger that that. If yor need more than that there is always
bigserial.
serial = int4
bigserial = int8
""
The type names serial and serial4 are equivalent: both create integer columns. The type names
bigserial and serial8 work just the same way, except that they create a bigint column. bigserial
should be used if you anticipate the use of more than 231 identifiers over the lifetime of the
table.
""
http://www.postgresql.org/docs/8.2/interactive/datatype-numeric.html#DATATYPE-SERIAL
Currently i am dealing with storing LDAP users into Postgres and i am
looking for a better way to make use of the DN as primary key instead of
SERIAL index.Any advice or suggestion is appreciated.
Here is a similar discussion that you may be enterested in:
http://archives.postgresql.org/pgsql-general/2006-10/msg00024.php
Regards,
Richard Broersma Jr.
Richard Broersma Jr <rabroersma@yahoo.com> writes:
"" The type names serial and serial4 are equivalent: both create integer
columns. The type names bigserial and serial8 work just the same way, except
that they create a bigint column. bigserial should be used if you anticipate
the use of more than 231 identifiers over the lifetime of the table. ""
http://www.postgresql.org/docs/8.2/interactive/datatype-numeric.html#DATATYPE-SERIAL
What would be those "231 identifiers"?
--
Jorge Godoy <jgodoy@gmail.com>
"" The type names serial and serial4 are equivalent: both create integer
columns. The type names bigserial and serial8 work just the same way, except
that they create a bigint column. bigserial should be used if you anticipate
the use of more than 231 identifiers over the lifetime of the table. ""
http://www.postgresql.org/docs/8.2/interactive/datatype-numeric.html#DATATYPE-SERIALWhat would be those "231 identifiers"?
oops, when I copied that text from the 8.2 docs I didn't catch that format error.
231 should read 2^31.
Regards,
Richard Broersma Jr.
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
On 11/22/06 20:23, carter ck wrote:
Hi all,
I am wonderring if it is a good practice to use SERIAL index as primary
key, as it is only available up to 9999999?Currently i am dealing with storing LDAP users into Postgres and i am
looking for a better way to make use of the DN as primary key instead of
SERIAL index.Any advice or suggestion is appreciated.
I'm one of those who thinks that a (possibly multisegment) natural
key *does* exist, and that if you think it doesn't, your design is
wrong.
For those times when and that when numeric sequences *are* needed
(employee_id and account_number for example) they should include a
check digit, to ensure that you don't mis-type a number and charge
the wrong account.
[I'm old enough to have worked in a Service Bureau where lots women
keypunched form data into Mohawk key-to-tape machines, and check
digits, which are also in credit cards and SSNs, are a perfect way
to protect against typos.]
- --
Ron Johnson, Jr.
Jefferson LA USA
Is "common sense" really valid?
For example, it is "common sense" to white-power racists that
whites are superior to blacks, and that those with brown skins
are mud people.
However, that "common sense" is obviously wrong.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.5 (GNU/Linux)
iD8DBQFFZcsbS9HxQb37XmcRAmtYAJ44k15B2bX8GQ6MegaEFGxeWm9q6gCgoVAT
w+exLaR8symCHDzKwSgp5q0=
=uIq6
-----END PGP SIGNATURE-----
On Thu, Nov 23, 2006 at 10:23:55AM -0600, Ron Johnson wrote:
For those times when and that when numeric sequences *are* needed
(employee_id and account_number for example) they should include a
check digit, to ensure that you don't mis-type a number and charge
the wrong account.
Sure, but the check digit does not need to be stored, as it can be
regenerated on demand. The user interface just verifies the check
digit, then throws it away.
Have a nice day,
--
Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/
Show quoted text
From each according to his ability. To each according to his ability to litigate.
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
On 11/23/06 10:49, Martijn van Oosterhout wrote:
On Thu, Nov 23, 2006 at 10:23:55AM -0600, Ron Johnson wrote:
For those times when and that when numeric sequences *are* needed
(employee_id and account_number for example) they should include a
check digit, to ensure that you don't mis-type a number and charge
the wrong account.Sure, but the check digit does not need to be stored, as it can be
regenerated on demand. The user interface just verifies the check
digit, then throws it away.
$ SET GEEZER
$ WRITE SYS$OUTPUT "THAT'S JUST EXTRA CYCLES WASTED BY THE"
$ WRITE SYS$OUTPUT "CLIENT. BETTER TO USE THEM FOR SOME OTHER"
$ WRITE SYS$OUTPUT "MORE PRODUCTIVE PURPOSE."
$ SET NOGEEZER
That's the VAX/VMS in me oozing out. But seriously, regenerate it
on demand??? That's not how it works. This isn't a CRC or hash
function.
- --
Ron Johnson, Jr.
Jefferson LA USA
Is "common sense" really valid?
For example, it is "common sense" to white-power racists that
whites are superior to blacks, and that those with brown skins
are mud people.
However, that "common sense" is obviously wrong.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.5 (GNU/Linux)
iD8DBQFFZd+xS9HxQb37XmcRAgB9AJ99CR+I7NvxMWtJkFENPP/IRJUq7QCg709/
eC2gtd+QLk1+PiTU/ub5WOU=
=VuM+
-----END PGP SIGNATURE-----
Hi,
Sure, but the check digit does not need to be stored, as it can be
regenerated on demand. The user interface just verifies the check
digit, then throws it away.$ SET GEEZER
$ WRITE SYS$OUTPUT "THAT'S JUST EXTRA CYCLES WASTED BY THE"
$ WRITE SYS$OUTPUT "CLIENT. BETTER TO USE THEM FOR SOME OTHER"
$ WRITE SYS$OUTPUT "MORE PRODUCTIVE PURPOSE."
$ SET NOGEEZERThat's the VAX/VMS in me oozing out. But seriously, regenerate it
on demand??? That's not how it works. This isn't a CRC or hash
function.
Well, a check digit _is_ a kind of CRC. It is redundant information. For
every number there is only one correct check digit, which means that the
check digit does not add extra information to the number. So why store it?
You will need to add the check digit on most (all?) output that is
interpreted by humans. The software itself can just use the number itself
(assuming you don't need to check the integrity of the software).
If you store the number in the database, I would suggest making the db check
the number on all input too. Otherwise you might end up with invalid data in
the database.
- Sander
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
On 11/23/06 12:38, Sander Steffann wrote:
Hi,
Sure, but the check digit does not need to be stored, as it can be
regenerated on demand. The user interface just verifies the check
digit, then throws it away.$ SET GEEZER
$ WRITE SYS$OUTPUT "THAT'S JUST EXTRA CYCLES WASTED BY THE"
$ WRITE SYS$OUTPUT "CLIENT. BETTER TO USE THEM FOR SOME OTHER"
$ WRITE SYS$OUTPUT "MORE PRODUCTIVE PURPOSE."
$ SET NOGEEZERThat's the VAX/VMS in me oozing out. But seriously, regenerate it
on demand??? That's not how it works. This isn't a CRC or hash
function.Well, a check digit _is_ a kind of CRC. It is redundant information. For
every number there is only one correct check digit, which means that the
check digit does not add extra information to the number. So why store it?
Because it's *part of* the id number. The way we implement it, it's
the one's digit.
c = f(n)
n' = n*10 + c
n' is what is stored in id column of the relevant table.
This way, if "you" {mistype an id number, it gets garbled in
transmission, etc}, it can be algorithmically determined whether or
not that is a valid number or not, and only if it is a valid number
do you hit the database.
Bottom line: check digits are in SSNs and credit card numbers, for a
good reason.
You will need to add the check digit on most (all?) output that is
interpreted by humans. The software itself can just use the number
itself (assuming you don't need to check the integrity of the software).If you store the number in the database, I would suggest making the db
check the number on all input too. Otherwise you might end up with
invalid data in the database.
- --
Ron Johnson, Jr.
Jefferson LA USA
Is "common sense" really valid?
For example, it is "common sense" to white-power racists that
whites are superior to blacks, and that those with brown skins
are mud people.
However, that "common sense" is obviously wrong.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.5 (GNU/Linux)
iD8DBQFFZiVUS9HxQb37XmcRAhogAKCPp6s995Lm84tUG9TunRvvaFbD8gCcDsv5
18fDxZwV6PrPskRym7hPzHs=
=ON0M
-----END PGP SIGNATURE-----
Bottom line: check digits are in SSNs
Uhm, no they're not. And this is of course one of the huge problems with
SSNs. (Although not quite as bad as the fact that they're not strictly
unique. Yes, really, duplicates have been issued in the past.)
--
Scott Ribe
scott_ribe@killerbytes.com
http://www.killerbytes.com/
(303) 722-0567 voice
On Thu, 2006-11-23 at 10:23, Ron Johnson wrote:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1On 11/22/06 20:23, carter ck wrote:
Hi all,
I am wonderring if it is a good practice to use SERIAL index as primary
key, as it is only available up to 9999999?Currently i am dealing with storing LDAP users into Postgres and i am
looking for a better way to make use of the DN as primary key instead of
SERIAL index.Any advice or suggestion is appreciated.
I'm one of those who thinks that a (possibly multisegment) natural
key *does* exist, and that if you think it doesn't, your design is
wrong.
Spend some time in the travel industry... The tax category ZO means
Passenger Service Chareg in Denmark. Or Greenland, or Faroe Islands.
And can be entered more than once. And the travel agent has to look at
the context of the travel itinerary to know which one(s) it is.
Sadly, the real world has many data problems created by idiots in suits
30 years ago that aren't going to go away any time soon.
On Mon, 2006-11-27 at 11:47 -0600, Scott Marlowe wrote:
On Thu, 2006-11-23 at 10:23, Ron Johnson wrote:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1On 11/22/06 20:23, carter ck wrote:
Hi all,
I am wonderring if it is a good practice to use SERIAL index as primary
key, as it is only available up to 9999999?Currently i am dealing with storing LDAP users into Postgres and i am
looking for a better way to make use of the DN as primary key instead of
SERIAL index.'
Bigserial?
Joshua D. Drake
--
=== The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
Providing the most comprehensive PostgreSQL solutions since 1997
http://www.commandprompt.com/
Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
On 11/27/06 11:26, Scott Ribe wrote:
Bottom line: check digits are in SSNs
Uhm, no they're not. And this is of course one of the huge problems with
SSNs. (Although not quite as bad as the fact that they're not strictly
unique. Yes, really, duplicates have been issued in the past.)
Hmm, you're right. Other kinds of important numbers have check
digits, though.
http://www.cs.nmsu.edu/~cssem/DickOct18.pdf
- --
Ron Johnson, Jr.
Jefferson LA USA
Is "common sense" really valid?
For example, it is "common sense" to white-power racists that
whites are superior to blacks, and that those with brown skins
are mud people.
However, that "common sense" is obviously wrong.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.5 (GNU/Linux)
iD8DBQFFay20S9HxQb37XmcRAvh/AJ9q2mgWBGCgR7/IY2lB2TJVheq/DwCgiJkP
MLhLb6Au0HOL3Iruk0ZrCSk=
=vGcr
-----END PGP SIGNATURE-----
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
On 11/27/06 11:47, Scott Marlowe wrote:
On Thu, 2006-11-23 at 10:23, Ron Johnson wrote:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1On 11/22/06 20:23, carter ck wrote:
Hi all,
I am wonderring if it is a good practice to use SERIAL index as primary
key, as it is only available up to 9999999?Currently i am dealing with storing LDAP users into Postgres and i am
looking for a better way to make use of the DN as primary key instead of
SERIAL index.Any advice or suggestion is appreciated.
I'm one of those who thinks that a (possibly multisegment) natural
key *does* exist, and that if you think it doesn't, your design is
wrong.Spend some time in the travel industry... The tax category ZO means
Passenger Service Chareg in Denmark. Or Greenland, or Faroe Islands.
And can be entered more than once. And the travel agent has to look at
the context of the travel itinerary to know which one(s) it is.Sadly, the real world has many data problems created by idiots in suits
30 years ago that aren't going to go away any time soon.
Yes, that's the point. They are legacy designs, and that portion of
the design is wrong.
- --
Ron Johnson, Jr.
Jefferson LA USA
Is "common sense" really valid?
For example, it is "common sense" to white-power racists that
whites are superior to blacks, and that those with brown skins
are mud people.
However, that "common sense" is obviously wrong.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.5 (GNU/Linux)
iD8DBQFFay6ES9HxQb37XmcRAly9AKC5qEpO9Z9Oscf5Bp3nbuSgIswPJwCg0dh3
FoDu81i4pndNwIQ88Bl2SsU=
=WCzo
-----END PGP SIGNATURE-----
Yes, that's the point. They are legacy designs, and that portion of
the design is wrong.
I'll weigh in my my .02 on this subject. After much pain and agony in
the real world, I have taken the stance that every table in my database
must have an arbitrary, numeric primary key (generally autogenerated).
I have found that this gets me into a lot of arguments with other
database guys, but never seems to cause any problems for me.
Conversely, I have seen innumerable problems in the real world caused by
the usage of actual data as primary keys.
Perhaps I am amazingly ignorant, but I have yet to find a case where my
approach causes any real problems. What does using "real" data as a
primary key buy you? The only real advantages I can see are that an
individual record's data will be somewhat more human-readable without
joining to other tables, and that your search queries can be simpler
because they don't have to join against other tables.
On the (many) occasions that I have worked on databases with "real" data
as primary keys, I just saw so many problems arise. In the real world,
data changes, even supposedly unchangeable data. When using arbitrary
primary keys, all you have to do is change the data in the one table
where it lives. If you are using real data as your keys, you have write
complex queries or code to "fix" your data when the supposedly
unchangeable data changes.
Anyway, I'm sure this is a huge argument, but that's my 0.2
Simply put, it doesn't scale as well.
If a table already has candidate keys, then you've presumably got unique
indices on them. A surrogate primary key adds another segment of data
to pass through I/O and another index to maintain. Under high loads,
those extra cycles will cost you transactions per minute.
If you're able to throw hardware at the problem to compensate for
performance and data size issues, it's not a problem. Most databases
are run on systems that are overkill already. If, OTOH, you're running
a system that needs to be able to process billions of transactions with
exabytes data (say, for example, a comprehensive multi-national health
record database) then you're going to be as interested in SQL tuning as
it's possible to be because no amount of hardware will be enough.
The other argument is that it's redundant data with no real meaning to
the domain, meaning using surrogate keys technically violates low-order
normal forms.
As far as data changing, if you're using foreign key constraints
properly you should never need to issue more than one UPDATE command.
ON UPDATE CASCADE is your friend.
It is always possible to design a domain model which perfectly captures
business logic. However, it is *not* always possible to actually
implement that domain in a computerized RDBMS, nor is it always
practical. Just as the domain model represents an estimated
implementation of the real world information, an RDBMS is just an
estimated implementation of the relational model.
--
Brandon Aiken
CS/IT Systems Engineer
-----Original Message-----
From: pgsql-general-owner@postgresql.org
[mailto:pgsql-general-owner@postgresql.org] On Behalf Of John McCawley
Sent: Monday, November 27, 2006 1:53 PM
To: Ron Johnson
Cc: pgsql-general@postgresql.org
Subject: Re: [GENERAL] IS it a good practice to use SERIAL as Primary
Key?
Yes, that's the point. They are legacy designs, and that portion of
the design is wrong.
I'll weigh in my my .02 on this subject. After much pain and agony in
the real world, I have taken the stance that every table in my database
must have an arbitrary, numeric primary key (generally autogenerated).
I have found that this gets me into a lot of arguments with other
database guys, but never seems to cause any problems for me.
Conversely, I have seen innumerable problems in the real world caused by
the usage of actual data as primary keys.
Perhaps I am amazingly ignorant, but I have yet to find a case where my
approach causes any real problems. What does using "real" data as a
primary key buy you? The only real advantages I can see are that an
individual record's data will be somewhat more human-readable without
joining to other tables, and that your search queries can be simpler
because they don't have to join against other tables.
On the (many) occasions that I have worked on databases with "real" data
as primary keys, I just saw so many problems arise. In the real world,
data changes, even supposedly unchangeable data. When using arbitrary
primary keys, all you have to do is change the data in the one table
where it lives. If you are using real data as your keys, you have write
complex queries or code to "fix" your data when the supposedly
unchangeable data changes.
Anyway, I'm sure this is a huge argument, but that's my 0.2
---------------------------(end of broadcast)---------------------------
TIP 6: explain analyze is your friend
John,
I'll weigh in my my .02 on this subject. After much pain and agony in
the real world, I have taken the stance that every table in my database
must have an arbitrary, numeric primary key (generally autogenerated).
I feel the same.
In the "real world" there is no such thing as a primary key. At least not
over time. Not enough people understand the concept of a primary key to make
those things existent in the real world.
So we take an artificially primary key - and most reliable way is to create
it yourself.
Harald
--
GHUM Harald Massa
persuadere et programmare
Harald Armin Massa
Reinsburgstraße 202b
70197 Stuttgart
0173/9409607
-
Python: the only language with more web frameworks than keywords.
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
On Nov 27, 2006, at 1:21 PM, Brandon Aiken wrote:
The other argument is that it's redundant data with no real meaning to
the domain, meaning using surrogate keys technically violates low-
order
normal forms.
It has real meaning in the sense that it is an internal identifier
that doesn't change. My bank set my online login to a stupid 5
letters of my name plus last four digits of SSN, and they "can not
change" it. Most likely, it is the primary key used for as a
foreign key to all the financial data. Dumb, dumb, dumb.
If, OTOH, they would go with an internal id, it would be trivial to
change the login id.
David Morton
Maia Mailguard http://www.maiamailguard.com
mortonda@dgrmm.net
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.5 (Darwin)
iD8DBQFFazzQUy30ODPkzl0RAs/sAJ9rBTbXPNN/T4eQ9zjJFMAKFpfrPACdHcLj
pVtAZhjxk24vgRm/ScNfuyw=
=mLTC
-----END PGP SIGNATURE-----