Any *real* reason to choose a natural, composite PK over a surrogate, simple PK?

Started by dananrg@yahoo.comalmost 20 years ago6 messagesgeneral
Jump to latest
#1dananrg@yahoo.com
dananrg@yahoo.com

In the book "Practical Issues in Database Management", Fabian Pascal
notes three reasons for choosing one PK over another - familiarity,
stability, and simplicity.

He notes further that those influenced by OO db design tend to use
simple, surrogate keys for all PKs in all databases they design; that
this is not *precluded* by relational theory, but that there's somehow
something illicit about it.

Today at least, and why I ask, I think it's a good rule of thumb to
create surrogate keys for almost all tables.

"Familiarity" seems like a spurious concern, and a poor tradeoff
against both stability (guaranteeing you are uniquely identifying rows)
and simplicity (with queries, and others intuiting your design).

What am I missing? Why use a composite key *ever* aside from
"familiarity?" Could someone give a real-world example where
"familiarity" is a compelling reason to choose a composite PK, and
trumps stability and simplicity?

Stability seems to be the single-most important factor to consider. If
the database can't uniquely identify a row, what's the point? Choosing
a surrogate key guarantees stability.

Dana

#2dananrg@yahoo.com
dananrg@yahoo.com
In reply to: dananrg@yahoo.com (#1)
Re: Any *real* reason to choose a natural, composite PK over a surrogate, simple PK?

If one decides to use a composite key, beyond how many attributes
should one seriously consider creating a surrogate key instead? 4? 5?
Less? I have seen a composite key composed of 5 attributes and thought
- why? What's the value over a surrogate key?

I guess choosing a candidate key (presuming the candidates are
legitimate candidates) is at least partially a matter of taste. What
taste would compel people to choose composite keys composed of more
than 2-3 attributes?

#3Brent Wood
b.wood@niwa.co.nz
In reply to: dananrg@yahoo.com (#2)
Re: Any *real* reason to choose a natural, composite PK

A bit simplistic, but in a nutshelll....

They are used for different things. A natural PK is used for data
integrity purposes, and if it is a single attr, it can also be a
performance enhancing index.

An artificial key is used to speed up queries, by allowing joins on a
single indexed attribute.

So use your composite, data driven, unique index as an integrity rule to
prevent duplicates, and use an artificial PK to enhance performance on
queries & joins. When you need to enhance performance is largely up to
you, unless you require your design to always maximise efficiency.

HTH,

Brent Wood

#4Trent Shipley
tshipley@deru.com
In reply to: dananrg@yahoo.com (#1)
Re: Any *real* reason to choose a natural, composite PK over a surrogate, simple PK?

On Thursday 2006-06-08 05:48, dananrg@yahoo.com wrote:

What am I missing? Why use a composite key *ever* aside from
"familiarity?" Could someone give a real-world example where
"familiarity" is a compelling reason to choose a composite PK, and
trumps stability and simplicity?

Another "familiarity" translates into "self-documentation" and thus is a major
software engineering desideratum. For some designers that might be reason
enough to use a composite key.

Using a surrogate key is arguably not a gain in simplicity. It adds a column
to the table design. It is populated with a non-intuitive sequence number.
The table now has a surrogate primary key and the alternate composite key.
The only gain in simplicity is for some machine operations.

Furthermore the surrogate key allows pseudo-uniqueness. If composite key over
rows ABC has two identical values, abc and abc', the composite key must still
have a unique constraint (trading back much efficiency that might be gained
with the surrogate key) to insure that the identical values are not masked by
the primary key constraint on the surrogate key.

Likewise, the stability provided by a surrogate key is arguably illusory. If
N is the primary key and the values in composite key ABC change then the
surrogate key N simply masks poor design. If ABC is not stable then the
initial analysis was flawed and ABC was not a valid candidate for a primary
key.

N only provides stability if the contents of ABC change in such a way that ABC
remains unique.

Show quoted text

Stability seems to be the single-most important factor to consider. If
the database can't uniquely identify a row, what's the point? Choosing
a surrogate key guarantees stability.

Dana

---------------------------(end of broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo@postgresql.org so that your
message can get through to the mailing list cleanly

#5Paul M Foster
paulf@quillandmouse.com
In reply to: dananrg@yahoo.com (#1)
Re: Any *real* reason to choose a natural, composite PK

dananrg@yahoo.com wrote:

In the book "Practical Issues in Database Management", Fabian Pascal
notes three reasons for choosing one PK over another - familiarity,
stability, and simplicity.

He notes further that those influenced by OO db design tend to use
simple, surrogate keys for all PKs in all databases they design; that
this is not *precluded* by relational theory, but that there's somehow
something illicit about it.

Today at least, and why I ask, I think it's a good rule of thumb to
create surrogate keys for almost all tables.

"Familiarity" seems like a spurious concern, and a poor tradeoff
against both stability (guaranteeing you are uniquely identifying rows)
and simplicity (with queries, and others intuiting your design).

What am I missing? Why use a composite key *ever* aside from
"familiarity?" Could someone give a real-world example where
"familiarity" is a compelling reason to choose a composite PK, and
trumps stability and simplicity?

Stability seems to be the single-most important factor to consider. If
the database can't uniquely identify a row, what's the point? Choosing
a surrogate key guarantees stability.

Surrogate keys have the advantage of performance. A composite key
composed of four fields of 50 characters each could create performance
problems. But a unique serial avoids this problem.

Of course, one of the drawbacks of surrogate keys is that, if you design
your table so that those four fields together are unique, once you index
the table on the serial key, there's nothing to guarantee your four
fields will *stay* unique.

Another reason for surrogate keys is that there may not be any
*meaningful* combination of fields to make up a unique key. That is,
although you may indeed have four unique fields for your table, together
they really have no meaning. For example, I have a log table in one of
my applications which adds several records per job. Yes, I could make
the key jobno + sequence_number. But why not just let the primary key be
a serial? Certainly simpler.

Here's another reason for using surrogate keys (real world). You have a
PHP application where you're passing GET parameters to the next PHP
script in the chain. If you're going to query a table with one of these
long composite keys, you've got to go through the operation of
concatenating all these values together in order to pass it via the GET
parameter. A real pain. It's far easier to retrieve the single serial
key field and pass that.

Overall, I'd say this is the problem with guys who write books full of
theory, expecting students to buy their pronouncements. Yes, some of
these guys have real world experience. But really, you should hang out
with some people who actually do this for a living after you read the
theory guys. The theory's keen and all that. But it's no substitute for
getting your hands dirty in the real world. Things are often very
different out there. Give me a mechanical engineer any day over an
architect who's never been near a building. (I don't know much about
Frank Lloyd Wright's education, but he built some beautiful buildings
which didn't stand up well to the elements.)

--
Paul M. Foster

#6Tim Hart
tjhart@mac.com
In reply to: Trent Shipley (#4)
Re: Any *real* reason to choose a natural, composite PK over a surrogate, simple PK?

While this statement is accurate, it isn't very precise. Needs change.
Requirements change. Usage changes. Any one of these changes can
invalidate a very correct initial analysis. A wise designer anticipates
change to minimize impact on both current work *and* future development
effort. Artificial keys are a very simple and effective guard against
human assumption and protect future design robustness.

Tim

On Jun 8, 2006, at 7:59 PM, Trent Shipley wrote:

Show quoted text

Likewise, the stability provided by a surrogate key is arguably
illusory. If
N is the primary key and the values in composite key ABC change then
the
surrogate key N simply masks poor design. If ABC is not stable then
the
initial analysis was flawed and ABC was not a valid candidate for a
primary
key.

N only provides stability if the contents of ABC change in such a way
that ABC
remains unique.