Re: D308-E9AF-4C11 : CONFIRM from pgsql-sql (subscribe)

Started by Gonzo Rockover 24 years ago8 messagesgeneral

GonzoRock@Excite.com

over 24 years ago

A Question for those of you who consider yourself crack Database Designers.

I am currently moving a large database(100+Tables) into pgSQL... with the intention of deploying against 'any' SQL database in the future. The development side will be rigorously using Standard SQL constructs with no unique/proprietary extensions.

My question concerns establishing the relationships.

Currently Relationships between tables are established via a Unique Integer ID like this:

*=APrimaryKey

PartTypes Customer Parts
--------- -------- -----
PartTypeID CustomerID PartID
*PartType *Customer PartTypeID
Address CustomerID
*PartNumber(2FieldPrimaryKey)
*PartRevision(2FieldPrimaryKey)
PartName

HOWEVER; I have read lots of texts describing the Relational Design should be instead like this:

*=APrimaryKey

PartTypes Customer Parts
--------- -------- -----
*PartType *Customer PartType
Address *PartNumber(2FieldPrimaryKey)
*PartRevison(2FieldPrimaryKey)
PartName
Customer

Both Techniques have a unique foreign key back to the parent tables but one uses No.Meaningful.Info.Integer.Data for the ForeignKey while the second uses Human.Understandable.ForeignKeys

Is one recommended over the other??? Sure appreciate the commentary before I get in too deep with all these tables.

Thanks!

Mike Mascari

mascarm@mascari.com

over 24 years ago

In reply to: Gonzo Rock (#1)

Re: Re: D308-E9AF-4C11 : CONFIRM from pgsql-sql (subscribe)

I prefer using unique integer ids generated from sequences rather than
keys composed of meaningful values.

Advantages:

Client side applications can store/handle the unique integer ids more
readily than having to deal with composite primary keys composed of
varying data types. For example, I can stuff the id associated with a
particular record easily in list boxes, combo boxes, edit controls, etc.
via SetItemData() or some other appropriate method. Its a bit more
complicated to track database records via composite keys of something
like: part no, vendor no, vendor group.

Updating the data doesn't require cascading updates. If you use keys
with meaning, the referential integrity constraints must support
cascading updates so if the key changes in the primary table the change
is cascaded to all referencing tables as well. Earlier versions of most
databases (Access, Oracle, etc.) only provided cascading deletes under
the assumption you would be using sequence generated keys.

Downside:

Many queries might require more joins against the primary table to fetch
the relevant information associated with the numerical id, whereas keys
composed of solely the values with which they are associated might not
require the joins, which will speed some applications. I now have some
queries with 20-way joins. But PostgreSQL provides a way to explicitly
set the path the planner will choose and so the execution of the query
is instantaneous. I'm not sure about other databases. In earlier
versions, I had to denormalize a bit solely for performance reasons.

In the past, I used to use composite keys and switched to the purely
sequence generated path and don't regret it at all. Of course, you'll
still have a unique constraint on the what-would-have-been meaningful
primary key.

Hope that helps,

Mike Mascari
mascarm@mascari.com

Gonzo Rock wrote:

Show quoted text

A Question for those of you who consider yourself crack Database Designers.

I am currently moving a large database(100+Tables) into pgSQL... with the intention of deploying against 'any' SQL database in the future. The development side will be rigorously using Standard SQL constructs with no unique/proprietary extensions.

My question concerns establishing the relationships.

Currently Relationships between tables are established via a Unique Integer ID like this:

*=APrimaryKey

PartTypes Customer Parts
--------- -------- -----
PartTypeID CustomerID PartID
*PartType *Customer PartTypeID
Address CustomerID
*PartNumber(2FieldPrimaryKey)
*PartRevision(2FieldPrimaryKey)
PartName

HOWEVER; I have read lots of texts describing the Relational Design should be instead like this:

*=APrimaryKey

PartTypes Customer Parts
--------- -------- -----
*PartType *Customer PartType
Address *PartNumber(2FieldPrimaryKey)
*PartRevison(2FieldPrimaryKey)
PartName
Customer

Both Techniques have a unique foreign key back to the parent tables but one uses No.Meaningful.Info.Integer.Data for the ForeignKey while the second uses Human.Understandable.ForeignKeys

Is one recommended over the other??? Sure appreciate the commentary before I get in too deep with all these tables.

Thanks!

Gonzo Rock

GonzoRock@Excite.com

over 24 years ago

In reply to: Mike Mascari (#2)

RE: [SQL] Database Design Question

OK... Fair Enough... Good Points indeed y'all.

Well... What about the problem of users trying to Query the Database??

You know... like when using Crystal Reports or something?.

SELECT * from HistoryTable
WHERE PartID = SomeInteger

SELECT * from HistoryTable
WHERE PartNum = 12345636 AND PartRev = C

How are they supposed to know What the PartID is ??

Anyway, that I why I was considering changing... current users always have trouble peering into the database... They don't quite get it.

At 02:31 PM 7/27/01 -0400, Mike Mascari wrote:

Show quoted text

I prefer using unique integer ids generated from sequences rather than
keys composed of meaningful values.

Advantages:

Client side applications can store/handle the unique integer ids more
readily than having to deal with composite primary keys composed of
varying data types. For example, I can stuff the id associated with a
particular record easily in list boxes, combo boxes, edit controls, etc.
via SetItemData() or some other appropriate method. Its a bit more
complicated to track database records via composite keys of something
like: part no, vendor no, vendor group.

Updating the data doesn't require cascading updates. If you use keys
with meaning, the referential integrity constraints must support
cascading updates so if the key changes in the primary table the change
is cascaded to all referencing tables as well. Earlier versions of most
databases (Access, Oracle, etc.) only provided cascading deletes under
the assumption you would be using sequence generated keys.

Downside:

Many queries might require more joins against the primary table to fetch
the relevant information associated with the numerical id, whereas keys
composed of solely the values with which they are associated might not
require the joins, which will speed some applications. I now have some
queries with 20-way joins. But PostgreSQL provides a way to explicitly
set the path the planner will choose and so the execution of the query
is instantaneous. I'm not sure about other databases. In earlier
versions, I had to denormalize a bit solely for performance reasons.

In the past, I used to use composite keys and switched to the purely
sequence generated path and don't regret it at all. Of course, you'll
still have a unique constraint on the what-would-have-been meaningful
primary key.

Hope that helps,

Mike Mascari
mascarm@mascari.com

Gonzo Rock wrote:

A Question for those of you who consider yourself crack Database Designers.

I am currently moving a large database(100+Tables) into pgSQL... with the intention of deploying against 'any' SQL database in the future. The development side will be rigorously using Standard SQL constructs with no unique/proprietary extensions.

My question concerns establishing the relationships.

Currently Relationships between tables are established via a Unique Integer ID like this:

*=APrimaryKey

PartTypes Customer Parts
--------- -------- -----
PartTypeID CustomerID PartID
*PartType *Customer PartTypeID
Address CustomerID
*PartNumber(2FieldPrimaryKey)
*PartRevision(2FieldPrimaryKey)
PartName

HOWEVER; I have read lots of texts describing the Relational Design should be instead like this:

*=APrimaryKey

PartTypes Customer Parts
--------- -------- -----
*PartType *Customer PartType
Address *PartNumber(2FieldPrimaryKey)
*PartRevison(2FieldPrimaryKey)
PartName
Customer

Both Techniques have a unique foreign key back to the parent tables but one uses No.Meaningful.Info.Integer.Data for the ForeignKey while the second uses Human.Understandable.ForeignKeys

Is one recommended over the other??? Sure appreciate the commentary before I get in too deep with all these tables.

Thanks!

Ryan Mahoney

ryan@paymentalliance.net

over 24 years ago

In reply to: Gonzo Rock (#3)

Re: RE: [SQL] Database Design Question

SELECT * from HistoryTable
WHERE PartNum = 12345636 AND PartRev = C

Is equal to:

SELECT t1.* from HistoryTable t1, PartTable t2
WHERE t2.PartName = 'airplane' AND t1.PartRev = 'C' AND t2.PartNum = t1.PartNum

You can create these joins for your users, and show them they only need to
swap out the name.

-r

At 01:21 PM 7/27/01 -0700, Gonzo Rock wrote:

Show quoted text

OK... Fair Enough... Good Points indeed y'all.

Well... What about the problem of users trying to Query the Database??

You know... like when using Crystal Reports or something?.

SELECT * from HistoryTable
WHERE PartID = SomeInteger

vs

SELECT * from HistoryTable
WHERE PartNum = 12345636 AND PartRev = C

How are they supposed to know What the PartID is ??

Anyway, that I why I was considering changing... current users always have
trouble peering into the database... They don't quite get it.

At 02:31 PM 7/27/01 -0400, Mike Mascari wrote:

I prefer using unique integer ids generated from sequences rather than
keys composed of meaningful values.

Advantages:

Client side applications can store/handle the unique integer ids more
readily than having to deal with composite primary keys composed of
varying data types. For example, I can stuff the id associated with a
particular record easily in list boxes, combo boxes, edit controls, etc.
via SetItemData() or some other appropriate method. Its a bit more
complicated to track database records via composite keys of something
like: part no, vendor no, vendor group.

Updating the data doesn't require cascading updates. If you use keys
with meaning, the referential integrity constraints must support
cascading updates so if the key changes in the primary table the change
is cascaded to all referencing tables as well. Earlier versions of most
databases (Access, Oracle, etc.) only provided cascading deletes under
the assumption you would be using sequence generated keys.

Downside:

Many queries might require more joins against the primary table to fetch
the relevant information associated with the numerical id, whereas keys
composed of solely the values with which they are associated might not
require the joins, which will speed some applications. I now have some
queries with 20-way joins. But PostgreSQL provides a way to explicitly
set the path the planner will choose and so the execution of the query
is instantaneous. I'm not sure about other databases. In earlier
versions, I had to denormalize a bit solely for performance reasons.

In the past, I used to use composite keys and switched to the purely
sequence generated path and don't regret it at all. Of course, you'll
still have a unique constraint on the what-would-have-been meaningful
primary key.

Hope that helps,

Mike Mascari
mascarm@mascari.com

Gonzo Rock wrote:

A Question for those of you who consider yourself crack Database

Designers.

I am currently moving a large database(100+Tables) into pgSQL... with

the intention of deploying against 'any' SQL database in the future. The
development side will be rigorously using Standard SQL constructs with no
unique/proprietary extensions.

My question concerns establishing the relationships.

Currently Relationships between tables are established via a Unique

Integer ID like this:

*=APrimaryKey

PartTypes Customer Parts
--------- -------- -----
PartTypeID CustomerID PartID
*PartType *Customer PartTypeID
Address CustomerID
*PartNumber(2FieldPrimaryKey)
*PartRevision(2FieldPrimaryKey)
PartName

HOWEVER; I have read lots of texts describing the Relational Design

should be instead like this:

*=APrimaryKey

PartTypes Customer Parts
--------- -------- -----
*PartType *Customer PartType
Address *PartNumber(2FieldPrimaryKey)
*PartRevison(2FieldPrimaryKey)
PartName
Customer

Both Techniques have a unique foreign key back to the parent tables

but one uses No.Meaningful.Info.Integer.Data for the ForeignKey while the
second uses Human.Understandable.ForeignKeys

Is one recommended over the other??? Sure appreciate the commentary

before I get in too deep with all these tables.

Thanks!

---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html

---
Incoming mail is certified Virus Free.
Checked by AVG anti-virus system (http://www.grisoft.com).
Version: 6.0.251 / Virus Database: 124 - Release Date: 4/26/01

Oliver Elphick

olly@lfix.co.uk

over 24 years ago

In reply to: Mike Mascari (#2)

Re: Re: D308-E9AF-4C11 : CONFIRM from pgsql-sql (subscribe)

Gonzo Rock wrote:

Is one recommended over the other??? Sure appreciate the commentary before I
get in too deep with all these tables.

The second sounds OK, but only if the chosen field is truly a candidate key.
"Customer" does not sound like one - suppose you have two 'John Smith's?
This is why most real-world applications use unique numbers or codes.
Of course you could (probably) differentiate the 'John Smith's by address,
but then the address has to be typed in as well as the name. A code is
much easier.

It all depends on the nature of the data.

--
Oliver Elphick Oliver.Elphick@lfix.co.uk
Isle of Wight http://www.lfix.co.uk/oliver
PGP: 1024R/32B8FAA1: 97 EA 1D 47 72 3F 28 47 6B 7E 39 CC 56 E4 C1 47
GPG: 1024D/3E1D0C1C: CA12 09E0 E8D5 8870 5839 932A 614D 4C34 3E1D 0C1C
========================================
"But the wisdom that is from above is first pure, then
peaceable, gentle, and easy to be intreated, full of
mercy and good fruits, without partiality, and without
hypocrisy." James 3:17

Import Notes

Reply to msg id not found: MessagefromGonzoRockGonzoRock@Excite.comofFri27Jul2001110215PDT.3.0.5.32.20010727110215.00c9bac0@postoffice.pacbell.net | Resolved by subject fallback

Mike Mascari

mascarm@mascari.com

over 24 years ago

In reply to: Gonzo Rock (#1)

Re: RE: [SQL] Database Design Question

Gonzo Rock wrote:

OK... Fair Enough... Good Points indeed y'all.

Well... What about the problem of users trying to Query the Database??

You know... like when using Crystal Reports or something?.

SELECT * from HistoryTable
WHERE PartID = SomeInteger

vs

SELECT * from HistoryTable
WHERE PartNum = 12345636 AND PartRev = C

How are they supposed to know What the PartID is ??

Anyway, that I why I was considering changing... current users always have trouble peering into the database... They don't quite get it.

Depending upon the sophistication of your users, you might want to
consider constructing a number of views where the data is pre-joined
(totally denormalized). We essentially do the same thing for both the
reasons you provide as well as for security purposes (row security)
based upon the value of CURRENT_USER.

Hope that helps,

Mike Mascari
mascarm@mascari.com

omid omoomi

oomoomi@hotmail.com

over 24 years ago

In reply to: Mike Mascari (#6)

Re: Database Design Question

Hi,
In addition I think, using that integer primary keys would be useful for the
databases which does not support CASCADE ON UPDATE.
Say you would have to change a PartType for any reason,using the integer
format, you will face no problem on tables integrity, updating the
description at the PartTypes table.
But currently PG supports the CASCADE UPDATEs.
Omid

From: A_Schnabel@t-online.de (Andre Schnabel)
To: "Gonzo Rock" <GonzoRock@Excite.com>, <pgsql-general@postgresql.org>
Subject: Re: [GENERAL] Database Design Question
Date: Fri, 27 Jul 2001 21:06:50 +0200

Don't really know, if I am a crack .. but ...

Your 1st Design would be faster when joining the tables in a query or view.
Furthermore an index on the id's (should be integers, right?) would use
much less storage space than an index on character-fields.

The 2nd design is preferred by theoretical purists. The data are much more
selfexplaining. If you only have a Parts-record you can see to which
Parttype an Costumer it belongs without qeurying the other tables. With
your 1st design you had to.

I think it's a question of performance, storagespace and readability.
If you need high performace use the 1st Design.
If you need a design, readable by people who don't work day by day with it,
use the 2nd method.

It's only my opinion, must not be right.

CU,
Andre
----- Original Message -----
From: Gonzo Rock
To: pgsql-general@postgresql.org
Sent: Friday, July 27, 2001 8:03 PM
Subject: [GENERAL] Database Design Question

A Question for those of you who consider yourself crack Database
Designers.

I am currently moving a large database(100+Tables) into pgSQL... with
the intention of deploying against 'any' SQL database in the future. The
development side will be rigorously using Standard SQL constructs with no
unique/proprietary extensions.

My question concerns establishing the relationships.

Currently Relationships between tables are established via a Unique
Integer ID like this:

*=APrimaryKey

PartTypes Customer Parts
--------- -------- -----
PartTypeID CustomerID PartID
*PartType *Customer PartTypeID
Address CustomerID
*PartNumber(2FieldPrimaryKey)
*PartRevision(2FieldPrimaryKey)
PartName

HOWEVER; I have read lots of texts describing the Relational Design
should be instead like this:

*=APrimaryKey

PartTypes Customer Parts
--------- -------- -----
*PartType *Customer PartType
Address *PartNumber(2FieldPrimaryKey)
*PartRevison(2FieldPrimaryKey)
PartName
Customer

Both Techniques have a unique foreign key back to the parent tables but
one uses No.Meaningful.Info.Integer.Data for the ForeignKey while the
second uses Human.Understandable.ForeignKeys

Is one recommended over the other??? Sure appreciate the commentary
before I get in too deep with all these tables.

Thanks!

---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html

_________________________________________________________________
Get your FREE download of MSN Explorer at http://explorer.msn.com/intl.asp

Import Notes

Resolved by subject fallback

James Orr

james@lrgmail.com

over 24 years ago

In reply to: Gonzo Rock (#1)

Re: Database Design Question

----- Original Message -----
From: "Gonzo Rock" <GonzoRock@Excite.com>
To: <pgsql-general@postgresql.org>
Cc: <pgsql-sql@postgresql.org>
Sent: Friday, July 27, 2001 4:21 PM
Subject: RE: [SQL] Database Design Question

OK... Fair Enough... Good Points indeed y'all.

Well... What about the problem of users trying to Query the Database??

You know... like when using Crystal Reports or something?.

SELECT * from HistoryTable
WHERE PartID = SomeInteger

vs

SELECT * from HistoryTable
WHERE PartNum = 12345636 AND PartRev = C

How are they supposed to know What the PartID is ??

Anyway, that I why I was considering changing... current users always have

trouble peering into the database... They don't quite get it.

Search conditions don't HAVE to be indexes. And you can have more than one
index. So you could have your primary index on PartID, which would be used
by your applications and another index on PartNum and PartRev if those are
frequently searched fields for crystal reports etc.