Lookup tables

Started by Rich Shepardabout 1 year ago49 messagesgeneral
Jump to latest
#1Rich Shepard
rshepard@appl-ecosys.com

Should lookup tables have a numeric FK column as well as the description column?

If so, how should I add an FK to the two lookup tables in my database?

TIA,

Rich

#2David G. Johnston
david.g.johnston@gmail.com
In reply to: Rich Shepard (#1)
Re: Lookup tables

On Tuesday, February 4, 2025, Rich Shepard <rshepard@appl-ecosys.com> wrote:

Should lookup tables have a numeric FK column as well as the description
column?

If so, how should I add an FK to the two lookup tables in my database?

Most do (have a surrogate PK) since it removes cascading updates and is a
smaller value.

Lots of alter tables and update queries.

David J.

#3Rich Shepard
rshepard@appl-ecosys.com
In reply to: Rich Shepard (#1)
Re: Lookup tables

On Tue, 4 Feb 2025, Rob Sargent wrote:

I would definitely add an Id for each of the looked up values. The code
can use the Id (for the join or lookup) and the string value can safely be
changed (think typo) without ripple effect.ᅵ It also prevents other tables
from referencing the lookup witrh bogus values (those not in the lookup
table)

Unless your lookup tables are huge I would create a new table matching
your current table but with an identity column and load from you original
table.
https://www.postgresql.org/docs/current/ddl-identity-columns.html

Thanks, Rob. Each lookup table has only a very few rows,

I'll make new tables today.

Regards,

Rich

#4Rich Shepard
rshepard@appl-ecosys.com
In reply to: David G. Johnston (#2)
Re: Lookup tables

On Tue, 4 Feb 2025, David G. Johnston wrote:

Most do (have a surrogate PK) since it removes cascading updates and is a
smaller value.

Lots of alter tables and update queries.

David,

That's a good point.

Thanks,

Rich

#5Ron
ronljohnsonjr@gmail.com
In reply to: David G. Johnston (#2)
Re: Lookup tables

On Tue, Feb 4, 2025 at 9:41 AM David G. Johnston <david.g.johnston@gmail.com>
wrote:

On Tuesday, February 4, 2025, Rich Shepard <rshepard@appl-ecosys.com>
wrote:

Should lookup tables have a numeric FK column as well as the description
column?

If so, how should I add an FK to the two lookup tables in my database?

Most do (have a surrogate PK) since it removes cascading updates

How does a synthetic PK "remove cascading updates"? Doesn't the decision
on whether or not to cascade update depend on the ON UPDATE CASCADE clause
of the FK definition?

--
Death to <Redacted>, and butter sauce.
Don't boil me, I'm still alive.
<Redacted> lobster!

#6David G. Johnston
david.g.johnston@gmail.com
In reply to: Ron (#5)
Re: Lookup tables

On Tuesday, February 4, 2025, Ron Johnson <ronljohnsonjr@gmail.com> wrote:

On Tue, Feb 4, 2025 at 9:41 AM David G. Johnston <
david.g.johnston@gmail.com> wrote:

On Tuesday, February 4, 2025, Rich Shepard <rshepard@appl-ecosys.com>
wrote:

Should lookup tables have a numeric FK column as well as the description
column?

If so, how should I add an FK to the two lookup tables in my database?

Most do (have a surrogate PK) since it removes cascading updates

How does a synthetic PK "remove cascading updates"? Doesn't the decision
on whether or not to cascade update depend on the ON UPDATE CASCADE clause
of the FK definition?

People don’t change synthetic PKs so updates never have to happen on the FK
side. Labels do get changed, in which case you have to update the FK label
side.

David J.

#7Ron
ronljohnsonjr@gmail.com
In reply to: Rich Shepard (#1)
Re: Lookup tables

On Tue, Feb 4, 2025 at 9:28 AM Rich Shepard <rshepard@appl-ecosys.com>
wrote:

Should lookup tables have a numeric FK column as well as the description
column?

Does your lookup table just have one column? (That's what your question
seems to imply, but that makes no sense, since the whole point of a lookup
table is to store some sort of a code in the "child" table instead of the
whole text of the description.)

--
Death to <Redacted>, and butter sauce.
Don't boil me, I'm still alive.
<Redacted> lobster!

#8David G. Johnston
david.g.johnston@gmail.com
In reply to: Ron (#7)
Re: Lookup tables

On Tuesday, February 4, 2025, Ron Johnson <ronljohnsonjr@gmail.com> wrote:

On Tue, Feb 4, 2025 at 9:28 AM Rich Shepard <rshepard@appl-ecosys.com>
wrote:

Should lookup tables have a numeric FK column as well as the description
column?

Does your lookup table just have one column? (That's what your question
seems to imply, but that makes no sense, since the whole point of a lookup
table is to store some sort of a code in the "child" table instead of the
whole text of the description.)

The point of a lookup table is to provide a unique list of authoritative
values for some purpose. Kinda like an enum. But having the label serve
as the unique value is reasonable - we only add surrogates for optimization.

David J.

#9Rich Shepard
rshepard@appl-ecosys.com
In reply to: Ron (#7)
Re: Lookup tables

On Tue, 4 Feb 2025, Ron Johnson wrote:

Does your lookup table just have one column? (That's what your question
seems to imply, but that makes no sense, since the whole point of a lookup
table is to store some sort of a code in the "child" table instead of the
whole text of the description.)

Ron,

Yes, each has a single column of type names, industrytypes and statustypes.
I've always used the descriptive names in queries.

Rich

#10Ron
ronljohnsonjr@gmail.com
In reply to: David G. Johnston (#6)
Re: Lookup tables

On Tue, Feb 4, 2025 at 9:59 AM David G. Johnston <david.g.johnston@gmail.com>
wrote:

On Tuesday, February 4, 2025, Ron Johnson <ronljohnsonjr@gmail.com> wrote:

On Tue, Feb 4, 2025 at 9:41 AM David G. Johnston <
david.g.johnston@gmail.com> wrote:

On Tuesday, February 4, 2025, Rich Shepard <rshepard@appl-ecosys.com>
wrote:

Should lookup tables have a numeric FK column as well as the
description column?

If so, how should I add an FK to the two lookup tables in my database?

Most do (have a surrogate PK) since it removes cascading updates

How does a synthetic PK "remove cascading updates"? Doesn't the decision
on whether or not to cascade update depend on the ON UPDATE CASCADE clause
of the FK definition?

People don’t change synthetic PKs so updates never have to happen on the
FK side. Labels do get changed, in which case you have to update the FK
label side.

That's the argument between a synthetic PK and a natural PK. (Everywhere
I've seen natural PK used -- that was 25 years ago -- you didn't use the
whole text of the description of the PK, you used a code, like 'HRLY' for
hourly wage employees and 'SLRY' for salaried employees. Then, HRLY or
SLRY would go in the t_employee table income_type VARCHAR(4) column,
referencing a lookup table.

--
Death to <Redacted>, and butter sauce.
Don't boil me, I'm still alive.
<Redacted> lobster!

#11Rich Shepard
rshepard@appl-ecosys.com
In reply to: David G. Johnston (#8)
Re: Lookup tables

On Tue, 4 Feb 2025, David G. Johnston wrote:

The point of a lookup table is to provide a unique list of authoritative
values for some purpose. Kinda like an enum. But having the label serve as
the unique value is reasonable - we only add surrogates for optimization.

David,

The industrytypes table has 26 rows, the statustypes table has 8 rows. Your
explanation suggests that for this database adding a PK to each table adds
little, if anything.

Regards,

Rich

#12Tony Shelver
tshelver@gmail.com
In reply to: Rich Shepard (#1)
Re: Lookup tables

Seen the suggestions.

After decades of playing with the alternatives, I don't overthink it
anymore. Pretty much my lookup tables consist of an autogenerated primary
key (smallint or integer), a code (name?) and a description as default.

If there are a lot of tables fitting a common data format, at times I have
consolidated them into a single table with a 'lookup type' column.

Depending on application requirements, sometimes there I will provide short
and long display names, and a generic description for further
clarification, but usually those requirements are overkill.

Tony

On Tue, 4 Feb 2025 at 16:28, Rich Shepard <rshepard@appl-ecosys.com> wrote:

Show quoted text

Should lookup tables have a numeric FK column as well as the description
column?

If so, how should I add an FK to the two lookup tables in my database?

TIA,

Rich

#13Ron
ronljohnsonjr@gmail.com
In reply to: Rich Shepard (#9)
Re: Lookup tables

On Tue, Feb 4, 2025 at 10:05 AM Rich Shepard <rshepard@appl-ecosys.com>
wrote:

On Tue, 4 Feb 2025, Ron Johnson wrote:

Does your lookup table just have one column? (That's what your question
seems to imply, but that makes no sense, since the whole point of a

lookup

table is to store some sort of a code in the "child" table instead of the
whole text of the description.)

Ron,

Yes, each has a single column of type names, industrytypes and statustypes.
I've always used the descriptive names in queries.

Yeah, that's definitely Bad Design, for the reason David enumerated.

In the lookup table, the PK can be either synthetic (an integer that means
nothing) or natural (a short text code, typically four characters) that is
an abbreviation of the description.

Synthetic keys have been the norm for the past 25+ years, but I have a soft
spot for natural keys.

Natural keys become unwieldy, though, when more than two columns are
required to define primality.

--
Death to <Redacted>, and butter sauce.
Don't boil me, I'm still alive.
<Redacted> lobster!

#14David G. Johnston
david.g.johnston@gmail.com
In reply to: Rich Shepard (#11)
Re: Lookup tables

On Tuesday, February 4, 2025, Rich Shepard <rshepard@appl-ecosys.com> wrote:

On Tue, 4 Feb 2025, David G. Johnston wrote:

The point of a lookup table is to provide a unique list of authoritative

values for some purpose. Kinda like an enum. But having the label serve as
the unique value is reasonable - we only add surrogates for optimization.

David,

The industrytypes table has 26 rows, the statustypes table has 8 rows. Your
explanation suggests that for this database adding a PK to each table adds
little, if anything.

It’s the FK side where the cost savings are experienced.

David J.

#15Ron
ronljohnsonjr@gmail.com
In reply to: Rich Shepard (#11)
Re: Lookup tables

On Tue, Feb 4, 2025 at 10:08 AM Rich Shepard <rshepard@appl-ecosys.com>
wrote:

On Tue, 4 Feb 2025, David G. Johnston wrote:

The point of a lookup table is to provide a unique list of authoritative
values for some purpose. Kinda like an enum. But having the label serve

as

the unique value is reasonable - we only add surrogates for optimization.

David,

The industrytypes table has 26 rows, the statustypes table has 8 rows. Your
explanation suggests that for this database adding a PK to each table adds
little, if anything.

How big is the database? A tiny 500MB db just for you can get by with poor
design. (But then, why are you using PG instead of SQLite?)

More importantly, will you ever update the descriptions? Of course not!
Famous last words. Having a separate PK means that you update one row in
one column, while what you've done means that tens/hundreds of thousands of
rows in possibly dozens of tables need to be updated.

It also means that *you* can easily change things in your ad hoc
database *without
forgetting* to update a table.

This is called an "update anomaly" in relational design theory.

--
Death to <Redacted>, and butter sauce.
Don't boil me, I'm still alive.
<Redacted> lobster!

#16Rich Shepard
rshepard@appl-ecosys.com
In reply to: David G. Johnston (#14)
Re: Lookup tables

On Tue, 4 Feb 2025, David G. Johnston wrote:

It’s the FK side where the cost savings are experienced.

David,

Okay.

Thanks,

Rich

#17Adrian Klaver
adrian.klaver@aklaver.com
In reply to: Ron (#15)
Re: Lookup tables

On 2/4/25 07:19, Ron Johnson wrote:

How big is the database?  A tiny 500MB db just for you can get by with
poor design.  (But then, why are you using PG instead of SQLite?)

For the reasons listed here:

https://sqlite.org/quirks.html

In particular:

[...]

2. SQLite Is Embedded, Not Client-Server

3. Flexible Typing

SQLite is flexible with regard to datatypes. Datatypes are advisory
rather than mandatory.

4. Foreign Key Enforcement Is Off By Default

[...]

More importantly, will you ever update the descriptions?  Of course not!
Famous last words.  Having a separate PK means that you update one row
in one column, while what you've done means that tens/hundreds of
thousands of rows in possibly dozens of tables need to be updated.

It also means that *you* can easily change things in your ad hoc
database *without forgetting* to update a table.

This is called an "update anomaly" in relational design theory.

--
Death to <Redacted>, and butter sauce.
Don't boil me, I'm still alive.
<Redacted> lobster!

--
Adrian Klaver
adrian.klaver@aklaver.com

#18Michał Kłeczek
michal@kleczek.org
In reply to: Rich Shepard (#1)
Re: Lookup tables

On 4 Feb 2025, at 15:27, Rich Shepard <rshepard@appl-ecosys.com> wrote:

Should lookup tables have a numeric FK column as well as the description column?

If so, how should I add an FK to the two lookup tables in my database?

I’ve read the whole thread and the reasoning for having (numeric) autogenerated surrogate key is:
a) performance
b) no cascading updates

I would like to add another dimension to this discussion: logical consistency.

Imagine the following simplified schema with surrogate PK:

create table restaurant (
restaurant_id int not null primary key generated always as identity,
name text not null unique
);
create table restaurant_visit (
visit_id int not null primary key generated always as identity,
guest_username text not null,
when date not null,
restaurant_id not null foreign key restaurant(restaurant_id),
rating smallint not null
);

Let’s say users on your website register restaurant visits and rate them.

The query to register a visit is:
insert into restaurant_visit
select $user, current_date, restaurant_id, $rating
from restaurant where name = $restaurant_name

It is now completely unclear what it means to change the name of the restaurant for already registered visits.
Is it still the same restaurant with a different name or a different restaurant?

Or let say someone swaps names of two restaurants.
That means a user that goes to the same restaurant every day would register visits to two different restaurants!

Using the name of a restaurant as primary key gets rid of these logical anomalies because
the database model now reflects facts from reality.

Having surrogate keys makes your relational database more like a network/object oriented database where rows don’t represent facts but rather some entities that have identity independent from their attributes.

Thanks,
Michal

#19Karsten Hilbert
Karsten.Hilbert@gmx.net
In reply to: Michał Kłeczek (#18)
Re: Lookup tables

Am Tue, Feb 04, 2025 at 05:31:13PM +0100 schrieb Michał Kłeczek:

It is now completely unclear what it means to change the name of the restaurant for already registered visits.
Is it still the same restaurant with a different name or a different restaurant?

Or let say someone swaps names of two restaurants.
That means a user that goes to the same restaurant every day would register visits to two different restaurants!

Using the name of a restaurant as primary key gets rid of these logical anomalies because
the database model now reflects facts from reality.

Reality tends to become so ambiguous as to not be
reflectable (two entirely different restaurants eventually,
within the flow of time, carry the very same name).

A primary key is very likely not the proper place to reflect
arbitrary business logic (is it the same restaurant or not ?
what if two restaurants have the same name at the same time
?). Primary keys are tools at the technical level.

Karsten
--
GPG 40BE 5B0E C98E 1713 AFA6 5BC0 3BEA AC80 7D4F C89B

#20Rob Sargent
robjsargent@gmail.com
In reply to: Karsten Hilbert (#19)
Re: Lookup tables

On 2/4/25 09:51, Karsten Hilbert wrote:

Am Tue, Feb 04, 2025 at 05:31:13PM +0100 schrieb Michał Kłeczek:

It is now completely unclear what it means to change the name of the restaurant for already registered visits.
Is it still the same restaurant with a different name or a different restaurant?

Or let say someone swaps names of two restaurants.
That means a user that goes to the same restaurant every day would register visits to two different restaurants!

Using the name of a restaurant as primary key gets rid of these logical anomalies because
the database model now reflects facts from reality.

Reality tends to become so ambiguous as to not be
reflectable (two entirely different restaurants eventually,
within the flow of time, carry the very same name).

A primary key is very likely not the proper place to reflect
arbitrary business logic (is it the same restaurant or not ?
what if two restaurants have the same name at the same time
?). Primary keys are tools at the technical level.

Karsten
--
GPG 40BE 5B0E C98E 1713 AFA6 5BC0 3BEA AC80 7D4F C89B

That OP is using a single column is interesting:  Taking this notion to
the restaurant/visit theme, the list of names or restaurants becomes a
proper table 'restaurant' (name, addresss, phone, website, etc).  The
name is as useless as a key as first/last is for person.

#21Ron
ronljohnsonjr@gmail.com
In reply to: Michał Kłeczek (#18)
#22Rob Sargent
robjsargent@gmail.com
In reply to: Ron (#21)
#23Michał Kłeczek
michal@kleczek.org
In reply to: Karsten Hilbert (#19)
#24Ron
ronljohnsonjr@gmail.com
In reply to: Rob Sargent (#22)
#25David G. Johnston
david.g.johnston@gmail.com
In reply to: Michał Kłeczek (#18)
#26Thiemo Kellner
thiemo@gelassene-pferde.biz
In reply to: Ron (#7)
#27Michał Kłeczek
michal@kleczek.org
In reply to: Thiemo Kellner (#26)
#28Karsten Hilbert
Karsten.Hilbert@gmx.net
In reply to: Michał Kłeczek (#27)
#29David G. Johnston
david.g.johnston@gmail.com
In reply to: Michał Kłeczek (#27)
#30Thiemo Kellner
thiemo@gelassene-pferde.biz
In reply to: Michał Kłeczek (#27)
#31Rich Shepard
rshepard@appl-ecosys.com
In reply to: Rich Shepard (#1)
#32David G. Johnston
david.g.johnston@gmail.com
In reply to: Rich Shepard (#31)
#33Rich Shepard
rshepard@appl-ecosys.com
In reply to: David G. Johnston (#32)
#34Rich Shepard
rshepard@appl-ecosys.com
In reply to: Rich Shepard (#3)
#35Thiemo Kellner
thiemo@gelassene-pferde.biz
In reply to: Michał Kłeczek (#27)
#36Karsten Hilbert
Karsten.Hilbert@gmx.net
In reply to: Thiemo Kellner (#35)
#37Michał Kłeczek
michal@kleczek.org
In reply to: David G. Johnston (#25)
#38Michał Kłeczek
michal@kleczek.org
In reply to: Thiemo Kellner (#35)
#39Thiemo Kellner
thiemo@gelassene-pferde.biz
In reply to: David G. Johnston (#25)
#40Thiemo Kellner
thiemo@gelassene-pferde.biz
In reply to: Michał Kłeczek (#23)
#41Michał Kłeczek
michal@kleczek.org
In reply to: Thiemo Kellner (#40)
#42Thiemo Kellner
thiemo@gelassene-pferde.biz
In reply to: Michał Kłeczek (#41)
#43Thiemo Kellner
thiemo@gelassene-pferde.biz
In reply to: Michał Kłeczek (#38)
#44Michał Kłeczek
michal@kleczek.org
In reply to: Thiemo Kellner (#43)
#45Thiemo Kellner
thiemo@gelassene-pferde.biz
In reply to: Michał Kłeczek (#44)
#46Peter J. Holzer
hjp-pgsql@hjp.at
In reply to: Thiemo Kellner (#35)
#47Michał Kłeczek
michal@kleczek.org
In reply to: Peter J. Holzer (#46)
#48Thiemo Kellner
thiemo@gelassene-pferde.biz
In reply to: Peter J. Holzer (#46)
#49Peter J. Holzer
hjp-pgsql@hjp.at
In reply to: Michał Kłeczek (#47)