Lookup tables
Should lookup tables have a numeric FK column as well as the description column?
If so, how should I add an FK to the two lookup tables in my database?
TIA,
Rich
On Tuesday, February 4, 2025, Rich Shepard <rshepard@appl-ecosys.com> wrote:
Should lookup tables have a numeric FK column as well as the description
column?If so, how should I add an FK to the two lookup tables in my database?
Most do (have a surrogate PK) since it removes cascading updates and is a
smaller value.
Lots of alter tables and update queries.
David J.
On Tue, 4 Feb 2025, Rob Sargent wrote:
I would definitely add an Id for each of the looked up values. The code
can use the Id (for the join or lookup) and the string value can safely be
changed (think typo) without ripple effect.ᅵ It also prevents other tables
from referencing the lookup witrh bogus values (those not in the lookup
table)Unless your lookup tables are huge I would create a new table matching
your current table but with an identity column and load from you original
table.
https://www.postgresql.org/docs/current/ddl-identity-columns.html
Thanks, Rob. Each lookup table has only a very few rows,
I'll make new tables today.
Regards,
Rich
Import Notes
Reply to msg id not found: 0def87f9-6fa6-49a7-b0bc-f313c066d387@gmail.com
On Tue, 4 Feb 2025, David G. Johnston wrote:
Most do (have a surrogate PK) since it removes cascading updates and is a
smaller value.Lots of alter tables and update queries.
David,
That's a good point.
Thanks,
Rich
On Tue, Feb 4, 2025 at 9:41 AM David G. Johnston <david.g.johnston@gmail.com>
wrote:
On Tuesday, February 4, 2025, Rich Shepard <rshepard@appl-ecosys.com>
wrote:Should lookup tables have a numeric FK column as well as the description
column?If so, how should I add an FK to the two lookup tables in my database?
Most do (have a surrogate PK) since it removes cascading updates
How does a synthetic PK "remove cascading updates"? Doesn't the decision
on whether or not to cascade update depend on the ON UPDATE CASCADE clause
of the FK definition?
--
Death to <Redacted>, and butter sauce.
Don't boil me, I'm still alive.
<Redacted> lobster!
On Tuesday, February 4, 2025, Ron Johnson <ronljohnsonjr@gmail.com> wrote:
On Tue, Feb 4, 2025 at 9:41 AM David G. Johnston <
david.g.johnston@gmail.com> wrote:On Tuesday, February 4, 2025, Rich Shepard <rshepard@appl-ecosys.com>
wrote:Should lookup tables have a numeric FK column as well as the description
column?If so, how should I add an FK to the two lookup tables in my database?
Most do (have a surrogate PK) since it removes cascading updates
How does a synthetic PK "remove cascading updates"? Doesn't the decision
on whether or not to cascade update depend on the ON UPDATE CASCADE clause
of the FK definition?
People don’t change synthetic PKs so updates never have to happen on the FK
side. Labels do get changed, in which case you have to update the FK label
side.
David J.
On Tue, Feb 4, 2025 at 9:28 AM Rich Shepard <rshepard@appl-ecosys.com>
wrote:
Should lookup tables have a numeric FK column as well as the description
column?
Does your lookup table just have one column? (That's what your question
seems to imply, but that makes no sense, since the whole point of a lookup
table is to store some sort of a code in the "child" table instead of the
whole text of the description.)
--
Death to <Redacted>, and butter sauce.
Don't boil me, I'm still alive.
<Redacted> lobster!
On Tuesday, February 4, 2025, Ron Johnson <ronljohnsonjr@gmail.com> wrote:
On Tue, Feb 4, 2025 at 9:28 AM Rich Shepard <rshepard@appl-ecosys.com>
wrote:Should lookup tables have a numeric FK column as well as the description
column?Does your lookup table just have one column? (That's what your question
seems to imply, but that makes no sense, since the whole point of a lookup
table is to store some sort of a code in the "child" table instead of the
whole text of the description.)
The point of a lookup table is to provide a unique list of authoritative
values for some purpose. Kinda like an enum. But having the label serve
as the unique value is reasonable - we only add surrogates for optimization.
David J.
On Tue, 4 Feb 2025, Ron Johnson wrote:
Does your lookup table just have one column? (That's what your question
seems to imply, but that makes no sense, since the whole point of a lookup
table is to store some sort of a code in the "child" table instead of the
whole text of the description.)
Ron,
Yes, each has a single column of type names, industrytypes and statustypes.
I've always used the descriptive names in queries.
Rich
On Tue, Feb 4, 2025 at 9:59 AM David G. Johnston <david.g.johnston@gmail.com>
wrote:
On Tuesday, February 4, 2025, Ron Johnson <ronljohnsonjr@gmail.com> wrote:
On Tue, Feb 4, 2025 at 9:41 AM David G. Johnston <
david.g.johnston@gmail.com> wrote:On Tuesday, February 4, 2025, Rich Shepard <rshepard@appl-ecosys.com>
wrote:Should lookup tables have a numeric FK column as well as the
description column?If so, how should I add an FK to the two lookup tables in my database?
Most do (have a surrogate PK) since it removes cascading updates
How does a synthetic PK "remove cascading updates"? Doesn't the decision
on whether or not to cascade update depend on the ON UPDATE CASCADE clause
of the FK definition?People don’t change synthetic PKs so updates never have to happen on the
FK side. Labels do get changed, in which case you have to update the FK
label side.
That's the argument between a synthetic PK and a natural PK. (Everywhere
I've seen natural PK used -- that was 25 years ago -- you didn't use the
whole text of the description of the PK, you used a code, like 'HRLY' for
hourly wage employees and 'SLRY' for salaried employees. Then, HRLY or
SLRY would go in the t_employee table income_type VARCHAR(4) column,
referencing a lookup table.
--
Death to <Redacted>, and butter sauce.
Don't boil me, I'm still alive.
<Redacted> lobster!
On Tue, 4 Feb 2025, David G. Johnston wrote:
The point of a lookup table is to provide a unique list of authoritative
values for some purpose. Kinda like an enum. But having the label serve as
the unique value is reasonable - we only add surrogates for optimization.
David,
The industrytypes table has 26 rows, the statustypes table has 8 rows. Your
explanation suggests that for this database adding a PK to each table adds
little, if anything.
Regards,
Rich
Seen the suggestions.
After decades of playing with the alternatives, I don't overthink it
anymore. Pretty much my lookup tables consist of an autogenerated primary
key (smallint or integer), a code (name?) and a description as default.
If there are a lot of tables fitting a common data format, at times I have
consolidated them into a single table with a 'lookup type' column.
Depending on application requirements, sometimes there I will provide short
and long display names, and a generic description for further
clarification, but usually those requirements are overkill.
Tony
On Tue, 4 Feb 2025 at 16:28, Rich Shepard <rshepard@appl-ecosys.com> wrote:
Show quoted text
Should lookup tables have a numeric FK column as well as the description
column?If so, how should I add an FK to the two lookup tables in my database?
TIA,
Rich
On Tue, Feb 4, 2025 at 10:05 AM Rich Shepard <rshepard@appl-ecosys.com>
wrote:
On Tue, 4 Feb 2025, Ron Johnson wrote:
Does your lookup table just have one column? (That's what your question
seems to imply, but that makes no sense, since the whole point of alookup
table is to store some sort of a code in the "child" table instead of the
whole text of the description.)Ron,
Yes, each has a single column of type names, industrytypes and statustypes.
I've always used the descriptive names in queries.
Yeah, that's definitely Bad Design, for the reason David enumerated.
In the lookup table, the PK can be either synthetic (an integer that means
nothing) or natural (a short text code, typically four characters) that is
an abbreviation of the description.
Synthetic keys have been the norm for the past 25+ years, but I have a soft
spot for natural keys.
Natural keys become unwieldy, though, when more than two columns are
required to define primality.
--
Death to <Redacted>, and butter sauce.
Don't boil me, I'm still alive.
<Redacted> lobster!
On Tuesday, February 4, 2025, Rich Shepard <rshepard@appl-ecosys.com> wrote:
On Tue, 4 Feb 2025, David G. Johnston wrote:
The point of a lookup table is to provide a unique list of authoritative
values for some purpose. Kinda like an enum. But having the label serve as
the unique value is reasonable - we only add surrogates for optimization.David,
The industrytypes table has 26 rows, the statustypes table has 8 rows. Your
explanation suggests that for this database adding a PK to each table adds
little, if anything.
It’s the FK side where the cost savings are experienced.
David J.
On Tue, Feb 4, 2025 at 10:08 AM Rich Shepard <rshepard@appl-ecosys.com>
wrote:
On Tue, 4 Feb 2025, David G. Johnston wrote:
The point of a lookup table is to provide a unique list of authoritative
values for some purpose. Kinda like an enum. But having the label serveas
the unique value is reasonable - we only add surrogates for optimization.
David,
The industrytypes table has 26 rows, the statustypes table has 8 rows. Your
explanation suggests that for this database adding a PK to each table adds
little, if anything.
How big is the database? A tiny 500MB db just for you can get by with poor
design. (But then, why are you using PG instead of SQLite?)
More importantly, will you ever update the descriptions? Of course not!
Famous last words. Having a separate PK means that you update one row in
one column, while what you've done means that tens/hundreds of thousands of
rows in possibly dozens of tables need to be updated.
It also means that *you* can easily change things in your ad hoc
database *without
forgetting* to update a table.
This is called an "update anomaly" in relational design theory.
--
Death to <Redacted>, and butter sauce.
Don't boil me, I'm still alive.
<Redacted> lobster!
On Tue, 4 Feb 2025, David G. Johnston wrote:
It’s the FK side where the cost savings are experienced.
David,
Okay.
Thanks,
Rich
On 2/4/25 07:19, Ron Johnson wrote:
How big is the database? A tiny 500MB db just for you can get by with
poor design. (But then, why are you using PG instead of SQLite?)
For the reasons listed here:
https://sqlite.org/quirks.html
In particular:
[...]
2. SQLite Is Embedded, Not Client-Server
3. Flexible Typing
SQLite is flexible with regard to datatypes. Datatypes are advisory
rather than mandatory.
4. Foreign Key Enforcement Is Off By Default
[...]
More importantly, will you ever update the descriptions? Of course not!
Famous last words. Having a separate PK means that you update one row
in one column, while what you've done means that tens/hundreds of
thousands of rows in possibly dozens of tables need to be updated.It also means that *you* can easily change things in your ad hoc
database *without forgetting* to update a table.This is called an "update anomaly" in relational design theory.
--
Death to <Redacted>, and butter sauce.
Don't boil me, I'm still alive.
<Redacted> lobster!
--
Adrian Klaver
adrian.klaver@aklaver.com
On 4 Feb 2025, at 15:27, Rich Shepard <rshepard@appl-ecosys.com> wrote:
Should lookup tables have a numeric FK column as well as the description column?
If so, how should I add an FK to the two lookup tables in my database?
I’ve read the whole thread and the reasoning for having (numeric) autogenerated surrogate key is:
a) performance
b) no cascading updates
I would like to add another dimension to this discussion: logical consistency.
Imagine the following simplified schema with surrogate PK:
create table restaurant (
restaurant_id int not null primary key generated always as identity,
name text not null unique
);
create table restaurant_visit (
visit_id int not null primary key generated always as identity,
guest_username text not null,
when date not null,
restaurant_id not null foreign key restaurant(restaurant_id),
rating smallint not null
);
Let’s say users on your website register restaurant visits and rate them.
The query to register a visit is:
insert into restaurant_visit
select $user, current_date, restaurant_id, $rating
from restaurant where name = $restaurant_name
It is now completely unclear what it means to change the name of the restaurant for already registered visits.
Is it still the same restaurant with a different name or a different restaurant?
Or let say someone swaps names of two restaurants.
That means a user that goes to the same restaurant every day would register visits to two different restaurants!
Using the name of a restaurant as primary key gets rid of these logical anomalies because
the database model now reflects facts from reality.
Having surrogate keys makes your relational database more like a network/object oriented database where rows don’t represent facts but rather some entities that have identity independent from their attributes.
Thanks,
Michal
Am Tue, Feb 04, 2025 at 05:31:13PM +0100 schrieb Michał Kłeczek:
It is now completely unclear what it means to change the name of the restaurant for already registered visits.
Is it still the same restaurant with a different name or a different restaurant?Or let say someone swaps names of two restaurants.
That means a user that goes to the same restaurant every day would register visits to two different restaurants!Using the name of a restaurant as primary key gets rid of these logical anomalies because
the database model now reflects facts from reality.
Reality tends to become so ambiguous as to not be
reflectable (two entirely different restaurants eventually,
within the flow of time, carry the very same name).
A primary key is very likely not the proper place to reflect
arbitrary business logic (is it the same restaurant or not ?
what if two restaurants have the same name at the same time
?). Primary keys are tools at the technical level.
Karsten
--
GPG 40BE 5B0E C98E 1713 AFA6 5BC0 3BEA AC80 7D4F C89B
On 2/4/25 09:51, Karsten Hilbert wrote:
Am Tue, Feb 04, 2025 at 05:31:13PM +0100 schrieb Michał Kłeczek:
It is now completely unclear what it means to change the name of the restaurant for already registered visits.
Is it still the same restaurant with a different name or a different restaurant?Or let say someone swaps names of two restaurants.
That means a user that goes to the same restaurant every day would register visits to two different restaurants!Using the name of a restaurant as primary key gets rid of these logical anomalies because
the database model now reflects facts from reality.Reality tends to become so ambiguous as to not be
reflectable (two entirely different restaurants eventually,
within the flow of time, carry the very same name).A primary key is very likely not the proper place to reflect
arbitrary business logic (is it the same restaurant or not ?
what if two restaurants have the same name at the same time
?). Primary keys are tools at the technical level.Karsten
--
GPG 40BE 5B0E C98E 1713 AFA6 5BC0 3BEA AC80 7D4F C89B
That OP is using a single column is interesting: Taking this notion to
the restaurant/visit theme, the list of names or restaurants becomes a
proper table 'restaurant' (name, addresss, phone, website, etc). The
name is as useless as a key as first/last is for person.