Need schema design advice

Started by Matthew Wilsonover 17 years ago11 messagesgeneral
Jump to latest
#1Matthew Wilson
matt@tplus1.com

I need to track employees and their preferred locations, shifts, and
stations.

For example, I need to track that Alice prefers to work the morning
shift at the west-side location, and she likes to work the cash-register
station.

Also, I need to track that Bob likes the west-side and north-side
locations, likes the night shift, and likes the dishwasher station. Note
the one-to-many relationship between Bob and his preferred locations. I
need to support that possibility.

So, I see two ways to make my tables, and I'd like some advice.

FIRST METHOD:

create table preferred_location (
employee_id int references employee (id),
location_id int references location (id));

create table preferred_shift (
employee_id int references employee (id),
shift int references shift (id));

create table preferred_station (
employee_id int references employee (id),
station_id int references station (id));

SECOND METHOD:

create table preferences (

employee_id int references employee (id),
other_table_name text, /
other_table_id int));

In the second method, I'd store tuples like this in the preferences
table:

(<Alice's ID>, 'location', <west-side location ID>),
(<Alice's ID>, 'shift', <morning shift ID>)
(<Alice's ID>, 'station', <cash register station ID>)

The nice thing about the second approach is I can extend this to store
all sorts of preferences as I dream them up. But on the downside, I
don't have any FK constraints.

I suspect this is a pretty common dilemma. Any commentary from the
experts on this list is welcome.

Thanks in advance!

Matt

#2Scott Bailey
artacus@comcast.net
In reply to: Matthew Wilson (#1)
Re: Need schema design advice

I need to track employees and their preferred locations, shifts, and
stations.

For example, I need to track that Alice prefers to work the morning
shift at the west-side location, and she likes to work the cash-register
station.

Also, I need to track that Bob likes the west-side and north-side
locations, likes the night shift, and likes the dishwasher station. Note
the one-to-many relationship between Bob and his preferred locations. I
need to support that possibility.

So, I see two ways to make my tables, and I'd like some advice.

FIRST METHOD:

create table preferred_location (
employee_id int references employee (id),
location_id int references location (id));

create table preferred_shift (
employee_id int references employee (id),
shift int references shift (id));

create table preferred_station (
employee_id int references employee (id),
station_id int references station (id));

SECOND METHOD:

create table preferences (

employee_id int references employee (id),
other_table_name text, /
other_table_id int));

In the second method, I'd store tuples like this in the preferences
table:

(<Alice's ID>, 'location', <west-side location ID>),
(<Alice's ID>, 'shift', <morning shift ID>)
(<Alice's ID>, 'station', <cash register station ID>)

The nice thing about the second approach is I can extend this to store
all sorts of preferences as I dream them up. But on the downside, I
don't have any FK constraints.

I suspect this is a pretty common dilemma. Any commentary from the
experts on this list is welcome.

I tend to favor the second approach because it is more extensible. I
might add an additional field to the preferences table. Something like
preference_order so that you can record someone's primary pick from a
secondary one.

Artacus

#3Jeff Soules
soules@gmail.com
In reply to: Matthew Wilson (#1)
Re: Need schema design advice

On Sat, Oct 11, 2008 at 1:10 PM, Matthew Wilson <matt@tplus1.com> wrote:

I need to track employees and their preferred locations, shifts, and
stations.

For example, I need to track that Alice prefers to work the morning
shift at the west-side location, and she likes to work the cash-register
station.

Also, I need to track that Bob likes the west-side and north-side
locations, likes the night shift, and likes the dishwasher station. Note
the one-to-many relationship between Bob and his preferred locations. I
need to support that possibility.

So, I see two ways to make my tables, and I'd like some advice.

FIRST METHOD:

create table preferred_location (
employee_id int references employee (id),
location_id int references location (id));

create table preferred_shift (
employee_id int references employee (id),
shift int references shift (id));

create table preferred_station (
employee_id int references employee (id),
station_id int references station (id));

SECOND METHOD:

create table preferences (

employee_id int references employee (id),
other_table_name text, /
other_table_id int));

In the second method, I'd store tuples like this in the preferences
table:

(<Alice's ID>, 'location', <west-side location ID>),
(<Alice's ID>, 'shift', <morning shift ID>)
(<Alice's ID>, 'station', <cash register station ID>)

The nice thing about the second approach is I can extend this to store
all sorts of preferences as I dream them up. But on the downside, I
don't have any FK constraints.

I suspect this is a pretty common dilemma. Any commentary from the
experts on this list is welcome.

Thanks in advance!

Matt

I'm certainly not an expert, but hopefully my commentary will still be
somewhat helpful.

Your "method 2" is something called an Entity-Attribute-Value table
design[1]See e.g. http://en.wikipedia.org/wiki/Entity-Attribute-Value_model. There was a discussion on this list a couple weeks ago
about the merits and drawbacks of designing your tables this way.

Honestly, it probably depends on what your ultimate needs are.

As the "Downsides" section of the Wiki link [1]See e.g. http://en.wikipedia.org/wiki/Entity-Attribute-Value_model shows, most of the
problems with EAV really start to emerge when the tables get huge and
you're dealing with hundreds of thousands to millions of entities,
each with potentially hundreds of attribute-value pairs. If you're
intending to roll out your application for every Starbucks on your
continent, that might start to be a problem. (From my experience,
implementations like this over large data sets suffer a big
performance hit and carry a lot of data integrity baggage.) If you're
talking about something for use in your chain of three internet cafes
around one town, and you aren't going to have more than a dozen
Attributes per Entity, it probably doesn't matter, because the
complications will be more manageable without screwing something up.

That said, by going the EAV/"Method-2" route, you're gaining
flexibility, but at the cost of increased complication, and ultimately
repurposing a relational database to do something that isn't very
database-like, that's really more like a spreadsheet. (So why not
just use a spreadsheet?) You have little room for recording
additional information, like ordering preferences, or indicating that
(say) a station preference depends on a location preference, or that a
shift time depends on day of the week, etc -- so you're probably not
getting as much flexibility as you think. Sure, you could add an
"Extra_Data" column, so you have rows:
Marie-Location-West-1,
Marie-Location-East-2,
Marie-Shift-Evening-Tuesday,
Marie-Station-Register-West,
Marie-Shift-Morning-Sunday,
etc. But you can see the data integrity nightmare already, when you
somehow manage to record "Marie-Shift-Register-1". Not to mention
that you'll have to do format conversions for that "Extra_Data" field,
and incorporate logic somewhere else in your program that deciphers
whatever's in the generic data field to come up with ordering
preferences for locations, station preferences by shift times, or
whatever else you want to store.

Essentially, in my humble opinion, you're putting off the problem of
thinking about the nature and structure of your data, and most
importantly, what you're going to use that data for, when ultimately
those are the decisions that should be guiding how you design and use
the database. Particularly given that this sounds like a
management-efficiency project rather than one that your business (I'm
assuming it's your business) hinges upon, I am imagining that you have
time
to be sure about exactly what you want to do with the data. If you
expect that your business needs will change dramatically over the
lifetime of the product, or you don't have time to make these
decisions now, then maybe the flexibility outweighs the drawbacks.

Good luck!

[1]: See e.g. http://en.wikipedia.org/wiki/Entity-Attribute-Value_model

#4Matthew Wilson
matt@tplus1.com
In reply to: Matthew Wilson (#1)
Re: Need schema design advice

Jeff, this is *exactly* the kind of feedback I was hoping to get.
Thanks so much for the link and the explanation.

Matt

#5Martin Gainty
mgainty@hotmail.com
In reply to: Matthew Wilson (#4)
Re: Need schema design advice

knee deep in a schema design myself ..curious as to which advice did jeff offer for schema design?

thanks
Martin
______________________________________________
Disclaimer and confidentiality note
Everything in this e-mail and any attachments relates to the official business of Sender. This transmission is of a confidential nature and Sender does not endorse distribution to any party other than intended recipient. Sender does not necessarily endorse content contained within this transmission.

To: pgsql-general@postgresql.org
From: matt@tplus1.com
Subject: Re: [GENERAL] Need schema design advice
Date: Sun, 12 Oct 2008 13:55:42 +0000

Jeff, this is *exactly* the kind of feedback I was hoping to get.
Thanks so much for the link and the explanation.

Matt

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

_________________________________________________________________
See how Windows connects the people, information, and fun that are part of your life.
http://clk.atdmt.com/MRT/go/msnnkwxp1020093175mrt/direct/01/

#6Sam Mason
sam@samason.me.uk
In reply to: Matthew Wilson (#1)
Re: Need schema design advice

On Sat, Oct 11, 2008 at 05:10:26PM +0000, Matthew Wilson wrote:

I need to track employees and their preferred locations, shifts, and
stations.

As always there's a trade off between "general" EAV style designs and
more specific ones (as noted by Jeff). One, more EAV style, design that
sprung to mind is:

CREATE TABLE preftypes (
preftype TEXT PRIMARY KEY
);

CREATE TABLE prefopts (
prefopt TEXT PRIMARY KEY,
preftype TEXT REFERENCES preftype
);

CREATE TABLE emps (
empid TEXT PRIMARY KEY
);

CREATE TABLE empprefs (
empid TEXT REFERENCES emps,
prefopt TEXT REFERENCES prefopts,
PRIMARY KEY (empid, prefopt)
);

INSERT INTO preftypes (prefname) VALUES
('location'), ('shift'), ('station');

INSERT INTO emps (empid) VALUES
('alice'), ('bob');

INSERT INTO prefopts (preftype, prefopt) VALUES
('location', 'west-side'),
('location', 'north-side'),
('shift', 'morning'),
('shift', 'night'),
('station', 'cash-register'),
('station', 'dishwasher');

INSERT INTO empprefs (empid, prefopt) VALUES
('alice', 'west-side'),
('alice', 'morning'),
('alice', 'cash-register'),
('bob', 'west-side'),
('bob', 'north-side'),
('bob', 'night'),
('bob', 'dishwasher');

you may want to move the "preftype" into the primary key of the
"prefopts" table; that would force you to reference it in the "empprefs"
table making queries asking for employee's preferences to specific
preftypes easier.

create table preferences (

employee_id int references employee (id),
other_table_name text, /
other_table_id int));

"other_table_name" sounds like bad style; no real way to enforce
integrity constraints (rules/triggers maybe, but it would be a bit of a
fiddle and prone to subtle bugs).

The scheme I gave should allow you to add new preference types, but it
makes it difficult to add details about the types' options. As always,
it's a trade off between what you're optimizing for. If you're adding
more preference types then go for a EAV style design, if you're going to
want to add more details about the preferences (this does seem to be the
common case, which is why most people here stay away from EAV designs).

This is easier to get started with, so if it's just going to be a quick
tech demo then this may be good. Be warned though that code from demos
tends to live much longer than you'd ever possibly expect so doing the
more long winded thing first may be easier--even a couple of months down
the line.

Sam

#7Sam Mason
sam@samason.me.uk
In reply to: Sam Mason (#6)
Re: Need schema design advice

On Sun, Oct 12, 2008 at 08:12:40PM +0100, I wrote:

As always, it's a trade off between what you're optimizing for. If
you're adding more preference types then go for a EAV style design, if
you're going to want to add more details about the preferences (this
does seem to be the common case, which is why most people here stay
away from EAV designs)[...]

oops, got distracted and forgot to finish this sentence!

...then you're better off with a non-EAV style design.

Sam

#8Martin Gainty
mgainty@hotmail.com
In reply to: Sam Mason (#7)
Re: Need schema design advice

could you provide a brief explanation of EAV ?

thx
Martin
______________________________________________
Disclaimer and confidentiality note
Everything in this e-mail and any attachments relates to the official business of Sender. This transmission is of a confidential nature and Sender does not endorse distribution to any party other than intended recipient. Sender does not necessarily endorse content contained within this transmission.

Date: Sun, 12 Oct 2008 20:22:14 +0100
From: sam@samason.me.uk
To: pgsql-general@postgresql.org
Subject: Re: [GENERAL] Need schema design advice

On Sun, Oct 12, 2008 at 08:12:40PM +0100, I wrote:

As always, it's a trade off between what you're optimizing for. If
you're adding more preference types then go for a EAV style design, if
you're going to want to add more details about the preferences (this
does seem to be the common case, which is why most people here stay
away from EAV designs)[...]

oops, got distracted and forgot to finish this sentence!

...then you're better off with a non-EAV style design.

Sam

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

_________________________________________________________________
Stay up to date on your PC, the Web, and your mobile phone with Windows Live.
http://clk.atdmt.com/MRT/go/msnnkwxp1020093185mrt/direct/01/

#9Ben
bench@silentmedia.com
In reply to: Martin Gainty (#8)
Re: Need schema design advice

On Oct 12, 2008, at 5:51 PM, Martin Gainty wrote:

could you provide a brief explanation of EAV ?

Instead of:

create table vehicles
(
kind text primary key,
wheels int
);
insert into vehicles (kind, wheels) values ('car',4);
insert into vehicles (kind, wheels) values ('bike',2);

create table boats
(
kind text primary key,
displacement int
);
insert into boats (kind,displacement) values ('small boat',1000);
insert into boats (kind,displacement) values ('big boat',300000);

... in an EAV model you would do something like:

create table eav
(
kind text primary key,
attr text,
value text
);
insert into eav (kind, attr, value) values ('car','wheels','4');
insert into eav (kind, attr, value) values ('bike','wheels','2');
insert into eav (kind, attr, value) values ('small
boat','displacement','1000');
insert into eav (kind, attr, value) values ('big
boat','displacement','300000');

Show quoted text
#10Rodrigo De León
rdeleonp@gmail.com
In reply to: Ben (#9)
Re: Need schema design advice

On Sun, Oct 12, 2008 at 8:10 PM, Ben Chobot <bench@silentmedia.com> wrote:

On Oct 12, 2008, at 5:51 PM, Martin Gainty wrote:

could you provide a brief explanation of EAV ?

(...) in an EAV model you would do something like:
create table eav
(
kind text primary key,
attr text,
value text
);
insert into eav (kind, attr, value) values ('car','wheels','4');
insert into eav (kind, attr, value) values ('bike','wheels','2');
insert into eav (kind, attr, value) values ('small
boat','displacement','1000');
insert into eav (kind, attr, value) values ('big
boat','displacement','300000');

Truly Hideous (TM).

Martin, I recommend you read the following:
http://joecelkothesqlapprentice.blogspot.com/2006/04/using-one-table-vs-many.html

#11Lew
noone@lwsc.ehost-services.com
In reply to: Martin Gainty (#8)
Re: Need schema design advice

Martin Gainty wrote:

could you provide a brief explanation of EAV ?

Please avoid HTML and eschew top-posting. The post from Jeff Soules in this
thread included the advice:

See e.g. http://en.wikipedia.org/wiki/Entity-Attribute-Value_model

which points to an explanation.

--
Lew