Database designpattern - product feature

Started by Adrian Sternalmost 11 years ago12 messagesgeneral
Jump to latest
#1Adrian Stern
adrian.stern@unchained.ch

Hi, I'm new

I've been working as the sole administrator of various postgresql projects
for a while now. All of which where django projects.
Since a new project is starting and we've found the need for a more generic
approach I would like to ask a few questions.

I would like to implement a pattern similar to the product feature pattern
explained in the silverstone book - the data model resource book vol 1. It
is simply explained. There is a Table PRODUCT holding the fields all the
products share, then there is the table PRODUCT_FEATURE, both of them in a
“many to many“ relationship.

PRODUCT <--- m -------- n ---> PRODUCT_FEATURE (a table in between of
course)

PRODUCT_FEATURE --> PF
PRODUCT --> P
TABLE IN BETWEEN --> TIB

PF defines the feature Type while P stands for the product the feature is
applied to. Some of these PF can have values of different types (text,
numbers, floating, blob, ...) which would be applied to TIB.

I don't like the idea of having numerous empty fields prepared in TIB, just
to store occasional values of different types, therefore I need to
specialize those TIB Values.

Now how would I do That?

I could create some tables solely for the means of holding [NUM], [TEXT],
[BLOB], [ETC] and reference them with the TIB PK. When using them I could
create a view TIBV containing all of [NUM, TEXT, BLOB, ETC] in the same
column called Value, and join it with TIB to get the value of a PF.

But is this a good idea?
Is there a better way?

Also, I would have to create a pivot table in order to list all the
products with all the features. As this is implemented in C (afaik) I
suppose it is rather fast or at least fast enough, but I do not actually
know. What I know is, there are about 30 Product Types and around 50
possible product features. One product can have up to approximately 25 PF
but are mostly around 5 to 10.

Do you think a pivot table is a good idea?
What alternative do i have?

There is room for caching since the dataset is not updated too often.

regards, adrian

#2Dorian Hoxha
dorian.hoxha@gmail.com
In reply to: Adrian Stern (#1)
Re: Database designpattern - product feature

What about keeping all the dynamic columns of each product in a json(b)
column ?
Maybe you can make constraints that check the product_type and
json->field->type ?

On Mon, Jun 1, 2015 at 4:35 PM, Adrian Stern <adrian.stern@unchained.ch>
wrote:

Show quoted text

Hi, I'm new

I've been working as the sole administrator of various postgresql projects
for a while now. All of which where django projects.
Since a new project is starting and we've found the need for a more
generic approach I would like to ask a few questions.

I would like to implement a pattern similar to the product feature pattern
explained in the silverstone book - the data model resource book vol 1. It
is simply explained. There is a Table PRODUCT holding the fields all the
products share, then there is the table PRODUCT_FEATURE, both of them in a
“many to many“ relationship.

PRODUCT <--- m -------- n ---> PRODUCT_FEATURE (a table in between of
course)

PRODUCT_FEATURE --> PF
PRODUCT --> P
TABLE IN BETWEEN --> TIB

PF defines the feature Type while P stands for the product the feature is
applied to. Some of these PF can have values of different types (text,
numbers, floating, blob, ...) which would be applied to TIB.

I don't like the idea of having numerous empty fields prepared in TIB,
just to store occasional values of different types, therefore I need to
specialize those TIB Values.

Now how would I do That?

I could create some tables solely for the means of holding [NUM], [TEXT],
[BLOB], [ETC] and reference them with the TIB PK. When using them I could
create a view TIBV containing all of [NUM, TEXT, BLOB, ETC] in the same
column called Value, and join it with TIB to get the value of a PF.

But is this a good idea?
Is there a better way?

Also, I would have to create a pivot table in order to list all the
products with all the features. As this is implemented in C (afaik) I
suppose it is rather fast or at least fast enough, but I do not actually
know. What I know is, there are about 30 Product Types and around 50
possible product features. One product can have up to approximately 25 PF
but are mostly around 5 to 10.

Do you think a pivot table is a good idea?
What alternative do i have?

There is room for caching since the dataset is not updated too often.

regards, adrian

#3Dorian Hoxha
dorian.hoxha@gmail.com
In reply to: Adrian Stern (#1)
Re: Database designpattern - product feature

Please do reply-all so you also reply to the list.

It's not ~good to develop with sqlite and deploy on posgresql. You should
have your 'dev' as close to 'prod' as possible.

Product_feature is another table in this case ?

On Tue, Jun 2, 2015 at 11:44 AM, Adrian Stern <adrian.stern@unchained.ch>
wrote:

Show quoted text

Database changeability is not a requirement. It just comes with django and
makes development so much easier since I can develop on sqlite and deploy
the wherever I want. Django orm is not "great" I agree, but it certainly
does not suck, there are alternatives like sqlalchemy which are far more
powerful.

But yea. I get what you're trying to tell me. And I agree, this postgresql
feature for jsonb look really nice. Much more easy to use than the whole
xquery stuff. The complete EAV Pattern described on wikipedia is actually
pretty complex and seems not easy at all to implement in using django.
Therefore, i maybe should run a few tests with the json plugins.

So let my sketch another approach.

PRODUCT -> P
- name
- type
- features (jsonb)

PRODUCT_FEATURE -> PF
- name
- description
- datatype
- validation

P now has the features field of type jsonb, which allows keys specified in
PF together with a value of "datatype" or simply a valid one. PF holds the
key-name, its datatype for generating the GUI, and some validation pattern
for input sanitizing. There is no relation between the Tables.

Getting the description is not an issue. I could even create a view
mapping the jsonb keys to rows.

Yes I like your approach. Is there anything I should be aware of? Some
do's and don'ts or known pitfalls?

#4Adrian Stern
adrian.stern@unchained.ch
In reply to: Dorian Hoxha (#3)
Re: Database designpattern - product feature

Sorry. Will do in the future.

Product_freature is a table describing the valid keys for product features.
With this it is possible to limit keys to specific groups of products.

Freundliche Grüsse

Adrian Stern
unchained - web solutions

adrian.stern@unchained.ch
+41 79 292 83 47

On Tue, Jun 2, 2015 at 12:58 PM, Dorian Hoxha <dorian.hoxha@gmail.com>
wrote:

Show quoted text

Please do reply-all so you also reply to the list.

It's not ~good to develop with sqlite and deploy on posgresql. You should
have your 'dev' as close to 'prod' as possible.

Product_feature is another table in this case ?

On Tue, Jun 2, 2015 at 11:44 AM, Adrian Stern <adrian.stern@unchained.ch>
wrote:

Database changeability is not a requirement. It just comes with django
and makes development so much easier since I can develop on sqlite and
deploy the wherever I want. Django orm is not "great" I agree, but it
certainly does not suck, there are alternatives like sqlalchemy which are
far more powerful.

But yea. I get what you're trying to tell me. And I agree, this
postgresql feature for jsonb look really nice. Much more easy to use than
the whole xquery stuff. The complete EAV Pattern described on wikipedia is
actually pretty complex and seems not easy at all to implement in using
django. Therefore, i maybe should run a few tests with the json plugins.

So let my sketch another approach.

PRODUCT -> P
- name
- type
- features (jsonb)

PRODUCT_FEATURE -> PF
- name
- description
- datatype
- validation

P now has the features field of type jsonb, which allows keys specified
in PF together with a value of "datatype" or simply a valid one. PF holds
the key-name, its datatype for generating the GUI, and some validation
pattern for input sanitizing. There is no relation between the Tables.

Getting the description is not an issue. I could even create a view
mapping the jsonb keys to rows.

Yes I like your approach. Is there anything I should be aware of? Some
do's and don'ts or known pitfalls?

#5Dorian Hoxha
dorian.hoxha@gmail.com
In reply to: Adrian Stern (#4)
Re: Database designpattern - product feature

So product_feature is only 1 row for each product_type, right ? Looks good.

On Tue, Jun 2, 2015 at 1:15 PM, Adrian Stern <adrian.stern@unchained.ch>
wrote:

Show quoted text

Sorry. Will do in the future.

Product_freature is a table describing the valid keys for product
features. With this it is possible to limit keys to specific groups of
products.

Freundliche Grüsse

Adrian Stern
unchained - web solutions

adrian.stern@unchained.ch
+41 79 292 83 47

On Tue, Jun 2, 2015 at 12:58 PM, Dorian Hoxha <dorian.hoxha@gmail.com>
wrote:

Please do reply-all so you also reply to the list.

It's not ~good to develop with sqlite and deploy on posgresql. You should
have your 'dev' as close to 'prod' as possible.

Product_feature is another table in this case ?

On Tue, Jun 2, 2015 at 11:44 AM, Adrian Stern <adrian.stern@unchained.ch>
wrote:

Database changeability is not a requirement. It just comes with django
and makes development so much easier since I can develop on sqlite and
deploy the wherever I want. Django orm is not "great" I agree, but it
certainly does not suck, there are alternatives like sqlalchemy which are
far more powerful.

But yea. I get what you're trying to tell me. And I agree, this
postgresql feature for jsonb look really nice. Much more easy to use than
the whole xquery stuff. The complete EAV Pattern described on wikipedia is
actually pretty complex and seems not easy at all to implement in using
django. Therefore, i maybe should run a few tests with the json plugins.

So let my sketch another approach.

PRODUCT -> P
- name
- type
- features (jsonb)

PRODUCT_FEATURE -> PF
- name
- description
- datatype
- validation

P now has the features field of type jsonb, which allows keys specified
in PF together with a value of "datatype" or simply a valid one. PF holds
the key-name, its datatype for generating the GUI, and some validation
pattern for input sanitizing. There is no relation between the Tables.

Getting the description is not an issue. I could even create a view
mapping the jsonb keys to rows.

Yes I like your approach. Is there anything I should be aware of? Some
do's and don'ts or known pitfalls?

#6William Dunn
dunnwjr@gmail.com
In reply to: Adrian Stern (#1)
Re: Database designpattern - product feature

Hello Adrian,

May I ask why you need a non-standard model? By standard models I mean the
following:

1) When you don't need to have subclass specific database constraints: All
subclasses in the same table, subclasses that do not have an attribute have
that column null. This has the best performance because no joins are
required when querying both superclass and subclass attributes, and all the
data for an object will be in the same block on disk. The disadvantage of
this is that you cannot enforce constraints, such as not-null, on subclass
specific attributes columns because the constraints would also be applied
to the superclasses. If you can ensure that your application is the only
way data can be inserted those constraints will naturally be enforced there.

2) When you need subclass specific database constraints: Use an ORM such as
Django's ORM or SQLAlchemy which has one table with the superclass
attributes and a table for each subclass with their subclass specific
attributes. This is slower because joins will be needed and the data for an
object will be in 2 different blocks of disk but it allows you to enforce
constraints within the database which will be checked whenever any
application tries to insert values. There is a lot of complexity added
because there will be so many small tables and indexes but the ORM takes
care of that for you.

*Will J. Dunn*
*willjdunn.com <http://willjdunn.com&gt;*

On Mon, Jun 1, 2015 at 10:35 AM, Adrian Stern <adrian.stern@unchained.ch>
wrote:

Show quoted text

Hi, I'm new

I've been working as the sole administrator of various postgresql projects
for a while now. All of which where django projects.
Since a new project is starting and we've found the need for a more
generic approach I would like to ask a few questions.

I would like to implement a pattern similar to the product feature pattern
explained in the silverstone book - the data model resource book vol 1. It
is simply explained. There is a Table PRODUCT holding the fields all the
products share, then there is the table PRODUCT_FEATURE, both of them in a
“many to many“ relationship.

PRODUCT <--- m -------- n ---> PRODUCT_FEATURE (a table in between of
course)

PRODUCT_FEATURE --> PF
PRODUCT --> P
TABLE IN BETWEEN --> TIB

PF defines the feature Type while P stands for the product the feature is
applied to. Some of these PF can have values of different types (text,
numbers, floating, blob, ...) which would be applied to TIB.

I don't like the idea of having numerous empty fields prepared in TIB,
just to store occasional values of different types, therefore I need to
specialize those TIB Values.

Now how would I do That?

I could create some tables solely for the means of holding [NUM], [TEXT],
[BLOB], [ETC] and reference them with the TIB PK. When using them I could
create a view TIBV containing all of [NUM, TEXT, BLOB, ETC] in the same
column called Value, and join it with TIB to get the value of a PF.

But is this a good idea?
Is there a better way?

Also, I would have to create a pivot table in order to list all the
products with all the features. As this is implemented in C (afaik) I
suppose it is rather fast or at least fast enough, but I do not actually
know. What I know is, there are about 30 Product Types and around 50
possible product features. One product can have up to approximately 25 PF
but are mostly around 5 to 10.

Do you think a pivot table is a good idea?
What alternative do i have?

There is room for caching since the dataset is not updated too often.

regards, adrian

#7Adrian Stern
adrian.stern@unchained.ch
In reply to: William Dunn (#6)
Re: Database designpattern - product feature

Hi William, thanks for joining the conversation.

1) We do hope for constraints since a connection to an ERP system is
possible in the future. We want to plan ahead.

2) As for the subclass approach: I would need about 30 subclasses and it
will get really hard to add new products since a change in the database
will be necessary each time. That's why we want a more generic approach.

Maybe I don't understand you right, because of the language barrier. Can
you provide me a link to a subclassing example?
->
https://docs.djangoproject.com/en/1.8/topics/db/models/#multi-table-inheritance
?

ORM is a given in my case. This is not a high performance application.

Freundliche Grüsse

Adrian Stern
unchained - web solutions

adrian.stern@unchained.ch
+41 79 292 83 47

On Tue, Jun 2, 2015 at 5:35 PM, William Dunn <dunnwjr@gmail.com> wrote:

Show quoted text

Hello Adrian,

May I ask why you need a non-standard model? By standard models I mean the
following:

1) When you don't need to have subclass specific database constraints: All
subclasses in the same table, subclasses that do not have an attribute have
that column null. This has the best performance because no joins are
required when querying both superclass and subclass attributes, and all the
data for an object will be in the same block on disk. The disadvantage of
this is that you cannot enforce constraints, such as not-null, on subclass
specific attributes columns because the constraints would also be applied
to the superclasses. If you can ensure that your application is the only
way data can be inserted those constraints will naturally be enforced there.

2) When you need subclass specific database constraints: Use an ORM such
as Django's ORM or SQLAlchemy which has one table with the superclass
attributes and a table for each subclass with their subclass specific
attributes. This is slower because joins will be needed and the data for an
object will be in 2 different blocks of disk but it allows you to enforce
constraints within the database which will be checked whenever any
application tries to insert values. There is a lot of complexity added
because there will be so many small tables and indexes but the ORM takes
care of that for you.

*Will J. Dunn*
*willjdunn.com <http://willjdunn.com&gt;*

On Mon, Jun 1, 2015 at 10:35 AM, Adrian Stern <adrian.stern@unchained.ch>
wrote:

Hi, I'm new

I've been working as the sole administrator of various postgresql
projects for a while now. All of which where django projects.
Since a new project is starting and we've found the need for a more
generic approach I would like to ask a few questions.

I would like to implement a pattern similar to the product feature
pattern explained in the silverstone book - the data model resource book
vol 1. It is simply explained. There is a Table PRODUCT holding the fields
all the products share, then there is the table PRODUCT_FEATURE, both of
them in a “many to many“ relationship.

PRODUCT <--- m -------- n ---> PRODUCT_FEATURE (a table in between of
course)

PRODUCT_FEATURE --> PF
PRODUCT --> P
TABLE IN BETWEEN --> TIB

PF defines the feature Type while P stands for the product the feature is
applied to. Some of these PF can have values of different types (text,
numbers, floating, blob, ...) which would be applied to TIB.

I don't like the idea of having numerous empty fields prepared in TIB,
just to store occasional values of different types, therefore I need to
specialize those TIB Values.

Now how would I do That?

I could create some tables solely for the means of holding [NUM], [TEXT],
[BLOB], [ETC] and reference them with the TIB PK. When using them I could
create a view TIBV containing all of [NUM, TEXT, BLOB, ETC] in the same
column called Value, and join it with TIB to get the value of a PF.

But is this a good idea?
Is there a better way?

Also, I would have to create a pivot table in order to list all the
products with all the features. As this is implemented in C (afaik) I
suppose it is rather fast or at least fast enough, but I do not actually
know. What I know is, there are about 30 Product Types and around 50
possible product features. One product can have up to approximately 25 PF
but are mostly around 5 to 10.

Do you think a pivot table is a good idea?
What alternative do i have?

There is room for caching since the dataset is not updated too often.

regards, adrian

#8Rob Sargent
robjsargent@gmail.com
In reply to: Adrian Stern (#7)
Re: Database designpattern - product feature

Sent from my iPad

On Jun 3, 2015, at 7:50 AM, Adrian Stern <adrian.stern@unchained.ch> wrote:

Hi William, thanks for joining the conversation.

1) We do hope for constraints since a connection to an ERP system is possible in the future. We want to plan ahead.

2) As for the subclass approach: I would need about 30 subclasses and it will get really hard to add new products since a change in the database will be necessary each time. That's why we want a more generic approach.

Maybe I don't understand you right, because of the language barrier. Can you provide me a link to a subclassing example?
-> https://docs.djangoproject.com/en/1.8/topics/db/models/#multi-table-inheritance ?

ORM is a given in my case. This is not a high performance application.

Freundliche Grüsse

I don't understand 'ORM is a given' in conjunction with a reluctance to db evolution? These seem oxymoronic.

Show quoted text
#9Roxanne Reid-Bennett
rox@tara-lu.com
In reply to: Adrian Stern (#7)
Re: Database designpattern - product feature

On 6/3/2015 2:50 AM, Adrian Stern wrote:

Hi William, thanks for joining the conversation.

1) We do hope for constraints since a connection to an ERP system is
possible in the future. We want to plan ahead.

2) As for the subclass approach: I would need about 30 subclasses and
it will get really hard to add new products since a change in the
database will be necessary each time. That's why we want a more
generic approach.

At the root of your description it appears to me that you are choosing
essentially an EAV design pattern. Constraints and case specific
validation become difficult (or impossible) to consistently enforce. I
have personal experience in cleaning up a system that used this *exact*
pattern (product -> product attribute). Different developers approached
updating information in different ways. They also chose to store the
attributes as text to avoid the text vs number issue. However, they
failed to force any validation and any hint of attempting to prevent
duplicates. That destroyed hope of supporting an underlying business
need to search for specific values during data analysis. (think of how
many different ways you can misspell "poughkeepsie" - but hey... it's
"just a name")

I inherited the results of poor development controls and poor control
over the weakness of the design - e.g. validation... and the cleanup has
been long, and painful.

I think you should evaluate your unease with having to update the
database on release (potentially many times) carefully for what it is
and why you have it. [I'm not saying it is invalid - just know why you
have it] Because no matter how well you design your system - databases
evolve. Manage that.

Anybody can muck up part of a project and cause garbage - but speaking
from experience... this design pattern really encourages it. If you
choose to use it - then you HAVE to control where and when
inserts/updates are done and be very careful with specifying how
validation is to be done to the entire development team (and potentially
the users)... and then review the code (and/or data) regularly.

wide tables/sparse fill on the columns ... I haven't tried to calculate
the overhead on this... but disk space is generally considered to be
cheap. [that doesn't mean your case wouldn't be criminally wasteful]
Choosing 1 wide table or 30 sub-tables to deal with detail data. I don't
know how that directly effects Postgres' performance.... if you choose
30 tables ... meh... at least you'll know exactly where your data is -
and exactly what is allowed for each and every variant. Remember to
enforce a consistent guessable naming convention.

All that said - there are reasons to use this pattern to gain the power
of the dynamically expandable configuration of allowed values. I just
see the use cases where the gain (flexibility) is worth the pain
(validation control) to be few and far between.

just my $0.01

Roxanne

Maybe I don't understand you right, because of the language barrier.
Can you provide me a link to a subclassing example?
->
https://docs.djangoproject.com/en/1.8/topics/db/models/#multi-table-inheritance
?

ORM is a given in my case. This is not a high performance application.

Freundliche Grüsse

Adrian Stern
unchained - web solutions

adrian.stern@unchained.ch <mailto:adrian.stern@unchained.ch>
+41 79 292 83 47

On Tue, Jun 2, 2015 at 5:35 PM, William Dunn <dunnwjr@gmail.com
<mailto:dunnwjr@gmail.com>> wrote:

Hello Adrian,

May I ask why you need a non-standard model? By standard models I
mean the following:

1) When you don't need to have subclass specific database
constraints: All subclasses in the same table, subclasses that do
not have an attribute have that column null. This has the best
performance because no joins are required when querying both
superclass and subclass attributes, and all the data for an object
will be in the same block on disk. The disadvantage of this is
that you cannot enforce constraints, such as not-null, on subclass
specific attributes columns because the constraints would also be
applied to the superclasses. If you can ensure that your
application is the only way data can be inserted those constraints
will naturally be enforced there.

2) When you need subclass specific database constraints: Use an
ORM such as Django's ORM or SQLAlchemy which has one table with
the superclass attributes and a table for each subclass with their
subclass specific attributes. This is slower because joins will be
needed and the data for an object will be in 2 different blocks of
disk but it allows you to enforce constraints within the database
which will be checked whenever any application tries to insert
values. There is a lot of complexity added because there will be
so many small tables and indexes but the ORM takes care of that
for you.

*Will J. Dunn*
_willjdunn.com <http://willjdunn.com&gt;_

On Mon, Jun 1, 2015 at 10:35 AM, Adrian Stern
<adrian.stern@unchained.ch <mailto:adrian.stern@unchained.ch>> wrote:

Hi, I'm new

I've been working as the sole administrator of various
postgresql projects for a while now. All of which where django
projects.
Since a new project is starting and we've found the need for a
more generic approach I would like to ask a few questions.

I would like to implement a pattern similar to the product
feature pattern explained in the silverstone book - the data
model resource book vol 1. It is simply explained. There is a
Table PRODUCT holding the fields all the products share, then
there is the table PRODUCT_FEATURE, both of them in a “many to
many“ relationship.

PRODUCT <--- m -------- n ---> PRODUCT_FEATURE (a table in
between of course)

PRODUCT_FEATURE --> PF
PRODUCT --> P
TABLE IN BETWEEN --> TIB

PF defines the feature Type while P stands for the product the
feature is applied to. Some of these PF can have values of
different types (text, numbers, floating, blob, ...) which
would be applied to TIB.

I don't like the idea of having numerous empty fields prepared
in TIB, just to store occasional values of different types,
therefore I need to specialize those TIB Values.

Now how would I do That?

I could create some tables solely for the means of holding
[NUM], [TEXT], [BLOB], [ETC] and reference them with the TIB
PK. When using them I could create a view TIBV containing all
of [NUM, TEXT, BLOB, ETC] in the same column called Value, and
join it with TIB to get the value of a PF.

But is this a good idea?
Is there a better way?

Also, I would have to create a pivot table in order to list
all the products with all the features. As this is implemented
in C (afaik) I suppose it is rather fast or at least fast
enough, but I do not actually know. What I know is, there are
about 30 Product Types and around 50 possible product
features. One product can have up to approximately 25 PF but
are mostly around 5 to 10.

Do you think a pivot table is a good idea?
What alternative do i have?

There is room for caching since the dataset is not updated too
often.

regards, adrian

--
[At other schools] I think the most common fault in general is to teach students how to pass exams instead of teaching them the science.
Donald Knuth

#10Jan de Visser
jan@de-visser.net
In reply to: Roxanne Reid-Bennett (#9)
Re: Database designpattern - product feature

On June 3, 2015 02:04:28 PM Roxanne Reid-Bennett wrote:

I think you should evaluate your unease with having to update the database
on release (potentially many times) carefully for what it is and why you
have it. [I'm not saying it is invalid - just know why you have it]
Because no matter how well you design your system - databases evolve.
Manage that.

Having been guilty of designing an EAV system before, I think I know his
hesitation is due to the fact that some databases (*cough* Oracle *cough*)
don't allow DDL in transactions (or at least back when I worked with Oracle it
didn't), making dynamic table creation a bit of a problem, especially when you
run in a container which gives you little flexibility in your tx handling
(i.e. you get it all the time, or never). This used to be a problem in many
iterations of J2EE containers. Also, lots of DBAs get skittish when they hear
about applications doing DDL. And again, many of the technological roadblocks
are fixed by now, but the hesitation remains.

Combine this with the fact that you want users to be able to create new
products, which should be built up out of existing and/or newly defined
attributes, you quickly end up with something EAV like. Because you don't want
your product management people coming crying to your DBAs to have a new table
for a new product defined. You want the product management people to point-
and-click their way through a nice GUI.

So this is what lead me to that EAV design for this exact problem: we started
off with Oracle as the database and a temperamental J2EE container, found out
we couldn't do DDL (or at least DDL was hard), and by the time we were on less
brain dead containers and databases the "damage" was done and there was no
going back.

But in my defense I will say that mine was one of the prettiest EAV systems
ever built. In my opinion at least :-)

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#11Adrian Stern
adrian.stern@unchained.ch
In reply to: Jan de Visser (#10)
Re: Database designpattern - product feature

(Sorry Jan for receiving this twice)

@ Roxanne & Jan:
Thank you both for your Input. I absolutely see your point against EAV.

I will work out two variants based on the specific solution, where i would
create a (bigger) Table for each group of similar products, and based on
the dynamic JSON solution.
With this we can have a good discussion about the pros and cons in our next
meeting. In the end, the customer hast to decide how important the
flexibility is for him. (And yes, the customer has IT-People who will
understand why and what we talk about)

So thanks for doubting my approach, it has been helpful!

Freundliche Grüsse

Adrian Stern
unchained - web solutions

adrian.stern@unchained.ch
+41 79 292 83 47

On Thu, Jun 4, 2015 at 4:02 AM, Jan de Visser <jan@de-visser.net> wrote:

Show quoted text

On June 3, 2015 02:04:28 PM Roxanne Reid-Bennett wrote:

I think you should evaluate your unease with having to update the

database

on release (potentially many times) carefully for what it is and why you
have it. [I'm not saying it is invalid - just know why you have it]
Because no matter how well you design your system - databases evolve.
Manage that.

Having been guilty of designing an EAV system before, I think I know his
hesitation is due to the fact that some databases (*cough* Oracle *cough*)
don't allow DDL in transactions (or at least back when I worked with
Oracle it
didn't), making dynamic table creation a bit of a problem, especially when
you
run in a container which gives you little flexibility in your tx handling
(i.e. you get it all the time, or never). This used to be a problem in many
iterations of J2EE containers. Also, lots of DBAs get skittish when they
hear
about applications doing DDL. And again, many of the technological
roadblocks
are fixed by now, but the hesitation remains.

Combine this with the fact that you want users to be able to create new
products, which should be built up out of existing and/or newly defined
attributes, you quickly end up with something EAV like. Because you don't
want
your product management people coming crying to your DBAs to have a new
table
for a new product defined. You want the product management people to point-
and-click their way through a nice GUI.

So this is what lead me to that EAV design for this exact problem: we
started
off with Oracle as the database and a temperamental J2EE container, found
out
we couldn't do DDL (or at least DDL was hard), and by the time we were on
less
brain dead containers and databases the "damage" was done and there was no
going back.

But in my defense I will say that mine was one of the prettiest EAV systems
ever built. In my opinion at least :-)

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#12William Dunn
dunnwjr@gmail.com
In reply to: Adrian Stern (#7)
Re: Database designpattern - product feature

Hello Adrian,

Sorry for the late response. I've not used Django so I can't provide
specific guidance on how you would implement it with that but in SQLAlchemy
that model is called "joined-table-inheritance":
http://docs.sqlalchemy.org/en/latest/orm/inheritance.html#joined-table-inheritance

That is also the behavior of Hibernate (Java) when using "table per
subclass" mapping.

*Will J. Dunn*
*willjdunn.com <http://willjdunn.com&gt;*

On Wed, Jun 3, 2015 at 2:50 AM, Adrian Stern <adrian.stern@unchained.ch>
wrote:

Show quoted text

Hi William, thanks for joining the conversation.

1) We do hope for constraints since a connection to an ERP system is
possible in the future. We want to plan ahead.

2) As for the subclass approach: I would need about 30 subclasses and it
will get really hard to add new products since a change in the database
will be necessary each time. That's why we want a more generic approach.

Maybe I don't understand you right, because of the language barrier. Can
you provide me a link to a subclassing example?
->
https://docs.djangoproject.com/en/1.8/topics/db/models/#multi-table-inheritance
?

ORM is a given in my case. This is not a high performance application.

Freundliche Grüsse

Adrian Stern
unchained - web solutions

adrian.stern@unchained.ch
+41 79 292 83 47

On Tue, Jun 2, 2015 at 5:35 PM, William Dunn <dunnwjr@gmail.com> wrote:

Hello Adrian,

May I ask why you need a non-standard model? By standard models I mean
the following:

1) When you don't need to have subclass specific database constraints:
All subclasses in the same table, subclasses that do not have an attribute
have that column null. This has the best performance because no joins are
required when querying both superclass and subclass attributes, and all the
data for an object will be in the same block on disk. The disadvantage of
this is that you cannot enforce constraints, such as not-null, on subclass
specific attributes columns because the constraints would also be applied
to the superclasses. If you can ensure that your application is the only
way data can be inserted those constraints will naturally be enforced there.

2) When you need subclass specific database constraints: Use an ORM such
as Django's ORM or SQLAlchemy which has one table with the superclass
attributes and a table for each subclass with their subclass specific
attributes. This is slower because joins will be needed and the data for an
object will be in 2 different blocks of disk but it allows you to enforce
constraints within the database which will be checked whenever any
application tries to insert values. There is a lot of complexity added
because there will be so many small tables and indexes but the ORM takes
care of that for you.

*Will J. Dunn*
*willjdunn.com <http://willjdunn.com&gt;*

On Mon, Jun 1, 2015 at 10:35 AM, Adrian Stern <adrian.stern@unchained.ch>
wrote:

Hi, I'm new

I've been working as the sole administrator of various postgresql
projects for a while now. All of which where django projects.
Since a new project is starting and we've found the need for a more
generic approach I would like to ask a few questions.

I would like to implement a pattern similar to the product feature
pattern explained in the silverstone book - the data model resource book
vol 1. It is simply explained. There is a Table PRODUCT holding the fields
all the products share, then there is the table PRODUCT_FEATURE, both of
them in a “many to many“ relationship.

PRODUCT <--- m -------- n ---> PRODUCT_FEATURE (a table in between of
course)

PRODUCT_FEATURE --> PF
PRODUCT --> P
TABLE IN BETWEEN --> TIB

PF defines the feature Type while P stands for the product the feature
is applied to. Some of these PF can have values of different types (text,
numbers, floating, blob, ...) which would be applied to TIB.

I don't like the idea of having numerous empty fields prepared in TIB,
just to store occasional values of different types, therefore I need to
specialize those TIB Values.

Now how would I do That?

I could create some tables solely for the means of holding [NUM],
[TEXT], [BLOB], [ETC] and reference them with the TIB PK. When using them I
could create a view TIBV containing all of [NUM, TEXT, BLOB, ETC] in the
same column called Value, and join it with TIB to get the value of a PF.

But is this a good idea?
Is there a better way?

Also, I would have to create a pivot table in order to list all the
products with all the features. As this is implemented in C (afaik) I
suppose it is rather fast or at least fast enough, but I do not actually
know. What I know is, there are about 30 Product Types and around 50
possible product features. One product can have up to approximately 25 PF
but are mostly around 5 to 10.

Do you think a pivot table is a good idea?
What alternative do i have?

There is room for caching since the dataset is not updated too often.

regards, adrian