Design question

Started by Mike Diehlover 17 years ago4 messagesgeneral
Jump to latest
#1Mike Diehl
mdiehl@diehlnet.com

Hi all,

I've got a design question that I need to ask before I go too far down what
might be the wrong road.

I've got a customer, who has multiple customers, who need to be able to upload
an excel spreadsheet into Postgres. Then they want to be able to slice and
dice that data.

The problem is that probably none of these spreadsheets will have the same
fields in them.

There are two ways to do this, that I can think of...

1. Create a table for each spreadsheet, using column headings as field names.
Every field would be a char/varchar. We might have a table to track which
client owns which table. This could amount to 10's of tables being added to
the db.

2. Create a table in which we store individual cells and associate them with
an owner. Then each client would essentially have one (huge?) table that
they can work with.

Design #1 is easy to implement, but might make management more difficult.
Design #2 is easy to manage, but the SQL needed to generate reports would
be "tricky." I'm intending to provide a report generator, so the complexity
of the reporting SQL can be mitigated.

So, which road should I travel down?

TIA,

--
Mike Diehl

#2James Strater
straterj@yahoo.com
In reply to: Mike Diehl (#1)
Re: Design question

Have you considered one large table with all of the columns from the various spreadsheets, then a separate view for each customer?

----- Original Message ----
From: Mike Diehl <mdiehl@diehlnet.com>
To: pgsql-general@postgresql.org
Sent: Wednesday, September 17, 2008 12:29:15 PM
Subject: [GENERAL] Design question

Hi all,

I've got a design question that I need to ask before I go too far down what
might be the wrong road.

I've got a customer, who has multiple customers, who need to be able to upload
an excel spreadsheet into Postgres. Then they want to be able to slice and
dice that data.

The problem is that probably none of these spreadsheets will have the same
fields in them.

There are two ways to do this, that I can think of...

1. Create a table for each spreadsheet, using column headings as field names.
Every field would be a char/varchar. We might have a table to track which
client owns which table. This could amount to 10's of tables being added to
the db.

2. Create a table in which we store individual cells and associate them with
an owner. Then each client would essentially have one (huge?) table that
they can work with.

Design #1 is easy to implement, but might make management more difficult.
Design #2 is easy to manage, but the SQL needed to generate reports would
be "tricky." I'm intending to provide a report generator, so the complexity
of the reporting SQL can be mitigated.

So, which road should I travel down?

TIA,

--
Mike Diehl

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#3Scott Marlowe
scott.marlowe@gmail.com
In reply to: Mike Diehl (#1)
Re: Design question

On Wed, Sep 17, 2008 at 11:29 AM, Mike Diehl <mdiehl@diehlnet.com> wrote:

Hi all,

I've got a design question that I need to ask before I go too far down what
might be the wrong road.

I've got a customer, who has multiple customers, who need to be able to upload
an excel spreadsheet into Postgres. Then they want to be able to slice and
dice that data.

The problem is that probably none of these spreadsheets will have the same
fields in them.

There are two ways to do this, that I can think of...

1. Create a table for each spreadsheet, using column headings as field names.
Every field would be a char/varchar. We might have a table to track which
client owns which table. This could amount to 10's of tables being added to
the db.

If you choose this method, you might want to split out customers by
schema, to make it easier to manage their tables. Then, all you have
to do is either prefix the customer name in front of the table
reference or set it in your search_path after connecting.

2. Create a table in which we store individual cells and associate them with
an owner. Then each client would essentially have one (huge?) table that
they can work with.

This is basically going to be an EAV (entity, attribute, value) type
setup. They are notoriously hard to write useful queries against, and
generally a poor performer.

I'd go with option 1 myself.

#4Richard Huxton
dev@archonet.com
In reply to: Mike Diehl (#1)
Re: Design question

Mike Diehl wrote:

1. Create a table for each spreadsheet, using column headings as field names.
Every field would be a char/varchar. We might have a table to track which
client owns which table. This could amount to 10's of tables being added to
the db.

Give each client their own schema. Set permissions on the whole schema.
You should be fine with a few dozen or a few hundred tables. You can
also dump/restore a whole schema then.

--
Richard Huxton
Archonet Ltd