why use SCHEMA? any real-world examples?

Started by Miles Keatonover 21 years ago4 messagesgeneral
Jump to latest
#1Miles Keaton
mileskeaton@gmail.com

I just noticed PostgreSQL's schemas for my first time.
(http://www.postgresql.org/docs/current/static/ddl-schemas.html)

I Googled around, but couldn't find any articles describing WHY or
WHEN to use schemas in database design.

Since the manual says HOW, could anyone here who has used schemas take
a minute to describe to a newbie like me why you did? What benefits
did they offer you? Any drawbacks?

Thanks for your time.

- Miles

#2Richard Huxton
dev@archonet.com
In reply to: Miles Keaton (#1)
Re: why use SCHEMA? any real-world examples?

Miles Keaton wrote:

Since the manual says HOW, could anyone here who has used schemas take
a minute to describe to a newbie like me why you did? What benefits
did they offer you? Any drawbacks?

Well - it's a namespace feature, so at its simplest it lets you have two
objects with the same name. It also lets you have permission control
over them, and provides a convenient way to group items together. For
example, I usually have a "util" schema where I keep utility
functions/views for dba use rather than general users.

For a longer example, you might have a database with two sets of users -
sales and accounts. They both need to print reports, but not the same
set of reports. So - you create a "reports" table with an access code of
S=sales, A=accounts, *=everyone.

You wrap this with a view "my_reports" where you supply your user-type
(S/A) and get a list of reports you can access. However, your
application needs to supply the user-type and if someone can inject the
right SQL into your connection, they can gain access to any report.

So - you create 3 schemas: core, sales, accounts. You put the "reports"
table into core and two views into "sales" and "accounts", both named
"my_reports" and rewritten appropriately. You deny access to "reports"
directly, and make sure your application sets its "search_path" to
contain the relevant sales/accounts schema. Then "SELECT * FROM
my_reports" will show only those reports your login allows.

HTH
--
Richard Huxton
Archonet Ltd

#3Gregory S. Williamson
gsw@globexplorer.com
In reply to: Richard Huxton (#2)
Re: why use SCHEMA? any real-world examples?

As other posters have indicated, there's a convenience factor and an advantage to compartmentalizing data. In our case we don't care so much about user rights (a very useful aspect in and of itself), but more for performance issues.

We have some applications that use a lot of detailed data about properties in different counties. We have a central table that stores their spatial attributes and some data about the properties themselves. The table has several million rows currently -- selections based on a bounding box are very fast, but if we try to get a list of all properties on all streets with names like "Elm%" in a given county, the select is painfully slow as the index (county / street in this simplified case) lacks specificity -- any given county yields say a half million rows as candidates by county, with hundreds of possible street entries, so sequential scans are used.

Hence, I broke out some of the property data that needed to be searched by county, with each county in its own schema,and each schema has the same tables (so the schema called "f10675" has a "name_search" table that has the same name as the "f01223" schema, but its own contents.

The search tables all refer to the original data by a unique identifier that is common between the schema/search tables and the main store. The search in these schema based tables is much faster because the specificity of the index is much greater, yielding only dozens or hundreds of candidates out of hundreds of thousands of rows.

The extra space taken by redundant data storage is more than compensated for by speed in retrieval.

HTH clarify possibilties,

Greg WIlliamson
DBA
GlobeXplorer LLC

-----Original Message-----
From: Miles Keaton [mailto:mileskeaton@gmail.com]
Sent: Wed 11/24/2004 9:12 PM
To: pgsql-general@postgresql.org
Cc:
Subject: [GENERAL] why use SCHEMA? any real-world examples?
I just noticed PostgreSQL's schemas for my first time.
(http://www.postgresql.org/docs/current/static/ddl-schemas.html)

I Googled around, but couldn't find any articles describing WHY or
WHEN to use schemas in database design.

Since the manual says HOW, could anyone here who has used schemas take
a minute to describe to a newbie like me why you did? What benefits
did they offer you? Any drawbacks?

Thanks for your time.

- Miles

---------------------------(end of broadcast)---------------------------
TIP 8: explain analyze is your friend

#4Daniel Martini
dmartini@uni-hohenheim.de
In reply to: Gregory S. Williamson (#3)
Re: why use SCHEMA? any real-world examples?

Hi,

Citing Miles Keaton <mileskeaton@gmail.com>:

I just noticed PostgreSQL's schemas for my first time.
(http://www.postgresql.org/docs/current/static/ddl-schemas.html)

I Googled around, but couldn't find any articles describing WHY or
WHEN to use schemas in database design.

When your data are too similar to be split into two databases but
at the same time too different to fit into common tables, a schema
comes in handy to keep your db tidy.

Since the manual says HOW, could anyone here who has used schemas take
a minute to describe to a newbie like me why you did?

We had agricultural experiments running here at our institute. We had
both field experiments outside and pot experiments in the greenhouse.
The data collected was mostly the same for both sets of experiments
(plant nutrient content, growth parameters like shoot length...), but
the number of samples taken per measured parameter was quite different
(e.g. for the pot experiments, nutrient data was available from each
and every plant, whereas we took samples in the field only from a subset
of plants). So both sets of experiments did not fit into one clean
normalized, relational model. On the other hand, it was quite desirable
to have all the data in one db, to be able to run queries across both
datasets at the same time from one connection. Schemas provided a nice
way to keep this all clean and simple.

What benefits
did they offer you?

Clean, logical organization of one projects data in one db.

Any drawbacks?

If you have tables with the same name in several schemas, only
the ones, which are in the first schema in the search path are
shown on \dt from psql. Not a major problem, but keep this in mind
when designing the db.

Regards,
Daniel