New SQL Datatype RECURRINGCHAR

Started by Nonameover 24 years ago22 messages
#1Noname
dbennett@jade.bensoft.com

Idea for a new SQL Data Type:

RECURRINGCHAR

The idea with RECURRINGCHAR is treated exactly like a VARCHAR in it's
usage. However, it's designed for table columns that store a small set of
repeated values (<=256 values). This allows for a great deal of savings in
the storage of the data.

Example:

Query:
select count(*) from order
Returns:
100,000

Query:
select distinct status from order
Returns:
OPEN
REWORK
PLANNED
RELEASED
FINISHED
SHIPPED

It's apparent that there is a lot of duplicate space used in the storage
of this information. The idea is if order.status was stored as a
RECURRINGCHAR
then the only data stored for the row would be a reference to the value of
the column. The actual values would be stored in a separate lookup table.

Advantages:

- Storage space is optimized.

- a query like:

select distinct {RECURRINGCHAR} from {table}

can be radically optimized

- Eliminates use of joins and extended knowledge of data relationships
for adhoc users.

This datatype could be extended to allow for larger sets of repeated
values:

RECURRINGCHAR1 (8-bit) up to 256 unique column values
RECURRINGCHAR2 (16-bit) up to 65536 unique column values

Reasoning behind using 'long reference values':

It is often an advantage to actually store an entire word representing a
business meaning as the value of a column (as opposed to a reference
number or mnemonic abbreviation ). This helps to make the system
'self documenting' and adds value to users who are performing adhoc
queries on the database.

----
David Bennett
President - Bensoft
912 Baltimore, Suite 200
Kansas City, MO 64105

#2Rod Taylor
rbt@barchord.com
In reply to: Noname (#1)
Re: New SQL Datatype RECURRINGCHAR

This is rather like MySQL's enum. I still opt for the join, and if
you like make a view for those who don't want to know the data
structure.
--
Rod Taylor

Your eyes are weary from staring at the CRT. You feel sleepy. Notice
how restful it is to watch the cursor blink. Close your eyes. The
opinions stated above are yours. You cannot imagine why you ever felt
otherwise.

----- Original Message -----
From: <dbennett@jade.bensoft.com>
To: <pgsql-hackers@postgresql.org>
Sent: Friday, June 29, 2001 6:05 PM
Subject: [HACKERS] New SQL Datatype RECURRINGCHAR

Idea for a new SQL Data Type:

RECURRINGCHAR

The idea with RECURRINGCHAR is treated exactly like a VARCHAR in

it's

usage. However, it's designed for table columns that store a small

set of

repeated values (<=256 values). This allows for a great deal of

savings in

the storage of the data.

Example:

Query:
select count(*) from order
Returns:
100,000

Query:
select distinct status from order
Returns:
OPEN
REWORK
PLANNED
RELEASED
FINISHED
SHIPPED

It's apparent that there is a lot of duplicate space used in the

storage

of this information. The idea is if order.status was stored as a
RECURRINGCHAR
then the only data stored for the row would be a reference to the

value of

the column. The actual values would be stored in a separate lookup

table.

Advantages:

- Storage space is optimized.

- a query like:

select distinct {RECURRINGCHAR} from {table}

can be radically optimized

- Eliminates use of joins and extended knowledge of data

relationships

for adhoc users.

This datatype could be extended to allow for larger sets of repeated
values:

RECURRINGCHAR1 (8-bit) up to 256 unique column values
RECURRINGCHAR2 (16-bit) up to 65536 unique column values

Reasoning behind using 'long reference values':

It is often an advantage to actually store an entire word

representing a

business meaning as the value of a column (as opposed to a reference
number or mnemonic abbreviation ). This helps to make the system
'self documenting' and adds value to users who are performing adhoc
queries on the database.

----
David Bennett
President - Bensoft
912 Baltimore, Suite 200
Kansas City, MO 64105

---------------------------(end of

broadcast)---------------------------

Show quoted text

TIP 4: Don't 'kill -9' the postmaster

#3Alex Pilosov
alex@pilosoft.com
In reply to: Noname (#1)
Re: New SQL Datatype RECURRINGCHAR

This is not a good idea. You are probably coming from mysql background (no
offense :).

See comments inline.

On Fri, 29 Jun 2001 dbennett@jade.bensoft.com wrote:

Idea for a new SQL Data Type:

It's apparent that there is a lot of duplicate space used in the storage
of this information. The idea is if order.status was stored as a
RECURRINGCHAR
then the only data stored for the row would be a reference to the value of
the column. The actual values would be stored in a separate lookup table.

You should instead have another table with two columns, order_status_id
and order_status_desc, and join with it to get your data.

Advantages:

- Storage space is optimized.

- a query like:

select distinct {RECURRINGCHAR} from {table}

can be radically optimized

select distinct order_status_desc from order_status_lookup

- Eliminates use of joins and extended knowledge of data relationships
for adhoc users.

For adhoc users, you can create a view so they won't be aware of joins.

It is often an advantage to actually store an entire word representing a
business meaning as the value of a column (as opposed to a reference
number or mnemonic abbreviation ). This helps to make the system
'self documenting' and adds value to users who are performing adhoc
queries on the database.

No, that is against good database design and any database normalization.

-alex

#4David Bennett
dbennett@bensoft.com
In reply to: Alex Pilosov (#3)
RE: New SQL Datatype RECURRINGCHAR

It's apparent that there is a lot of duplicate space used in the storage
of this information. The idea is if order.status was stored as a
RECURRINGCHAR
then the only data stored for the row would be a reference to the value

of

the column. The actual values would be stored in a separate lookup table.

You should instead have another table with two columns, order_status_id
and order_status_desc, and join with it to get your data.

The idea is to simplify the process of storing and accessing the data.
Joins required
a deeper knowledge of the relational structure. This also complicates
application
programming, two tables must be maintained instead of just one.

select distinct {RECURRINGCHAR} from {table}

can be radically optimized

select distinct order_status_desc from order_status_lookup

Again the idea is to simplify. Reduce the number of tables required to
represent a business model.

- Eliminates use of joins and extended knowledge of data relationships
for adhoc users.

For adhoc users, you can create a view so they won't be aware of joins.

Now we have a master table, a lookup table AND a view?
even more complication....

It is often an advantage to actually store an entire word representing a
business meaning as the value of a column (as opposed to a reference
number or mnemonic abbreviation ). This helps to make the system
'self documenting' and adds value to users who are performing adhoc
queries on the database.

No, that is against good database design and any database normalization.

I would like to hear your argument on this. I don't see how optimizing
the storage of reference value breaks a normalization rule.

--Dave

#5Tom Lane
tgl@sss.pgh.pa.us
In reply to: Rod Taylor (#2)
Re: New SQL Datatype RECURRINGCHAR

"Rod Taylor" <rbt@barchord.com> writes:

This is rather like MySQL's enum.

Yes. If we were going to do anything like this, I'd vote for stealing
the "enum" API, lock stock and barrel --- might as well be compatible.

regards, tom lane

#6Alex Pilosov
alex@pilosoft.com
In reply to: David Bennett (#4)
RE: New SQL Datatype RECURRINGCHAR

On Tue, 3 Jul 2001, David Bennett wrote:

The idea is to simplify the process of storing and accessing the data.
Joins required a deeper knowledge of the relational structure. This
also complicates application programming, two tables must be
maintained instead of just one.

Sometimes, to maintain correctness, its necessary to have complex designs.

"All problems have simple, easy-to-understand, incorrect solutions".

Again the idea is to simplify. Reduce the number of tables required to
represent a business model.

Why? You should normalize your data, which _increases_ number of tables.

- Eliminates use of joins and extended knowledge of data relationships
for adhoc users.

For adhoc users, you can create a view so they won't be aware of joins.

Now we have a master table, a lookup table AND a view?
even more complication....

Well, that's called software development. If you don't want complications,
you can use MS-Access *:)

It is often an advantage to actually store an entire word representing a
business meaning as the value of a column (as opposed to a reference
number or mnemonic abbreviation ). This helps to make the system
'self documenting' and adds value to users who are performing adhoc
queries on the database.

No, that is against good database design and any database normalization.

I would like to hear your argument on this. I don't see how optimizing
the storage of reference value breaks a normalization rule.

What if tomorrow you will need to change text name for "OPEN" status to
"OPEN_PENDING_SOMETHING"? With your design, you will need to update all
rows in the table changing it. With normalized design, you just update the
lookup table. Etc, etc.

-alex

#7David Bennett
dbennett@bensoft.com
In reply to: Tom Lane (#5)
RE: New SQL Datatype RECURRINGCHAR

The only problem with 'enum' is that all of the possible values must be
specified at CREATE time. A logical extension to this would be to allow for
'dynamic extensions' to the list.

-----Original Message-----
From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
Sent: Tuesday, July 03, 2001 4:49 PM
To: Rod Taylor
Cc: dbennett@jade.bensoft.com; pgsql-hackers@postgresql.org
Subject: Re: [HACKERS] New SQL Datatype RECURRINGCHAR

"Rod Taylor" <rbt@barchord.com> writes:

This is rather like MySQL's enum.

Yes. If we were going to do anything like this, I'd vote for stealing
the "enum" API, lock stock and barrel --- might as well be compatible.

regards, tom lane

#8David Bennett
dbennett@bensoft.com
In reply to: Alex Pilosov (#6)
RE: New SQL Datatype RECURRINGCHAR

various disagreements and "quotes"...

I agree that you disagree.... :)

RECURRINGCHAR does not break normal form. It simply optimizes the storage
of reference values (recurring keys). This allows for the use of 'long
words' as reference values with a great deal of system storage savings and a
boost in performance in certain circumstances. This is more a form of
'compression' then anything else, as a matter of fact, this is very similar
to the LZ78 family of substitutional compressors.

http://www.faqs.org/faqs/compression-faq/part2/section-1.html

The advantage here is that we are targeting a normalized value in it's
atomic state, The recurrence rate of this these values is extremely high
which allows us to store this data in a very small space and optimize the
access to this data by using the 'dictionary' that we create.

What if tomorrow you will need to change text name for "OPEN" status to
"OPEN_PENDING_SOMETHING"? With your design, you will need to update all
rows in the table changing it. With normalized design, you just update the
lookup table. Etc, etc.

In either model you would:

update master_table set status='OPEN_PENDING_SOMETHING' where status='OPEN'

This would not change, in fact, even in a normalized design you wouldn't
change the lookup table (parent) key. Perhaps you are misunderstanding my
initial concept. The MySQL 'enum' is close. However, it is static and
requires you to embed business data (your key list) in the DDL. The idea I
have here is to dynamically extend this list as needed. I am not saying
that the value can't relate to a parent (lookup) table. It's just not
necessary if the value is all that is needed.

--Dave (Hoping some other SQL developers are monitoring this thread :)

#9Alex Pilosov
alex@pilosoft.com
In reply to: David Bennett (#8)
RE: New SQL Datatype RECURRINGCHAR

On Fri, 6 Jul 2001, David Bennett wrote:

<rest snipped>

In either model you would:

update master_table set status='OPEN_PENDING_SOMETHING' where status='OPEN'

This would not change, in fact, even in a normalized design you
wouldn't change the lookup table (parent) key. Perhaps you are
misunderstanding my initial concept. The MySQL 'enum' is close.
However, it is static and requires you to embed business data (your
key list) in the DDL. The idea I have here is to dynamically extend
this list as needed. I am not saying that the value can't relate to a
parent (lookup) table. It's just not necessary if the value is all
that is needed.

You are making absolutely no sense.

Let me break it down:

a) To do an update of a key to a different value, you would need to do
following:
1) look up the new value in entire table, find if its already exists
2) If it exists, good.
3) if it doesn't, pick a next number. (out of some sequence, I suppose) to
represent the key.
4) do the actual update.

Step 1 without an index is a killer. Then, you need to have a certain
'table' to map the existing key values to their numerical representations.

How would this 'table' get populated? On startup? On select?

Its one thing to take 'enum' datatype, which I wouldn't disagree too
much with. Its another thing to suggest this kind of a scheme, which
should be really done with views and rules.

I.E. instead of (as you would have) table a(..., x recurringchar),
you must have two things:

table a_real (..., x int4)
table lookup (x int4, varchar value)

Then, have a view:
create view a as select ..., value from a_real, lookup where
a_real.x=lookup.x

Then create a rule on insert: (syntax may be rusty)
create rule foo
on insert on table a
do instead
...whatever magic you need to do the actual inserton, lookup, etc.

Show quoted text

--Dave (Hoping some other SQL developers are monitoring this thread :)

#10David Bennett
dbennett@bensoft.com
In reply to: Alex Pilosov (#9)
RE: New SQL Datatype RECURRINGCHAR

Alex,

I think I fully understand your position. Let me put wrap up our
conversation so far.

Given the application requirements:

1) contacts have a type.

2) new types must be added on the fly as needed.

3) types names rarely change.

4) the number of contacts should scale to support millions of records.

5) the number of types will be limited to under 64k

6) Users must be able to easily query contacts with readable types.

-----
In a nutshell you are recommending:
-----

create table contact_type (
code int2,
type char(16),
PRIMARY KEY ( code )
);

create table contact (
number serial,
name char(32),
type int2,
PRIMARY KEY ( number ),
FOREIGN KEY ( type ) REFERENCES contact_type ( code )
);

create view contact_with_readble_type as (
select c.number as number,
c.name as name,
t.type as type
from
contact c,
contact_type t
);

* To build a type lookup table:

1) Select type and code from contact_type
2) Build UI object which displays type and returns code

* In order to insert a new record with this model:

1) Look up to see if type exists
2) Insert new type
3) Get type ID
4) Insert contact record

* The adhoc query user is now faced with
the task of understanding 3 data tables.

-----
With recurringchar you could do this easily as:
-----

create table contact (
number serial,
name char(32),
type recurringchar1,
PRIMARY KEY ( number ),
);

* To build a type lookup table:

1) Select distinct type from contact (optimized access to recurringchar
dictionary)
2) Build UI object which displays and returns type.

* In order to insert a new record with this model:

1) Insert contact record

* The adhoc query user has one data table.

-----

Granted, changing the value of contact_type.type would require edits to the
contact records.
It may be possible to add simple syntax to allow editing of a 'recurringchar
dictionary' to
get around isolated problem which would only exist in certain applications.

Actually, maybe 'dictionary' or 'dictref' would be a better name for the
datatype.

#11Alex Pilosov
alex@pilosoft.com
In reply to: David Bennett (#10)
RE: New SQL Datatype RECURRINGCHAR

On Sat, 7 Jul 2001, David Bennett wrote:

-----
In a nutshell you are recommending:
-----

create table contact_type (
code int2,
type char(16),
PRIMARY KEY ( code )
);

create table contact (
number serial,
name char(32),
type int2,
PRIMARY KEY ( number ),
FOREIGN KEY ( type ) REFERENCES contact_type ( code )
);

create view contact_with_readble_type as (
select c.number as number,
c.name as name,
t.type as type
from
contact c,
contact_type t
);

* To build a type lookup table:

1) Select type and code from contact_type
2) Build UI object which displays type and returns code

Just 'select distinct' on a view should do just fine.

* In order to insert a new record with this model:

1) Look up to see if type exists
2) Insert new type
3) Get type ID
4) Insert contact record

This can be encapsulated with "ON INSERT" rule on a view.

* The adhoc query user is now faced with
the task of understanding 3 data tables.

No, only one view. All the logic is encapsulated there.

-----
With recurringchar you could do this easily as:
-----

create table contact (
number serial,
name char(32),
type recurringchar1,
PRIMARY KEY ( number ),
);

* To build a type lookup table:

1) Select distinct type from contact (optimized access to recurringchar
dictionary)
2) Build UI object which displays and returns type.

* In order to insert a new record with this model:

1) Insert contact record

* The adhoc query user has one data table.

-----

Granted, changing the value of contact_type.type would require edits
to the contact records. It may be possible to add simple syntax to
allow editing of a 'recurringchar dictionary' to get around isolated
problem which would only exist in certain applications.

Actually, maybe 'dictionary' or 'dictref' would be a better name for
the datatype.

These things belong in application or middleware (AKA views/triggers), not
in database server itself.

There are multiple problems with your implementation, for example,
transaction handling, assume this situation:

Tran A inserts a new contact with new type "foo", but does not commit.
Dictionary assigns value of N to 'foo'.

Tran B inserts a new contact with type foo. What value should be entered
in the dictionary? N? A new value?

If a type disappears from database, does its dictionary ID get reused?

All these questions are not simple questions, and its not up to database
to decide it. Your preferred solution belongs in your triggers/views, not
in core database.

#12Rod Taylor
rbt@barchord.com
In reply to: Alex Pilosov (#11)
Re: New SQL Datatype RECURRINGCHAR

This would be a potential feature of being able to insert into views
in general. Reversing the CREATE VIEW statement to accept inserts,
deletes and updates.

If true, focus on that. Theres lots of views that cannot be reversed
properly -- unions come to mind -- but perhaps this type of simple
join could be a first step in the package. I believe this is on the
TODO list already.

Different attack, but accomplishes the same thing within SQL standards
as I seem to recall views are supposed to do this where reasonable.

Failing that, implement this type of action the same way as foreign
keys. Via the described method with automagically created views,
tables, etc. Though I suggest leaving it in contrib for sometime.
Enum functionality isn't particularly useful to the majority whose
applications tend to pull out the numbers for states when the
application is opened (with the assumption they're generally static).

--
Rod Taylor

Your eyes are weary from staring at the CRT. You feel sleepy. Notice
how restful it is to watch the cursor blink. Close your eyes. The
opinions stated above are yours. You cannot imagine why you ever felt
otherwise.

----- Original Message -----
From: "Alex Pilosov" <alex@pilosoft.com>
To: "David Bennett" <dbennett@bensoft.com>
Cc: <pgsql-hackers@postgresql.org>
Sent: Saturday, July 07, 2001 9:24 PM
Subject: RE: [HACKERS] New SQL Datatype RECURRINGCHAR

On Sat, 7 Jul 2001, David Bennett wrote:

-----
In a nutshell you are recommending:
-----

create table contact_type (
code int2,
type char(16),
PRIMARY KEY ( code )
);

create table contact (
number serial,
name char(32),
type int2,
PRIMARY KEY ( number ),
FOREIGN KEY ( type ) REFERENCES contact_type ( code )
);

create view contact_with_readble_type as (
select c.number as number,
c.name as name,
t.type as type
from
contact c,
contact_type t
);

* To build a type lookup table:

1) Select type and code from contact_type
2) Build UI object which displays type and returns code

Just 'select distinct' on a view should do just fine.

* In order to insert a new record with this model:

1) Look up to see if type exists
2) Insert new type
3) Get type ID
4) Insert contact record

This can be encapsulated with "ON INSERT" rule on a view.

* The adhoc query user is now faced with
the task of understanding 3 data tables.

No, only one view. All the logic is encapsulated there.

-----
With recurringchar you could do this easily as:
-----

create table contact (
number serial,
name char(32),
type recurringchar1,
PRIMARY KEY ( number ),
);

* To build a type lookup table:

1) Select distinct type from contact (optimized access to

recurringchar

dictionary)
2) Build UI object which displays and returns type.

* In order to insert a new record with this model:

1) Insert contact record

* The adhoc query user has one data table.

-----

Granted, changing the value of contact_type.type would require

edits

to the contact records. It may be possible to add simple syntax to
allow editing of a 'recurringchar dictionary' to get around

isolated

problem which would only exist in certain applications.

Actually, maybe 'dictionary' or 'dictref' would be a better name

for

the datatype.

These things belong in application or middleware (AKA

views/triggers), not

in database server itself.

There are multiple problems with your implementation, for example,
transaction handling, assume this situation:

Tran A inserts a new contact with new type "foo", but does not

commit.

Dictionary assigns value of N to 'foo'.

Tran B inserts a new contact with type foo. What value should be

entered

in the dictionary? N? A new value?

If a type disappears from database, does its dictionary ID get

reused?

All these questions are not simple questions, and its not up to

database

to decide it. Your preferred solution belongs in your

triggers/views, not

in core database.

---------------------------(end of

broadcast)---------------------------

Show quoted text

TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo@postgresql.org so that your
message can get through to the mailing list cleanly

#13Alex Pilosov
alex@pilosoft.com
In reply to: Rod Taylor (#12)
Re: New SQL Datatype RECURRINGCHAR

On Sat, 7 Jul 2001, Rod Taylor wrote:

This would be a potential feature of being able to insert into views
in general. Reversing the CREATE VIEW statement to accept inserts,
deletes and updates.

Definitely not a 'potential' feature, but a existing and documented one.
Read up on rules, especially 'ON INSERT DO INSTEAD' stuff. Its not
automatic, though.

If true, focus on that. Theres lots of views that cannot be reversed
properly -- unions come to mind -- but perhaps this type of simple
join could be a first step in the package. I believe this is on the
TODO list already.

On TODO list are updatable views in SQL sense of word, [i.e. automatic
updateability of a view which matches certain criteria].

Different attack, but accomplishes the same thing within SQL standards
as I seem to recall views are supposed to do this where reasonable.

Failing that, implement this type of action the same way as foreign
keys. Via the described method with automagically created views,
tables, etc. Though I suggest leaving it in contrib for sometime.
Enum functionality isn't particularly useful to the majority whose
applications tend to pull out the numbers for states when the
application is opened (with the assumption they're generally static).

Original suggestion was not for an enum type, it was for _dynamically
extensible_ data dictionary type.

ENUM is statically defined, and it wouldn't be too hard to implement, with
one exception: one more type-specific field needs to be added to
pg_attribute table, where would be stored argument for the type (such as,
length for a char/varchar types, length/precision for numeric type, and
possible values for a enum type).

This just needs a pronouncement that this addition is a good idea, and
then its a trivial thing to implement enum.

-alex

#14Rod Taylor
rbt@barchord.com
In reply to: Alex Pilosov (#13)
Re: New SQL Datatype RECURRINGCHAR

This would be a potential feature of being able to insert into

views

in general. Reversing the CREATE VIEW statement to accept

inserts,

deletes and updates.

Definitely not a 'potential' feature, but a existing and documented

one.

Read up on rules, especially 'ON INSERT DO INSTEAD' stuff. Its not
automatic, though.

Trust me, I know how to go about doing those kinds of things manually.
I was referring to the automated reveral -- which creates this
features in a very simple manner without any additions or changes to
system tables -- aside from reverse rules themselves which is a more
generic feature.

If true, focus on that. Theres lots of views that cannot be

reversed

properly -- unions come to mind -- but perhaps this type of simple
join could be a first step in the package. I believe this is on

the

TODO list already.

On TODO list are updatable views in SQL sense of word, [i.e.

automatic

updateability of a view which matches certain criteria].

Different attack, but accomplishes the same thing within SQL

standards

as I seem to recall views are supposed to do this where

reasonable.

Failing that, implement this type of action the same way as

foreign

keys. Via the described method with automagically created views,
tables, etc. Though I suggest leaving it in contrib for sometime.
Enum functionality isn't particularly useful to the majority whose
applications tend to pull out the numbers for states when the
application is opened (with the assumption they're generally

static).

Original suggestion was not for an enum type, it was for

_dynamically

extensible_ data dictionary type.

ENUMs from my memory are easily redefined. And since the database
they're implemented in requires table locks for everything, they can
appear dynamic (nothing is transaction safe in that thing anyway).

#15Rod Taylor
rbt@barchord.com
In reply to: Alex Pilosov (#13)
Re: New SQL Datatype RECURRINGCHAR

This would be a potential feature of being able to insert into

views

in general. Reversing the CREATE VIEW statement to accept

inserts,

deletes and updates.

Definitely not a 'potential' feature, but a existing and

documented

one.

Read up on rules, especially 'ON INSERT DO INSTEAD' stuff. Its not
automatic, though.

Trust me, I know how to go about doing those kinds of things

manually.

I was referring to the automated reveral -- which creates this
features in a very simple manner without any additions or changes to
system tables -- aside from reverse rules themselves which is a more
generic feature.

Hmm. My above statement lost all credibility in poor grammer and
speeling. Time for bed I suppose.

Anyway, the point is that some of the simple views should be straight
forward to reversing automatically if someone has the will and the
time it can be done. A while back a list of 'views which cannot be
reversed' was created and included things such as Unions,
Intersections, exclusions, aggregates, CASE statements, and a few more
items.

#16Tom Lane
tgl@sss.pgh.pa.us
In reply to: Rod Taylor (#15)
Re: New SQL Datatype RECURRINGCHAR

"Rod Taylor" <rbt@barchord.com> writes:

Anyway, the point is that some of the simple views should be straight
forward to reversing automatically if someone has the will and the
time it can be done. A while back a list of 'views which cannot be
reversed' was created and included things such as Unions,
Intersections, exclusions, aggregates, CASE statements, and a few more
items.

SQL92 has a notion that certain simple views are "updatable", while the
rest are not. In our terms this means that we should automatically
create ON INSERT/UPDATE/DELETE rules if the view is updatable according
to the spec. I have not bothered to chase down all the exact details
of the spec's "updatableness" restrictions, but they're along the same
lines you mention --- only one referenced table, no aggregation, no
set operations, all view outputs are simple column references, etc.

My feeling is that the restrictions are stringent enough to eliminate
most of the interesting uses of views, and hence an automatic rule
creation feature is not nearly as useful/important as it appears at
first glance. In real-world applications you'll have to expend some
thought on manual rule creation anyway.

regards, tom lane

#17Zeugswetter Andreas SB
ZeugswetterA@wien.spardat.at
In reply to: Tom Lane (#16)
AW: New SQL Datatype RECURRINGCHAR

My feeling is that the restrictions are stringent enough to eliminate
most of the interesting uses of views, and hence an automatic rule
creation feature is not nearly as useful/important as it appears at
first glance.

The most prominent of the "interesting uses" probably beeing when the views
are part of the authorization system, since views are the only standardized
mechanism to restrict access at the row level. Imho not to be neglected.
(user xxx is only allowed to manipulate rows that belong to his department,
so he is only granted access to a view, not the main table)

Andreas

#18Hannu Krosing
hannu@tm.ee
In reply to: Zeugswetter Andreas SB (#17)
Re: AW: New SQL Datatype RECURRINGCHAR

Zeugswetter Andreas SB wrote:

My feeling is that the restrictions are stringent enough to eliminate
most of the interesting uses of views, and hence an automatic rule
creation feature is not nearly as useful/important as it appears at
first glance.

The most prominent of the "interesting uses" probably beeing when the views
are part of the authorization system, since views are the only standardized
mechanism to restrict access at the row level.

True, and often the views can be restricted to insert only data that
will be
visible using this view.

Imho not to be neglected.
(user xxx is only allowed to manipulate rows that belong to his department,
so he is only granted access to a view, not the main table)

This seems to be a little more complicated that Tom described (I.e. it
has
probably more than one relation involved or uses a function to get
CURRENT_USER's
department id)

IIRC MS Access has much broader repertoire of updatable views than
described
by Tom. Can be it's an extension to standard SQL though.

-----------------
Hannu

#19Tom Lane
tgl@sss.pgh.pa.us
In reply to: Hannu Krosing (#18)
Re: AW: New SQL Datatype RECURRINGCHAR

Hannu Krosing <hannu@tm.ee> writes:

Zeugswetter Andreas SB wrote:

The most prominent of the "interesting uses" probably beeing when the views
are part of the authorization system, since views are the only standardized
mechanism to restrict access at the row level.

True, and often the views can be restricted to insert only data that
will be visible using this view.

Right. The interesting question is whether an automatic rule creator
could be expected to derive the correct restrictions on
insert/update/delete given the WHERE clause of the view. Insert/delete
might not be too bad (at first thought, requiring the inserted/deleted
rows to pass the WHERE condition would do), but I'm not so sure about
update. Is it sufficient to require both the old and new states of the
row to pass the WHERE condition?

SQL92 gives this restriction on WHERE clauses for updatable views:

d) If the <table expression> immediately contained in QS imme-
diately contains a <where clause> WC, then no leaf generally
underlying table of QS shall be a generally underlying table
of any <query expression> contained in WC.

which conveys nothing to my mind :-(, except that they're restricting
sub-SELECTs in WHERE somehow. Can anyone translate that into English?

regards, tom lane

#20Jan Wieck
JanWieck@Yahoo.com
In reply to: Tom Lane (#19)
Re: AW: New SQL Datatype RECURRINGCHAR

Tom Lane wrote:

Hannu Krosing <hannu@tm.ee> writes:

Zeugswetter Andreas SB wrote:

The most prominent of the "interesting uses" probably beeing when the views
are part of the authorization system, since views are the only standardized
mechanism to restrict access at the row level.

True, and often the views can be restricted to insert only data that
will be visible using this view.

Right. The interesting question is whether an automatic rule creator
could be expected to derive the correct restrictions on
insert/update/delete given the WHERE clause of the view. Insert/delete
might not be too bad (at first thought, requiring the inserted/deleted
rows to pass the WHERE condition would do), but I'm not so sure about
update. Is it sufficient to require both the old and new states of the
row to pass the WHERE condition?

Yes, no other chance. Remember that the rule on SELECT is
allways applied to the scan that looks for the rows to
update, so you'd never have a chance to hit other rows
through the view.

Jan

--

#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me. #
#================================================== JanWieck@Yahoo.com #

_________________________________________________________
Do You Yahoo!?
Get your free @yahoo.com address at http://mail.yahoo.com

#21Peter Eisentraut
peter_e@gmx.net
In reply to: Tom Lane (#19)
Re: AW: New SQL Datatype RECURRINGCHAR

Tom Lane writes:

SQL92 gives this restriction on WHERE clauses for updatable views:

d) If the <table expression> immediately contained in QS imme-
diately contains a <where clause> WC, then no leaf generally
underlying table of QS shall be a generally underlying table
of any <query expression> contained in WC.

which conveys nothing to my mind :-(, except that they're restricting
sub-SELECTs in WHERE somehow. Can anyone translate that into English?

No table mentioned in the FROM-clause (in PG even implicitly) of the query
expression (or view definition) is allowed to be mentioned in a subquery
in the WHERE clause of the query expression (or view definition).

The phrasing "leaf" and "generally" underlying is only to make this
statement theoretically pure because you can create generally underlying
tables that look different but do the same thing (different join syntax),
whereas a leaf generally underlying table is guaranteed to be a real base
table.

--
Peter Eisentraut peter_e@gmx.net http://funkturm.homeip.net/~peter

#22Jan Wieck
JanWieck@Yahoo.com
In reply to: David Bennett (#10)
Re: New SQL Datatype RECURRINGCHAR

David Bennett wrote:

Alex,

I think I fully understand your position. Let me put wrap up our
conversation so far.
[lots of arguments for recurringchar]

All I've seen up to now is that you continue to mix up
simplification on the user side with data and content control
on the DB designer side. Do the users create all the tables
and would have to create the views, or is that more the job
of someone who's educated enough?

And about the multiple lookups and storage of new types, we
have procedural languages and database triggers.

This is no personal offense, but I just don't see why we
should adopt non-standard MySQLism for functionality that is
available through standard mechanisms with alot more
flexibility and control.

Jan

--

#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me. #
#================================================== JanWieck@Yahoo.com #

_________________________________________________________
Do You Yahoo!?
Get your free @yahoo.com address at http://mail.yahoo.com