Dreaming About Redesigning SQL
Hi,
This is for relational database theory experts on one hand and
imlementers of real-world alications on the other hand. If there was
a chance to start again and design SQL afresh, for best
cleaness/power/performance what changes would you make? What would
_your_ query language (and the underlying database concept) look like?
Seun Osewa
PS: I should want to post my ideas too for review but more
experienced/qualified people should come first
After takin a swig o' Arrakan spice grog, seunosewa@inaira.com (Seun Osewa) belched out...:
This is for relational database theory experts on one hand and
imlementers of real-world alications on the other hand. If there was
a chance to start again and design SQL afresh, for best
cleaness/power/performance what changes would you make? What would
_your_ query language (and the underlying database concept) look
like?
There are two notable 'projects' out there:
1. There's Darwen and Date's "Tutorial D" language, defined as part
of their "Third Manifesto" about relational databases.
2. newSQL <http://newsql.sourceforge.net/>, where they are studying
two syntaxes, one based on Java, and one based on a
simplification (to my mind, oversimplification) of SQL.
The "newSQL" project suffers from their definition being something of
a "chip away everything that doesn't look like an elephant"
definition. They aren't defining, in "mathematical" terms, what their
language is supposed to be able to express; they are instead defining
a big grab-bag of minor syntactic features, and seem to expect that a
database system will emerge from that.
In contrast, "Tutorial D" is _all_ about mathematical definition of
what it is supposed to express, and the text is a tough read,
irrespective of other merits.
--
wm(X,Y):-write(X),write('@'),write(Y). wm('cbbrowne','cbbrowne.com').
http://cbbrowne.com/info/thirdmanifesto.html
DOS: n., A small annoying boot virus that causes random spontaneous
system crashes, usually just before saving a massive project. Easily
cured by Unix. See also MS-DOS, IBM-DOS, DR-DOS.
-- from David Vicker's .plan
Thanks for the links.
Christopher Browne <cbbrowne@acm.org> wrote in message news:<blkq9n$d9puv$4@ID-125932.news.uni-berlin.de>...
There are two notable 'projects' out there:
1. There's Darwen and Date's "Tutorial D" language, defined as part
of their "Third Manifesto" about relational databases.2. newSQL <http://newsql.sourceforge.net/>, where they are studying
two syntaxes, one based on Java, and one based on a
simplification (to my mind, oversimplification) of SQL.
I was able to get a pdf coy of the "Third Manifesto" article here:
http://citeseer.nj.nec.com/darwen95third.html
but the details of tutorial D seem not to be a part of that article.
NewSQL *might* be cool if someone found reason to use it in a DBMS.
Sometimes I wonder why its so important to model data in the "rela-
tional way", to think of data in form of sets of tuples rather than
tables or lists or whatever. I mean, though its elegant and based
on mathematical principles I would like to know why its the _right_
model to follow in designing a DBMS (or database). The way my mind
sees it, should we not rather be interested in what works?
Seun Osewa
Christopher Browne wrote:
After takin a swig o' Arrakan spice grog, seunosewa@inaira.com (Seun Osewa) belched out...:
This is for relational database theory experts on one hand and
imlementers of real-world alications on the other hand. If there was
a chance to start again and design SQL afresh, for best
cleaness/power/performance what changes would you make? What would
_your_ query language (and the underlying database concept) look
like?There are two notable 'projects' out there:
1. There's Darwen and Date's "Tutorial D" language, defined as part
of their "Third Manifesto" about relational databases.
I read the Third Manifesto. There are many ideas in the TTM that have
strong arguments, although I most confess I haven't read any
critiques. A few (of many) points:
1) Strict adherence to the relational model, where all of SQL's
short-comings are addressed:
A) No attribute ordering
B) No tuple ordering (sets aren't ordered)
C) No duplicate tuples (relations are sets)
D) No nulls (2VL sufficient. Missing information is meta-data)
E) No nullogical mistakes (ex: SUM of an empty relation is zero, AVG
is an error)
F) Generalized transitive closure
G) Declared attribute, relation variable, and database constraints,
including transition constraints
H) Candidate keys required (this has positive logical consequences for
the DBMS implementor)
I) Tuple and relation-valued attributes
J) No tuple-level operations
a bunch more...
2) The query language should be computationally complete. The user
should be able to author complete applications in the language, rather
than the language being a sublanguage. This reverses Codd's query
sublanguage proposed in "A Relational Model of Data for Large Shared
Data Banks"
http://www.acm.org/classics/nov95/s1p5.html
<sarcasm>
Thanks ACM for just putting part of the paper on-line, complete with
broken links and spelling errors!
</sarcasm>
3) The language (a D implementation) would ensure a separation between
the logical design of the application and the physical implementation.
The programmer should think in terms of the evaluation of relational
algebraic expressions, not manipulating physical records in disk
blocks in a file.
4) The type system should separate the actual, internal representation
from the possible representation, of which there might be many. For
example, a POINT may be internally expressed in cartesian coordinates
but may supply both polar and cartensian THE_ operators.
5) The type system should implement D & D's view of multiple
inheritance, where read-operators are inherited but write-operators
aren't. This eliminates the "Is a Circle an Ellipse?" dilemma imposed
by C++, for example. IOW, in a "D" language, a Circle is an Ellipse.
They reject Stonebreaker's ideas of OIDs and relation variable
inheritance, which of course, are in PostgreSQL.
It's a very provocative read. At a minimum, one can learn what to
avoid with SQL. The language looks neat on paper. Perhaps one day
someone will provide an open source implementation. One could envision
a "D" project along the same lines as the same sort of project that
added SQL to Postgres...
But I'd rather have PITR :-)
Mike Mascari
mascarm@mascari.com
Mike Mascari kirjutas L, 04.10.2003 kell 06:32:
2) The query language should be computationally complete. The user
should be able to author complete applications in the language, rather
than the language being a sublanguage.
To me it seems like requiring that one should be able to author complete
programs in regex.
Yes, when all you have is a hammer everything looks like a nail ;)
----------------
Hannu
The world rejoiced as mascarm@mascari.com (Mike Mascari) wrote:
It's a very provocative read. At a minimum, one can learn what to
avoid with SQL. The language looks neat on paper. Perhaps one day
someone will provide an open source implementation. One could envision
a "D" project along the same lines as the same sort of project that
added SQL to Postgres...
I think you summed it up nicely. The "manifesto" is a provocative, if
painful, read. It is very useful at pointing out "pointy edges" of
SQL that might be wise to avoid.
I'm not thrilled with the language; I think they have made a mistake
in trying to make it too abbreviation-oriented. They keep operator
names short, to a fault.
As you say, the most likely way for a "D" to emerge in a popular way
would be by someone adding the language to an existing database
system.
There is a project out on SourceForge for a "D implementation," called
"Duro." It takes the opposite approach; the operators are all defined
as C functions, so you write all your code in C. It uses a data store
built atop Berkeley DB.
I think an implementor would be better off using an SQL database
underneath, and using their code layer in between to accomplish the
"divorce" from the aspects of SQL that they disapprove of. Sort of
like MaVerick, a Pick implementation in Java that uses a DBMS such as
PostgreSQL as the underlying data store.
You do a "proof of concept" by building something that translates D
requests to SQL requests. And _then_ get a project going to put a "D
parser" in as an alternative to the SQL parser. (Yes, that
oversimplifies matters. Tough...)
--
let name="cbbrowne" and tld="ntlug.org" in name ^ "@" ^ tld;;
http://www3.sympatico.ca/cbbrowne/rdbms.html
Rules of the Evil Overlord #81. "If I am fighting with the hero atop a
moving platform, have disarmed him, and am about to finish him off and
he glances behind me and drops flat, I too will drop flat instead of
quizzically turning around to find out what he saw."
<http://www.eviloverlord.com/>
Seun Osewa wrote:
Sometimes I wonder why its so important to model data in the "rela-
tional way", to think of data in form of sets of tuples rather than
tables or lists or whatever. I mean, though its elegant and based
on mathematical principles I would like to know why its the _right_
model to follow in designing a DBMS (or database). The way my mind
sees it, should we not rather be interested in what works?
Relational is the _right_ model because 'it works'. It's the only truly comprehensive
data model and subject of decades of research. All other data models have been found to
be flawed and (nearly) discarded.
If you don't care for mathematical principles, there's always ad-hoc database models.
Check out Pick, OO and XML databases. They're interested in what works and ignore
elegance and mathematical principles.
--
Lee Fesperman, FirstSQL, Inc. (http://www.firstsql.com)
==============================================================
* The Ultimate DBMS is here!
* FirstSQL/J Object/Relational DBMS (http://www.firstsql.com)
Christopher Browne <cbbrowne@acm.org> wrote in message news:<m3lls1vzfc.fsf@wolfe.cbbrowne.com>...
I think an implementor would be better off using an SQL database
underneath, and using their code layer in between to accomplish the
"divorce" from the aspects of SQL that they disapprove of.
That is, in fact, the approach taken in a product called Dataphor
(see www.alphora.com). They have implemented a "D"-language (called D4)
that translates into SQL and hence uses underlying SQLServer, Oracle
or DB2- DBMS'es as the engine.
It is, however, not a very easy mapping to do and you have to resort
to all sorts of unclean stuff to make it work...
regards,
Lauri Pietarinen
In article <ba87a3cf.0310031759.42dce77c@posting.google.com>, Seun Osewa
<seunosewa@inaira.com> writes
Thanks for the links.
Christopher Browne <cbbrowne@acm.org> wrote in message news:<blkq9n$d9puv$4@ID-
125932.news.uni-berlin.de>...There are two notable 'projects' out there:
1. There's Darwen and Date's "Tutorial D" language, defined as part
of their "Third Manifesto" about relational databases.2. newSQL <http://newsql.sourceforge.net/>, where they are studying
two syntaxes, one based on Java, and one based on a
simplification (to my mind, oversimplification) of SQL.I was able to get a pdf coy of the "Third Manifesto" article here:
http://citeseer.nj.nec.com/darwen95third.html
but the details of tutorial D seem not to be a part of that article.
NewSQL *might* be cool if someone found reason to use it in a DBMS.
Is Darwen and Date's stuff that where they said SQL was crap. As I
understand it, within about a year of designing SQL, at least one of
Codd and Date said it was rubbish and tried to replace it with something
"better".
Sometimes I wonder why its so important to model data in the "rela-
tional way", to think of data in form of sets of tuples rather than
tables or lists or whatever. I mean, though its elegant and based
on mathematical principles I would like to know why its the _right_
model to follow in designing a DBMS (or database). The way my mind
sees it, should we not rather be interested in what works?
I couldn't agree more (of course I would). As I like to put it, surely
Occam's Razor says that stuffing the four-dimensional world into a flat-
earth database can't be the optimal solution!
The trouble with so many SQL advocates is that they are so convinced in
the mathematical rightness of the relational model, that they forget it
is a *model* and, as such, needs to be shown as relevant to the real
world.
That said, I always think relationally when designing databases - it
helps. Look at the multi-value databases. Think relationally, you can
still store your data in normal form, but you're not stuffed by all the
irrelevant restrictions that relational databases tend to impose.
Get a freebie copy of jBASE, UniVerse or UniData, and try them out :-)
Cheers,
Wol
--
Anthony W. Youngman <pixie@thewolery.demon.co.uk>
'Yings, yow graley yin! Suz ae rikt dheu,' said the blue man, taking the
thimble. 'What *is* he?' said Magrat. 'They're gnomes,' said Nanny. The man
lowered the thimble. 'Pictsies!' Carpe Jugulum, Terry Pratchett 1998
Visit the MaVerick web-site - <http://www.maverick-dbms.org> Open Source Pick
In article <3F7F8E1A.474@ix.netcom.com>, Lee Fesperman
<firstsql@ix.netcom.com> writes
If you don't care for mathematical principles, there's always ad-hoc database
models.
Check out Pick, OO and XML databases. They're interested in what works and
ignore
elegance and mathematical principles.
Mathematical principles? You mean like Euclidean Geometry and Newtonian
Mechanics? They're perfectly solid, good, mathematically correct. Shame
they don't actually WORK all the time in the real world.
That's what I feel about relational, too ...
Cheers,
Wol
--
Anthony W. Youngman - wol at thewolery dot demon dot co dot uk
Witches are curious by definition and inquisitive by nature. She moved in. "Let
me through. I'm a nosey person.", she said, employing both elbows.
Maskerade : (c) 1995 Terry Pratchett
"Anthony W. Youngman" <thewolery@nospam.demon.co.uk> wrote in message news:<xTDLP1CFRIg$Ewjw@thewolery.demon.co.uk>...
In article <3F7F8E1A.474@ix.netcom.com>, Lee Fesperman
<firstsql@ix.netcom.com> writesIf you don't care for mathematical principles, there's always ad-hoc database
models.
Check out Pick, OO and XML databases. They're interested in what works and
ignore
elegance and mathematical principles.Mathematical principles? You mean like Euclidean Geometry and Newtonian
Mechanics? They're perfectly solid, good, mathematically correct. Shame
they don't actually WORK all the time in the real world.That's what I feel about relational, too ...
That explains the generally poor quality of your posts. You substitute
emotion for reason.
On 3 Oct 2003 21:39:03 GMT, Christopher Browne <cbbrowne@acm.org>
wrote:
There are two notable 'projects' out there:
1. There's Darwen and Date's "Tutorial D" language, defined as part
of their "Third Manifesto" about relational databases.2. newSQL <http://newsql.sourceforge.net/>, where they are studying
two syntaxes, one based on Java, and one based on a
simplification (to my mind, oversimplification) of SQL.
ISTR that Terry Halpin (of ORM fame) designed a language named
"ConQuer". I don't know the details, but I think Date's latest
edition refers to it in a note. Halpin's working on Visio at
Microsoft now, I think.
--
Mike Sherrill
Information Management Systems
-----Original Message-----
From: Seun Osewa [mailto:seunosewa@inaira.com]
Sent: Friday, October 03, 2003 11:52 AM
To: pgsql-hackers@postgresql.org
Subject: [HACKERS] Dreaming About Redesigning SQLHi,
This is for relational database theory experts on one hand
and imlementers of real-world alications on the other hand.
If there was a chance to start again and design SQL afresh,
for best cleaness/power/performance what changes would you
make? What would _your_ query language (and the underlying
database concept) look like?Seun Osewa
PS: I should want to post my ideas too for review but more
experienced/qualified people should come first
I imagine you have read the 3rd database manifesto by Codd. I think
he's gone off the deep end a bit. You don't just throw away a trillion
dollars worth of effort and tools to make things mathematically
orthogonal.
However, on some things he is clearly right. For instance, null values
are evil. Programmers understand it, but end users will *always* be
surprised that:
SELECT COUNT(shirts) FROM clothing WHERE color = 'blue'
SELECT COUNT(shirts) FROM clothing WHERE NOT color = 'blue'
Is not equal to the number of shirts in the inventory if any color
fields are null.
Therefore, his idea of using default values instead and never using null
is a good one.
If SQL vendors would follow the ANSI/ISO standard to the letter, and
implement the latest iteration, that would solve all of the problems
that SQL tool users have to face.
Import Notes
Resolved by subject fallback
I have tried, twice, to download the evaluation version of the alphora
product for testing and it doesn't work. Guess there would be a lot
to learn from playing with it; the product is more than a RDBMS
Regards,
Seun Osewa
lauri.pietarinen@atbusiness.com (Lauri Pietarinen) wrote:
Show quoted text
That is, in fact, the approach taken in a product called Dataphor
(see www.alphora.com). They have implemented a "D"-language (called D4)
that translates into SQL and hence uses underlying SQLServer, Oracle
or DB2- DBMS'es as the engine.regards,
Lauri Pietarinen
DCorbit@connx.com ("Dann Corbit") wrote in message news:<D90A5A6C612A39408103E6ECDD77B829408BE9@voyager.corporate.connx.com>...
-----Original Message-----
From: Seun Osewa [mailto:seunosewa@inaira.com]
Sent: Friday, October 03, 2003 11:52 AM
To: pgsql-hackers@postgresql.org
Subject: [HACKERS] Dreaming About Redesigning SQLHi,
This is for relational database theory experts on one hand
and imlementers of real-world alications on the other hand.
If there was a chance to start again and design SQL afresh,
for best cleaness/power/performance what changes would you
make? What would your query language (and the underlying
database concept) look like?Seun Osewa
PS: I should want to post my ideas too for review but more
experienced/qualified people should come firstI imagine you have read the 3rd database manifesto by Codd. I think
he's gone off the deep end a bit.
Dann, you are showing your ignorance. While Dr. Codd recently died, if
you think he wrote a third database manifesto, you have definitely
gone off the deep end yourself.
You don't just throw away a trillion
dollars worth of effort and tools to make things mathematically
orthogonal.
Again, you are showing your ignorance. Nobody has ever suggested
anything even remotely resembling the above.
However, on some things he is clearly right. For instance, null values
are evil.
Dr. Codd believed we need two NULLs. You ascribe correctness to the
one thing I think he clearly got wrong.
Programmers understand it
That's an absurd assertion.
Therefore, his idea of using default values instead and never using null
is a good one.
That is not his idea.
If SQL vendors would follow the ANSI/ISO standard to the letter, and
implement the latest iteration, that would solve all of the problems
that SQL tool users have to face.
Upon what do you base this ridiculous opinion?
seunosewa@inaira.com (Seun Osewa) wrote in message news:<ba87a3cf.0310080256.11846ef3@posting.google.com>...
I have tried, twice, to download the evaluation version of the alphora
product for testing and it doesn't work. Guess there would be a lot
to learn from playing with it; the product is more than a RDBMS
Aw, that's unfortunate. It took me a while to get working.
It is infact an integrated application development environment where
you can define a great part of your application in a declarative
fashion.
regards,
Lauri Pietarinen
Good question. Although I would want to move away from relational
databases too, if there is an RDBMS and one wants to query it, what
would I aim for? If you look at XQuery, you will see an example of
what I would definitely NOT aim for. Although the user of such a
language might very well be a technical person, instead of starting
with mathematics (relational calculus, relational algebra) I would
suggest starting with language. The mathematics of language is more
complex than the mathematics of relations, particularly simple
relations (such as 1NF tables).
If you look at the history of data persistence prior to Codd's 1970
ACM paper, you will see several attempts at this. One I have studied
of late is GIRLS (Generalized Information Retrieval Language and
System), specified by Don Nelson and implemented by several folks with
the most famous being Dick Pick. This GIRLS language was specified a
full 40 years ago and lives today in many IT shops under a variety of
about 10 different names, including IBM's UniQuery and Retrieve (for
UniData and Universe respectively). This language is flawed, as are
all, but so very close to what I would think would be a good approach.
It was written at TRW in order to make it so that the military in
Viet Nam could query their data without technical folks in the field.
It went into production in 1969 with the US Army. Prior to the end of
the cold war, it was used by the CIA to track (the associated
database) and query about Russion spies in the US.
I would suggest ditching the entire relational model (as both overly
simplistic in its theory and overly complex in its implementation) and
start with English (that is one of the other names for the GIRLS
language). Note that language is also the starting point for putting
data in XML documents, but it sure doesn't seem to be the starting
point for XQuery, eh?
--dawn
Dawn M. Wolthuis
www.tincat-group.com
With all due respect, Dawn, you are an idiot.
"Dawn M. Wolthuis" <dwolt@iserv.net> wrote in message
news:6db906b2.0310091212.4f967cf5@posting.google.com...
Show quoted text
Good question. Although I would want to move away from relational
databases too, if there is an RDBMS and one wants to query it, what
would I aim for? If you look at XQuery, you will see an example of
what I would definitely NOT aim for. Although the user of such a
language might very well be a technical person, instead of starting
with mathematics (relational calculus, relational algebra) I would
suggest starting with language. The mathematics of language is more
complex than the mathematics of relations, particularly simple
relations (such as 1NF tables).If you look at the history of data persistence prior to Codd's 1970
ACM paper, you will see several attempts at this. One I have studied
of late is GIRLS (Generalized Information Retrieval Language and
System), specified by Don Nelson and implemented by several folks with
the most famous being Dick Pick. This GIRLS language was specified a
full 40 years ago and lives today in many IT shops under a variety of
about 10 different names, including IBM's UniQuery and Retrieve (for
UniData and Universe respectively). This language is flawed, as are
all, but so very close to what I would think would be a good approach.
It was written at TRW in order to make it so that the military in
Viet Nam could query their data without technical folks in the field.
It went into production in 1969 with the US Army. Prior to the end of
the cold war, it was used by the CIA to track (the associated
database) and query about Russion spies in the US.I would suggest ditching the entire relational model (as both overly
simplistic in its theory and overly complex in its implementation) and
start with English (that is one of the other names for the GIRLS
language). Note that language is also the starting point for putting
data in XML documents, but it sure doesn't seem to be the starting
point for XQuery, eh?--dawn
Dawn M. Wolthuis
www.tincat-group.com
The mathematics of language is more complex than the mathematics of
relations, particularly simple relations (such as 1NF tables). <<
Are you sure, you know what you are talking about?
I would suggest ditching the entire relational model (as both overly
simplistic in its theory and overly complex in its implementation.. <<
Incredible! How about reading some books on the subject?
--
-- Anith
( Please reply to newsgroups only )
A long time ago, in a galaxy far, far away, dwolt@iserv.net (Dawn M. Wolthuis) wrote:
Good question. Although I would want to move away from relational
databases too, if there is an RDBMS and one wants to query it, what
would I aim for? If you look at XQuery, you will see an example of
what I would definitely NOT aim for. Although the user of such a
language might very well be a technical person, instead of starting
with mathematics (relational calculus, relational algebra) I would
suggest starting with language. The mathematics of language is more
complex than the mathematics of relations, particularly simple
relations (such as 1NF tables).
No, that _very much_ gets things backwards.
You need to have a clearly defined model of how the data is to be
manipulated before it makes any sense to try to make up a language.
--
output = reverse("moc.enworbbc" "@" "enworbbc")
http://cbbrowne.com/info/rdbms.html
To iterate is human; to recurse, divine.