Updatable views/with check option parsing
I have spent some time figuring out how to resolve the parsing conflicts in
Bernd Helmle's updatable views patch. The problem has now been reduced to
specifically this situation:
CREATE VIEW foo AS SELECT expr :: TIME . WITH
(where expr is a_expr or b_expr and TIME could also be TIMESTAMP or TIME(x) or
TIMESTAMP(x)).
The continuation here could be WITH TIME ZONE (calling for a shift) or WITH
CHECK OPTION (calling for a reduce).
All the usual ideas about unfolding the rules or making keywords more reserved
don't work (why should they). A one-token lookahead simply can't parse this.
I have had some ideas about trying to play around with the precedence rules --
giving WITH TIME ZONE a higher precedence than WITH CHECK OPTION -- but I
have no experience with that and I am apparently not doing it right, if that
is supposed to work at all.
If we can't get that to work, it seems that we are out of options unless we
want to just accept the conflicts.
How should we go about this, and what should Bernd do with his patch, which,
as I understand it, has been held up for quite a while simply because he is
concerned about this issue?
--
Peter Eisentraut
http://developer.postgresql.org/~petere/
Ühel kenal päeval, K, 2006-05-24 kell 13:13, kirjutas Peter Eisentraut:
I have spent some time figuring out how to resolve the parsing conflicts in
Bernd Helmle's updatable views patch. The problem has now been reduced to
specifically this situation:CREATE VIEW foo AS SELECT expr :: TIME . WITH
(where expr is a_expr or b_expr and TIME could also be TIMESTAMP or TIME(x) or
TIMESTAMP(x)).The continuation here could be WITH TIME ZONE (calling for a shift) or WITH
CHECK OPTION (calling for a reduce).All the usual ideas about unfolding the rules or making keywords more reserved
don't work (why should they). A one-token lookahead simply can't parse this.
Can't we teach tokenized a new token "WITH TIME ZONE" ?
I have had some ideas about trying to play around with the precedence rules --
giving WITH TIME ZONE a higher precedence than WITH CHECK OPTION -- but I
have no experience with that and I am apparently not doing it right, if that
is supposed to work at all.If we can't get that to work, it seems that we are out of options unless we
want to just accept the conflicts.How should we go about this, and what should Bernd do with his patch, which,
as I understand it, has been held up for quite a while simply because he is
concerned about this issue?
--
----------------
Hannu Krosing
Database Architect
Skype Technologies OÜ
Akadeemia tee 21 F, Tallinn, 12618, Estonia
Skype me: callto:hkrosing
Get Skype for free: http://www.skype.com
Hannu Krosing wrote:
Can't we teach tokenized a new token "WITH TIME ZONE" ?
No, that's three tokens, not one. We surely don't want to start making
white space significant.
cheers
andrew
Peter Eisentraut <peter_e@gmx.net> writes:
I have spent some time figuring out how to resolve the parsing conflicts in
Bernd Helmle's updatable views patch. The problem has now been reduced to
specifically this situation:
Could we see the proposed patches for gram.y?
If we can't get that to work, it seems that we are out of options unless we
want to just accept the conflicts.
Not acceptable, per prior discussions any time someone was too lazy to
fix their grammar patch completely ...
regards, tom lane
Am Mittwoch, 24. Mai 2006 20:42 schrieb Tom Lane:
Peter Eisentraut <peter_e@gmx.net> writes:
I have spent some time figuring out how to resolve the parsing conflicts
in Bernd Helmle's updatable views patch. The problem has now been
reduced to specifically this situation:Could we see the proposed patches for gram.y?
Here it is.
$ make -W gram.y gram.c
bison -y -d gram.y
conflicts: 4 shift/reduce
These are basically for instances of the same problem.
--
Peter Eisentraut
http://developer.postgresql.org/~petere/
Attachments:
updatable-views-gram-patch.difftext/x-diff; charset=iso-8859-1; name=updatable-views-gram-patch.diffDownload
--- /home/pei/devel/pg82/pgsql/src/backend/parser/gram.y 2006-05-22 09:05:13.000000000 +0200
+++ gram.y 2006-05-26 09:41:21.000000000 +0200
@@ -339,7 +339,8 @@
%type <list> constraints_set_list
%type <boolean> constraints_set_mode
%type <str> OptTableSpace OptConsTableSpace OptTableSpaceOwner
-
+%type <list> opt_check_option
+%type <node> opt_check_mode
/*
* If you make any token changes, update the keyword table in
@@ -356,7 +357,7 @@
BACKWARD BEFORE BEGIN_P BETWEEN BIGINT BINARY BIT
BOOLEAN_P BOTH BY
- CACHE CALLED CASCADE CASE CAST CHAIN CHAR_P
+ CACHE CALLED CASCADE CASCADED CASE CAST CHAIN CHAR_P
CHARACTER CHARACTERISTICS CHECK CHECKPOINT CLASS CLOSE
CLUSTER COALESCE COLLATE COLUMN COMMENT COMMIT
COMMITTED CONNECTION CONSTRAINT CONSTRAINTS CONVERSION_P CONVERT COPY CREATE CREATEDB
@@ -4618,12 +4619,12 @@
/*****************************************************************************
*
* QUERY:
- * CREATE [ OR REPLACE ] [ TEMP ] VIEW <viewname> '('target-list ')' AS <query>
+ * CREATE [ OR REPLACE ] [ TEMP ] VIEW <viewname> '('target-list ')' AS <query> [ WITH [ CASCADED | LOCAL ] CHECK OPTION ]
*
*****************************************************************************/
ViewStmt: CREATE OptTemp VIEW qualified_name opt_column_list
- AS SelectStmt
+ AS SelectStmt opt_check_option
{
ViewStmt *n = makeNode(ViewStmt);
n->replace = false;
@@ -4631,10 +4632,11 @@
n->view->istemp = $2;
n->aliases = $5;
n->query = (Query *) $7;
+ n->options = (List *) $8;
$$ = (Node *) n;
}
| CREATE OR REPLACE OptTemp VIEW qualified_name opt_column_list
- AS SelectStmt
+ AS SelectStmt opt_check_option
{
ViewStmt *n = makeNode(ViewStmt);
n->replace = true;
@@ -4642,10 +4644,25 @@
n->view->istemp = $4;
n->aliases = $7;
n->query = (Query *) $9;
+ n->options = (List *) $10;
$$ = (Node *) n;
}
;
+opt_check_option:
+ WITH opt_check_mode CHECK OPTION
+ {
+ $$ = list_make1( $2 );
+ }
+ | /* EMPTY */ { $$ = NIL; }
+ ;
+
+opt_check_mode:
+ CASCADED { $$ = (Node *)makeString("cascaded"); }
+ | LOCAL { $$ = (Node *)makeString("local"); }
+ | /* EMPTY */ { $$ = (Node *)makeString("cascaded"); }
+ ;
+
/*****************************************************************************
*
* QUERY:
@@ -8500,7 +8517,6 @@
| VARYING
| VIEW
| VOLATILE
- | WITH
| WITHOUT
| WORK
| WRITE
@@ -8551,8 +8567,6 @@
| SETOF
| SMALLINT
| SUBSTRING
- | TIME
- | TIMESTAMP
| TREAT
| TRIM
| VARCHAR
@@ -8608,6 +8622,7 @@
| ASC
| ASYMMETRIC
| BOTH
+ | CASCADED
| CASE
| CAST
| CHECK
@@ -8662,6 +8677,8 @@
| SYMMETRIC
| TABLE
| THEN
+ | TIME
+ | TIMESTAMP
| TO
| TRAILING
| TRUE_P
@@ -8671,6 +8688,7 @@
| USING
| WHEN
| WHERE
+ | WITH
;
Peter Eisentraut wrote:
Am Mittwoch, 24. Mai 2006 20:42 schrieb Tom Lane:
Peter Eisentraut <peter_e@gmx.net> writes:
I have spent some time figuring out how to resolve the parsing conflicts
in Bernd Helmle's updatable views patch. The problem has now been
reduced to specifically this situation:Could we see the proposed patches for gram.y?
Here it is.
$ make -W gram.y gram.c
bison -y -d gram.y
conflicts: 4 shift/reduceThese are basically for instances of the same problem.
I had a quick look - I don't think there is an easy answer with the
current proposed grammar. If we want to prevent shift/reduce conflicts I
suspect we'd need to use a different keyword than WITH, although I can't
think of one that couldn't be a trailing clause on a select statment,
which is the cause of the trouble. Another possibility would be to move
the optional WITH clause so that it would come before the AS clause.
Then there should be no conflict, I believe. Something like:
ViewStmt: CREATE OptTemp VIEW qualified_name opt_column_list
opt_check_option AS SelectStmt
cheers
andrew
Andrew Dunstan <andrew@dunslane.net> writes:
I had a quick look - I don't think there is an easy answer with the
current proposed grammar. If we want to prevent shift/reduce conflicts I
suspect we'd need to use a different keyword than WITH, although I can't
think of one that couldn't be a trailing clause on a select statment,
which is the cause of the trouble. Another possibility would be to move
the optional WITH clause so that it would come before the AS clause.
Unfortunately the SQL99 spec is perfectly clear about what it wants:
<view definition> ::=
CREATE [ RECURSIVE ] VIEW <table name>
<view specification>
AS <query expression>
[ WITH [ <levels clause> ] CHECK OPTION ]
<levels clause> ::=
CASCADED
| LOCAL
I haven't had time to play with this yet, but I suspect the answer will
have to be that we reinstate the token-merging UNION JOIN kluge that I
just took out :-(. Or we could look into recognizing the whole thing as
one token in scan.l, but I suspect that doesn't work unless we give up
the no-backtrack property of the lexer, which would be more of a speed
hit than the intermediate function was. Anyway it should certainly be
soluble with token merging, if we can't find a pure grammar solution.
regards, tom lane
On Wed, May 24, 2006 at 01:13:06PM +0200, Peter Eisentraut wrote:
CREATE VIEW foo AS SELECT expr :: TIME . WITH
(where expr is a_expr or b_expr and TIME could also be TIMESTAMP or TIME(x) or
TIMESTAMP(x)).The continuation here could be WITH TIME ZONE (calling for a shift) or WITH
CHECK OPTION (calling for a reduce).All the usual ideas about unfolding the rules or making keywords more reserved
don't work (why should they). A one-token lookahead simply can't parse this.I have had some ideas about trying to play around with the precedence rules --
giving WITH TIME ZONE a higher precedence than WITH CHECK OPTION -- but I
have no experience with that and I am apparently not doing it right, if that
is supposed to work at all.
All precedence rules do is force the parser to either shift or reduce
without complaining about a conflict. i.e. it resolves the conflict by
forcing a particular option.
So all you would acheive with precedence rules is that you codify the
solution. For example: always shift and if the user specifies WITH
CHECK they get a parse error. It would be nice to be able to detect
this and tell the user to parenthesise the (expr::TIME) thus solving
the problem. Given that :: is not SQL-standard anyway, perhaps this is
not a bad solution.
Incidently, IIRC the default behaviour on conflict is a shift anyway,
so that what the patch already does anyway.
So we get:
CREATE VIEW foo AS SELECT expr :: TIME WITH TIME ZONE <-- OK
CREATE VIEW foo AS SELECT expr :: TIME WITH CHECK OPTION <-- parse error
CREATE VIEW foo AS SELECT (expr :: TIME) WITH CHECK OPTION <-- OK
Of course, any code that decompiles into SQL will have to be careful to
not produce unparseable SQL.
Have a ncie day,
--
Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/
Show quoted text
From each according to his ability. To each according to his ability to litigate.
Martijn van Oosterhout <kleptog@svana.org> writes:
So we get:
CREATE VIEW foo AS SELECT expr :: TIME WITH TIME ZONE <-- OK
CREATE VIEW foo AS SELECT expr :: TIME WITH CHECK OPTION <-- parse error
CREATE VIEW foo AS SELECT (expr :: TIME) WITH CHECK OPTION <-- OK
I haven't really been following this conversation, but just on the off chance
this is a useful idea: Would it work to make "WITH" just a noise word? then
you would just need one token of look-ahead to recognize "TIME ZONE" or "CHECK
OPTION" instead of 2. I don't know what <levels clause>s look like so I'm not
sure if you would be able to recognize them without seeing the WITH. I'm not
even sure this works even if you can for that matter.
--
greg
Peter Eisentraut said:
Martijn van Oosterhout wrote:
Incidently, IIRC the default behaviour on conflict is a shift anyway,
so that what the patch already does anyway.So we get:
CREATE VIEW foo AS SELECT expr :: TIME WITH TIME ZONE <-- OK
CREATE VIEW foo AS SELECT expr :: TIME WITH CHECK OPTION <-- >
parse error CREATE VIEW foo AS SELECT (expr :: TIME) WITH CHECK OPTION
<-- OKYes, that's really the fundamental problem if you let shift/reduce
conflicts stand: the parser will behave weirdly in the conflict cases.There is a seemingly little known option in bison named %glr-parser,
which when turned on parses all of theses cases correctly. Maybe that
is worth considering.
Interesting.
Unfortunately, the manual says:
"The GLR parsers require a compiler for ISO C89 or later. In addition, they
use the inline keyword, which is not C89, but is C99 and is a common
extension in pre-C99 compilers. It is up to the user of these parsers to
handle portability issues."
Do we want such a restriction?
cheers
andrew
Import Notes
Reply to msg id not found: 200605262334.36499.peter_e@gmx.netReference msg id not found: 200605262334.36499.peter_e@gmx.net | Resolved by subject fallback
Martijn van Oosterhout wrote:
Incidently, IIRC the default behaviour on conflict is a shift anyway,
so that what the patch already does anyway.So we get:
CREATE VIEW foo AS SELECT expr :: TIME WITH TIME ZONE <-- OK
CREATE VIEW foo AS SELECT expr :: TIME WITH CHECK OPTION <-- > parse error
CREATE VIEW foo AS SELECT (expr :: TIME) WITH CHECK OPTION <-- OK
Yes, that's really the fundamental problem if you let shift/reduce
conflicts stand: the parser will behave weirdly in the conflict cases.
There is a seemingly little known option in bison named %glr-parser,
which when turned on parses all of theses cases correctly. Maybe that
is worth considering.
--
Peter Eisentraut
http://developer.postgresql.org/~petere/
Andrew Dunstan wrote:
"The GLR parsers require a compiler for ISO C89 or later. In
addition, they use the inline keyword, which is not C89, but is C99
and is a common extension in pre-C99 compilers. It is up to the user
of these parsers to handle portability issues."
We already use inline, or handle its nonexistence, respectively.
--
Peter Eisentraut
http://developer.postgresql.org/~petere/
Tom Lane said:
Peter Eisentraut <peter_e@gmx.net> writes:
Andrew Dunstan wrote:
"The GLR parsers require a compiler for ISO C89 or later. In
addition, they use the inline keyword, which is not C89, but is C99
and is a common extension in pre-C99 compilers. It is up to the user
of these parsers to handle portability issues."We already use inline, or handle its nonexistence, respectively.
Yeah, I don't see anything in that statement that we don't assume
already. The interesting question to me is how much different is
GLR from garden-variety bison; in particular, what's the parser
performance like?
As I understand it, it runs one parser pretty much like the standard LALR(1)
case, until it finds an ambiguity (shift/reduce or reduce/reduce) at which
stage it clones the parser to take parallel paths, working in lockstep, and
storing up semantic actions. When one of the clones encounters an error it
goes away, and the surviving clone takes its stored semantic actions. If
that's true, then probably the only significant performance hit is in cases
of ambiguity, and we would only have a handful of those, each lasting for
one token, so the performance hit should be very small. We'd have to test
it, of course ;-)
cheers
andrew
Import Notes
Reply to msg id not found: 2633.1148686004@sss.pgh.pa.usReference msg id not found: 2633.1148686004@sss.pgh.pa.us | Resolved by subject fallback
Peter Eisentraut <peter_e@gmx.net> writes:
Andrew Dunstan wrote:
"The GLR parsers require a compiler for ISO C89 or later. In
addition, they use the inline keyword, which is not C89, but is C99
and is a common extension in pre-C99 compilers. It is up to the user
of these parsers to handle portability issues."
We already use inline, or handle its nonexistence, respectively.
Yeah, I don't see anything in that statement that we don't assume
already. The interesting question to me is how much different is
GLR from garden-variety bison; in particular, what's the parser
performance like?
regards, tom lane
"Andrew Dunstan" <andrew@dunslane.net> writes:
As I understand it, it runs one parser pretty much like the standard LALR(1)
case, until it finds an ambiguity (shift/reduce or reduce/reduce) at which
stage it clones the parser to take parallel paths, working in lockstep, and
storing up semantic actions. When one of the clones encounters an error it
goes away, and the surviving clone takes its stored semantic actions. If
that's true, then probably the only significant performance hit is in cases
of ambiguity, and we would only have a handful of those, each lasting for
one token, so the performance hit should be very small. We'd have to test
it, of course ;-)
Yeah, I just read the same in the bison manual. The thing that's
bothering me is that a GLR parser would hide that ambiguity from you,
and thus changes in the grammar might cause us to incur performance
hits without realizing it. The manual indicates that the performance
is pretty awful whenever an ambiguity does occur.
regards, tom lane
Tom Lane wrote:
Yeah, I just read the same in the bison manual. The thing that's
bothering me is that a GLR parser would hide that ambiguity from you,
It doesn't really hide it. You still get the "N shift/reduce conflicts"
warnings from bison. You just know that they are being handled.
--
Peter Eisentraut
http://developer.postgresql.org/~petere/
Peter Eisentraut <peter_e@gmx.net> writes:
Tom Lane wrote:
Yeah, I just read the same in the bison manual. The thing that's
bothering me is that a GLR parser would hide that ambiguity from you,
It doesn't really hide it. You still get the "N shift/reduce conflicts"
warnings from bison. You just know that they are being handled.
Well, that has the same problem that we've raised every other time
someone has said "I don't want to fix the grammar to not have any
conflicts". If bison only tells you "there were N conflicts",
how do you know these are the same N conflicts you had yesterday?
In a grammar that we hack around as much as we do with Postgres,
I really don't think that's acceptable.
I think that by far the most reliable solution is to put back the
"filter" yylex function that I removed a couple months ago:
http://developer.postgresql.org/cvsweb.cgi/pgsql/src/backend/parser/parser.c.diff?r1=1.64&r2=1.65
We can use the same technique that we used for UNION JOIN, but instead
join, say, WITH and TIME into one token and make the time datatype
productions look for "TIME WITHTIME ZONE" and so on. (I propose this
rather than putting the ugliness into WITH CHECK OPTION, because this
way we should only need one merged token and thus only one case to
check in the filter function; AFAICS we'd need three cases if we
merge tokens on that end of it.)
I'm not sure we can just revert the above-mentioned patch, because it
had some interactions with a later patch to use %option prefix.
Shouldn't be too hard to fix though. I'll put together a proposed
patch.
regards, tom lane
I wrote:
We can use the same technique that we used for UNION JOIN, but instead
join, say, WITH and TIME into one token and make the time datatype
productions look for "TIME WITHTIME ZONE" and so on. (I propose this
rather than putting the ugliness into WITH CHECK OPTION, because this
way we should only need one merged token and thus only one case to
check in the filter function; AFAICS we'd need three cases if we
merge tokens on that end of it.)
On investigation that turns out to have been a bad idea: if we do it
that way, it becomes necessary to promote WITH to a fully reserved word.
The counterexample is
CREATE VIEW v AS SELECT * FROM foo WITH ...
Is WITH an alias for foo (with no AS), or is it the start of a WITH
CHECK OPTION? No way to tell without lookahead.
While I don't think that making WITH a fully reserved word would cause
any great damage, I'm unwilling to do it just to save a couple of lines
of code. Accordingly, I propose the attached patch. This reinstates
the filter yylex function formerly used for UNION JOIN (in a slightly
cleaner fashion than it was previously done) and parses WITH CHECK
OPTION without any bison complaints, and with no new reserved words.
If no objections, I'll go ahead and apply this, and Peter can get on
with making the stub productions do something useful.
regards, tom lane
Ühel kenal päeval, R, 2006-05-26 kell 22:50, kirjutas Tom Lane:
I wrote:
We can use the same technique that we used for UNION JOIN, but instead
join, say, WITH and TIME into one token and make the time datatype
productions look for "TIME WITHTIME ZONE" and so on. (I propose this
rather than putting the ugliness into WITH CHECK OPTION, because this
way we should only need one merged token and thus only one case to
check in the filter function; AFAICS we'd need three cases if we
merge tokens on that end of it.)On investigation that turns out to have been a bad idea: if we do it
that way, it becomes necessary to promote WITH to a fully reserved word.
The counterexample isCREATE VIEW v AS SELECT * FROM foo WITH ...
Is WITH an alias for foo (with no AS), or is it the start of a WITH
CHECK OPTION? No way to tell without lookahead.While I don't think that making WITH a fully reserved word would cause
any great damage, I'm unwilling to do it just to save a couple of lines
of code.
I think we should go on and do promote WITH to a reserved keyword now
because eventually we have to do it anyway.
It is needed for recursive queries as well. I don't pretend to be an
expert bison coder, but I was unable to define a grammar for
SQL-standard recursive queries without making WITH a reserved keyword.
--
----------------
Hannu Krosing
Database Architect
Skype Technologies OÜ
Akadeemia tee 21 F, Tallinn, 12618, Estonia
Skype me: callto:hkrosing
Get Skype for free: http://www.skype.com
Hannu Krosing <hannu@skype.net> writes:
I think we should go on and do promote WITH to a reserved keyword now
because eventually we have to do it anyway.
It is needed for recursive queries as well.
I'm unconvinced. Recursive queries have WITH at the front, not the
back, so the parsing issues are entirely different.
If we do find that, we can easily adjust this code to simplify the
filter function at that time. But I don't agree with reserving words
just because we might need them for patches that don't exist yet.
regards, tom lane
While I don't think that making WITH a fully reserved word would
cause
any great damage, I'm unwilling to do it just to save a couple of
lines
of code.
I think we should go on and do promote WITH to a reserved keyword now
Oracle, MS-SQL, DB2, MySQL and Informix also have WITH reserved, so it
would
imho be ok to do it if it simplifies code.
Andreas
Import Notes
Resolved by subject fallback