[Bug Fix] ECPG: could not use set xxx to default statement

Started by Higuchi, Daisukealmost 7 years ago13 messages
#1Higuchi, Daisuke
higuchi.daisuke@jp.fujitsu.com
1 attachment(s)

Hi,

I found ECPG's bug which SET xxx = DEFAULT statement could not be used.

[PROBLEM]
When the source code (*.pgc) has "EXEC SQL set xxx = default;", created c program by ECPG has no " = default".
For example, "EXEC SQL set search_path = default;" in *.pgc will be converted to "{ ECPGdo(__LINE__, 0, 1, NULL, 0, ECPGst_normal, "set search_path", ECPGt_EOIT, ECPGt_EORT);}" in c program.

[Investigation]
gram.y lacks ";" in the end of section "generic_set", so preproc.y's syntax is broken.

src/backend/parser/gram.y
-------------------------------------------
generic_set:

| var_name '=' DEFAULT
{
VariableSetStmt *n = makeNode(VariableSetStmt);
n->kind = VAR_SET_DEFAULT;
n->name = $1;
$$ = n;
}

set_rest_more: /* Generic SET syntaxes: */
-------------------------------------------

src/interfaces/ecpg/preproc/preproc.y
-------------------------------------------
generic_set:

| var_name TO DEFAULT
{
$$ = cat_str(2,$1,mm_strdup("to default"));
}
| var_name '=' DEFAULT set_rest_more:
generic_set
{
$$ = $1;
}
-------------------------------------------

I attached the patch which has ";" in the end of section "generic_set" and regression.

Regards,
Daisuke, Higuchi

Attachments:

FIX_ECPG_SET_TO_DEFAULT_v1.patchapplication/octet-stream; name=FIX_ECPG_SET_TO_DEFAULT_v1.patchDownload
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index a68f78e..711dd2b 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -1465,6 +1465,7 @@ generic_set:
 					n->name = $1;
 					$$ = n;
 				}
+			;
 
 set_rest_more:	/* Generic SET syntaxes: */
 			generic_set 						{$$ = $1;}
diff --git a/src/interfaces/ecpg/test/expected/sql-show.c b/src/interfaces/ecpg/test/expected/sql-show.c
index 1b52d5e..5997a15 100644
--- a/src/interfaces/ecpg/test/expected/sql-show.c
+++ b/src/interfaces/ecpg/test/expected/sql-show.c
@@ -112,7 +112,7 @@ if (sqlca.sqlcode < 0) sqlprint();}
 
   printf("Var: Standard conforming strings: %s\n", var);
 
-  { ECPGdo(__LINE__, 0, 1, NULL, 0, ECPGst_normal, "set time zone PST8PDT", ECPGt_EOIT, ECPGt_EORT);
+  { ECPGdo(__LINE__, 0, 1, NULL, 0, ECPGst_normal, "set standard_conforming_strings to default", ECPGt_EOIT, ECPGt_EORT);
 #line 30 "show.pgc"
 
 if (sqlca.sqlwarn[0] == 'W') sqlprint();
@@ -121,7 +121,7 @@ if (sqlca.sqlwarn[0] == 'W') sqlprint();
 if (sqlca.sqlcode < 0) sqlprint();}
 #line 30 "show.pgc"
 
-  { ECPGdo(__LINE__, 0, 1, NULL, 0, ECPGst_normal, "show time zone", ECPGt_EOIT, 
+  { ECPGdo(__LINE__, 0, 1, NULL, 0, ECPGst_normal, "show standard_conforming_strings", ECPGt_EOIT, 
 	ECPGt_char,(var),(long)25,(long)1,(25)*sizeof(char), 
 	ECPGt_NO_INDICATOR, NULL , 0L, 0L, 0L, ECPGt_EORT);
 #line 31 "show.pgc"
@@ -132,9 +132,9 @@ if (sqlca.sqlwarn[0] == 'W') sqlprint();
 if (sqlca.sqlcode < 0) sqlprint();}
 #line 31 "show.pgc"
 
-  printf("Time Zone: %s\n", var);
+  printf("Var: Standard conforming strings: %s\n", var);
 
-  { ECPGdo(__LINE__, 0, 1, NULL, 0, ECPGst_normal, "set transaction isolation level read committed", ECPGt_EOIT, ECPGt_EORT);
+  { ECPGdo(__LINE__, 0, 1, NULL, 0, ECPGst_normal, "set time zone PST8PDT", ECPGt_EOIT, ECPGt_EORT);
 #line 34 "show.pgc"
 
 if (sqlca.sqlwarn[0] == 'W') sqlprint();
@@ -143,7 +143,7 @@ if (sqlca.sqlwarn[0] == 'W') sqlprint();
 if (sqlca.sqlcode < 0) sqlprint();}
 #line 34 "show.pgc"
 
-  { ECPGdo(__LINE__, 0, 1, NULL, 0, ECPGst_normal, "show transaction isolation level", ECPGt_EOIT, 
+  { ECPGdo(__LINE__, 0, 1, NULL, 0, ECPGst_normal, "show time zone", ECPGt_EOIT, 
 	ECPGt_char,(var),(long)25,(long)1,(25)*sizeof(char), 
 	ECPGt_NO_INDICATOR, NULL , 0L, 0L, 0L, ECPGt_EORT);
 #line 35 "show.pgc"
@@ -154,9 +154,9 @@ if (sqlca.sqlwarn[0] == 'W') sqlprint();
 if (sqlca.sqlcode < 0) sqlprint();}
 #line 35 "show.pgc"
 
-  printf("Transaction isolation level: %s\n", var);
+  printf("Time Zone: %s\n", var);
 
-  { ECPGdisconnect(__LINE__, "ALL");
+  { ECPGdo(__LINE__, 0, 1, NULL, 0, ECPGst_normal, "set transaction isolation level read committed", ECPGt_EOIT, ECPGt_EORT);
 #line 38 "show.pgc"
 
 if (sqlca.sqlwarn[0] == 'W') sqlprint();
@@ -165,6 +165,28 @@ if (sqlca.sqlwarn[0] == 'W') sqlprint();
 if (sqlca.sqlcode < 0) sqlprint();}
 #line 38 "show.pgc"
 
+  { ECPGdo(__LINE__, 0, 1, NULL, 0, ECPGst_normal, "show transaction isolation level", ECPGt_EOIT, 
+	ECPGt_char,(var),(long)25,(long)1,(25)*sizeof(char), 
+	ECPGt_NO_INDICATOR, NULL , 0L, 0L, 0L, ECPGt_EORT);
+#line 39 "show.pgc"
+
+if (sqlca.sqlwarn[0] == 'W') sqlprint();
+#line 39 "show.pgc"
+
+if (sqlca.sqlcode < 0) sqlprint();}
+#line 39 "show.pgc"
+
+  printf("Transaction isolation level: %s\n", var);
+
+  { ECPGdisconnect(__LINE__, "ALL");
+#line 42 "show.pgc"
+
+if (sqlca.sqlwarn[0] == 'W') sqlprint();
+#line 42 "show.pgc"
+
+if (sqlca.sqlcode < 0) sqlprint();}
+#line 42 "show.pgc"
+
 
   return 0;
 }
diff --git a/src/interfaces/ecpg/test/expected/sql-show.stderr b/src/interfaces/ecpg/test/expected/sql-show.stderr
index c303a84..dc58659 100644
--- a/src/interfaces/ecpg/test/expected/sql-show.stderr
+++ b/src/interfaces/ecpg/test/expected/sql-show.stderr
@@ -44,33 +44,47 @@
 [NO_PID]: sqlca: code: 0, state: 00000
 [NO_PID]: ecpg_get_data on line 27: RESULT: off offset: -1; array: no
 [NO_PID]: sqlca: code: 0, state: 00000
-[NO_PID]: ecpg_execute on line 30: query: set time zone PST8PDT; with 0 parameter(s) on connection ecpg1_regression
+[NO_PID]: ecpg_execute on line 30: query: set standard_conforming_strings to default; with 0 parameter(s) on connection ecpg1_regression
 [NO_PID]: sqlca: code: 0, state: 00000
 [NO_PID]: ecpg_execute on line 30: using PQexec
 [NO_PID]: sqlca: code: 0, state: 00000
 [NO_PID]: ecpg_process_output on line 30: OK: SET
 [NO_PID]: sqlca: code: 0, state: 00000
-[NO_PID]: ecpg_execute on line 31: query: show time zone; with 0 parameter(s) on connection ecpg1_regression
+[NO_PID]: ecpg_execute on line 31: query: show standard_conforming_strings; with 0 parameter(s) on connection ecpg1_regression
 [NO_PID]: sqlca: code: 0, state: 00000
 [NO_PID]: ecpg_execute on line 31: using PQexec
 [NO_PID]: sqlca: code: 0, state: 00000
 [NO_PID]: ecpg_process_output on line 31: correctly got 1 tuples with 1 fields
 [NO_PID]: sqlca: code: 0, state: 00000
-[NO_PID]: ecpg_get_data on line 31: RESULT: PST8PDT offset: -1; array: no
+[NO_PID]: ecpg_get_data on line 31: RESULT: on offset: -1; array: no
 [NO_PID]: sqlca: code: 0, state: 00000
-[NO_PID]: ecpg_execute on line 34: query: set transaction isolation level read committed; with 0 parameter(s) on connection ecpg1_regression
+[NO_PID]: ecpg_execute on line 34: query: set time zone PST8PDT; with 0 parameter(s) on connection ecpg1_regression
 [NO_PID]: sqlca: code: 0, state: 00000
 [NO_PID]: ecpg_execute on line 34: using PQexec
 [NO_PID]: sqlca: code: 0, state: 00000
 [NO_PID]: ecpg_process_output on line 34: OK: SET
 [NO_PID]: sqlca: code: 0, state: 00000
-[NO_PID]: ecpg_execute on line 35: query: show transaction isolation level; with 0 parameter(s) on connection ecpg1_regression
+[NO_PID]: ecpg_execute on line 35: query: show time zone; with 0 parameter(s) on connection ecpg1_regression
 [NO_PID]: sqlca: code: 0, state: 00000
 [NO_PID]: ecpg_execute on line 35: using PQexec
 [NO_PID]: sqlca: code: 0, state: 00000
 [NO_PID]: ecpg_process_output on line 35: correctly got 1 tuples with 1 fields
 [NO_PID]: sqlca: code: 0, state: 00000
-[NO_PID]: ecpg_get_data on line 35: RESULT: read committed offset: -1; array: no
+[NO_PID]: ecpg_get_data on line 35: RESULT: PST8PDT offset: -1; array: no
+[NO_PID]: sqlca: code: 0, state: 00000
+[NO_PID]: ecpg_execute on line 38: query: set transaction isolation level read committed; with 0 parameter(s) on connection ecpg1_regression
+[NO_PID]: sqlca: code: 0, state: 00000
+[NO_PID]: ecpg_execute on line 38: using PQexec
+[NO_PID]: sqlca: code: 0, state: 00000
+[NO_PID]: ecpg_process_output on line 38: OK: SET
+[NO_PID]: sqlca: code: 0, state: 00000
+[NO_PID]: ecpg_execute on line 39: query: show transaction isolation level; with 0 parameter(s) on connection ecpg1_regression
+[NO_PID]: sqlca: code: 0, state: 00000
+[NO_PID]: ecpg_execute on line 39: using PQexec
+[NO_PID]: sqlca: code: 0, state: 00000
+[NO_PID]: ecpg_process_output on line 39: correctly got 1 tuples with 1 fields
+[NO_PID]: sqlca: code: 0, state: 00000
+[NO_PID]: ecpg_get_data on line 39: RESULT: read committed offset: -1; array: no
 [NO_PID]: sqlca: code: 0, state: 00000
 [NO_PID]: ecpg_finish: connection ecpg1_regression closed
 [NO_PID]: sqlca: code: 0, state: 00000
diff --git a/src/interfaces/ecpg/test/expected/sql-show.stdout b/src/interfaces/ecpg/test/expected/sql-show.stdout
index 9319c5d..1b0d19c 100644
--- a/src/interfaces/ecpg/test/expected/sql-show.stdout
+++ b/src/interfaces/ecpg/test/expected/sql-show.stdout
@@ -1,5 +1,6 @@
 Var: Search path: public
 Var: Search path: public
 Var: Standard conforming strings: off
+Var: Standard conforming strings: on
 Time Zone: PST8PDT
 Transaction isolation level: read committed
diff --git a/src/interfaces/ecpg/test/sql/show.pgc b/src/interfaces/ecpg/test/sql/show.pgc
index 339678a..d243aba 100644
--- a/src/interfaces/ecpg/test/sql/show.pgc
+++ b/src/interfaces/ecpg/test/sql/show.pgc
@@ -27,6 +27,10 @@ int main() {
   EXEC SQL SHOW standard_conforming_strings INTO :var;
   printf("Var: Standard conforming strings: %s\n", var);
 
+  EXEC SQL SET standard_conforming_strings TO DEFAULT;
+  EXEC SQL SHOW standard_conforming_strings INTO :var;
+  printf("Var: Standard conforming strings: %s\n", var);
+
   EXEC SQL SET TIME ZONE PST8PDT;
   EXEC SQL SHOW TIME ZONE INTO :var;
   printf("Time Zone: %s\n", var);
#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Higuchi, Daisuke (#1)
Re: [Bug Fix] ECPG: could not use set xxx to default statement

"Higuchi, Daisuke" <higuchi.daisuke@jp.fujitsu.com> writes:

[ missing semicolon in gram.y breaks ecpg parsing of same construct ]

That's pretty nasty. The fix in gram.y is certainly needed, but I'm
unexcited by the regression test additions you propose. What I really
want to know is why a syntax error in gram.y wasn't detected by any
of the tools we use, and whether we can do something about that.
Otherwise the next bug of the same kind may go just as undetected;
in fact, I've got little confidence there aren't other such omissions
already :-(

regards, tom lane

#3Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tom Lane (#2)
Re: [Bug Fix] ECPG: could not use set xxx to default statement

I wrote:

"Higuchi, Daisuke" <higuchi.daisuke@jp.fujitsu.com> writes:

[ missing semicolon in gram.y breaks ecpg parsing of same construct ]

That's pretty nasty. The fix in gram.y is certainly needed, but I'm
unexcited by the regression test additions you propose. What I really
want to know is why a syntax error in gram.y wasn't detected by any
of the tools we use,

Ugh ... the Bison NEWS file has this:

* Changes in version 1.875, 2003-01-01:
...
- Semicolons are once again optional at the end of grammar rules.
This reverts to the behavior of Bison 1.33 and earlier, and improves
compatibility with Yacc.

I'd remembered how we had to run around and insert semicolons to satisfy
Bison 1.3-something, and supposed that that restriction still held.
But it doesn't. It seems though that our conversion script for creating
preproc.y depends on there being semicolons.

I think we need to fix that script to either cope with missing semicolons,
or at least complain about them. Too tired to look into how, right now.

regards, tom lane

#4Michael Meskes
meskes@postgresql.org
In reply to: Tom Lane (#3)
Re: [Bug Fix] ECPG: could not use set xxx to default statement

On Tue, 2019-02-19 at 00:05 -0500, Tom Lane wrote:

I wrote:

"Higuchi, Daisuke" <higuchi.daisuke@jp.fujitsu.com> writes:

[ missing semicolon in gram.y breaks ecpg parsing of same
construct ]

That's pretty nasty. The fix in gram.y is certainly needed, but
I'm
unexcited by the regression test additions you propose. What I
really
want to know is why a syntax error in gram.y wasn't detected by any
of the tools we use,

I'm actually surprised it only shows by one incorrectly working rule
and did not mangle the whole file by combining to rules. I guess that's
because bison now finds the end of the rule somehow even without the
semicolon.

But it doesn't. It seems though that our conversion script for
creating
preproc.y depends on there being semicolons.

Yes, it does. There has to be a way for the script to find the end of a
rule and I wonder if bison's way can be used in such a simple perl
script too.

I think we need to fix that script to either cope with missing
semicolons,
or at least complain about them. Too tired to look into how, right
now.

If we can identify a missing semicolon we probably can also figure out
where it had to be.

Michael
--
Michael Meskes
Michael at Fam-Meskes dot De, Michael at Meskes dot (De|Com|Net|Org)
Meskes at (Debian|Postgresql) dot Org
Jabber: michael at xmpp dot meskes dot org
VfL Borussia! Força Barça! SF 49ers! Use Debian GNU/Linux, PostgreSQL

#5Higuchi, Daisuke
higuchi.daisuke@jp.fujitsu.com
In reply to: Tom Lane (#3)
1 attachment(s)
RE: [Bug Fix] ECPG: could not use set xxx to default statement

Hi,

I think we need to fix that script to either cope with missing semicolons,
or at least complain about them. Too tired to look into how, right now.

I attached the patch which cope with missing semicolons.
Previous parse.pl find semicolon and dump data to buffer. When attached patch's parse.pl find new tokens before finding a semicolon, it also dumps data to buffer.

Regards,
Daisuke, Higuchi

Attachments:

FIX_ECPG_SET_TO_DEFAULT_v2.patchapplication/octet-stream; name=FIX_ECPG_SET_TO_DEFAULT_v2.patchDownload
diff --git a/src/interfaces/ecpg/preproc/parse.pl b/src/interfaces/ecpg/preproc/parse.pl
index 6dee500..4445d3d 100644
--- a/src/interfaces/ecpg/preproc/parse.pl
+++ b/src/interfaces/ecpg/preproc/parse.pl
@@ -25,6 +25,7 @@ my $yaccmode              = 0;
 my $header_included       = 0;
 my $feature_not_supported = 0;
 my $tokenmode             = 0;
+my $fill_semicolon        = 0;
 
 my (%buff, $infield, $comment, %tokens, %addons);
 my ($stmt_mode, @fields);
@@ -258,6 +259,7 @@ sub main
 			}
 			elsif ($arr[$fieldIndexer] eq '}')
 			{
+				$fill_semicolon = 1;
 				$brace_indent--;
 				next;
 			}
@@ -271,7 +273,9 @@ sub main
 			{
 				next;
 			}
-			if ($arr[$fieldIndexer] eq ';')
+
+			if (($arr[$fieldIndexer] =~ /[A-Za-z0-9]+:/ && $fill_semicolon eq 1)
+				|| $arr[$fieldIndexer] eq ';')
 			{
 				if ($copymode)
 				{
@@ -279,16 +283,23 @@ sub main
 					{
 						dump_line($stmt_mode, \@fields);
 					}
-					add_to_buffer('rules', ";\n\n");
+					if ($arr[$fieldIndexer] eq ';')
+					{
+						add_to_buffer('rules', ";\n\n");
+					}
 				}
 				else
 				{
 					$copymode = 1;
 				}
+				$fill_semicolon = 0;
 				@fields  = ();
 				$infield = 0;
 				$line    = '';
-				next;
+				if($arr[$fieldIndexer] eq ';')
+				{
+					next;
+				}
 			}
 
 			if ($arr[$fieldIndexer] eq '|')
#6Michael Meskes
meskes@postgresql.org
In reply to: Higuchi, Daisuke (#5)
Re: [Bug Fix] ECPG: could not use set xxx to default statement

Higuchi-san,

I attached the patch which cope with missing semicolons.
Previous parse.pl find semicolon and dump data to buffer. When
attached patch's parse.pl find new tokens before finding a semicolon,
it also dumps data to buffer.

Now this seems to be much easier than I expected. Thanks. My first test
show two "missing" semicolons in gram.y. :)

Michael
--
Michael Meskes
Michael at Fam-Meskes dot De, Michael at Meskes dot (De|Com|Net|Org)
Meskes at (Debian|Postgresql) dot Org
Jabber: michael at xmpp dot meskes dot org
VfL Borussia! Força Barça! SF 49ers! Use Debian GNU/Linux, PostgreSQL

#7Michael Meskes
meskes@postgresql.org
In reply to: Higuchi, Daisuke (#5)
Re: [Bug Fix] ECPG: could not use set xxx to default statement

Higuchi-san,

I attached the patch which cope with missing semicolons.
Previous parse.pl find semicolon and dump data to buffer. When
attached patch's parse.pl find new tokens before finding a semicolon,
it also dumps data to buffer.

It just occurred to me that check_rules.pl probably uses the same logic
to identify each rule and thus needs to be changed, too.

Also, IIRC bison allows blanks between the symbol name and the colon,
or in other words "generic_set:" is equal to "generic_set :". If this
happens after a "missing" semicolon I think your patch does not notice
the end of the rule.

Michael
--
Michael Meskes
Michael at Fam-Meskes dot De, Michael at Meskes dot (De|Com|Net|Org)
Meskes at (Debian|Postgresql) dot Org
Jabber: michael at xmpp dot meskes dot org
VfL Borussia! Força Barça! SF 49ers! Use Debian GNU/Linux, PostgreSQL

#8Andrew Dunstan
andrew.dunstan@2ndquadrant.com
In reply to: Michael Meskes (#7)
Re: [Bug Fix] ECPG: could not use set xxx to default statement

On 2/19/19 6:21 AM, Michael Meskes wrote:

Higuchi-san,

I attached the patch which cope with missing semicolons.
Previous parse.pl find semicolon and dump data to buffer. When
attached patch's parse.pl find new tokens before finding a semicolon,
it also dumps data to buffer.

It just occurred to me that check_rules.pl probably uses the same logic
to identify each rule and thus needs to be changed, too.

Also, IIRC bison allows blanks between the symbol name and the colon,
or in other words "generic_set:" is equal to "generic_set :". If this
happens after a "missing" semicolon I think your patch does not notice
the end of the rule.

Yeah, it also seems too possibly liberal about where it matches a rule
name. AUIU this should be the first token on a line; is that right?
OTOH, it won't handle any case where an action block is not the last
thing in the rule, since it only sets $fill_semicolon on seeing a
closing brace (on its own).

I just looked at the bison manual at gnu.org and also at `info bison` on
my local machine, and couldn't see any reference to semicolons being
optional at the end of a rule. Under the heading "Syntax of Grammar
Rules" it says this:

A Bison grammar rule has the following general form:

     RESULT: COMPONENTS...;

Making it optional without putting that in the manual is just awful.

cheers

andrew

--
Andrew Dunstan https://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#9Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andrew Dunstan (#8)
Re: [Bug Fix] ECPG: could not use set xxx to default statement

Andrew Dunstan <andrew.dunstan@2ndquadrant.com> writes:

I just looked at the bison manual at gnu.org and also at `info bison` on
my local machine, and couldn't see any reference to semicolons being
optional at the end of a rule. Under the heading "Syntax of Grammar
Rules" it says this:
A Bison grammar rule has the following general form:
     RESULT: COMPONENTS...;
Making it optional without putting that in the manual is just awful.

Yeah. I wonder if they removed that info in 1.34 and failed to
put it back in 1.875?

Anyway, I'm of the opinion that omitting the semi is poor style
and our tools should insist on it even if Bison does not. Thus,
I think the correct fix is for the scripts to complain about a
missing semi, not cope.

My initial look at parse.pl last night left me feeling pretty
disheartened about its robustness in general --- for example,
it looks like { } /* or */ inside a string literal or Bison
character token would break it completely, because it wouldn't
distinguish those cases from the same things outside a string.
It's just luck we haven't broken it yet (or, perhaps, we have
and nobody exercised the relevant productions yet?).

Probably, somebody who's a better Perl programmer than me
ought to take point on improving that.

regards, tom lane

#10Andrew Dunstan
andrew.dunstan@2ndquadrant.com
In reply to: Tom Lane (#9)
Re: [Bug Fix] ECPG: could not use set xxx to default statement

On 2/19/19 9:29 AM, Tom Lane wrote:

Andrew Dunstan <andrew.dunstan@2ndquadrant.com> writes:

I just looked at the bison manual at gnu.org and also at `info bison` on
my local machine, and couldn't see any reference to semicolons being
optional at the end of a rule. Under the heading "Syntax of Grammar
Rules" it says this:
A Bison grammar rule has the following general form:
     RESULT: COMPONENTS...;
Making it optional without putting that in the manual is just awful.

Yeah. I wonder if they removed that info in 1.34 and failed to
put it back in 1.875?

Anyway, I'm of the opinion that omitting the semi is poor style
and our tools should insist on it even if Bison does not. Thus,
I think the correct fix is for the scripts to complain about a
missing semi, not cope.

Yeah, agreed.

My initial look at parse.pl last night left me feeling pretty
disheartened about its robustness in general --- for example,
it looks like { } /* or */ inside a string literal or Bison
character token would break it completely, because it wouldn't
distinguish those cases from the same things outside a string.
It's just luck we haven't broken it yet (or, perhaps, we have
and nobody exercised the relevant productions yet?).

Probably, somebody who's a better Perl programmer than me
ought to take point on improving that.

Agreed.

cheers

andrew

--
Andrew Dunstan https://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#11Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andrew Dunstan (#10)
1 attachment(s)
Re: [Bug Fix] ECPG: could not use set xxx to default statement

Andrew Dunstan <andrew.dunstan@2ndquadrant.com> writes:

On 2/19/19 9:29 AM, Tom Lane wrote:

Probably, somebody who's a better Perl programmer than me
ought to take point on improving that.

Agreed.

Not seeing any motion on this, here's a draft patch to make these
scripts complain about missing semicolons. Against the current
gram.y (which contains 2 such errors, as Michael noted) you
get output like

'/usr/bin/perl' ./parse.pl . < ../../../backend/parser/gram.y > preproc.y
unterminated rule at ./parse.pl line 370, <> line 1469.
make: *** [preproc.y] Error 255
make: *** Deleting file `preproc.y'

That's not *super* friendly, but it does give you the right line number
to look at in gram.y. We could adjust the script (and the Makefile)
further so that the message would cite the gram.y filename, but I'm not
sure if it's worth the trouble. Thoughts?

regards, tom lane

Attachments:

complain-about-missing-grammar-semicolons-1.patchtext/x-diff; charset=us-ascii; name=complain-about-missing-grammar-semicolons-1.patchDownload
diff --git a/src/interfaces/ecpg/preproc/check_rules.pl b/src/interfaces/ecpg/preproc/check_rules.pl
index 1388d05..3daff88 100644
--- a/src/interfaces/ecpg/preproc/check_rules.pl
+++ b/src/interfaces/ecpg/preproc/check_rules.pl
@@ -1,7 +1,7 @@
 #!/usr/bin/perl
 # src/interfaces/ecpg/preproc/check_rules.pl
 # test parser generator for ecpg
-# call with backend parser as stdin
+# call with backend grammar as stdin
 #
 # Copyright (c) 2009-2019, PostgreSQL Global Development Group
 #
@@ -47,6 +47,7 @@ my %replace_line = (
 
 my $block        = '';
 my $yaccmode     = 0;
+my $in_rule      = 0;
 my $brace_indent = 0;
 my (@arr, %found);
 my $comment     = 0;
@@ -131,10 +132,13 @@ while (<$parser_fh>)
 			$found{$block} = 1;
 			$cc++;
 			$block = '';
+			$in_rule = 0 if $arr[$fieldIndexer] eq ';';
 		}
 		elsif (($arr[$fieldIndexer] =~ '[A-Za-z0-9]+:')
 			|| $arr[ $fieldIndexer + 1 ] eq ':')
 		{
+			die "unterminated rule" if $in_rule;
+			$in_rule     = 1;
 			$non_term_id = $arr[$fieldIndexer];
 			$non_term_id =~ tr/://d;
 		}
@@ -145,6 +149,8 @@ while (<$parser_fh>)
 	}
 }
 
+die "unterminated rule" if $in_rule;
+
 close $parser_fh;
 if ($verbose)
 {
diff --git a/src/interfaces/ecpg/preproc/parse.pl b/src/interfaces/ecpg/preproc/parse.pl
index 6dee500..e219398 100644
--- a/src/interfaces/ecpg/preproc/parse.pl
+++ b/src/interfaces/ecpg/preproc/parse.pl
@@ -22,6 +22,7 @@ $path = "." unless $path;
 my $copymode              = 0;
 my $brace_indent          = 0;
 my $yaccmode              = 0;
+my $in_rule               = 0;
 my $header_included       = 0;
 my $feature_not_supported = 0;
 my $tokenmode             = 0;
@@ -288,6 +289,7 @@ sub main
 				@fields  = ();
 				$infield = 0;
 				$line    = '';
+				$in_rule = 0;
 				next;
 			}
 
@@ -365,6 +367,8 @@ sub main
 				$line    = '';
 				@fields  = ();
 				$infield = 1;
+				die "unterminated rule" if $in_rule;
+				$in_rule = 1;
 				next;
 			}
 			elsif ($copymode)
@@ -415,6 +419,7 @@ sub main
 			}
 		}
 	}
+	die "unterminated rule" if $in_rule;
 	return;
 }
 
#12Michael Meskes
meskes@postgresql.org
In reply to: Tom Lane (#11)
Re: [Bug Fix] ECPG: could not use set xxx to default statement

Not seeing any motion on this, here's a draft patch to make these
scripts complain about missing semicolons. Against the current
gram.y (which contains 2 such errors, as Michael noted) you
get output like

Thanks Tom for looking into this. Are we agreed then that we want
gram.y to have semicolons?

'/usr/bin/perl' ./parse.pl . < ../../../backend/parser/gram.y >
preproc.y
unterminated rule at ./parse.pl line 370, <> line 1469.
make: *** [preproc.y] Error 255
make: *** Deleting file `preproc.y'

That's not *super* friendly, but it does give you the right line
number
to look at in gram.y. We could adjust the script (and the Makefile)
further so that the message would cite the gram.y filename, but I'm
not
sure if it's worth the trouble. Thoughts?

IMO it's not worth it. We all know where the grammar is and that the
ecpg tools only parse that one file. Why putting effort into writing it
down too?

Michael
--
Michael Meskes
Michael at Fam-Meskes dot De, Michael at Meskes dot (De|Com|Net|Org)
Meskes at (Debian|Postgresql) dot Org
Jabber: michael at xmpp dot meskes dot org
VfL Borussia! Força Barça! SF 49ers! Use Debian GNU/Linux, PostgreSQL

#13Tom Lane
tgl@sss.pgh.pa.us
In reply to: Michael Meskes (#12)
Re: [Bug Fix] ECPG: could not use set xxx to default statement

Michael Meskes <meskes@postgresql.org> writes:

Not seeing any motion on this, here's a draft patch to make these
scripts complain about missing semicolons. Against the current
gram.y (which contains 2 such errors, as Michael noted) you
get output like

Thanks Tom for looking into this. Are we agreed then that we want
gram.y to have semicolons?

Hearing no objections, I pushed it that way.

That's not *super* friendly, but it does give you the right line
number to look at in gram.y. We could adjust the script (and the Makefile)
further so that the message would cite the gram.y filename, but I'm
not sure if it's worth the trouble. Thoughts?

IMO it's not worth it. We all know where the grammar is and that the
ecpg tools only parse that one file. Why putting effort into writing it
down too?

I did manage to fix the "die" commands so that you get something like

unterminated rule at grammar line 1461

without the extraneous detail about the script's internals.
That seems clear enough from here.

I'm still disturbed by the scripts' ability to get fooled by
braces or comment markers within quoted strings. However, that's
not something I have good ideas about how to fix, and there's not
any evidence that it's a live problem at the moment.

regards, tom lane