Sanding down some edge cases for PL/pgSQL reserved words

Started by Tom Lane9 months ago11 messages
#1Tom Lane
tgl@sss.pgh.pa.us
2 attachment(s)

This is a rather delayed response to the discussion of bug
#18693 [1]/messages/by-id/18693-65968418890877b4@postgresql.org, in which I wrote:

(It's kind of annoying that "strict" has to be double-quoted
in the RAISE NOTICE, especially since you get a rather misleading
error if it isn't. But that seems like a different discussion.)

As an example of that, if you don't double-quote "strict"
in this usage you get

regression=# do $$ declare r record; begin
SELECT a, b AS STRICT INTO r FROM (SELECT 'A' AS a, 'B' AS b) AS q;
RAISE NOTICE 'STRICT r.strict = %', r.strict;
end $$;
ERROR: record "r" has no field "strict"
LINE 1: r.strict
^
QUERY: r.strict
CONTEXT: PL/pgSQL function inline_code_block line 3 at RAISE

which is pretty bogus because the record *does* have a field
named "strict". The actual problem is that STRICT is a fully
reserved PL/pgSQL keyword, which means you need to double-quote
it if you want to use it this way.

The attached patches provide two independent responses to that:

1. AFAICS, there is no real reason for STRICT to be a reserved
rather than unreserved PL/pgSQL keyword, and for that matter not
EXECUTE either. Making them unreserved does allow some ambiguity,
but I don't think there's any surprises in how that ambiguity
would be resolved; and certainly we've preferred ambiguity over
introducing new reserved keywords in PL/pgSQL before. I think
these two just escaped that treatment by dint of being ancient.

2. That "has no field" error message is flat-out wrong. The now-known
way to trigger it has a different cause, and what's more, we simply do
not know at this point whether the malleable record type has such a
field. So in 0002 below I just changed it to assume that the problem
is a reserved field name. We might find another way to reach that
failure in future, but I doubt that "has no field" would be the right
thing to say in any case.

This is v19 material at this point, so I'll stick it on the CF queue.

regards, tom lane

[1]: /messages/by-id/18693-65968418890877b4@postgresql.org

Attachments:

v1-0001-De-reserve-keywords-EXECUTE-and-STRICT-in-PL-pgSQ.patchtext/x-diff; charset=us-ascii; name*0=v1-0001-De-reserve-keywords-EXECUTE-and-STRICT-in-PL-pgSQ.p; name*1=atchDownload
From 2d19cbfbbd9aa4166cbeb46195d7a7b04e8fda52 Mon Sep 17 00:00:00 2001
From: Tom Lane <tgl@sss.pgh.pa.us>
Date: Fri, 25 Apr 2025 16:27:13 -0400
Subject: [PATCH v1 1/2] De-reserve keywords EXECUTE and STRICT in PL/pgSQL.

On close inspection, there does not seem to be a strong reason
why these should be fully-reserved keywords.  I guess they just
escaped consideration in previous attempts to minimize PL/pgSQL's
list of reserved words.
---
 src/pl/plpgsql/src/pl_gram.y              | 13 +++++++++----
 src/pl/plpgsql/src/pl_reserved_kwlist.h   |  2 --
 src/pl/plpgsql/src/pl_scanner.c           |  2 +-
 src/pl/plpgsql/src/pl_unreserved_kwlist.h |  2 ++
 4 files changed, 12 insertions(+), 7 deletions(-)

diff --git a/src/pl/plpgsql/src/pl_gram.y b/src/pl/plpgsql/src/pl_gram.y
index 5612e66d023..7b672ea5179 100644
--- a/src/pl/plpgsql/src/pl_gram.y
+++ b/src/pl/plpgsql/src/pl_gram.y
@@ -1368,7 +1368,8 @@ for_control		: for_variable K_IN
 						int			tok = yylex(&yylval, &yylloc, yyscanner);
 						int			tokloc = yylloc;
 
-						if (tok == K_EXECUTE)
+						if (tok_is_keyword(tok, &yylval,
+										   K_EXECUTE, "execute"))
 						{
 							/* EXECUTE means it's a dynamic FOR loop */
 							PLpgSQL_stmt_dynfors *new;
@@ -2135,7 +2136,8 @@ stmt_open		: K_OPEN cursor_variable
 								yyerror(&yylloc, NULL, yyscanner, "syntax error, expected \"FOR\"");
 
 							tok = yylex(&yylval, &yylloc, yyscanner);
-							if (tok == K_EXECUTE)
+							if (tok_is_keyword(tok, &yylval,
+											   K_EXECUTE, "execute"))
 							{
 								int			endtoken;
 
@@ -2536,6 +2538,7 @@ unreserved_keyword	:
 				| K_ERRCODE
 				| K_ERROR
 				| K_EXCEPTION
+				| K_EXECUTE
 				| K_EXIT
 				| K_FETCH
 				| K_FIRST
@@ -2581,6 +2584,7 @@ unreserved_keyword	:
 				| K_SLICE
 				| K_SQLSTATE
 				| K_STACKED
+				| K_STRICT
 				| K_TABLE
 				| K_TABLE_NAME
 				| K_TYPE
@@ -3514,7 +3518,8 @@ make_return_query_stmt(int location, YYSTYPE *yylvalp, YYLTYPE *yyllocp, yyscan_
 	new->stmtid = ++plpgsql_curr_compile->nstatements;
 
 	/* check for RETURN QUERY EXECUTE */
-	if ((tok = yylex(yylvalp, yyllocp, yyscanner)) != K_EXECUTE)
+	tok = yylex(yylvalp, yyllocp, yyscanner);
+	if (!tok_is_keyword(tok, yylvalp, K_EXECUTE, "execute"))
 	{
 		/* ordinary static query */
 		plpgsql_push_back_token(tok, yylvalp, yyllocp, yyscanner);
@@ -3597,7 +3602,7 @@ read_into_target(PLpgSQL_variable **target, bool *strict, YYSTYPE *yylvalp, YYLT
 		*strict = false;
 
 	tok = yylex(yylvalp, yyllocp, yyscanner);
-	if (strict && tok == K_STRICT)
+	if (strict && tok_is_keyword(tok, yylvalp, K_STRICT, "strict"))
 	{
 		*strict = true;
 		tok = yylex(yylvalp, yyllocp, yyscanner);
diff --git a/src/pl/plpgsql/src/pl_reserved_kwlist.h b/src/pl/plpgsql/src/pl_reserved_kwlist.h
index ce7b0c9d331..f3ef2cbd8d7 100644
--- a/src/pl/plpgsql/src/pl_reserved_kwlist.h
+++ b/src/pl/plpgsql/src/pl_reserved_kwlist.h
@@ -33,7 +33,6 @@ PG_KEYWORD("case", K_CASE)
 PG_KEYWORD("declare", K_DECLARE)
 PG_KEYWORD("else", K_ELSE)
 PG_KEYWORD("end", K_END)
-PG_KEYWORD("execute", K_EXECUTE)
 PG_KEYWORD("for", K_FOR)
 PG_KEYWORD("foreach", K_FOREACH)
 PG_KEYWORD("from", K_FROM)
@@ -44,7 +43,6 @@ PG_KEYWORD("loop", K_LOOP)
 PG_KEYWORD("not", K_NOT)
 PG_KEYWORD("null", K_NULL)
 PG_KEYWORD("or", K_OR)
-PG_KEYWORD("strict", K_STRICT)
 PG_KEYWORD("then", K_THEN)
 PG_KEYWORD("to", K_TO)
 PG_KEYWORD("using", K_USING)
diff --git a/src/pl/plpgsql/src/pl_scanner.c b/src/pl/plpgsql/src/pl_scanner.c
index d08187dafcb..19825e5c718 100644
--- a/src/pl/plpgsql/src/pl_scanner.c
+++ b/src/pl/plpgsql/src/pl_scanner.c
@@ -53,7 +53,7 @@ IdentifierLookup plpgsql_IdentifierLookup = IDENTIFIER_LOOKUP_NORMAL;
  * We try to avoid reserving more keywords than we have to; but there's
  * little point in not reserving a word if it's reserved in the core grammar.
  * Currently, the following words are reserved here but not in the core:
- * BEGIN BY DECLARE EXECUTE FOREACH IF LOOP STRICT WHILE
+ * BEGIN BY DECLARE FOREACH IF LOOP WHILE
  */
 
 /* ScanKeywordList lookup data for PL/pgSQL keywords */
diff --git a/src/pl/plpgsql/src/pl_unreserved_kwlist.h b/src/pl/plpgsql/src/pl_unreserved_kwlist.h
index 98f99ec470c..b48c5a645ff 100644
--- a/src/pl/plpgsql/src/pl_unreserved_kwlist.h
+++ b/src/pl/plpgsql/src/pl_unreserved_kwlist.h
@@ -58,6 +58,7 @@ PG_KEYWORD("elsif", K_ELSIF)
 PG_KEYWORD("errcode", K_ERRCODE)
 PG_KEYWORD("error", K_ERROR)
 PG_KEYWORD("exception", K_EXCEPTION)
+PG_KEYWORD("execute", K_EXECUTE)
 PG_KEYWORD("exit", K_EXIT)
 PG_KEYWORD("fetch", K_FETCH)
 PG_KEYWORD("first", K_FIRST)
@@ -103,6 +104,7 @@ PG_KEYWORD("scroll", K_SCROLL)
 PG_KEYWORD("slice", K_SLICE)
 PG_KEYWORD("sqlstate", K_SQLSTATE)
 PG_KEYWORD("stacked", K_STACKED)
+PG_KEYWORD("strict", K_STRICT)
 PG_KEYWORD("table", K_TABLE)
 PG_KEYWORD("table_name", K_TABLE_NAME)
 PG_KEYWORD("type", K_TYPE)
-- 
2.43.5

v1-0002-Improve-error-report-for-PL-pgSQL-reserved-word-u.patchtext/x-diff; charset=us-ascii; name*0=v1-0002-Improve-error-report-for-PL-pgSQL-reserved-word-u.p; name*1=atchDownload
From ed82817f7d72446474db18b4dac3516e57dba725 Mon Sep 17 00:00:00 2001
From: Tom Lane <tgl@sss.pgh.pa.us>
Date: Fri, 25 Apr 2025 17:22:27 -0400
Subject: [PATCH v1 2/2] Improve error report for PL/pgSQL reserved word used
 as a field name.

The current code in resolve_column_ref (dating to commits 01f7d2990
and fe24d7816) believes that not finding a RECFIELD datum is a
can't-happen case, in consequence of which I didn't spend a whole lot
of time considering what to do if it did happen.  But it turns out
that it *can* happen if the would-be field name is a fully-reserved
PL/pgSQL keyword.  Change the error message to describe that
situation, and add a test case demonstrating it.

This might need further refinement if anyone can find other ways to
trigger a failure here; but without an example it's not clear what
other error to throw.
---
 src/pl/plpgsql/src/expected/plpgsql_misc.out | 22 ++++++++++++++++++++
 src/pl/plpgsql/src/pl_comp.c                 | 19 ++++++++++-------
 src/pl/plpgsql/src/sql/plpgsql_misc.sql      | 16 ++++++++++++++
 3 files changed, 50 insertions(+), 7 deletions(-)

diff --git a/src/pl/plpgsql/src/expected/plpgsql_misc.out b/src/pl/plpgsql/src/expected/plpgsql_misc.out
index a6511df08ec..7c87029783a 100644
--- a/src/pl/plpgsql/src/expected/plpgsql_misc.out
+++ b/src/pl/plpgsql/src/expected/plpgsql_misc.out
@@ -65,3 +65,25 @@ do $$ declare x public.foo%rowtype; begin end $$;
 ERROR:  relation "public.foo" does not exist
 CONTEXT:  compilation of PL/pgSQL function "inline_code_block" near line 1
 do $$ declare x public.misc_table%rowtype; begin end $$;
+-- Test handling of a reserved word as a record field name
+do $$ declare r record;
+begin
+  select 1 as x, 2 as foreach into r;
+  raise notice 'r.x = %', r.x;
+  raise notice 'r.foreach = %', r.foreach;
+end $$;
+NOTICE:  r.x = 1
+ERROR:  field name "foreach" is a reserved key word
+LINE 1: r.foreach
+        ^
+HINT:  Use double quotes to quote it.
+QUERY:  r.foreach
+CONTEXT:  PL/pgSQL function inline_code_block line 5 at RAISE
+do $$ declare r record;
+begin
+  select 1 as x, 2 as foreach into r;
+  raise notice 'r.x = %', r.x;
+  raise notice 'r."foreach" = %', r."foreach";
+end $$;
+NOTICE:  r.x = 1
+NOTICE:  r."foreach" = 2
diff --git a/src/pl/plpgsql/src/pl_comp.c b/src/pl/plpgsql/src/pl_comp.c
index 519f7695d7c..5589adb2474 100644
--- a/src/pl/plpgsql/src/pl_comp.c
+++ b/src/pl/plpgsql/src/pl_comp.c
@@ -1201,17 +1201,22 @@ resolve_column_ref(ParseState *pstate, PLpgSQL_expr *expr,
 				}
 
 				/*
-				 * We should not get here, because a RECFIELD datum should
-				 * have been built at parse time for every possible qualified
-				 * reference to fields of this record.  But if we do, handle
-				 * it like field-not-found: throw error or return NULL.
+				 * Ideally we'd never get here, because a RECFIELD datum
+				 * should have been built at parse time for every qualified
+				 * reference to a field of this record that appears in the
+				 * source text.  However, plpgsql_yylex will not build such a
+				 * datum unless the field name lexes as token type IDENT.
+				 * Hence, if the would-be field name is a PL/pgSQL reserved
+				 * word, we lose.  Assume that that's what happened and tell
+				 * the user to quote it, unless the caller prefers we just
+				 * return NULL.
 				 */
 				if (error_if_no_field)
 					ereport(ERROR,
-							(errcode(ERRCODE_UNDEFINED_COLUMN),
-							 errmsg("record \"%s\" has no field \"%s\"",
-									(nnames_field == 1) ? name1 : name2,
+							(errcode(ERRCODE_SYNTAX_ERROR),
+							 errmsg("field name \"%s\" is a reserved key word",
 									colname),
+							 errhint("Use double quotes to quote it."),
 							 parser_errposition(pstate, cref->location)));
 			}
 			break;
diff --git a/src/pl/plpgsql/src/sql/plpgsql_misc.sql b/src/pl/plpgsql/src/sql/plpgsql_misc.sql
index d3a7f703a75..4c31b9d11fc 100644
--- a/src/pl/plpgsql/src/sql/plpgsql_misc.sql
+++ b/src/pl/plpgsql/src/sql/plpgsql_misc.sql
@@ -37,3 +37,19 @@ do $$ declare x foo.bar%rowtype; begin end $$;
 do $$ declare x foo.bar.baz%rowtype; begin end $$;
 do $$ declare x public.foo%rowtype; begin end $$;
 do $$ declare x public.misc_table%rowtype; begin end $$;
+
+-- Test handling of a reserved word as a record field name
+
+do $$ declare r record;
+begin
+  select 1 as x, 2 as foreach into r;
+  raise notice 'r.x = %', r.x;
+  raise notice 'r.foreach = %', r.foreach;
+end $$;
+
+do $$ declare r record;
+begin
+  select 1 as x, 2 as foreach into r;
+  raise notice 'r.x = %', r.x;
+  raise notice 'r."foreach" = %', r."foreach";
+end $$;
-- 
2.43.5

#2Joel Jacobson
joel@compiler.org
In reply to: Tom Lane (#1)
Re: Sanding down some edge cases for PL/pgSQL reserved words

On Sat, Apr 26, 2025, at 06:44, Tom Lane wrote:

This is a rather delayed response to the discussion of bug
#18693 [1], in which I wrote:

...

which is pretty bogus because the record *does* have a field
named "strict". The actual problem is that STRICT is a fully
reserved PL/pgSQL keyword, which means you need to double-quote
it if you want to use it this way.

I'd like to briefly raise an old nostalgic PL/pgSQL dream of mine that might be
affected by this change.

For years, I've felt we could benefit from introducing convenience syntax to
explicitly require that exactly one row is affected by a query, something which
currently requires using a somewhat cumbersome workaround:

- Using `... INTO STRICT ...` for `SELECT`,
- Using `RETURNING ... INTO STRICT ...` for `DELETE/UPDATE/INSERT`, or
- Checking `ROW_COUNT` via `GET DIAGNOSTICS` and raising an error if not exactly one row.

I think it would be more convenient and intuitive if we could simply write:

```
STRICT [SELECT | UPDATE | INSERT | DELETE] ...;
```

That is, allowing `STRICT` followed directly by any regular `SELECT`, `UPDATE`,
`INSERT`, or `DELETE` command, explicitly enforcing exactly one affected row.

Changing `STRICT` to become an unreserved keyword in PL/pgSQL would effectively
close the window of opportunity for this syntax, as it would introduce ambiguity
in command parsing.

I was actually not aware of STRICT already being a reserved PL/pgSQL keyword.
Had I known that, I would have proposed this convenience syntax already since
a long time ago.

I wonder how often developers truly need to use "strict" as a field name versus
the potential usage of a clean and explicit syntax for enforcing single-row
results without additional verbosity.

/Joel

#3Tom Lane
tgl@sss.pgh.pa.us
In reply to: Joel Jacobson (#2)
Re: Sanding down some edge cases for PL/pgSQL reserved words

"Joel Jacobson" <joel@compiler.org> writes:

For years, I've felt we could benefit from introducing convenience syntax to
explicitly require that exactly one row is affected by a query, something which
currently requires using a somewhat cumbersome workaround:

- Using `... INTO STRICT ...` for `SELECT`,
- Using `RETURNING ... INTO STRICT ...` for `DELETE/UPDATE/INSERT`, or
- Checking `ROW_COUNT` via `GET DIAGNOSTICS` and raising an error if not exactly one row.

I think it would be more convenient and intuitive if we could simply write:

```
STRICT [SELECT | UPDATE | INSERT | DELETE] ...;
```

Meh. I don't really have an opinion on whether this is worth bespoke
syntax, but if it is:

(1) I don't see why we'd restrict it to plpgsql as opposed to
implementing it in core SQL.

(2) Putting the keyword at the front seems fairly un-SQL-like.
For SELECT, "SELECT STRICT ..." would seem more natural, as it calls
back to SELECT DISTINCT; or you could imagine integrating it into the
LIMIT clause. Not as sure what to do for the DML commands, but
somewhere near where we put RETURNING seems saner.

Also, even if we did do it in plpgsql exactly as you suggest, making
it unreserved doesn't move the needle on whether that's possible.
Most of plpgsql's statement-starting keywords are unreserved.

But please, don't hijack this thread for that discussion ...

regards, tom lane

#4Joel Jacobson
joel@compiler.org
In reply to: Tom Lane (#3)
Re: Sanding down some edge cases for PL/pgSQL reserved words

On Sat, Apr 26, 2025, at 05:10, Tom Lane wrote:

"Joel Jacobson" <joel@compiler.org> writes:

For years, I've felt we could benefit from introducing convenience syntax to
explicitly require that exactly one row is affected by a query, something which
currently requires using a somewhat cumbersome workaround:

- Using `... INTO STRICT ...` for `SELECT`,
- Using `RETURNING ... INTO STRICT ...` for `DELETE/UPDATE/INSERT`, or
- Checking `ROW_COUNT` via `GET DIAGNOSTICS` and raising an error if not exactly one row.

I think it would be more convenient and intuitive if we could simply write:

```
STRICT [SELECT | UPDATE | INSERT | DELETE] ...;
```

Meh. I don't really have an opinion on whether this is worth bespoke
syntax, but if it is:

(1) I don't see why we'd restrict it to plpgsql as opposed to
implementing it in core SQL.

Good point, I agree, that would be much better.

(2) Putting the keyword at the front seems fairly un-SQL-like.
For SELECT, "SELECT STRICT ..." would seem more natural, as it calls
back to SELECT DISTINCT; or you could imagine integrating it into the
LIMIT clause. Not as sure what to do for the DML commands, but
somewhere near where we put RETURNING seems saner.

Also, even if we did do it in plpgsql exactly as you suggest, making
it unreserved doesn't move the needle on whether that's possible.
Most of plpgsql's statement-starting keywords are unreserved.

But please, don't hijack this thread for that discussion ...

Understood, and thanks for clarifying this change doesn't affect the strictness idea.

/Joel

#5Pavel Stehule
pavel.stehule@gmail.com
In reply to: Tom Lane (#1)
Re: Sanding down some edge cases for PL/pgSQL reserved words

Hi

I started reviewing this patch.

so 7. 6. 2025 v 18:41 odesílatel Tom Lane <tgl@sss.pgh.pa.us> napsal:

This is a rather delayed response to the discussion of bug
#18693 [1], in which I wrote:

(It's kind of annoying that "strict" has to be double-quoted
in the RAISE NOTICE, especially since you get a rather misleading
error if it isn't. But that seems like a different discussion.)

As an example of that, if you don't double-quote "strict"
in this usage you get

regression=# do $$ declare r record; begin
SELECT a, b AS STRICT INTO r FROM (SELECT 'A' AS a, 'B' AS b) AS q;
RAISE NOTICE 'STRICT r.strict = %', r.strict;
end $$;
ERROR: record "r" has no field "strict"
LINE 1: r.strict
^
QUERY: r.strict
CONTEXT: PL/pgSQL function inline_code_block line 3 at RAISE

which is pretty bogus because the record *does* have a field
named "strict". The actual problem is that STRICT is a fully
reserved PL/pgSQL keyword, which means you need to double-quote
it if you want to use it this way.

The attached patches provide two independent responses to that:

1. AFAICS, there is no real reason for STRICT to be a reserved
rather than unreserved PL/pgSQL keyword, and for that matter not
EXECUTE either. Making them unreserved does allow some ambiguity,
but I don't think there's any surprises in how that ambiguity
would be resolved; and certainly we've preferred ambiguity over
introducing new reserved keywords in PL/pgSQL before. I think
these two just escaped that treatment by dint of being ancient.

There is no issue.

2. That "has no field" error message is flat-out wrong. The now-known
way to trigger it has a different cause, and what's more, we simply do
not know at this point whether the malleable record type has such a
field. So in 0002 below I just changed it to assume that the problem
is a reserved field name. We might find another way to reach that
failure in future, but I doubt that "has no field" would be the right
thing to say in any case.

The proposed patch is a zero invasive solution. But the question is why we
cannot allow plpgsql reserved keywords in recfilds?

There should not be any collisions. Isn't there a better solution to
modify plpgsql_yylex instead and allow all keywords after '.' ? Sure. It
will be more invasive.

Regards

Pavel

Show quoted text

This is v19 material at this point, so I'll stick it on the CF queue.

regards, tom lane

[1]
/messages/by-id/18693-65968418890877b4@postgresql.org

#6Pavel Stehule
pavel.stehule@gmail.com
In reply to: Pavel Stehule (#5)
Re: Sanding down some edge cases for PL/pgSQL reserved words

Hi

2. That "has no field" error message is flat-out wrong. The now-known
way to trigger it has a different cause, and what's more, we simply do
not know at this point whether the malleable record type has such a
field. So in 0002 below I just changed it to assume that the problem
is a reserved field name. We might find another way to reach that
failure in future, but I doubt that "has no field" would be the right
thing to say in any case.

The proposed patch is a zero invasive solution. But the question is why we
cannot allow plpgsql reserved keywords in recfilds?

There should not be any collisions. Isn't there a better solution to
modify plpgsql_yylex instead and allow all keywords after '.' ? Sure. It
will be more invasive.

Is there some description of what keywords should be reserved? If I
remember correctly, the scanner was changed more times, and maybe more
reserved keywords are not necessary.

Regards

Pavel

Show quoted text

Regards

Pavel

This is v19 material at this point, so I'll stick it on the CF queue.

regards, tom lane

[1]
/messages/by-id/18693-65968418890877b4@postgresql.org

#7Tom Lane
tgl@sss.pgh.pa.us
In reply to: Pavel Stehule (#6)
Re: Sanding down some edge cases for PL/pgSQL reserved words

Pavel Stehule <pavel.stehule@gmail.com> writes:

Is there some description of what keywords should be reserved? If I
remember correctly, the scanner was changed more times, and maybe more
reserved keywords are not necessary.

Per the comment in pl_scanner.c:

* We try to avoid reserving more keywords than we have to; but there's
* little point in not reserving a word if it's reserved in the core grammar.
* Currently, the following words are reserved here but not in the core:
* BEGIN BY DECLARE EXECUTE FOREACH IF LOOP STRICT WHILE

This patch gets rid of EXECUTE and STRICT, but the others are harder
to de-reserve. I think most of the rest are there because they can
follow a block or loop label, and the same comment observes

* (We still have to reserve initial keywords that might follow a block
* label, unfortunately, since the method used to determine if we are at
* start of statement doesn't recognize such cases.

regards, tom lane

#8Pavel Stehule
pavel.stehule@gmail.com
In reply to: Tom Lane (#7)
Re: Sanding down some edge cases for PL/pgSQL reserved words

ne 8. 6. 2025 v 23:49 odesílatel Tom Lane <tgl@sss.pgh.pa.us> napsal:

Pavel Stehule <pavel.stehule@gmail.com> writes:

Is there some description of what keywords should be reserved? If I
remember correctly, the scanner was changed more times, and maybe more
reserved keywords are not necessary.

Per the comment in pl_scanner.c:

* We try to avoid reserving more keywords than we have to; but there's
* little point in not reserving a word if it's reserved in the core
grammar.
* Currently, the following words are reserved here but not in the core:
* BEGIN BY DECLARE EXECUTE FOREACH IF LOOP STRICT WHILE

This patch gets rid of EXECUTE and STRICT, but the others are harder
to de-reserve. I think most of the rest are there because they can
follow a block or loop label, and the same comment observes

* (We still have to reserve initial keywords that might follow a block
* label, unfortunately, since the method used to determine if we are at
* start of statement doesn't recognize such cases.

Looks so block label is a problem, but loop label not - and then BEGIN
DECLARE WHEN is really required reserved world
by gram.y

Maybe these comments are a little bit obsolete. Probably is not a good idea
to make unreserved words keywords used
as read_sql_xxxx delimiter: WHEN, LOOP, WHILE, INTO, USING, IN, FROM, and
maybe some other. This is probably
main reason why PL/pgSQL has these keywords marked as reserved.

Maybe there should be a new assert, that checks so the keywords used as
delimiters are reserved keywords.

I checked the list of reserved words of Ada language or PL/SQL language and
we are significantly different.

I can imagine two situations.

a) current state + Tom's patch that reports so keywords are reserved

b) ignore the keyword after the "dot" symbol, and allow the reserved
keyword as a record field without limits. SQL now allows using a lot of
keywords as labels without
necessity of using AS or double quoting.

Both variants can work well I think - a) is more strict, zero invasive, b)
is more user friendly, but small typo can hide some problems.

What do you think about it?

Regards

Pavel

Show quoted text

regards, tom lane

#9Pavel Stehule
pavel.stehule@gmail.com
In reply to: Tom Lane (#1)
Re: Sanding down some edge cases for PL/pgSQL reserved words

Hi

1. AFAICS, there is no real reason for STRICT to be a reserved
rather than unreserved PL/pgSQL keyword, and for that matter not
EXECUTE either. Making them unreserved does allow some ambiguity,
but I don't think there's any surprises in how that ambiguity
would be resolved; and certainly we've preferred ambiguity over
introducing new reserved keywords in PL/pgSQL before. I think
these two just escaped that treatment by dint of being ancient.

I checked other reserved keywords and I didn't see any reason to be
reserved keywords
for K_TO, K_NOT.

K_FOREACH, and K_WHILE are reserved probably because are used after
opt_loop_label - but it is not necessary

Other keywords are used as some delimiter or as protection against parser's
conflicts.

Regards

Pavel

#10Pavel Stehule
pavel.stehule@gmail.com
In reply to: Pavel Stehule (#5)
3 attachment(s)
Re: Sanding down some edge cases for PL/pgSQL reserved words

Hi

ne 8. 6. 2025 v 6:25 odesílatel Pavel Stehule <pavel.stehule@gmail.com>
napsal:

Hi

I started reviewing this patch.

so 7. 6. 2025 v 18:41 odesílatel Tom Lane <tgl@sss.pgh.pa.us> napsal:

This is a rather delayed response to the discussion of bug
#18693 [1], in which I wrote:

(It's kind of annoying that "strict" has to be double-quoted
in the RAISE NOTICE, especially since you get a rather misleading
error if it isn't. But that seems like a different discussion.)

As an example of that, if you don't double-quote "strict"
in this usage you get

regression=# do $$ declare r record; begin
SELECT a, b AS STRICT INTO r FROM (SELECT 'A' AS a, 'B' AS b) AS q;
RAISE NOTICE 'STRICT r.strict = %', r.strict;
end $$;
ERROR: record "r" has no field "strict"
LINE 1: r.strict
^
QUERY: r.strict
CONTEXT: PL/pgSQL function inline_code_block line 3 at RAISE

which is pretty bogus because the record *does* have a field
named "strict". The actual problem is that STRICT is a fully
reserved PL/pgSQL keyword, which means you need to double-quote
it if you want to use it this way.

The attached patches provide two independent responses to that:

1. AFAICS, there is no real reason for STRICT to be a reserved
rather than unreserved PL/pgSQL keyword, and for that matter not
EXECUTE either. Making them unreserved does allow some ambiguity,
but I don't think there's any surprises in how that ambiguity
would be resolved; and certainly we've preferred ambiguity over
introducing new reserved keywords in PL/pgSQL before. I think
these two just escaped that treatment by dint of being ancient.

There is no issue.

2. That "has no field" error message is flat-out wrong. The now-known
way to trigger it has a different cause, and what's more, we simply do
not know at this point whether the malleable record type has such a
field. So in 0002 below I just changed it to assume that the problem
is a reserved field name. We might find another way to reach that
failure in future, but I doubt that "has no field" would be the right
thing to say in any case.

The proposed patch is a zero invasive solution. But the question is why we
cannot allow plpgsql reserved keywords in recfilds?

There should not be any collisions. Isn't there a better solution to
modify plpgsql_yylex instead and allow all keywords after '.' ? Sure. It
will be more invasive.

Looks so nobody has any motivation to do some deeper changes to reduce
prohibition of reserved words. It is true, so in the real world it is not
an issue.

I did a review, and I didn't find any issue.

All tests passed without problems. I'll mark this patch as ready for commit.

Maybe the usage of unreserved words as variables or field names can be
tested a little bit more. See patch 0003

Regards

Pavel

Show quoted text

Regards

Pavel

This is v19 material at this point, so I'll stick it on the CF queue.

regards, tom lane

[1]
/messages/by-id/18693-65968418890877b4@postgresql.org

Attachments:

0002-Improve-error-report-for-PL-pgSQL-reserved-word-used.patchtext/x-patch; charset=US-ASCII; name=0002-Improve-error-report-for-PL-pgSQL-reserved-word-used.patchDownload
From 03c62909c4eac5bcc7c7e07d221656fd3c406b33 Mon Sep 17 00:00:00 2001
From: Tom Lane <tgl@sss.pgh.pa.us>
Date: Fri, 25 Apr 2025 17:22:27 -0400
Subject: [PATCH 2/3] Improve error report for PL/pgSQL reserved word used as a
 field name.

The current code in resolve_column_ref (dating to commits 01f7d2990
and fe24d7816) believes that not finding a RECFIELD datum is a
can't-happen case, in consequence of which I didn't spend a whole lot
of time considering what to do if it did happen.  But it turns out
that it *can* happen if the would-be field name is a fully-reserved
PL/pgSQL keyword.  Change the error message to describe that
situation, and add a test case demonstrating it.

This might need further refinement if anyone can find other ways to
trigger a failure here; but without an example it's not clear what
other error to throw.
---
 src/pl/plpgsql/src/expected/plpgsql_misc.out | 22 ++++++++++++++++++++
 src/pl/plpgsql/src/pl_comp.c                 | 19 ++++++++++-------
 src/pl/plpgsql/src/sql/plpgsql_misc.sql      | 16 ++++++++++++++
 3 files changed, 50 insertions(+), 7 deletions(-)

diff --git a/src/pl/plpgsql/src/expected/plpgsql_misc.out b/src/pl/plpgsql/src/expected/plpgsql_misc.out
index a6511df08ec..7c87029783a 100644
--- a/src/pl/plpgsql/src/expected/plpgsql_misc.out
+++ b/src/pl/plpgsql/src/expected/plpgsql_misc.out
@@ -65,3 +65,25 @@ do $$ declare x public.foo%rowtype; begin end $$;
 ERROR:  relation "public.foo" does not exist
 CONTEXT:  compilation of PL/pgSQL function "inline_code_block" near line 1
 do $$ declare x public.misc_table%rowtype; begin end $$;
+-- Test handling of a reserved word as a record field name
+do $$ declare r record;
+begin
+  select 1 as x, 2 as foreach into r;
+  raise notice 'r.x = %', r.x;
+  raise notice 'r.foreach = %', r.foreach;
+end $$;
+NOTICE:  r.x = 1
+ERROR:  field name "foreach" is a reserved key word
+LINE 1: r.foreach
+        ^
+HINT:  Use double quotes to quote it.
+QUERY:  r.foreach
+CONTEXT:  PL/pgSQL function inline_code_block line 5 at RAISE
+do $$ declare r record;
+begin
+  select 1 as x, 2 as foreach into r;
+  raise notice 'r.x = %', r.x;
+  raise notice 'r."foreach" = %', r."foreach";
+end $$;
+NOTICE:  r.x = 1
+NOTICE:  r."foreach" = 2
diff --git a/src/pl/plpgsql/src/pl_comp.c b/src/pl/plpgsql/src/pl_comp.c
index b80c59447fb..ee961425a5b 100644
--- a/src/pl/plpgsql/src/pl_comp.c
+++ b/src/pl/plpgsql/src/pl_comp.c
@@ -1211,17 +1211,22 @@ resolve_column_ref(ParseState *pstate, PLpgSQL_expr *expr,
 				}
 
 				/*
-				 * We should not get here, because a RECFIELD datum should
-				 * have been built at parse time for every possible qualified
-				 * reference to fields of this record.  But if we do, handle
-				 * it like field-not-found: throw error or return NULL.
+				 * Ideally we'd never get here, because a RECFIELD datum
+				 * should have been built at parse time for every qualified
+				 * reference to a field of this record that appears in the
+				 * source text.  However, plpgsql_yylex will not build such a
+				 * datum unless the field name lexes as token type IDENT.
+				 * Hence, if the would-be field name is a PL/pgSQL reserved
+				 * word, we lose.  Assume that that's what happened and tell
+				 * the user to quote it, unless the caller prefers we just
+				 * return NULL.
 				 */
 				if (error_if_no_field)
 					ereport(ERROR,
-							(errcode(ERRCODE_UNDEFINED_COLUMN),
-							 errmsg("record \"%s\" has no field \"%s\"",
-									(nnames_field == 1) ? name1 : name2,
+							(errcode(ERRCODE_SYNTAX_ERROR),
+							 errmsg("field name \"%s\" is a reserved key word",
 									colname),
+							 errhint("Use double quotes to quote it."),
 							 parser_errposition(pstate, cref->location)));
 			}
 			break;
diff --git a/src/pl/plpgsql/src/sql/plpgsql_misc.sql b/src/pl/plpgsql/src/sql/plpgsql_misc.sql
index d3a7f703a75..4c31b9d11fc 100644
--- a/src/pl/plpgsql/src/sql/plpgsql_misc.sql
+++ b/src/pl/plpgsql/src/sql/plpgsql_misc.sql
@@ -37,3 +37,19 @@ do $$ declare x foo.bar%rowtype; begin end $$;
 do $$ declare x foo.bar.baz%rowtype; begin end $$;
 do $$ declare x public.foo%rowtype; begin end $$;
 do $$ declare x public.misc_table%rowtype; begin end $$;
+
+-- Test handling of a reserved word as a record field name
+
+do $$ declare r record;
+begin
+  select 1 as x, 2 as foreach into r;
+  raise notice 'r.x = %', r.x;
+  raise notice 'r.foreach = %', r.foreach;
+end $$;
+
+do $$ declare r record;
+begin
+  select 1 as x, 2 as foreach into r;
+  raise notice 'r.x = %', r.x;
+  raise notice 'r."foreach" = %', r."foreach";
+end $$;
-- 
2.49.0

0003-test-of-usage-of-unreserved-words-as-variables-or-re.patchtext/x-patch; charset=US-ASCII; name=0003-test-of-usage-of-unreserved-words-as-variables-or-re.patchDownload
From 108d531ef3443bdb0f609f4bb06538a3cc9f204f Mon Sep 17 00:00:00 2001
From: "okbob@github.com" <pavel.stehule@gmail.com>
Date: Sun, 15 Jun 2025 14:23:08 +0200
Subject: [PATCH 3/3] test of usage of unreserved words as variables or record
 fields

---
 src/pl/plpgsql/src/expected/plpgsql_misc.out | 14 ++++++++++++++
 src/pl/plpgsql/src/sql/plpgsql_misc.sql      | 13 +++++++++++++
 2 files changed, 27 insertions(+)

diff --git a/src/pl/plpgsql/src/expected/plpgsql_misc.out b/src/pl/plpgsql/src/expected/plpgsql_misc.out
index 7c87029783a..16a6c74b954 100644
--- a/src/pl/plpgsql/src/expected/plpgsql_misc.out
+++ b/src/pl/plpgsql/src/expected/plpgsql_misc.out
@@ -65,6 +65,20 @@ do $$ declare x public.foo%rowtype; begin end $$;
 ERROR:  relation "public.foo" does not exist
 CONTEXT:  compilation of PL/pgSQL function "inline_code_block" near line 1
 do $$ declare x public.misc_table%rowtype; begin end $$;
+-- Test handling of unreserved word as a variable names
+-- and record field names.
+do $$
+declare
+  execute int;
+  r record;
+begin
+  execute := 10;
+  raise notice 'execute = %', execute;
+  select 1 as "strict" into r;
+  raise notice 'r.strict = %', r.strict;
+end $$;
+NOTICE:  execute = 10
+NOTICE:  r.strict = 1
 -- Test handling of a reserved word as a record field name
 do $$ declare r record;
 begin
diff --git a/src/pl/plpgsql/src/sql/plpgsql_misc.sql b/src/pl/plpgsql/src/sql/plpgsql_misc.sql
index 4c31b9d11fc..f29045039b9 100644
--- a/src/pl/plpgsql/src/sql/plpgsql_misc.sql
+++ b/src/pl/plpgsql/src/sql/plpgsql_misc.sql
@@ -38,6 +38,19 @@ do $$ declare x foo.bar.baz%rowtype; begin end $$;
 do $$ declare x public.foo%rowtype; begin end $$;
 do $$ declare x public.misc_table%rowtype; begin end $$;
 
+-- Test handling of unreserved word as a variable names
+-- and record field names.
+do $$
+declare
+  execute int;
+  r record;
+begin
+  execute := 10;
+  raise notice 'execute = %', execute;
+  select 1 as "strict" into r;
+  raise notice 'r.strict = %', r.strict;
+end $$;
+
 -- Test handling of a reserved word as a record field name
 
 do $$ declare r record;
-- 
2.49.0

0001-De-reserve-keywords-EXECUTE-and-STRICT-in-PL-pgSQL.patchtext/x-patch; charset=US-ASCII; name=0001-De-reserve-keywords-EXECUTE-and-STRICT-in-PL-pgSQL.patchDownload
From 143525045be93923cd721829867e33f3cbb5620b Mon Sep 17 00:00:00 2001
From: Tom Lane <tgl@sss.pgh.pa.us>
Date: Fri, 25 Apr 2025 16:27:13 -0400
Subject: [PATCH 1/3] De-reserve keywords EXECUTE and STRICT in PL/pgSQL.

On close inspection, there does not seem to be a strong reason
why these should be fully-reserved keywords.  I guess they just
escaped consideration in previous attempts to minimize PL/pgSQL's
list of reserved words.
---
 src/pl/plpgsql/src/pl_gram.y              | 13 +++++++++----
 src/pl/plpgsql/src/pl_reserved_kwlist.h   |  2 --
 src/pl/plpgsql/src/pl_scanner.c           |  2 +-
 src/pl/plpgsql/src/pl_unreserved_kwlist.h |  2 ++
 4 files changed, 12 insertions(+), 7 deletions(-)

diff --git a/src/pl/plpgsql/src/pl_gram.y b/src/pl/plpgsql/src/pl_gram.y
index 5612e66d023..7b672ea5179 100644
--- a/src/pl/plpgsql/src/pl_gram.y
+++ b/src/pl/plpgsql/src/pl_gram.y
@@ -1368,7 +1368,8 @@ for_control		: for_variable K_IN
 						int			tok = yylex(&yylval, &yylloc, yyscanner);
 						int			tokloc = yylloc;
 
-						if (tok == K_EXECUTE)
+						if (tok_is_keyword(tok, &yylval,
+										   K_EXECUTE, "execute"))
 						{
 							/* EXECUTE means it's a dynamic FOR loop */
 							PLpgSQL_stmt_dynfors *new;
@@ -2135,7 +2136,8 @@ stmt_open		: K_OPEN cursor_variable
 								yyerror(&yylloc, NULL, yyscanner, "syntax error, expected \"FOR\"");
 
 							tok = yylex(&yylval, &yylloc, yyscanner);
-							if (tok == K_EXECUTE)
+							if (tok_is_keyword(tok, &yylval,
+											   K_EXECUTE, "execute"))
 							{
 								int			endtoken;
 
@@ -2536,6 +2538,7 @@ unreserved_keyword	:
 				| K_ERRCODE
 				| K_ERROR
 				| K_EXCEPTION
+				| K_EXECUTE
 				| K_EXIT
 				| K_FETCH
 				| K_FIRST
@@ -2581,6 +2584,7 @@ unreserved_keyword	:
 				| K_SLICE
 				| K_SQLSTATE
 				| K_STACKED
+				| K_STRICT
 				| K_TABLE
 				| K_TABLE_NAME
 				| K_TYPE
@@ -3514,7 +3518,8 @@ make_return_query_stmt(int location, YYSTYPE *yylvalp, YYLTYPE *yyllocp, yyscan_
 	new->stmtid = ++plpgsql_curr_compile->nstatements;
 
 	/* check for RETURN QUERY EXECUTE */
-	if ((tok = yylex(yylvalp, yyllocp, yyscanner)) != K_EXECUTE)
+	tok = yylex(yylvalp, yyllocp, yyscanner);
+	if (!tok_is_keyword(tok, yylvalp, K_EXECUTE, "execute"))
 	{
 		/* ordinary static query */
 		plpgsql_push_back_token(tok, yylvalp, yyllocp, yyscanner);
@@ -3597,7 +3602,7 @@ read_into_target(PLpgSQL_variable **target, bool *strict, YYSTYPE *yylvalp, YYLT
 		*strict = false;
 
 	tok = yylex(yylvalp, yyllocp, yyscanner);
-	if (strict && tok == K_STRICT)
+	if (strict && tok_is_keyword(tok, yylvalp, K_STRICT, "strict"))
 	{
 		*strict = true;
 		tok = yylex(yylvalp, yyllocp, yyscanner);
diff --git a/src/pl/plpgsql/src/pl_reserved_kwlist.h b/src/pl/plpgsql/src/pl_reserved_kwlist.h
index ce7b0c9d331..f3ef2cbd8d7 100644
--- a/src/pl/plpgsql/src/pl_reserved_kwlist.h
+++ b/src/pl/plpgsql/src/pl_reserved_kwlist.h
@@ -33,7 +33,6 @@ PG_KEYWORD("case", K_CASE)
 PG_KEYWORD("declare", K_DECLARE)
 PG_KEYWORD("else", K_ELSE)
 PG_KEYWORD("end", K_END)
-PG_KEYWORD("execute", K_EXECUTE)
 PG_KEYWORD("for", K_FOR)
 PG_KEYWORD("foreach", K_FOREACH)
 PG_KEYWORD("from", K_FROM)
@@ -44,7 +43,6 @@ PG_KEYWORD("loop", K_LOOP)
 PG_KEYWORD("not", K_NOT)
 PG_KEYWORD("null", K_NULL)
 PG_KEYWORD("or", K_OR)
-PG_KEYWORD("strict", K_STRICT)
 PG_KEYWORD("then", K_THEN)
 PG_KEYWORD("to", K_TO)
 PG_KEYWORD("using", K_USING)
diff --git a/src/pl/plpgsql/src/pl_scanner.c b/src/pl/plpgsql/src/pl_scanner.c
index d08187dafcb..19825e5c718 100644
--- a/src/pl/plpgsql/src/pl_scanner.c
+++ b/src/pl/plpgsql/src/pl_scanner.c
@@ -53,7 +53,7 @@ IdentifierLookup plpgsql_IdentifierLookup = IDENTIFIER_LOOKUP_NORMAL;
  * We try to avoid reserving more keywords than we have to; but there's
  * little point in not reserving a word if it's reserved in the core grammar.
  * Currently, the following words are reserved here but not in the core:
- * BEGIN BY DECLARE EXECUTE FOREACH IF LOOP STRICT WHILE
+ * BEGIN BY DECLARE FOREACH IF LOOP WHILE
  */
 
 /* ScanKeywordList lookup data for PL/pgSQL keywords */
diff --git a/src/pl/plpgsql/src/pl_unreserved_kwlist.h b/src/pl/plpgsql/src/pl_unreserved_kwlist.h
index 98f99ec470c..b48c5a645ff 100644
--- a/src/pl/plpgsql/src/pl_unreserved_kwlist.h
+++ b/src/pl/plpgsql/src/pl_unreserved_kwlist.h
@@ -58,6 +58,7 @@ PG_KEYWORD("elsif", K_ELSIF)
 PG_KEYWORD("errcode", K_ERRCODE)
 PG_KEYWORD("error", K_ERROR)
 PG_KEYWORD("exception", K_EXCEPTION)
+PG_KEYWORD("execute", K_EXECUTE)
 PG_KEYWORD("exit", K_EXIT)
 PG_KEYWORD("fetch", K_FETCH)
 PG_KEYWORD("first", K_FIRST)
@@ -103,6 +104,7 @@ PG_KEYWORD("scroll", K_SCROLL)
 PG_KEYWORD("slice", K_SLICE)
 PG_KEYWORD("sqlstate", K_SQLSTATE)
 PG_KEYWORD("stacked", K_STACKED)
+PG_KEYWORD("strict", K_STRICT)
 PG_KEYWORD("table", K_TABLE)
 PG_KEYWORD("table_name", K_TABLE_NAME)
 PG_KEYWORD("type", K_TYPE)
-- 
2.49.0

#11Tom Lane
tgl@sss.pgh.pa.us
In reply to: Pavel Stehule (#10)
Re: Sanding down some edge cases for PL/pgSQL reserved words

Pavel Stehule <pavel.stehule@gmail.com> writes:

Looks so nobody has any motivation to do some deeper changes to reduce
prohibition of reserved words. It is true, so in the real world it is not
an issue.

Certainly anyone who's annoyed is free to do more work here.

All tests passed without problems. I'll mark this patch as ready for commit.
Maybe the usage of unreserved words as variables or field names can be
tested a little bit more. See patch 0003

Pushed with the test case as you suggested. Thanks for reviewing!

regards, tom lane