Making the subquery alias optional in the FROM clause
This was discussed previously in [1]/messages/by-id/1487773980.3143.15.camel@oopsware.de, and there seemed to be general
consensus in favour of it, but no new patch emerged.
Attached is a patch that takes the approach of not generating an alias
at all, which seems to be neater and simpler, and less code than
trying to generate a unique alias.
It still generates an eref for the subquery RTE, which has a made-up
relation name, but that is marked as not visible on the
ParseNamespaceItem, so it doesn't conflict with anything else, need
not be unique, and cannot be used for qualified references to the
subquery's columns.
The only place that exposes the eref's made-up relation name is the
existing query deparsing code in ruleutils.c, which uniquifies it and
generates SQL spec-compliant output. For example:
CREATE OR REPLACE VIEW test_view AS
SELECT *
FROM (SELECT a, b FROM foo),
(SELECT c, d FROM bar)
WHERE a = c;
\sv test_view
CREATE OR REPLACE VIEW public.test_view AS
SELECT subquery.a,
subquery.b,
subquery_1.c,
subquery_1.d
FROM ( SELECT foo.a,
foo.b
FROM foo) subquery,
( SELECT bar.c,
bar.d
FROM bar) subquery_1
WHERE subquery.a = subquery_1.c
Regards,
Dean
Attachments:
make-subquery-alias-optional.patchtext/x-patch; charset=US-ASCII; name=make-subquery-alias-optional.patchDownload
diff --git a/doc/src/sgml/ref/select.sgml b/doc/src/sgml/ref/select.sgml
new file mode 100644
index 80bb8ad..8cae456
--- a/doc/src/sgml/ref/select.sgml
+++ b/doc/src/sgml/ref/select.sgml
@@ -51,7 +51,7 @@ SELECT [ ALL | DISTINCT [ ON ( <replacea
[ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [ [ AS ] <replaceable class="parameter">alias</replaceable> [ ( <replaceable class="parameter">column_alias</replaceable> [, ...] ) ] ]
[ TABLESAMPLE <replaceable class="parameter">sampling_method</replaceable> ( <replaceable class="parameter">argument</replaceable> [, ...] ) [ REPEATABLE ( <replaceable class="parameter">seed</replaceable> ) ] ]
- [ LATERAL ] ( <replaceable class="parameter">select</replaceable> ) [ AS ] <replaceable class="parameter">alias</replaceable> [ ( <replaceable class="parameter">column_alias</replaceable> [, ...] ) ]
+ [ LATERAL ] ( <replaceable class="parameter">select</replaceable> ) [ [ AS ] <replaceable class="parameter">alias</replaceable> [ ( <replaceable class="parameter">column_alias</replaceable> [, ...] ) ] ]
<replaceable class="parameter">with_query_name</replaceable> [ [ AS ] <replaceable class="parameter">alias</replaceable> [ ( <replaceable class="parameter">column_alias</replaceable> [, ...] ) ] ]
[ LATERAL ] <replaceable class="parameter">function_name</replaceable> ( [ <replaceable class="parameter">argument</replaceable> [, ...] ] )
[ WITH ORDINALITY ] [ [ AS ] <replaceable class="parameter">alias</replaceable> [ ( <replaceable class="parameter">column_alias</replaceable> [, ...] ) ] ]
@@ -490,11 +490,18 @@ TABLE [ ONLY ] <replaceable class="param
output were created as a temporary table for the duration of
this single <command>SELECT</command> command. Note that the
sub-<command>SELECT</command> must be surrounded by
- parentheses, and an alias <emphasis>must</emphasis> be
- provided for it. A
+ parentheses. A
<link linkend="sql-values"><command>VALUES</command></link> command
can also be used here.
</para>
+
+ <para>
+ An alias may be provided to name the output of the
+ sub-<command>SELECT</command>, and optionally rename its columns.
+ Note that according to the SQL standard, a
+ sub-<command>SELECT</command> <emphasis>must</emphasis> have an alias,
+ whereas in <productname>PostgreSQL</productname> it is optional.
+ </para>
</listitem>
</varlistentry>
@@ -2041,6 +2048,16 @@ SELECT 2+2;
</para>
</refsect2>
+ <refsect2>
+ <title>Omitting sub-<command>SELECT</command> aliases in <literal>FROM</literal></title>
+
+ <para>
+ According to the SQL standard, a sub-<command>SELECT</command> in the
+ <literal>FROM</literal> list must have an alias. In
+ <productname>PostgreSQL</productname>, this alias may be omitted.
+ </para>
+ </refsect2>
+
<refsect2>
<title><literal>ONLY</literal> and Inheritance</title>
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
new file mode 100644
index 969c9c1..3b25f7c
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -13346,33 +13346,6 @@ table_ref: relation_expr opt_alias_claus
n->lateral = false;
n->subquery = $1;
n->alias = $2;
- /*
- * The SQL spec does not permit a subselect
- * (<derived_table>) without an alias clause,
- * so we don't either. This avoids the problem
- * of needing to invent a unique refname for it.
- * That could be surmounted if there's sufficient
- * popular demand, but for now let's just implement
- * the spec and see if anyone complains.
- * However, it does seem like a good idea to emit
- * an error message that's better than "syntax error".
- */
- if ($2 == NULL)
- {
- if (IsA($1, SelectStmt) &&
- ((SelectStmt *) $1)->valuesLists)
- ereport(ERROR,
- (errcode(ERRCODE_SYNTAX_ERROR),
- errmsg("VALUES in FROM must have an alias"),
- errhint("For example, FROM (VALUES ...) [AS] foo."),
- parser_errposition(@1)));
- else
- ereport(ERROR,
- (errcode(ERRCODE_SYNTAX_ERROR),
- errmsg("subquery in FROM must have an alias"),
- errhint("For example, FROM (SELECT ...) [AS] foo."),
- parser_errposition(@1)));
- }
$$ = (Node *) n;
}
| LATERAL_P select_with_parens opt_alias_clause
@@ -13382,23 +13355,6 @@ table_ref: relation_expr opt_alias_claus
n->lateral = true;
n->subquery = $2;
n->alias = $3;
- /* same comment as above */
- if ($3 == NULL)
- {
- if (IsA($2, SelectStmt) &&
- ((SelectStmt *) $2)->valuesLists)
- ereport(ERROR,
- (errcode(ERRCODE_SYNTAX_ERROR),
- errmsg("VALUES in FROM must have an alias"),
- errhint("For example, FROM (VALUES ...) [AS] foo."),
- parser_errposition(@2)));
- else
- ereport(ERROR,
- (errcode(ERRCODE_SYNTAX_ERROR),
- errmsg("subquery in FROM must have an alias"),
- errhint("For example, FROM (SELECT ...) [AS] foo."),
- parser_errposition(@2)));
- }
$$ = (Node *) n;
}
| joined_table
diff --git a/src/backend/parser/parse_clause.c b/src/backend/parser/parse_clause.c
new file mode 100644
index c655d18..5a18107
--- a/src/backend/parser/parse_clause.c
+++ b/src/backend/parser/parse_clause.c
@@ -404,16 +404,6 @@ transformRangeSubselect(ParseState *psta
Query *query;
/*
- * We require user to supply an alias for a subselect, per SQL92. To relax
- * this, we'd have to be prepared to gin up a unique alias for an
- * unlabeled subselect. (This is just elog, not ereport, because the
- * grammar should have enforced it already. It'd probably be better to
- * report the error here, but we don't have a good error location here.)
- */
- if (r->alias == NULL)
- elog(ERROR, "subquery in FROM must have an alias");
-
- /*
* Set p_expr_kind to show this parse level is recursing to a subselect.
* We can't be nested within any expression, so don't need save-restore
* logic here.
@@ -430,10 +420,14 @@ transformRangeSubselect(ParseState *psta
pstate->p_lateral_active = r->lateral;
/*
- * Analyze and transform the subquery.
+ * Analyze and transform the subquery. Note that if the subquery doesn't
+ * have an alias, it can't be explicitly selected for locking, but locking
+ * might still be required (if there is an all-tables locking clause).
*/
query = parse_sub_analyze(r->subquery, pstate, NULL,
- isLockedRefname(pstate, r->alias->aliasname),
+ isLockedRefname(pstate,
+ r->alias == NULL ? NULL :
+ r->alias->aliasname),
true);
/* Restore state */
diff --git a/src/backend/parser/parse_relation.c b/src/backend/parser/parse_relation.c
new file mode 100644
index 926dcbf..68ff2cc
--- a/src/backend/parser/parse_relation.c
+++ b/src/backend/parser/parse_relation.c
@@ -1577,7 +1577,11 @@ addRangeTableEntryForRelation(ParseState
* Then, construct and return a ParseNamespaceItem for the new RTE.
*
* This is much like addRangeTableEntry() except that it makes a subquery RTE.
- * Note that an alias clause *must* be supplied.
+ *
+ * If the subquery does not have an alias, the auto-generated relation name in
+ * the returned ParseNamespaceItem will be marked as not visible, and so only
+ * unqualified references to the subquery columns will be allowed, and the
+ * relation name will not conflict with others in the pstate's namespace list.
*/
ParseNamespaceItem *
addRangeTableEntryForSubquery(ParseState *pstate,
@@ -1587,7 +1591,6 @@ addRangeTableEntryForSubquery(ParseState
bool inFromCl)
{
RangeTblEntry *rte = makeNode(RangeTblEntry);
- char *refname = alias->aliasname;
Alias *eref;
int numaliases;
List *coltypes,
@@ -1595,6 +1598,7 @@ addRangeTableEntryForSubquery(ParseState
*colcollations;
int varattno;
ListCell *tlistitem;
+ ParseNamespaceItem *nsitem;
Assert(pstate != NULL);
@@ -1602,7 +1606,7 @@ addRangeTableEntryForSubquery(ParseState
rte->subquery = subquery;
rte->alias = alias;
- eref = copyObject(alias);
+ eref = alias ? copyObject(alias) : makeAlias("subquery", NIL);
numaliases = list_length(eref->colnames);
/* fill in any unspecified alias columns, and extract column type info */
@@ -1634,7 +1638,7 @@ addRangeTableEntryForSubquery(ParseState
ereport(ERROR,
(errcode(ERRCODE_INVALID_COLUMN_REFERENCE),
errmsg("table \"%s\" has %d columns available but %d columns specified",
- refname, varattno, numaliases)));
+ eref->aliasname, varattno, numaliases)));
rte->eref = eref;
@@ -1665,8 +1669,15 @@ addRangeTableEntryForSubquery(ParseState
* Build a ParseNamespaceItem, but don't add it to the pstate's namespace
* list --- caller must do that if appropriate.
*/
- return buildNSItemFromLists(rte, list_length(pstate->p_rtable),
- coltypes, coltypmods, colcollations);
+ nsitem = buildNSItemFromLists(rte, list_length(pstate->p_rtable),
+ coltypes, coltypmods, colcollations);
+
+ /*
+ * Mark it visible as a relation name only if it had a user-written alias.
+ */
+ nsitem->p_rel_visible = (alias != NULL);
+
+ return nsitem;
}
/*
@@ -2520,6 +2531,10 @@ addRangeTableEntryForENR(ParseState *pst
* This is used when we have not yet done transformLockingClause, but need
* to know the correct lock to take during initial opening of relations.
*
+ * Note that refname may be NULL (for a subquery without an alias), in which
+ * case the relation can't be locked by name, but it might still be locked if
+ * a locking clause requests that all tables be locked.
+ *
* Note: we pay no attention to whether it's FOR UPDATE vs FOR SHARE,
* since the table-level lock is the same either way.
*/
@@ -2544,7 +2559,7 @@ isLockedRefname(ParseState *pstate, cons
/* all tables used in query */
return true;
}
- else
+ else if (refname != NULL)
{
/* just the named tables */
ListCell *l2;
diff --git a/src/backend/utils/adt/ruleutils.c b/src/backend/utils/adt/ruleutils.c
new file mode 100644
index c3937a6..9d285fe
--- a/src/backend/utils/adt/ruleutils.c
+++ b/src/backend/utils/adt/ruleutils.c
@@ -11728,7 +11728,11 @@ get_from_clause_item(Node *jtnode, Query
else if (rte->rtekind == RTE_SUBQUERY ||
rte->rtekind == RTE_VALUES)
{
- /* Alias is syntactically required for SUBQUERY and VALUES */
+ /*
+ * For a subquery, always print alias. This makes the output SQL
+ * spec-compliant, even though we allow the alias to be omitted on
+ * input.
+ */
printalias = true;
}
else if (rte->rtekind == RTE_CTE)
diff --git a/src/include/parser/parse_node.h b/src/include/parser/parse_node.h
new file mode 100644
index cf9c759..962ebf6
--- a/src/include/parser/parse_node.h
+++ b/src/include/parser/parse_node.h
@@ -246,8 +246,11 @@ struct ParseState
* visible for qualified references) but it does hide their columns
* (unqualified references to the columns refer to the JOIN, not the member
* tables, so we must not complain that such a reference is ambiguous).
- * Various special RTEs such as NEW/OLD for rules may also appear with only
- * one flag set.
+ * Conversely, a subquery without an alias does not hide the columns selected
+ * by the subquery, but it does hide the auto-generated relation name (so the
+ * subquery columns are visible for unqualified references only). Various
+ * special RTEs such as NEW/OLD for rules may also appear with only one flag
+ * set.
*
* While processing the FROM clause, namespace items may appear with
* p_lateral_only set, meaning they are visible only to LATERAL
diff --git a/src/interfaces/ecpg/preproc/ecpg.addons b/src/interfaces/ecpg/preproc/ecpg.addons
new file mode 100644
index 94f7d4a..e94da2a
--- a/src/interfaces/ecpg/preproc/ecpg.addons
+++ b/src/interfaces/ecpg/preproc/ecpg.addons
@@ -449,12 +449,6 @@ ECPG: into_clauseINTOOptTempTableName bl
$$= cat2_str(mm_strdup("into"), $2);
}
| ecpg_into { $$ = EMPTY; }
-ECPG: table_refselect_with_parensopt_alias_clause addon
- if ($2 == NULL)
- mmerror(PARSE_ERROR, ET_ERROR, "subquery in FROM must have an alias");
-ECPG: table_refLATERAL_Pselect_with_parensopt_alias_clause addon
- if ($3 == NULL)
- mmerror(PARSE_ERROR, ET_ERROR, "subquery in FROM must have an alias");
ECPG: TypenameSimpleTypenameopt_array_bounds block
{ $$ = cat2_str($1, $2.str); }
ECPG: TypenameSETOFSimpleTypenameopt_array_bounds block
diff --git a/src/test/regress/expected/subselect.out b/src/test/regress/expected/subselect.out
new file mode 100644
index 45c75ee..f264a40
--- a/src/test/regress/expected/subselect.out
+++ b/src/test/regress/expected/subselect.out
@@ -30,6 +30,12 @@ SELECT * FROM ((SELECT 1 AS x)) ss;
1
(1 row)
+SELECT * FROM ((SELECT 1 AS x)), ((SELECT * FROM ((SELECT 2 AS y))));
+ x | y
+---+---
+ 1 | 2
+(1 row)
+
(SELECT 2) UNION SELECT 2;
?column?
----------
@@ -196,6 +202,53 @@ SELECT f1 AS "Correlated Field"
3
(5 rows)
+-- Subselects without aliases
+SELECT count FROM (SELECT COUNT(DISTINCT name) FROM road);
+ count
+-------
+ 2911
+(1 row)
+
+SELECT COUNT(*) FROM (SELECT DISTINCT name FROM road);
+ count
+-------
+ 2911
+(1 row)
+
+SELECT * FROM (SELECT * FROM int4_tbl), (VALUES (123456)) WHERE f1 = column1;
+ f1 | column1
+--------+---------
+ 123456 | 123456
+(1 row)
+
+CREATE VIEW view_unnamed_ss AS
+SELECT * FROM (SELECT * FROM (SELECT abs(f1) AS a1 FROM int4_tbl)),
+ (SELECT * FROM int8_tbl)
+ WHERE a1 < 10 AND q1 > a1 ORDER BY q1, q2;
+SELECT * FROM view_unnamed_ss;
+ a1 | q1 | q2
+----+------------------+-------------------
+ 0 | 123 | 456
+ 0 | 123 | 4567890123456789
+ 0 | 4567890123456789 | -4567890123456789
+ 0 | 4567890123456789 | 123
+ 0 | 4567890123456789 | 4567890123456789
+(5 rows)
+
+\sv view_unnamed_ss
+CREATE OR REPLACE VIEW public.view_unnamed_ss AS
+ SELECT subquery.a1,
+ subquery_1.q1,
+ subquery_1.q2
+ FROM ( SELECT subquery_2.a1
+ FROM ( SELECT abs(int4_tbl.f1) AS a1
+ FROM int4_tbl) subquery_2) subquery,
+ ( SELECT int8_tbl.q1,
+ int8_tbl.q2
+ FROM int8_tbl) subquery_1
+ WHERE subquery.a1 < 10 AND subquery_1.q1 > subquery.a1
+ ORDER BY subquery_1.q1, subquery_1.q2
+DROP VIEW view_unnamed_ss;
--
-- Use some existing tables in the regression test
--
diff --git a/src/test/regress/sql/subselect.sql b/src/test/regress/sql/subselect.sql
new file mode 100644
index 94ba91f..e4b74a0
--- a/src/test/regress/sql/subselect.sql
+++ b/src/test/regress/sql/subselect.sql
@@ -13,6 +13,8 @@ SELECT 1 AS zero WHERE 1 IN (SELECT 2);
SELECT * FROM (SELECT 1 AS x) ss;
SELECT * FROM ((SELECT 1 AS x)) ss;
+SELECT * FROM ((SELECT 1 AS x)), ((SELECT * FROM ((SELECT 2 AS y))));
+
(SELECT 2) UNION SELECT 2;
((SELECT 2)) UNION SELECT 2;
@@ -80,6 +82,24 @@ SELECT f1 AS "Correlated Field"
WHERE (f1, f2) IN (SELECT f2, CAST(f3 AS int4) FROM SUBSELECT_TBL
WHERE f3 IS NOT NULL);
+-- Subselects without aliases
+
+SELECT count FROM (SELECT COUNT(DISTINCT name) FROM road);
+SELECT COUNT(*) FROM (SELECT DISTINCT name FROM road);
+
+SELECT * FROM (SELECT * FROM int4_tbl), (VALUES (123456)) WHERE f1 = column1;
+
+CREATE VIEW view_unnamed_ss AS
+SELECT * FROM (SELECT * FROM (SELECT abs(f1) AS a1 FROM int4_tbl)),
+ (SELECT * FROM int8_tbl)
+ WHERE a1 < 10 AND q1 > a1 ORDER BY q1, q2;
+
+SELECT * FROM view_unnamed_ss;
+
+\sv view_unnamed_ss
+
+DROP VIEW view_unnamed_ss;
+
--
-- Use some existing tables in the regression test
--
On Mon, Jun 27, 2022 at 9:49 PM Dean Rasheed <dean.a.rasheed@gmail.com> wrote:
This was discussed previously in [1], and there seemed to be general
consensus in favour of it, but no new patch emerged.Attached is a patch that takes the approach of not generating an alias
at all, which seems to be neater and simpler, and less code than
trying to generate a unique alias.It still generates an eref for the subquery RTE, which has a made-up
relation name, but that is marked as not visible on the
ParseNamespaceItem, so it doesn't conflict with anything else, need
not be unique, and cannot be used for qualified references to the
subquery's columns.The only place that exposes the eref's made-up relation name is the
existing query deparsing code in ruleutils.c, which uniquifies it and
generates SQL spec-compliant output. For example:CREATE OR REPLACE VIEW test_view AS
SELECT *
FROM (SELECT a, b FROM foo),
(SELECT c, d FROM bar)
WHERE a = c;\sv test_view
CREATE OR REPLACE VIEW public.test_view AS
SELECT subquery.a,
subquery.b,
subquery_1.c,
subquery_1.d
FROM ( SELECT foo.a,
foo.b
FROM foo) subquery,
( SELECT bar.c,
bar.d
FROM bar) subquery_1
WHERE subquery.a = subquery_1.c
It doesn't play that well if you have something called subquery though:
CREATE OR REPLACE VIEW test_view AS
SELECT *
FROM (SELECT a, b FROM foo),
(SELECT c, d FROM bar), (select relname from pg_class limit
1) as subquery
WHERE a = c;
\sv test_view
CREATE OR REPLACE VIEW public.test_view AS
SELECT subquery.a,
subquery.b,
subquery_1.c,
subquery_1.d,
subquery_2.relname
FROM ( SELECT foo.a,
foo.b
FROM foo) subquery,
( SELECT bar.c,
bar.d
FROM bar) subquery_1,
( SELECT pg_class.relname
FROM pg_class
LIMIT 1) subquery_2
WHERE subquery.a = subquery_1.c
While the output is a valid query, it's not nice that it's replacing a
user provided alias with another one (or force an alias if you have a
relation called subquery). More generally, I'm -0.5 on the feature.
I prefer to force using SQL-compliant queries, and also not take bad
habits.
On Mon, 27 Jun 2022 at 11:12, Julien Rouhaud <rjuju123@gmail.com> wrote:
More generally, I'm -0.5 on the feature.
I prefer to force using SQL-compliant queries, and also not take bad
habits.
As to forcing SQL-complaint queries, that ship sailed a long time ago:
Postgres allows but does not enforce the use of SQL-compliant queries, and
many of its important features are extensions anyway, so forcing SQL
compliant queries is out of the question (although I could see the utility
of a mode where it warns or errors on non-compliant queries, at least in
principle).
As to bad habits, I'm having trouble understanding. Why do you think
leaving the alias off a subquery is a bad habit (assuming it were allowed)?
If the name is never used, why are we required to supply it?
On Mon, 27 Jun 2022 at 16:12, Julien Rouhaud <rjuju123@gmail.com> wrote:
It doesn't play that well if you have something called subquery though:
[example that changes a user-provided alias]
While the output is a valid query, it's not nice that it's replacing a
user provided alias with another one (or force an alias if you have a
relation called subquery).
It's already the case that user-provided aliases can get replaced by
new ones in the query-deparsing code, e.g.:
CREATE OR REPLACE VIEW test_view AS
SELECT x.a, y.b
FROM foo AS x,
(SELECT b FROM foo AS x) AS y;
\sv test_view
CREATE OR REPLACE VIEW public.test_view AS
SELECT x.a,
y.b
FROM foo x,
( SELECT x_1.b
FROM foo x_1) y
and similarly it may invent technically unnecessary aliases where
there were none before. The query-deparsing code has never been
alias-preserving, unless you take care to give everything a globally
unique alias.
Regards,
Dean
On Mon, Jun 27, 2022 at 11:25 AM Dean Rasheed <dean.a.rasheed@gmail.com>
wrote:
On Mon, 27 Jun 2022 at 16:12, Julien Rouhaud <rjuju123@gmail.com> wrote:
It doesn't play that well if you have something called subquery though:
[example that changes a user-provided alias]
While the output is a valid query, it's not nice that it's replacing a
user provided alias with another one (or force an alias if you have a
relation called subquery).It's already the case that user-provided aliases can get replaced by
new ones in the query-deparsing code, e.g.:
Regardless, is there any reason to not just prefix our made-up aliases with
"pg_" to make it perfectly clear they were generated by the system and are
basically implementation details as opposed to something that appeared in
the originally written query?
I suppose, "because we've haven't until now, so why start" suffices...but
still doing a rename/suffixing because of query rewriting and inventing one
where we made it optional seem different enough to justify implementing
something different.
David J.
On Mon, 27 Jun 2022 at 19:43, David G. Johnston
<david.g.johnston@gmail.com> wrote:
On Mon, Jun 27, 2022 at 11:25 AM Dean Rasheed <dean.a.rasheed@gmail.com> wrote:
On Mon, 27 Jun 2022 at 16:12, Julien Rouhaud <rjuju123@gmail.com> wrote:
It doesn't play that well if you have something called subquery though:
[example that changes a user-provided alias]
While the output is a valid query, it's not nice that it's replacing a
user provided alias with another one (or force an alias if you have a
relation called subquery).It's already the case that user-provided aliases can get replaced by
new ones in the query-deparsing code, e.g.:Regardless, is there any reason to not just prefix our made-up aliases with "pg_" to make it perfectly clear they were generated by the system and are basically implementation details as opposed to something that appeared in the originally written query?
I suppose, "because we've haven't until now, so why start" suffices...but still doing a rename/suffixing because of query rewriting and inventing one where we made it optional seem different enough to justify implementing something different.
I think "pg_" would be a bad idea, since it's too easily confused with
things like system catalogs. The obvious precedent we have for a
made-up alias is "unnamed_join", so perhaps "unnamed_subquery" would
be better.
Regards,
Dean
Hi,
On Mon, Jun 27, 2022 at 12:03:20PM -0400, Isaac Morland wrote:
On Mon, 27 Jun 2022 at 11:12, Julien Rouhaud <rjuju123@gmail.com> wrote:
More generally, I'm -0.5 on the feature.
I prefer to force using SQL-compliant queries, and also not take bad
habits.As to forcing SQL-complaint queries, that ship sailed a long time ago:
Postgres allows but does not enforce the use of SQL-compliant queries, and
many of its important features are extensions anyway, so forcing SQL
compliant queries is out of the question (although I could see the utility
of a mode where it warns or errors on non-compliant queries, at least in
principle).
Sure, but it doesn't mean that we should support even more non-compliant syntax
without any restraint. In this case, I don't see much benefit as it's not
solving performance problem or something like that.
As to bad habits, I'm having trouble understanding. Why do you think
leaving the alias off a subquery is a bad habit (assuming it were allowed)?
I think It's a bad habit because as far as I can see it's not supported on
mysql or sqlserver.
If the name is never used, why are we required to supply it?
I'm not saying that I'm thrilled having to do so, but it's also not a huge
trouble. And since it's required I have the habit to automatically put some
random alias if I'm writing some one shot query that indeed doesn't need to use
the alias.
But similarly, I many times relied on the fact that writable CTE are executed
even if not explicitly referenced. So by the same argument shouldn't we allow
something like this?
WITH (INSERT INTO t SELECT * pending WHERE ts < now())
SELECT now() AS last_processing_time;
On Tue, 28 Jun 2022 at 00:32, Julien Rouhaud <rjuju123@gmail.com> wrote:
As to forcing SQL-complaint queries, that ship sailed a long time ago:
Postgres allows but does not enforce the use of SQL-compliant queries,
and
many of its important features are extensions anyway, so forcing SQL
compliant queries is out of the question (although I could see theutility
of a mode where it warns or errors on non-compliant queries, at least in
principle).Sure, but it doesn't mean that we should support even more non-compliant
syntax
without any restraint. In this case, I don't see much benefit as it's not
solving performance problem or something like that.
It's improving developer performance by eliminating the need to make up
utterly useless names. I don't care if behind the scenes names are
assigned, although it would be even better if the names didn't exist at
all. I just want the computer to do stuff for me that requires absolutely
no human judgement whatsoever.
As to bad habits, I'm having trouble understanding. Why do you think
leaving the alias off a subquery is a bad habit (assuming it were
allowed)?
I think It's a bad habit because as far as I can see it's not supported on
mysql or sqlserver.
So it’s a bad habit to use features of Postgres that aren’t available on
MySQL or SQL Server?
For myself, I don’t care one bit about whether my code will run on those
systems, or Oracle: as far as I’m concerned I write Postgres applications,
not SQL applications. Of course, many people have a need to support other
systems, so I appreciate the care we take to document the differences from
the standard, and I hope we will continue to support standard queries. But
if it’s a bad habit to use Postgres-specific features, why do we create any
of those features?
If the name is never used, why are we required to supply it?
But similarly, I many times relied on the fact that writable CTE are
executed
even if not explicitly referenced. So by the same argument shouldn't we
allow
something like this?WITH (INSERT INTO t SELECT * pending WHERE ts < now())
SELECT now() AS last_processing_time;
I’m not necessarily opposed to allowing this too. But the part which causes
me annoyance is normal subquery naming.
Dean Rasheed <dean.a.rasheed@gmail.com> writes:
This was discussed previously in [1], and there seemed to be general
consensus in favour of it, but no new patch emerged.
As I said in that thread, I'm not super enthused about this, but I was
clearly in the minority so I think it should go forward.
Attached is a patch that takes the approach of not generating an alias
at all, which seems to be neater and simpler, and less code than
trying to generate a unique alias.
Hm. Looking at the code surrounding what you touched, I'm reminded
that we allow JOIN nodes to not have an alias, and represent that
situation by rte->alias == NULL. I wonder if it'd be better in the
long run to make alias-less subqueries work similarly, rather than
generating something that after-the-fact will be indistinguishable
from a user-written alias. If that turns out to not work well,
I'd agree with "unnamed_subquery" as the inserted name.
Also, what about VALUES clauses? It seems inconsistent to remove
this restriction for sub-SELECT but not VALUES. Actually it looks
like your patch already does remove that restriction in gram.y,
but you didn't follow through elsewhere.
As far as the docs go, I think it's sufficient to mention the
inconsistency with SQL down at the bottom; we don't need a
redundant in-line explanation.
regards, tom lane
On Tue, 5 Jul 2022 at 19:00, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Dean Rasheed <dean.a.rasheed@gmail.com> writes:
This was discussed previously in [1], and there seemed to be general
consensus in favour of it, but no new patch emerged.As I said in that thread, I'm not super enthused about this, but I was
clearly in the minority so I think it should go forward.
Cool. Thanks for looking.
Attached is a patch that takes the approach of not generating an alias
at all, which seems to be neater and simpler, and less code than
trying to generate a unique alias.Hm. Looking at the code surrounding what you touched, I'm reminded
that we allow JOIN nodes to not have an alias, and represent that
situation by rte->alias == NULL. I wonder if it'd be better in the
long run to make alias-less subqueries work similarly,
That is what the patch does: transformRangeSubselect() passes a NULL
alias to addRangeTableEntryForSubquery(), which has been modified to
cope with that in a similar way to addRangeTableEntryForJoin() and
other addRangeTableEntryFor...() functions.
So for example, with the following query, this is what the output from
the parser looks like:
SELECT * FROM (SELECT 1);
query->rtable:
rte:
rtekind = RTE_SUBQUERY
alias = NULL
eref = { aliasname = "subquery", colnames = ... }
rather than
generating something that after-the-fact will be indistinguishable
from a user-written alias. If that turns out to not work well,
I'd agree with "unnamed_subquery" as the inserted name.
The result is distinguishable from a user-written alias, because
rte->alias is NULL. I think the confusion is that when I suggested
using "unnamed_subquery", I was referring to rte->eref->aliasname, and
I still think it's a good idea to change that, for consistency with
unnamed joins.
Also, what about VALUES clauses? It seems inconsistent to remove
this restriction for sub-SELECT but not VALUES. Actually it looks
like your patch already does remove that restriction in gram.y,
but you didn't follow through elsewhere.
It does support unnamed VALUES clauses in the FROM list (there's a
regression test exercising that). It wasn't necessary to make any
additional code changes because addRangeTableEntryForValues() already
supported having a NULL alias, and it all just flowed through.
In fact, the grammar forces you to enclose a VALUES clause in the FROM
list in parentheses, so this ends up being an unnamed subquery in the
FROM list as well. For example:
SELECT * FROM (VALUES(1),(2),(3));
produces
query->rtable:
rte:
rtekind = RTE_SUBQUERY
alias = NULL
eref = { aliasname = "subquery", colnames = ... }
subquery->rtable:
rte:
rtekind = RTE_VALUES
alias = NULL
eref = { aliasname = "*VALUES*", colnames = ... }
So it's not really any different from a normal subquery.
As far as the docs go, I think it's sufficient to mention the
inconsistency with SQL down at the bottom; we don't need a
redundant in-line explanation.
OK, fair enough.
I'll post an update in a little while, but first, I found a bug, which
revealed a pre-existing bug in transformLockingClause(). I'll start a
new thread for that, since it'd be good to get that resolved
independently of this patch.
Regards,
Dean
On Wed, 6 Jul 2022 at 15:09, Dean Rasheed <dean.a.rasheed@gmail.com> wrote:
I'll post an update in a little while, but first, I found a bug, which
revealed a pre-existing bug in transformLockingClause(). I'll start a
new thread for that, since it'd be good to get that resolved
independently of this patch.
Attached is an update with the following changes:
* Docs updated as suggested.
* transformLockingClause() updated to skip subquery and values rtes
without aliases.
* eref->aliasname changed to "unnamed_subquery" for subqueries without aliases.
Regards,
Dean
Attachments:
make-subquery-alias-optional-v2.patchtext/x-patch; charset=US-ASCII; name=make-subquery-alias-optional-v2.patchDownload
diff --git a/doc/src/sgml/ref/select.sgml b/doc/src/sgml/ref/select.sgml
new file mode 100644
index 80bb8ad..410c80e
--- a/doc/src/sgml/ref/select.sgml
+++ b/doc/src/sgml/ref/select.sgml
@@ -51,7 +51,7 @@ SELECT [ ALL | DISTINCT [ ON ( <replacea
[ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [ [ AS ] <replaceable class="parameter">alias</replaceable> [ ( <replaceable class="parameter">column_alias</replaceable> [, ...] ) ] ]
[ TABLESAMPLE <replaceable class="parameter">sampling_method</replaceable> ( <replaceable class="parameter">argument</replaceable> [, ...] ) [ REPEATABLE ( <replaceable class="parameter">seed</replaceable> ) ] ]
- [ LATERAL ] ( <replaceable class="parameter">select</replaceable> ) [ AS ] <replaceable class="parameter">alias</replaceable> [ ( <replaceable class="parameter">column_alias</replaceable> [, ...] ) ]
+ [ LATERAL ] ( <replaceable class="parameter">select</replaceable> ) [ [ AS ] <replaceable class="parameter">alias</replaceable> [ ( <replaceable class="parameter">column_alias</replaceable> [, ...] ) ] ]
<replaceable class="parameter">with_query_name</replaceable> [ [ AS ] <replaceable class="parameter">alias</replaceable> [ ( <replaceable class="parameter">column_alias</replaceable> [, ...] ) ] ]
[ LATERAL ] <replaceable class="parameter">function_name</replaceable> ( [ <replaceable class="parameter">argument</replaceable> [, ...] ] )
[ WITH ORDINALITY ] [ [ AS ] <replaceable class="parameter">alias</replaceable> [ ( <replaceable class="parameter">column_alias</replaceable> [, ...] ) ] ]
@@ -490,8 +490,8 @@ TABLE [ ONLY ] <replaceable class="param
output were created as a temporary table for the duration of
this single <command>SELECT</command> command. Note that the
sub-<command>SELECT</command> must be surrounded by
- parentheses, and an alias <emphasis>must</emphasis> be
- provided for it. A
+ parentheses, and an alias can be provided in the same way as for a
+ table. A
<link linkend="sql-values"><command>VALUES</command></link> command
can also be used here.
</para>
@@ -2041,6 +2041,16 @@ SELECT 2+2;
</para>
</refsect2>
+ <refsect2>
+ <title>Omitting sub-<command>SELECT</command> aliases in <literal>FROM</literal></title>
+
+ <para>
+ According to the SQL standard, a sub-<command>SELECT</command> in the
+ <literal>FROM</literal> list must have an alias. In
+ <productname>PostgreSQL</productname>, this alias may be omitted.
+ </para>
+ </refsect2>
+
<refsect2>
<title><literal>ONLY</literal> and Inheritance</title>
diff --git a/src/backend/parser/analyze.c b/src/backend/parser/analyze.c
new file mode 100644
index 8ed2c4b..6688c2a
--- a/src/backend/parser/analyze.c
+++ b/src/backend/parser/analyze.c
@@ -3291,7 +3291,7 @@ transformLockingClause(ParseState *pstat
foreach(rt, qry->rtable)
{
RangeTblEntry *rte = (RangeTblEntry *) lfirst(rt);
- char *rtename;
+ char *rtename = rte->eref->aliasname;
++i;
if (!rte->inFromCl)
@@ -3302,15 +3302,22 @@ transformLockingClause(ParseState *pstat
* name and needs to be skipped (otherwise it might hide a
* base relation with the same name), except if it has a USING
* alias, which *is* visible.
+ *
+ * Subquery and values RTEs without aliases are never visible
+ * as relation names and must always be skipped.
*/
- if (rte->rtekind == RTE_JOIN && rte->alias == NULL)
+ if (rte->alias == NULL)
{
- if (rte->join_using_alias == NULL)
+ if (rte->rtekind == RTE_JOIN)
+ {
+ if (rte->join_using_alias == NULL)
+ continue;
+ rtename = rte->join_using_alias->aliasname;
+ }
+ else if (rte->rtekind == RTE_SUBQUERY ||
+ rte->rtekind == RTE_VALUES)
continue;
- rtename = rte->join_using_alias->aliasname;
}
- else
- rtename = rte->eref->aliasname;
if (strcmp(rtename, thisrel->relname) == 0)
{
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
new file mode 100644
index 0523013..0f8d056
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -13346,33 +13346,6 @@ table_ref: relation_expr opt_alias_claus
n->lateral = false;
n->subquery = $1;
n->alias = $2;
- /*
- * The SQL spec does not permit a subselect
- * (<derived_table>) without an alias clause,
- * so we don't either. This avoids the problem
- * of needing to invent a unique refname for it.
- * That could be surmounted if there's sufficient
- * popular demand, but for now let's just implement
- * the spec and see if anyone complains.
- * However, it does seem like a good idea to emit
- * an error message that's better than "syntax error".
- */
- if ($2 == NULL)
- {
- if (IsA($1, SelectStmt) &&
- ((SelectStmt *) $1)->valuesLists)
- ereport(ERROR,
- (errcode(ERRCODE_SYNTAX_ERROR),
- errmsg("VALUES in FROM must have an alias"),
- errhint("For example, FROM (VALUES ...) [AS] foo."),
- parser_errposition(@1)));
- else
- ereport(ERROR,
- (errcode(ERRCODE_SYNTAX_ERROR),
- errmsg("subquery in FROM must have an alias"),
- errhint("For example, FROM (SELECT ...) [AS] foo."),
- parser_errposition(@1)));
- }
$$ = (Node *) n;
}
| LATERAL_P select_with_parens opt_alias_clause
@@ -13382,23 +13355,6 @@ table_ref: relation_expr opt_alias_claus
n->lateral = true;
n->subquery = $2;
n->alias = $3;
- /* same comment as above */
- if ($3 == NULL)
- {
- if (IsA($2, SelectStmt) &&
- ((SelectStmt *) $2)->valuesLists)
- ereport(ERROR,
- (errcode(ERRCODE_SYNTAX_ERROR),
- errmsg("VALUES in FROM must have an alias"),
- errhint("For example, FROM (VALUES ...) [AS] foo."),
- parser_errposition(@2)));
- else
- ereport(ERROR,
- (errcode(ERRCODE_SYNTAX_ERROR),
- errmsg("subquery in FROM must have an alias"),
- errhint("For example, FROM (SELECT ...) [AS] foo."),
- parser_errposition(@2)));
- }
$$ = (Node *) n;
}
| joined_table
diff --git a/src/backend/parser/parse_clause.c b/src/backend/parser/parse_clause.c
new file mode 100644
index c655d18..5a18107
--- a/src/backend/parser/parse_clause.c
+++ b/src/backend/parser/parse_clause.c
@@ -404,16 +404,6 @@ transformRangeSubselect(ParseState *psta
Query *query;
/*
- * We require user to supply an alias for a subselect, per SQL92. To relax
- * this, we'd have to be prepared to gin up a unique alias for an
- * unlabeled subselect. (This is just elog, not ereport, because the
- * grammar should have enforced it already. It'd probably be better to
- * report the error here, but we don't have a good error location here.)
- */
- if (r->alias == NULL)
- elog(ERROR, "subquery in FROM must have an alias");
-
- /*
* Set p_expr_kind to show this parse level is recursing to a subselect.
* We can't be nested within any expression, so don't need save-restore
* logic here.
@@ -430,10 +420,14 @@ transformRangeSubselect(ParseState *psta
pstate->p_lateral_active = r->lateral;
/*
- * Analyze and transform the subquery.
+ * Analyze and transform the subquery. Note that if the subquery doesn't
+ * have an alias, it can't be explicitly selected for locking, but locking
+ * might still be required (if there is an all-tables locking clause).
*/
query = parse_sub_analyze(r->subquery, pstate, NULL,
- isLockedRefname(pstate, r->alias->aliasname),
+ isLockedRefname(pstate,
+ r->alias == NULL ? NULL :
+ r->alias->aliasname),
true);
/* Restore state */
diff --git a/src/backend/parser/parse_relation.c b/src/backend/parser/parse_relation.c
new file mode 100644
index 926dcbf..4e63556
--- a/src/backend/parser/parse_relation.c
+++ b/src/backend/parser/parse_relation.c
@@ -1577,7 +1577,11 @@ addRangeTableEntryForRelation(ParseState
* Then, construct and return a ParseNamespaceItem for the new RTE.
*
* This is much like addRangeTableEntry() except that it makes a subquery RTE.
- * Note that an alias clause *must* be supplied.
+ *
+ * If the subquery does not have an alias, the auto-generated relation name in
+ * the returned ParseNamespaceItem will be marked as not visible, and so only
+ * unqualified references to the subquery columns will be allowed, and the
+ * relation name will not conflict with others in the pstate's namespace list.
*/
ParseNamespaceItem *
addRangeTableEntryForSubquery(ParseState *pstate,
@@ -1587,7 +1591,6 @@ addRangeTableEntryForSubquery(ParseState
bool inFromCl)
{
RangeTblEntry *rte = makeNode(RangeTblEntry);
- char *refname = alias->aliasname;
Alias *eref;
int numaliases;
List *coltypes,
@@ -1595,6 +1598,7 @@ addRangeTableEntryForSubquery(ParseState
*colcollations;
int varattno;
ListCell *tlistitem;
+ ParseNamespaceItem *nsitem;
Assert(pstate != NULL);
@@ -1602,7 +1606,7 @@ addRangeTableEntryForSubquery(ParseState
rte->subquery = subquery;
rte->alias = alias;
- eref = copyObject(alias);
+ eref = alias ? copyObject(alias) : makeAlias("unnamed_subquery", NIL);
numaliases = list_length(eref->colnames);
/* fill in any unspecified alias columns, and extract column type info */
@@ -1634,7 +1638,7 @@ addRangeTableEntryForSubquery(ParseState
ereport(ERROR,
(errcode(ERRCODE_INVALID_COLUMN_REFERENCE),
errmsg("table \"%s\" has %d columns available but %d columns specified",
- refname, varattno, numaliases)));
+ eref->aliasname, varattno, numaliases)));
rte->eref = eref;
@@ -1665,8 +1669,15 @@ addRangeTableEntryForSubquery(ParseState
* Build a ParseNamespaceItem, but don't add it to the pstate's namespace
* list --- caller must do that if appropriate.
*/
- return buildNSItemFromLists(rte, list_length(pstate->p_rtable),
- coltypes, coltypmods, colcollations);
+ nsitem = buildNSItemFromLists(rte, list_length(pstate->p_rtable),
+ coltypes, coltypmods, colcollations);
+
+ /*
+ * Mark it visible as a relation name only if it had a user-written alias.
+ */
+ nsitem->p_rel_visible = (alias != NULL);
+
+ return nsitem;
}
/*
@@ -2520,6 +2531,10 @@ addRangeTableEntryForENR(ParseState *pst
* This is used when we have not yet done transformLockingClause, but need
* to know the correct lock to take during initial opening of relations.
*
+ * Note that refname may be NULL (for a subquery without an alias), in which
+ * case the relation can't be locked by name, but it might still be locked if
+ * a locking clause requests that all tables be locked.
+ *
* Note: we pay no attention to whether it's FOR UPDATE vs FOR SHARE,
* since the table-level lock is the same either way.
*/
@@ -2544,7 +2559,7 @@ isLockedRefname(ParseState *pstate, cons
/* all tables used in query */
return true;
}
- else
+ else if (refname != NULL)
{
/* just the named tables */
ListCell *l2;
diff --git a/src/backend/utils/adt/ruleutils.c b/src/backend/utils/adt/ruleutils.c
new file mode 100644
index 26cf4fa..513bf0f
--- a/src/backend/utils/adt/ruleutils.c
+++ b/src/backend/utils/adt/ruleutils.c
@@ -11725,7 +11725,11 @@ get_from_clause_item(Node *jtnode, Query
else if (rte->rtekind == RTE_SUBQUERY ||
rte->rtekind == RTE_VALUES)
{
- /* Alias is syntactically required for SUBQUERY and VALUES */
+ /*
+ * For a subquery, always print alias. This makes the output SQL
+ * spec-compliant, even though we allow the alias to be omitted on
+ * input.
+ */
printalias = true;
}
else if (rte->rtekind == RTE_CTE)
diff --git a/src/include/parser/parse_node.h b/src/include/parser/parse_node.h
new file mode 100644
index cf9c759..962ebf6
--- a/src/include/parser/parse_node.h
+++ b/src/include/parser/parse_node.h
@@ -246,8 +246,11 @@ struct ParseState
* visible for qualified references) but it does hide their columns
* (unqualified references to the columns refer to the JOIN, not the member
* tables, so we must not complain that such a reference is ambiguous).
- * Various special RTEs such as NEW/OLD for rules may also appear with only
- * one flag set.
+ * Conversely, a subquery without an alias does not hide the columns selected
+ * by the subquery, but it does hide the auto-generated relation name (so the
+ * subquery columns are visible for unqualified references only). Various
+ * special RTEs such as NEW/OLD for rules may also appear with only one flag
+ * set.
*
* While processing the FROM clause, namespace items may appear with
* p_lateral_only set, meaning they are visible only to LATERAL
diff --git a/src/interfaces/ecpg/preproc/ecpg.addons b/src/interfaces/ecpg/preproc/ecpg.addons
new file mode 100644
index 94f7d4a..e94da2a
--- a/src/interfaces/ecpg/preproc/ecpg.addons
+++ b/src/interfaces/ecpg/preproc/ecpg.addons
@@ -449,12 +449,6 @@ ECPG: into_clauseINTOOptTempTableName bl
$$= cat2_str(mm_strdup("into"), $2);
}
| ecpg_into { $$ = EMPTY; }
-ECPG: table_refselect_with_parensopt_alias_clause addon
- if ($2 == NULL)
- mmerror(PARSE_ERROR, ET_ERROR, "subquery in FROM must have an alias");
-ECPG: table_refLATERAL_Pselect_with_parensopt_alias_clause addon
- if ($3 == NULL)
- mmerror(PARSE_ERROR, ET_ERROR, "subquery in FROM must have an alias");
ECPG: TypenameSimpleTypenameopt_array_bounds block
{ $$ = cat2_str($1, $2.str); }
ECPG: TypenameSETOFSimpleTypenameopt_array_bounds block
diff --git a/src/test/regress/expected/subselect.out b/src/test/regress/expected/subselect.out
new file mode 100644
index 45c75ee..63d26d4
--- a/src/test/regress/expected/subselect.out
+++ b/src/test/regress/expected/subselect.out
@@ -30,6 +30,12 @@ SELECT * FROM ((SELECT 1 AS x)) ss;
1
(1 row)
+SELECT * FROM ((SELECT 1 AS x)), ((SELECT * FROM ((SELECT 2 AS y))));
+ x | y
+---+---
+ 1 | 2
+(1 row)
+
(SELECT 2) UNION SELECT 2;
?column?
----------
@@ -196,6 +202,69 @@ SELECT f1 AS "Correlated Field"
3
(5 rows)
+-- Subselects without aliases
+SELECT count FROM (SELECT COUNT(DISTINCT name) FROM road);
+ count
+-------
+ 2911
+(1 row)
+
+SELECT COUNT(*) FROM (SELECT DISTINCT name FROM road);
+ count
+-------
+ 2911
+(1 row)
+
+SELECT * FROM (SELECT * FROM int4_tbl), (VALUES (123456)) WHERE f1 = column1;
+ f1 | column1
+--------+---------
+ 123456 | 123456
+(1 row)
+
+CREATE VIEW view_unnamed_ss AS
+SELECT * FROM (SELECT * FROM (SELECT abs(f1) AS a1 FROM int4_tbl)),
+ (SELECT * FROM int8_tbl)
+ WHERE a1 < 10 AND q1 > a1 ORDER BY q1, q2;
+SELECT * FROM view_unnamed_ss;
+ a1 | q1 | q2
+----+------------------+-------------------
+ 0 | 123 | 456
+ 0 | 123 | 4567890123456789
+ 0 | 4567890123456789 | -4567890123456789
+ 0 | 4567890123456789 | 123
+ 0 | 4567890123456789 | 4567890123456789
+(5 rows)
+
+\sv view_unnamed_ss
+CREATE OR REPLACE VIEW public.view_unnamed_ss AS
+ SELECT unnamed_subquery.a1,
+ unnamed_subquery_1.q1,
+ unnamed_subquery_1.q2
+ FROM ( SELECT unnamed_subquery_2.a1
+ FROM ( SELECT abs(int4_tbl.f1) AS a1
+ FROM int4_tbl) unnamed_subquery_2) unnamed_subquery,
+ ( SELECT int8_tbl.q1,
+ int8_tbl.q2
+ FROM int8_tbl) unnamed_subquery_1
+ WHERE unnamed_subquery.a1 < 10 AND unnamed_subquery_1.q1 > unnamed_subquery.a1
+ ORDER BY unnamed_subquery_1.q1, unnamed_subquery_1.q2
+DROP VIEW view_unnamed_ss;
+-- Test matching of locking clause to correct alias
+CREATE VIEW view_unnamed_ss_locking AS
+SELECT * FROM (SELECT * FROM int4_tbl), int8_tbl AS unnamed_subquery
+ WHERE f1 = q1
+ FOR UPDATE OF unnamed_subquery;
+\sv view_unnamed_ss_locking
+CREATE OR REPLACE VIEW public.view_unnamed_ss_locking AS
+ SELECT unnamed_subquery.f1,
+ unnamed_subquery_1.q1,
+ unnamed_subquery_1.q2
+ FROM ( SELECT int4_tbl.f1
+ FROM int4_tbl) unnamed_subquery,
+ int8_tbl unnamed_subquery_1
+ WHERE unnamed_subquery.f1 = unnamed_subquery_1.q1
+ FOR UPDATE OF unnamed_subquery_1
+DROP VIEW view_unnamed_ss_locking;
--
-- Use some existing tables in the regression test
--
diff --git a/src/test/regress/sql/subselect.sql b/src/test/regress/sql/subselect.sql
new file mode 100644
index 94ba91f..4027670
--- a/src/test/regress/sql/subselect.sql
+++ b/src/test/regress/sql/subselect.sql
@@ -13,6 +13,8 @@ SELECT 1 AS zero WHERE 1 IN (SELECT 2);
SELECT * FROM (SELECT 1 AS x) ss;
SELECT * FROM ((SELECT 1 AS x)) ss;
+SELECT * FROM ((SELECT 1 AS x)), ((SELECT * FROM ((SELECT 2 AS y))));
+
(SELECT 2) UNION SELECT 2;
((SELECT 2)) UNION SELECT 2;
@@ -80,6 +82,35 @@ SELECT f1 AS "Correlated Field"
WHERE (f1, f2) IN (SELECT f2, CAST(f3 AS int4) FROM SUBSELECT_TBL
WHERE f3 IS NOT NULL);
+-- Subselects without aliases
+
+SELECT count FROM (SELECT COUNT(DISTINCT name) FROM road);
+SELECT COUNT(*) FROM (SELECT DISTINCT name FROM road);
+
+SELECT * FROM (SELECT * FROM int4_tbl), (VALUES (123456)) WHERE f1 = column1;
+
+CREATE VIEW view_unnamed_ss AS
+SELECT * FROM (SELECT * FROM (SELECT abs(f1) AS a1 FROM int4_tbl)),
+ (SELECT * FROM int8_tbl)
+ WHERE a1 < 10 AND q1 > a1 ORDER BY q1, q2;
+
+SELECT * FROM view_unnamed_ss;
+
+\sv view_unnamed_ss
+
+DROP VIEW view_unnamed_ss;
+
+-- Test matching of locking clause to correct alias
+
+CREATE VIEW view_unnamed_ss_locking AS
+SELECT * FROM (SELECT * FROM int4_tbl), int8_tbl AS unnamed_subquery
+ WHERE f1 = q1
+ FOR UPDATE OF unnamed_subquery;
+
+\sv view_unnamed_ss_locking
+
+DROP VIEW view_unnamed_ss_locking;
+
--
-- Use some existing tables in the regression test
--
On Sat, Jul 9, 2022 at 3:28 AM Dean Rasheed <dean.a.rasheed@gmail.com>
wrote:
On Wed, 6 Jul 2022 at 15:09, Dean Rasheed <dean.a.rasheed@gmail.com>
wrote:I'll post an update in a little while, but first, I found a bug, which
revealed a pre-existing bug in transformLockingClause(). I'll start a
new thread for that, since it'd be good to get that resolved
independently of this patch.Attached is an update with the following changes:
* Docs updated as suggested.
* transformLockingClause() updated to skip subquery and values rtes
without aliases.
* eref->aliasname changed to "unnamed_subquery" for subqueries without
aliases.Regards,
Dean
Hi,
rtename is assigned at the beginning of the loop:
+ char *rtename = rte->eref->aliasname;
It seems the code would be more readable if you keep the assignment in
else block below:
+ else if (rte->rtekind == RTE_SUBQUERY ||
+ rte->rtekind == RTE_VALUES)
continue;
- rtename = rte->join_using_alias->aliasname;
}
- else
- rtename = rte->eref->aliasname;
because rtename would be assigned in the `rte->rtekind == RTE_JOIN` case.
Cheers
On Sat, 9 Jul 2022 at 12:24, Zhihong Yu <zyu@yugabyte.com> wrote:
It seems the code would be more readable if you keep the assignment in else block below:
+ else if (rte->rtekind == RTE_SUBQUERY || + rte->rtekind == RTE_VALUES) continue; - rtename = rte->join_using_alias->aliasname; } - else - rtename = rte->eref->aliasname;because rtename would be assigned in the `rte->rtekind == RTE_JOIN` case.
But then it would need 2 else blocks, one inside the rte->alias ==
NULL block, for when rtekind is not RTE_JOIN, RTE_SUBQUERY or
RTE_VALUES, and another after the block, for when rte->alias != NULL.
I find it more readable this way.
Regards,
Dean
On Sat, Jul 9, 2022 at 5:18 AM Dean Rasheed <dean.a.rasheed@gmail.com>
wrote:
On Sat, 9 Jul 2022 at 12:24, Zhihong Yu <zyu@yugabyte.com> wrote:
It seems the code would be more readable if you keep the assignment in
else block below:
+ else if (rte->rtekind == RTE_SUBQUERY || + rte->rtekind == RTE_VALUES) continue; - rtename = rte->join_using_alias->aliasname; } - else - rtename = rte->eref->aliasname;because rtename would be assigned in the `rte->rtekind == RTE_JOIN` case.
But then it would need 2 else blocks, one inside the rte->alias ==
NULL block, for when rtekind is not RTE_JOIN, RTE_SUBQUERY or
RTE_VALUES, and another after the block, for when rte->alias != NULL.
I find it more readable this way.Regards,
Dean
Hi, Dean:
Thanks for the explanation.
I should have looked closer :-)
Dean Rasheed <dean.a.rasheed@gmail.com> writes:
Attached is an update with the following changes:
* Docs updated as suggested.
* transformLockingClause() updated to skip subquery and values rtes
without aliases.
* eref->aliasname changed to "unnamed_subquery" for subqueries without aliases.
This looks good to me. Marked RFC.
regards, tom lane
On Mon, 2 Oct 2023 at 00:33, Dean Rasheed <dean.a.rasheed@gmail.com> wrote:
The only place that exposes the eref's made-up relation name is the
existing query deparsing code in ruleutils.c, which uniquifies it and
generates SQL spec-compliant output. For example:
I ran into one other place: error messages.
SELECT unnamed_subquery.a
FROM (SELECT 1 AS a)
ERROR: There is an entry for table "unnamed_subquery", but it cannot be
referenced from this part of the query.invalid reference to FROM-clause
entry for table "unnamed_subquery"
Normally, we would find the cited name somewhere in the query. Confusing.
Notably, the same does not happen for "unnamed_subquery_1":
SELECT unnamed_subquery_1.a
FROM (SELECT 1 AS a), (SELECT 1 AS a)
ERROR: missing FROM-clause entry for table "unnamed_subquery_1"
That's the message anybody would expect.
Also makes sense, as "uniquification" only happens in the above quoted
case, and all invisible aliases seem to be "unnamed_subquery" at this
point? But a bit confusing on a different level.
Maybe error messages should not be aware of invisible aliases, and just
complain about "missing FROM-clause entry"?
Not sure whether a fix would be easy, nor whether it would be worth the
effort. Just wanted to document the corner case issue in this thread.
Regards
Erwin
Erwin Brandstetter <brsaweda@gmail.com> writes:
On Mon, 2 Oct 2023 at 00:33, Dean Rasheed <dean.a.rasheed@gmail.com> wrote:
The only place that exposes the eref's made-up relation name is the
existing query deparsing code in ruleutils.c, which uniquifies it and
generates SQL spec-compliant output. For example:
I ran into one other place: error messages.
SELECT unnamed_subquery.a
FROM (SELECT 1 AS a)
ERROR: There is an entry for table "unnamed_subquery", but it cannot be
referenced from this part of the query.invalid reference to FROM-clause
entry for table "unnamed_subquery"
Yeah, that's exposing more of the implementation than we really want.
Notably, the same does not happen for "unnamed_subquery_1":
SELECT unnamed_subquery_1.a
FROM (SELECT 1 AS a), (SELECT 1 AS a)
ERROR: missing FROM-clause entry for table "unnamed_subquery_1"
Actually, that happens because "unnamed_subquery_1" *isn't* in the
parse tree. As implemented, both RTEs are labeled "unnamed_subquery"
in the parser output, and it's ruleutils that de-duplicates them.
I'm inclined to think we should avoid letting "unnamed_subquery"
appear in the parse tree, too. It might not be a good idea to
try to leave the eref field null, but could we set it to an
empty string instead, that is
- eref = alias ? copyObject(alias) : makeAlias("unnamed_subquery", NIL);
+ eref = alias ? copyObject(alias) : makeAlias("", NIL);
and then let ruleutils replace that with "unnamed_subquery"? This
would prevent accessing the subquery name in the way Erwin shows,
because we don't let you write an empty identifier in SQL:
regression=# select "".a from (select 1 as a);
ERROR: zero-length delimited identifier at or near """"
LINE 1: select "".a from (select 1 as a);
^
However, there might then be some parser error messages that
refer to subquery "", so I'm not sure if this is totally
without surprises either.
regards, tom lane
On Mon, 2 Oct 2023 at 01:01, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Erwin Brandstetter <brsaweda@gmail.com> writes:
On Mon, 2 Oct 2023 at 00:33, Dean Rasheed <dean.a.rasheed@gmail.com> wrote:
The only place that exposes the eref's made-up relation name is the
existing query deparsing code in ruleutils.c, which uniquifies it and
generates SQL spec-compliant output. For example:I ran into one other place: error messages.
SELECT unnamed_subquery.a
FROM (SELECT 1 AS a)
ERROR: There is an entry for table "unnamed_subquery", but it cannot be
referenced from this part of the query.invalid reference to FROM-clause
entry for table "unnamed_subquery"Yeah, that's exposing more of the implementation than we really want.
Note that this isn't a new issue, specific to unnamed subqueries. The
same thing happens for unnamed joins:
create table foo(a int);
create table bar(a int);
select unnamed_join.a from foo join bar using (a);
ERROR: invalid reference to FROM-clause entry for table "unnamed_join"
LINE 1: select unnamed_join.a from foo join bar using (a);
^
DETAIL: There is an entry for table "unnamed_join", but it cannot be
referenced from this part of the query.
And there's a similar problem with VALUES RTEs:
insert into foo values (1),(2) returning "*VALUES*".a;
ERROR: invalid reference to FROM-clause entry for table "*VALUES*"
LINE 1: insert into foo values (1),(2) returning "*VALUES*".a;
^
DETAIL: There is an entry for table "*VALUES*", but it cannot be
referenced from this part of the query.
I'm inclined to think we should avoid letting "unnamed_subquery"
appear in the parse tree, too. It might not be a good idea to
try to leave the eref field null, but could we set it to an
empty string instead, that is- eref = alias ? copyObject(alias) : makeAlias("unnamed_subquery", NIL); + eref = alias ? copyObject(alias) : makeAlias("", NIL);and then let ruleutils replace that with "unnamed_subquery"?
Hmm, I think that there would be other side-effects if we did that --
at least doing it for VALUES RTEs would also require additional
changes to retain current EXPLAIN output. I think perhaps it would be
better to try for a more targeted fix of the parser error reporting.
In searchRangeTableForRel() we try to find any RTE that could possibly
match the RangeVar, but certain kinds of RTE don't naturally have
names, and if they also haven't been given aliases, then they can't
possibly match anywhere in the query (and thus it's misleading to
report that they can't be referred to from specific places).
So I think perhaps it's better to just have searchRangeTableForRel()
exclude these kinds of RTE, if they haven't been given an alias.
Regards,
Dean
Attachments:
improve-error-reporting-for-refs-to-unnamed-rtes.patchapplication/x-patch; name=improve-error-reporting-for-refs-to-unnamed-rtes.patchDownload
diff --git a/src/backend/parser/parse_relation.c b/src/backend/parser/parse_relation.c
new file mode 100644
index 864ea9b..0afe177
--- a/src/backend/parser/parse_relation.c
+++ b/src/backend/parser/parse_relation.c
@@ -395,6 +395,20 @@ searchRangeTableForRel(ParseState *pstat
{
RangeTblEntry *rte = (RangeTblEntry *) lfirst(l);
+ /*
+ * Ignore RTEs that do not automatically take their names from
+ * database objects, and have not been supplied with aliases
+ * (e.g., unnamed joins). Such RTEs cannot possibly match the
+ * RangeVar, and cannot be referenced from anywhere in the query.
+ * Thus, it would be misleading to complain that they can't be
+ * referenced from particular parts of the query.
+ */
+ if (!rte->alias &&
+ (rte->rtekind == RTE_SUBQUERY ||
+ rte->rtekind == RTE_JOIN ||
+ rte->rtekind == RTE_VALUES))
+ continue;
+
if (rte->rtekind == RTE_RELATION &&
OidIsValid(relId) &&
rte->relid == relId)
diff --git a/src/test/regress/expected/join.out b/src/test/regress/expected/join.out
new file mode 100644
index 9b8638f..2bbf3bf
--- a/src/test/regress/expected/join.out
+++ b/src/test/regress/expected/join.out
@@ -2895,6 +2895,26 @@ select bar.*, unnamed_join.* from
ERROR: FOR UPDATE cannot be applied to a join
LINE 3: for update of bar;
^
+-- Test error reporting of references to internal aliases
+select unnamed_join.* from
+ t1 join t2 on (t1.a = t2.a);
+ERROR: missing FROM-clause entry for table "unnamed_join"
+LINE 1: select unnamed_join.* from
+ ^
+select unnamed_join.* from
+ (select * from t1 join t2 using (a)) ss join t3 on (t3.x = ss.a);
+ERROR: missing FROM-clause entry for table "unnamed_join"
+LINE 1: select unnamed_join.* from
+ ^
+select unnamed_subquery.* from
+ (select * from t1);
+ERROR: missing FROM-clause entry for table "unnamed_subquery"
+LINE 1: select unnamed_subquery.* from
+ ^
+insert into t1 values (1, 10), (2, 20) returning "*VALUES*".*;
+ERROR: missing FROM-clause entry for table "*VALUES*"
+LINE 1: insert into t1 values (1, 10), (2, 20) returning "*VALUES*"....
+ ^
--
-- regression test for 8.1 merge right join bug
--
diff --git a/src/test/regress/sql/join.sql b/src/test/regress/sql/join.sql
new file mode 100644
index 3e5032b..4d4ebc3
--- a/src/test/regress/sql/join.sql
+++ b/src/test/regress/sql/join.sql
@@ -669,6 +669,19 @@ select bar.*, unnamed_join.* from
(t1 join t2 using (a) as foo) as bar, t3 as unnamed_join
for update of bar;
+-- Test error reporting of references to internal aliases
+
+select unnamed_join.* from
+ t1 join t2 on (t1.a = t2.a);
+
+select unnamed_join.* from
+ (select * from t1 join t2 using (a)) ss join t3 on (t3.x = ss.a);
+
+select unnamed_subquery.* from
+ (select * from t1);
+
+insert into t1 values (1, 10), (2, 20) returning "*VALUES*".*;
+
--
-- regression test for 8.1 merge right join bug
--
Dean Rasheed <dean.a.rasheed@gmail.com> writes:
On Mon, 2 Oct 2023 at 01:01, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Yeah, that's exposing more of the implementation than we really want.
Note that this isn't a new issue, specific to unnamed subqueries. The
same thing happens for unnamed joins:
True, and we've had few complaints about that. Still, if we can
clean it up without too much effort, let's do so.
So I think perhaps it's better to just have searchRangeTableForRel()
exclude these kinds of RTE, if they haven't been given an alias.
Would we need a new flag in the ParseNamespaceItem data structure,
or will the existing data serve? I see how to do this if we add
a "doesn't really have a name" flag, but it's not clear to me that
we can reliably identify them otherwise. Maybe a test involving
the rtekind and whether the "alias" field is set would do, but
that way seems a bit ugly.
regards, tom lane