Making the subquery alias optional in the FROM clause

Started by Dean Rasheedalmost 4 years ago19 messageshackers
Jump to latest
#1Dean Rasheed
dean.a.rasheed@gmail.com

This was discussed previously in [1]/messages/by-id/1487773980.3143.15.camel@oopsware.de, and there seemed to be general
consensus in favour of it, but no new patch emerged.

Attached is a patch that takes the approach of not generating an alias
at all, which seems to be neater and simpler, and less code than
trying to generate a unique alias.

It still generates an eref for the subquery RTE, which has a made-up
relation name, but that is marked as not visible on the
ParseNamespaceItem, so it doesn't conflict with anything else, need
not be unique, and cannot be used for qualified references to the
subquery's columns.

The only place that exposes the eref's made-up relation name is the
existing query deparsing code in ruleutils.c, which uniquifies it and
generates SQL spec-compliant output. For example:

CREATE OR REPLACE VIEW test_view AS
SELECT *
FROM (SELECT a, b FROM foo),
(SELECT c, d FROM bar)
WHERE a = c;

\sv test_view

CREATE OR REPLACE VIEW public.test_view AS
SELECT subquery.a,
subquery.b,
subquery_1.c,
subquery_1.d
FROM ( SELECT foo.a,
foo.b
FROM foo) subquery,
( SELECT bar.c,
bar.d
FROM bar) subquery_1
WHERE subquery.a = subquery_1.c

Regards,
Dean

[1]: /messages/by-id/1487773980.3143.15.camel@oopsware.de

Attachments:

make-subquery-alias-optional.patchtext/x-patch; charset=US-ASCII; name=make-subquery-alias-optional.patchDownload+131-75
#2Julien Rouhaud
rjuju123@gmail.com
In reply to: Dean Rasheed (#1)
Re: Making the subquery alias optional in the FROM clause

On Mon, Jun 27, 2022 at 9:49 PM Dean Rasheed <dean.a.rasheed@gmail.com> wrote:

This was discussed previously in [1], and there seemed to be general
consensus in favour of it, but no new patch emerged.

Attached is a patch that takes the approach of not generating an alias
at all, which seems to be neater and simpler, and less code than
trying to generate a unique alias.

It still generates an eref for the subquery RTE, which has a made-up
relation name, but that is marked as not visible on the
ParseNamespaceItem, so it doesn't conflict with anything else, need
not be unique, and cannot be used for qualified references to the
subquery's columns.

The only place that exposes the eref's made-up relation name is the
existing query deparsing code in ruleutils.c, which uniquifies it and
generates SQL spec-compliant output. For example:

CREATE OR REPLACE VIEW test_view AS
SELECT *
FROM (SELECT a, b FROM foo),
(SELECT c, d FROM bar)
WHERE a = c;

\sv test_view

CREATE OR REPLACE VIEW public.test_view AS
SELECT subquery.a,
subquery.b,
subquery_1.c,
subquery_1.d
FROM ( SELECT foo.a,
foo.b
FROM foo) subquery,
( SELECT bar.c,
bar.d
FROM bar) subquery_1
WHERE subquery.a = subquery_1.c

It doesn't play that well if you have something called subquery though:

CREATE OR REPLACE VIEW test_view AS
SELECT *
FROM (SELECT a, b FROM foo),
(SELECT c, d FROM bar), (select relname from pg_class limit
1) as subquery
WHERE a = c;

\sv test_view
CREATE OR REPLACE VIEW public.test_view AS
SELECT subquery.a,
subquery.b,
subquery_1.c,
subquery_1.d,
subquery_2.relname
FROM ( SELECT foo.a,
foo.b
FROM foo) subquery,
( SELECT bar.c,
bar.d
FROM bar) subquery_1,
( SELECT pg_class.relname
FROM pg_class
LIMIT 1) subquery_2
WHERE subquery.a = subquery_1.c

While the output is a valid query, it's not nice that it's replacing a
user provided alias with another one (or force an alias if you have a
relation called subquery). More generally, I'm -0.5 on the feature.
I prefer to force using SQL-compliant queries, and also not take bad
habits.

#3Isaac Morland
isaac.morland@gmail.com
In reply to: Julien Rouhaud (#2)
Re: Making the subquery alias optional in the FROM clause

On Mon, 27 Jun 2022 at 11:12, Julien Rouhaud <rjuju123@gmail.com> wrote:

More generally, I'm -0.5 on the feature.
I prefer to force using SQL-compliant queries, and also not take bad
habits.

As to forcing SQL-complaint queries, that ship sailed a long time ago:
Postgres allows but does not enforce the use of SQL-compliant queries, and
many of its important features are extensions anyway, so forcing SQL
compliant queries is out of the question (although I could see the utility
of a mode where it warns or errors on non-compliant queries, at least in
principle).

As to bad habits, I'm having trouble understanding. Why do you think
leaving the alias off a subquery is a bad habit (assuming it were allowed)?
If the name is never used, why are we required to supply it?

#4Dean Rasheed
dean.a.rasheed@gmail.com
In reply to: Julien Rouhaud (#2)
Re: Making the subquery alias optional in the FROM clause

On Mon, 27 Jun 2022 at 16:12, Julien Rouhaud <rjuju123@gmail.com> wrote:

It doesn't play that well if you have something called subquery though:

[example that changes a user-provided alias]

While the output is a valid query, it's not nice that it's replacing a
user provided alias with another one (or force an alias if you have a
relation called subquery).

It's already the case that user-provided aliases can get replaced by
new ones in the query-deparsing code, e.g.:

CREATE OR REPLACE VIEW test_view AS
SELECT x.a, y.b
FROM foo AS x,
(SELECT b FROM foo AS x) AS y;

\sv test_view

CREATE OR REPLACE VIEW public.test_view AS
SELECT x.a,
y.b
FROM foo x,
( SELECT x_1.b
FROM foo x_1) y

and similarly it may invent technically unnecessary aliases where
there were none before. The query-deparsing code has never been
alias-preserving, unless you take care to give everything a globally
unique alias.

Regards,
Dean

#5David G. Johnston
david.g.johnston@gmail.com
In reply to: Dean Rasheed (#4)
Re: Making the subquery alias optional in the FROM clause

On Mon, Jun 27, 2022 at 11:25 AM Dean Rasheed <dean.a.rasheed@gmail.com>
wrote:

On Mon, 27 Jun 2022 at 16:12, Julien Rouhaud <rjuju123@gmail.com> wrote:

It doesn't play that well if you have something called subquery though:

[example that changes a user-provided alias]

While the output is a valid query, it's not nice that it's replacing a
user provided alias with another one (or force an alias if you have a
relation called subquery).

It's already the case that user-provided aliases can get replaced by
new ones in the query-deparsing code, e.g.:

Regardless, is there any reason to not just prefix our made-up aliases with
"pg_" to make it perfectly clear they were generated by the system and are
basically implementation details as opposed to something that appeared in
the originally written query?

I suppose, "because we've haven't until now, so why start" suffices...but
still doing a rename/suffixing because of query rewriting and inventing one
where we made it optional seem different enough to justify implementing
something different.

David J.

#6Dean Rasheed
dean.a.rasheed@gmail.com
In reply to: David G. Johnston (#5)
Re: Making the subquery alias optional in the FROM clause

On Mon, 27 Jun 2022 at 19:43, David G. Johnston
<david.g.johnston@gmail.com> wrote:

On Mon, Jun 27, 2022 at 11:25 AM Dean Rasheed <dean.a.rasheed@gmail.com> wrote:

On Mon, 27 Jun 2022 at 16:12, Julien Rouhaud <rjuju123@gmail.com> wrote:

It doesn't play that well if you have something called subquery though:

[example that changes a user-provided alias]

While the output is a valid query, it's not nice that it's replacing a
user provided alias with another one (or force an alias if you have a
relation called subquery).

It's already the case that user-provided aliases can get replaced by
new ones in the query-deparsing code, e.g.:

Regardless, is there any reason to not just prefix our made-up aliases with "pg_" to make it perfectly clear they were generated by the system and are basically implementation details as opposed to something that appeared in the originally written query?

I suppose, "because we've haven't until now, so why start" suffices...but still doing a rename/suffixing because of query rewriting and inventing one where we made it optional seem different enough to justify implementing something different.

I think "pg_" would be a bad idea, since it's too easily confused with
things like system catalogs. The obvious precedent we have for a
made-up alias is "unnamed_join", so perhaps "unnamed_subquery" would
be better.

Regards,
Dean

#7Julien Rouhaud
rjuju123@gmail.com
In reply to: Isaac Morland (#3)
Re: Making the subquery alias optional in the FROM clause

Hi,

On Mon, Jun 27, 2022 at 12:03:20PM -0400, Isaac Morland wrote:

On Mon, 27 Jun 2022 at 11:12, Julien Rouhaud <rjuju123@gmail.com> wrote:

More generally, I'm -0.5 on the feature.
I prefer to force using SQL-compliant queries, and also not take bad
habits.

As to forcing SQL-complaint queries, that ship sailed a long time ago:
Postgres allows but does not enforce the use of SQL-compliant queries, and
many of its important features are extensions anyway, so forcing SQL
compliant queries is out of the question (although I could see the utility
of a mode where it warns or errors on non-compliant queries, at least in
principle).

Sure, but it doesn't mean that we should support even more non-compliant syntax
without any restraint. In this case, I don't see much benefit as it's not
solving performance problem or something like that.

As to bad habits, I'm having trouble understanding. Why do you think
leaving the alias off a subquery is a bad habit (assuming it were allowed)?

I think It's a bad habit because as far as I can see it's not supported on
mysql or sqlserver.

If the name is never used, why are we required to supply it?

I'm not saying that I'm thrilled having to do so, but it's also not a huge
trouble. And since it's required I have the habit to automatically put some
random alias if I'm writing some one shot query that indeed doesn't need to use
the alias.

But similarly, I many times relied on the fact that writable CTE are executed
even if not explicitly referenced. So by the same argument shouldn't we allow
something like this?

WITH (INSERT INTO t SELECT * pending WHERE ts < now())
SELECT now() AS last_processing_time;

#8Isaac Morland
isaac.morland@gmail.com
In reply to: Julien Rouhaud (#7)
Re: Making the subquery alias optional in the FROM clause

On Tue, 28 Jun 2022 at 00:32, Julien Rouhaud <rjuju123@gmail.com> wrote:

As to forcing SQL-complaint queries, that ship sailed a long time ago:

Postgres allows but does not enforce the use of SQL-compliant queries,

and

many of its important features are extensions anyway, so forcing SQL
compliant queries is out of the question (although I could see the

utility

of a mode where it warns or errors on non-compliant queries, at least in
principle).

Sure, but it doesn't mean that we should support even more non-compliant
syntax
without any restraint. In this case, I don't see much benefit as it's not
solving performance problem or something like that.

It's improving developer performance by eliminating the need to make up
utterly useless names. I don't care if behind the scenes names are
assigned, although it would be even better if the names didn't exist at
all. I just want the computer to do stuff for me that requires absolutely
no human judgement whatsoever.

As to bad habits, I'm having trouble understanding. Why do you think

leaving the alias off a subquery is a bad habit (assuming it were

allowed)?

I think It's a bad habit because as far as I can see it's not supported on
mysql or sqlserver.

So it’s a bad habit to use features of Postgres that aren’t available on
MySQL or SQL Server?

For myself, I don’t care one bit about whether my code will run on those
systems, or Oracle: as far as I’m concerned I write Postgres applications,
not SQL applications. Of course, many people have a need to support other
systems, so I appreciate the care we take to document the differences from
the standard, and I hope we will continue to support standard queries. But
if it’s a bad habit to use Postgres-specific features, why do we create any
of those features?

If the name is never used, why are we required to supply it?

But similarly, I many times relied on the fact that writable CTE are
executed
even if not explicitly referenced. So by the same argument shouldn't we
allow
something like this?

WITH (INSERT INTO t SELECT * pending WHERE ts < now())
SELECT now() AS last_processing_time;

I’m not necessarily opposed to allowing this too. But the part which causes
me annoyance is normal subquery naming.

#9Tom Lane
tgl@sss.pgh.pa.us
In reply to: Dean Rasheed (#1)
Re: Making the subquery alias optional in the FROM clause

Dean Rasheed <dean.a.rasheed@gmail.com> writes:

This was discussed previously in [1], and there seemed to be general
consensus in favour of it, but no new patch emerged.

As I said in that thread, I'm not super enthused about this, but I was
clearly in the minority so I think it should go forward.

Attached is a patch that takes the approach of not generating an alias
at all, which seems to be neater and simpler, and less code than
trying to generate a unique alias.

Hm. Looking at the code surrounding what you touched, I'm reminded
that we allow JOIN nodes to not have an alias, and represent that
situation by rte->alias == NULL. I wonder if it'd be better in the
long run to make alias-less subqueries work similarly, rather than
generating something that after-the-fact will be indistinguishable
from a user-written alias. If that turns out to not work well,
I'd agree with "unnamed_subquery" as the inserted name.

Also, what about VALUES clauses? It seems inconsistent to remove
this restriction for sub-SELECT but not VALUES. Actually it looks
like your patch already does remove that restriction in gram.y,
but you didn't follow through elsewhere.

As far as the docs go, I think it's sufficient to mention the
inconsistency with SQL down at the bottom; we don't need a
redundant in-line explanation.

regards, tom lane

#10Dean Rasheed
dean.a.rasheed@gmail.com
In reply to: Tom Lane (#9)
Re: Making the subquery alias optional in the FROM clause

On Tue, 5 Jul 2022 at 19:00, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Dean Rasheed <dean.a.rasheed@gmail.com> writes:

This was discussed previously in [1], and there seemed to be general
consensus in favour of it, but no new patch emerged.

As I said in that thread, I'm not super enthused about this, but I was
clearly in the minority so I think it should go forward.

Cool. Thanks for looking.

Attached is a patch that takes the approach of not generating an alias
at all, which seems to be neater and simpler, and less code than
trying to generate a unique alias.

Hm. Looking at the code surrounding what you touched, I'm reminded
that we allow JOIN nodes to not have an alias, and represent that
situation by rte->alias == NULL. I wonder if it'd be better in the
long run to make alias-less subqueries work similarly,

That is what the patch does: transformRangeSubselect() passes a NULL
alias to addRangeTableEntryForSubquery(), which has been modified to
cope with that in a similar way to addRangeTableEntryForJoin() and
other addRangeTableEntryFor...() functions.

So for example, with the following query, this is what the output from
the parser looks like:

SELECT * FROM (SELECT 1);

query->rtable:
rte:
rtekind = RTE_SUBQUERY
alias = NULL
eref = { aliasname = "subquery", colnames = ... }

rather than
generating something that after-the-fact will be indistinguishable
from a user-written alias. If that turns out to not work well,
I'd agree with "unnamed_subquery" as the inserted name.

The result is distinguishable from a user-written alias, because
rte->alias is NULL. I think the confusion is that when I suggested
using "unnamed_subquery", I was referring to rte->eref->aliasname, and
I still think it's a good idea to change that, for consistency with
unnamed joins.

Also, what about VALUES clauses? It seems inconsistent to remove
this restriction for sub-SELECT but not VALUES. Actually it looks
like your patch already does remove that restriction in gram.y,
but you didn't follow through elsewhere.

It does support unnamed VALUES clauses in the FROM list (there's a
regression test exercising that). It wasn't necessary to make any
additional code changes because addRangeTableEntryForValues() already
supported having a NULL alias, and it all just flowed through.

In fact, the grammar forces you to enclose a VALUES clause in the FROM
list in parentheses, so this ends up being an unnamed subquery in the
FROM list as well. For example:

SELECT * FROM (VALUES(1),(2),(3));

produces

query->rtable:
rte:
rtekind = RTE_SUBQUERY
alias = NULL
eref = { aliasname = "subquery", colnames = ... }
subquery->rtable:
rte:
rtekind = RTE_VALUES
alias = NULL
eref = { aliasname = "*VALUES*", colnames = ... }

So it's not really any different from a normal subquery.

As far as the docs go, I think it's sufficient to mention the
inconsistency with SQL down at the bottom; we don't need a
redundant in-line explanation.

OK, fair enough.

I'll post an update in a little while, but first, I found a bug, which
revealed a pre-existing bug in transformLockingClause(). I'll start a
new thread for that, since it'd be good to get that resolved
independently of this patch.

Regards,
Dean

#11Dean Rasheed
dean.a.rasheed@gmail.com
In reply to: Dean Rasheed (#10)
Re: Making the subquery alias optional in the FROM clause

On Wed, 6 Jul 2022 at 15:09, Dean Rasheed <dean.a.rasheed@gmail.com> wrote:

I'll post an update in a little while, but first, I found a bug, which
revealed a pre-existing bug in transformLockingClause(). I'll start a
new thread for that, since it'd be good to get that resolved
independently of this patch.

Attached is an update with the following changes:

* Docs updated as suggested.
* transformLockingClause() updated to skip subquery and values rtes
without aliases.
* eref->aliasname changed to "unnamed_subquery" for subqueries without aliases.

Regards,
Dean

Attachments:

make-subquery-alias-optional-v2.patchtext/x-patch; charset=US-ASCII; name=make-subquery-alias-optional-v2.patchDownload+164-81
#12Zhihong Yu
zyu@yugabyte.com
In reply to: Dean Rasheed (#11)
Re: Making the subquery alias optional in the FROM clause

On Sat, Jul 9, 2022 at 3:28 AM Dean Rasheed <dean.a.rasheed@gmail.com>
wrote:

On Wed, 6 Jul 2022 at 15:09, Dean Rasheed <dean.a.rasheed@gmail.com>
wrote:

I'll post an update in a little while, but first, I found a bug, which
revealed a pre-existing bug in transformLockingClause(). I'll start a
new thread for that, since it'd be good to get that resolved
independently of this patch.

Attached is an update with the following changes:

* Docs updated as suggested.
* transformLockingClause() updated to skip subquery and values rtes
without aliases.
* eref->aliasname changed to "unnamed_subquery" for subqueries without
aliases.

Regards,
Dean

Hi,
rtename is assigned at the beginning of the loop:

+ char *rtename = rte->eref->aliasname;

It seems the code would be more readable if you keep the assignment in
else block below:

+                   else if (rte->rtekind == RTE_SUBQUERY ||
+                            rte->rtekind == RTE_VALUES)
                        continue;
-                   rtename = rte->join_using_alias->aliasname;
                }
-               else
-                   rtename = rte->eref->aliasname;

because rtename would be assigned in the `rte->rtekind == RTE_JOIN` case.

Cheers

#13Dean Rasheed
dean.a.rasheed@gmail.com
In reply to: Zhihong Yu (#12)
Re: Making the subquery alias optional in the FROM clause

On Sat, 9 Jul 2022 at 12:24, Zhihong Yu <zyu@yugabyte.com> wrote:

It seems the code would be more readable if you keep the assignment in else block below:

+                   else if (rte->rtekind == RTE_SUBQUERY ||
+                            rte->rtekind == RTE_VALUES)
continue;
-                   rtename = rte->join_using_alias->aliasname;
}
-               else
-                   rtename = rte->eref->aliasname;

because rtename would be assigned in the `rte->rtekind == RTE_JOIN` case.

But then it would need 2 else blocks, one inside the rte->alias ==
NULL block, for when rtekind is not RTE_JOIN, RTE_SUBQUERY or
RTE_VALUES, and another after the block, for when rte->alias != NULL.
I find it more readable this way.

Regards,
Dean

#14Zhihong Yu
zyu@yugabyte.com
In reply to: Dean Rasheed (#13)
Re: Making the subquery alias optional in the FROM clause

On Sat, Jul 9, 2022 at 5:18 AM Dean Rasheed <dean.a.rasheed@gmail.com>
wrote:

On Sat, 9 Jul 2022 at 12:24, Zhihong Yu <zyu@yugabyte.com> wrote:

It seems the code would be more readable if you keep the assignment in

else block below:

+                   else if (rte->rtekind == RTE_SUBQUERY ||
+                            rte->rtekind == RTE_VALUES)
continue;
-                   rtename = rte->join_using_alias->aliasname;
}
-               else
-                   rtename = rte->eref->aliasname;

because rtename would be assigned in the `rte->rtekind == RTE_JOIN` case.

But then it would need 2 else blocks, one inside the rte->alias ==
NULL block, for when rtekind is not RTE_JOIN, RTE_SUBQUERY or
RTE_VALUES, and another after the block, for when rte->alias != NULL.
I find it more readable this way.

Regards,
Dean

Hi, Dean:
Thanks for the explanation.

I should have looked closer :-)

#15Tom Lane
tgl@sss.pgh.pa.us
In reply to: Dean Rasheed (#11)
Re: Making the subquery alias optional in the FROM clause

Dean Rasheed <dean.a.rasheed@gmail.com> writes:

Attached is an update with the following changes:

* Docs updated as suggested.
* transformLockingClause() updated to skip subquery and values rtes
without aliases.
* eref->aliasname changed to "unnamed_subquery" for subqueries without aliases.

This looks good to me. Marked RFC.

regards, tom lane

#16Erwin Brandstetter
brsaweda@gmail.com
In reply to: Dean Rasheed (#1)
Re: Making the subquery alias optional in the FROM clause

On Mon, 2 Oct 2023 at 00:33, Dean Rasheed <dean.a.rasheed@gmail.com> wrote:

The only place that exposes the eref's made-up relation name is the
existing query deparsing code in ruleutils.c, which uniquifies it and
generates SQL spec-compliant output. For example:

I ran into one other place: error messages.

SELECT unnamed_subquery.a
FROM (SELECT 1 AS a)

ERROR: There is an entry for table "unnamed_subquery", but it cannot be

referenced from this part of the query.invalid reference to FROM-clause
entry for table "unnamed_subquery"

Normally, we would find the cited name somewhere in the query. Confusing.
Notably, the same does not happen for "unnamed_subquery_1":

SELECT unnamed_subquery_1.a
FROM (SELECT 1 AS a), (SELECT 1 AS a)

ERROR: missing FROM-clause entry for table "unnamed_subquery_1"

That's the message anybody would expect.
Also makes sense, as "uniquification" only happens in the above quoted
case, and all invisible aliases seem to be "unnamed_subquery" at this
point? But a bit confusing on a different level.

Maybe error messages should not be aware of invisible aliases, and just
complain about "missing FROM-clause entry"?
Not sure whether a fix would be easy, nor whether it would be worth the
effort. Just wanted to document the corner case issue in this thread.

Regards
Erwin

#17Tom Lane
tgl@sss.pgh.pa.us
In reply to: Erwin Brandstetter (#16)
Re: Making the subquery alias optional in the FROM clause

Erwin Brandstetter <brsaweda@gmail.com> writes:

On Mon, 2 Oct 2023 at 00:33, Dean Rasheed <dean.a.rasheed@gmail.com> wrote:

The only place that exposes the eref's made-up relation name is the
existing query deparsing code in ruleutils.c, which uniquifies it and
generates SQL spec-compliant output. For example:

I ran into one other place: error messages.
SELECT unnamed_subquery.a
FROM (SELECT 1 AS a)
ERROR: There is an entry for table "unnamed_subquery", but it cannot be
referenced from this part of the query.invalid reference to FROM-clause
entry for table "unnamed_subquery"

Yeah, that's exposing more of the implementation than we really want.

Notably, the same does not happen for "unnamed_subquery_1":
SELECT unnamed_subquery_1.a
FROM (SELECT 1 AS a), (SELECT 1 AS a)
ERROR: missing FROM-clause entry for table "unnamed_subquery_1"

Actually, that happens because "unnamed_subquery_1" *isn't* in the
parse tree. As implemented, both RTEs are labeled "unnamed_subquery"
in the parser output, and it's ruleutils that de-duplicates them.

I'm inclined to think we should avoid letting "unnamed_subquery"
appear in the parse tree, too. It might not be a good idea to
try to leave the eref field null, but could we set it to an
empty string instead, that is

-	eref = alias ? copyObject(alias) : makeAlias("unnamed_subquery", NIL);
+	eref = alias ? copyObject(alias) : makeAlias("", NIL);

and then let ruleutils replace that with "unnamed_subquery"? This
would prevent accessing the subquery name in the way Erwin shows,
because we don't let you write an empty identifier in SQL:

regression=# select "".a from (select 1 as a);
ERROR: zero-length delimited identifier at or near """"
LINE 1: select "".a from (select 1 as a);
^

However, there might then be some parser error messages that
refer to subquery "", so I'm not sure if this is totally
without surprises either.

regards, tom lane

#18Dean Rasheed
dean.a.rasheed@gmail.com
In reply to: Tom Lane (#17)
Re: Making the subquery alias optional in the FROM clause

On Mon, 2 Oct 2023 at 01:01, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Erwin Brandstetter <brsaweda@gmail.com> writes:

On Mon, 2 Oct 2023 at 00:33, Dean Rasheed <dean.a.rasheed@gmail.com> wrote:

The only place that exposes the eref's made-up relation name is the
existing query deparsing code in ruleutils.c, which uniquifies it and
generates SQL spec-compliant output. For example:

I ran into one other place: error messages.
SELECT unnamed_subquery.a
FROM (SELECT 1 AS a)
ERROR: There is an entry for table "unnamed_subquery", but it cannot be
referenced from this part of the query.invalid reference to FROM-clause
entry for table "unnamed_subquery"

Yeah, that's exposing more of the implementation than we really want.

Note that this isn't a new issue, specific to unnamed subqueries. The
same thing happens for unnamed joins:

create table foo(a int);
create table bar(a int);
select unnamed_join.a from foo join bar using (a);

ERROR: invalid reference to FROM-clause entry for table "unnamed_join"
LINE 1: select unnamed_join.a from foo join bar using (a);
^
DETAIL: There is an entry for table "unnamed_join", but it cannot be
referenced from this part of the query.

And there's a similar problem with VALUES RTEs:

insert into foo values (1),(2) returning "*VALUES*".a;

ERROR: invalid reference to FROM-clause entry for table "*VALUES*"
LINE 1: insert into foo values (1),(2) returning "*VALUES*".a;
^
DETAIL: There is an entry for table "*VALUES*", but it cannot be
referenced from this part of the query.

I'm inclined to think we should avoid letting "unnamed_subquery"
appear in the parse tree, too. It might not be a good idea to
try to leave the eref field null, but could we set it to an
empty string instead, that is

-       eref = alias ? copyObject(alias) : makeAlias("unnamed_subquery", NIL);
+       eref = alias ? copyObject(alias) : makeAlias("", NIL);

and then let ruleutils replace that with "unnamed_subquery"?

Hmm, I think that there would be other side-effects if we did that --
at least doing it for VALUES RTEs would also require additional
changes to retain current EXPLAIN output. I think perhaps it would be
better to try for a more targeted fix of the parser error reporting.

In searchRangeTableForRel() we try to find any RTE that could possibly
match the RangeVar, but certain kinds of RTE don't naturally have
names, and if they also haven't been given aliases, then they can't
possibly match anywhere in the query (and thus it's misleading to
report that they can't be referred to from specific places).

So I think perhaps it's better to just have searchRangeTableForRel()
exclude these kinds of RTE, if they haven't been given an alias.

Regards,
Dean

Attachments:

improve-error-reporting-for-refs-to-unnamed-rtes.patchapplication/x-patch; name=improve-error-reporting-for-refs-to-unnamed-rtes.patchDownload+47-0
#19Tom Lane
tgl@sss.pgh.pa.us
In reply to: Dean Rasheed (#18)
Re: Making the subquery alias optional in the FROM clause

Dean Rasheed <dean.a.rasheed@gmail.com> writes:

On Mon, 2 Oct 2023 at 01:01, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Yeah, that's exposing more of the implementation than we really want.

Note that this isn't a new issue, specific to unnamed subqueries. The
same thing happens for unnamed joins:

True, and we've had few complaints about that. Still, if we can
clean it up without too much effort, let's do so.

So I think perhaps it's better to just have searchRangeTableForRel()
exclude these kinds of RTE, if they haven't been given an alias.

Would we need a new flag in the ParseNamespaceItem data structure,
or will the existing data serve? I see how to do this if we add
a "doesn't really have a name" flag, but it's not clear to me that
we can reliably identify them otherwise. Maybe a test involving
the rtekind and whether the "alias" field is set would do, but
that way seems a bit ugly.

regards, tom lane