Allow an alias to be attached directly to a JOIN ... USING
A small new feature in SQL:2016 allows attaching a table alias to a
JOIN/USING construct:
<named columns join> ::=
USING <left paren> <join column list> <right paren>
[ AS <join correlation name> ]
(The part in brackets is new.)
This seems quite useful, and it seems the code would already support
this if we allow the grammar to accept this syntax.
Patch attached.
--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Attachments:
0001-Allow-an-alias-to-be-attached-directly-to-a-JOIN-.-U.patchtext/plain; charset=UTF-8; name=0001-Allow-an-alias-to-be-attached-directly-to-a-JOIN-.-U.patch; x-mac-creator=0; x-mac-type=0Download+69-28
On Tue, Jun 18, 2019 at 2:41 AM Peter Eisentraut
<peter.eisentraut@2ndquadrant.com> wrote:
A small new feature in SQL:2016 allows attaching a table alias to a
JOIN/USING construct:<named columns join> ::=
USING <left paren> <join column list> <right paren>
[ AS <join correlation name> ](The part in brackets is new.)
This seems quite useful, and it seems the code would already support
this if we allow the grammar to accept this syntax.
Neat. That's a refreshingly short patch to get a sql_features.txt
line bumped to YES.
Patch attached.
It does what it says on the tin.
I see that USING is the important thing here; for (a NATURAL JOIN b)
AS ab or (a JOIN b ON ...) AS ab you still need the parentheses or
(respectively) it means something different (alias for B only) or
doesn't parse. That makes sense.
I noticed that the HINT when you accidentally use a base table name
instead of a table alias is more helpful than the HINT you get when
you use a base table name instead of a join alias. That seems like a
potential improvement that is independent of this syntax change.
--
Thomas Munro
https://enterprisedb.com
Hello Peter,
A small new feature in SQL:2016 allows attaching a table alias to a
JOIN/USING construct:<named columns join> ::=
USING <left paren> <join column list> <right paren>
[ AS <join correlation name> ](The part in brackets is new.)
This seems quite useful, and it seems the code would already support
this if we allow the grammar to accept this syntax.Patch attached.
A few more comments.
Patch v1 applies cleanly, compiles. make check ok. Doc gen ok.
The patch allows an AS clause (alias) attached to a JOIN USING, which seems
to be SQL feature F404, which seems a new feature in SQL:2016.
The feature implementation only involves parser changes, so the underlying
infrastructure seems to be already available.
About the code:
The removal from the grammar of the dynamic type introspection to distinguish
between ON & USING is a relief in itself:-)
About the feature:
When using aliases both on tables and on the unifying using clause, the former
are hidden from view. I cannot say that I understand why, and this makes it
impossible to access some columns in some cases if there is an ambiguity, eg:
postgres=# SELECT t.filler
FROM pgbench_tellers AS t
JOIN pgbench_branches AS b USING (bid) AS x;
ERROR: invalid reference to FROM-clause entry for table "t"
LINE 1: SELECT t.filler FROM pgbench_tellers AS t JOIN pgbench_branc...
^
HINT: There is an entry for table "t", but it cannot be referenced from this
part of the query.
But then:
postgres=# SELECT x.filler
FROM pgbench_tellers AS t
JOIN pgbench_branches AS b USING (bid) AS x;
ERROR: column reference "filler" is ambiguous
LINE 1: SELECT x.filler FROM pgbench_tellers AS t JOIN pgbench_branc...
^
Is there a good reason to forbid several aliases covering the same table?
More precisely, is this behavior expected from the spec or a side effect
of pg implementation?
Given that the executor detects that the underlying alias exists, could it
just let it pass instead of raising an error, and it would simply just
work?
I'm wondering why such an alias could not be attached also to an ON
clause. Having them in one case but not the other looks strange.
About the documentation:
The documentation changes only involves the synopsis. ISTM that maybe aliases
shadowing one another could deserve some caveat. The documentation in its
"alias" paragraph only talks about hidding table and functions names.
Also, the USING paragraph could talk about its optional alias and its
hiding effect.
About tests:
Maybe an alias hidding case could be added.
--
Fabien.
On Tue, Jul 16, 2019 at 8:58 AM Fabien COELHO <coelho@cri.ensmp.fr> wrote:
About the feature:
When using aliases both on tables and on the unifying using clause, the former
are hidden from view. I cannot say that I understand why, and this makes it
impossible to access some columns in some cases if there is an ambiguity, eg:postgres=# SELECT t.filler
FROM pgbench_tellers AS t
JOIN pgbench_branches AS b USING (bid) AS x;
ERROR: invalid reference to FROM-clause entry for table "t"
LINE 1: SELECT t.filler FROM pgbench_tellers AS t JOIN pgbench_branc...
^
HINT: There is an entry for table "t", but it cannot be referenced from this
part of the query.But then:
postgres=# SELECT x.filler
FROM pgbench_tellers AS t
JOIN pgbench_branches AS b USING (bid) AS x;
ERROR: column reference "filler" is ambiguous
LINE 1: SELECT x.filler FROM pgbench_tellers AS t JOIN pgbench_branc...
^Is there a good reason to forbid several aliases covering the same table?
More precisely, is this behavior expected from the spec or a side effect
of pg implementation?
Indeed, that seems like a problem, and it's a good question. You can
see this on unpatched master with SELECT x.filler FROM
(pgbench_tellers AS t JOIN b USING (bid)) AS x.
I'm moving this to the next CF.
--
Thomas Munro
https://enterprisedb.com
On 2019-Aug-01, Thomas Munro wrote:
Indeed, that seems like a problem, and it's a good question. You can
see this on unpatched master with SELECT x.filler FROM
(pgbench_tellers AS t JOIN b USING (bid)) AS x.
I'm not sure I understand why that problem is a blocker for this patch.
--
�lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 2019-09-17 19:37, Alvaro Herrera wrote:
On 2019-Aug-01, Thomas Munro wrote:
Indeed, that seems like a problem, and it's a good question. You can
see this on unpatched master with SELECT x.filler FROM
(pgbench_tellers AS t JOIN b USING (bid)) AS x.I'm not sure I understand why that problem is a blocker for this patch.
I tried to analyze the spec for what the behavior should be here, but I
got totally lost. I'll give it another look.
--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Tue, 17 Sep 2019, Alvaro Herrera wrote:
Indeed, that seems like a problem, and it's a good question. You can
see this on unpatched master with SELECT x.filler FROM
(pgbench_tellers AS t JOIN b USING (bid)) AS x.I'm not sure I understand why that problem is a blocker for this patch.
As discussed on another thread,
/messages/by-id/2aa57950-b1d7-e9b6-0770-fa592d565dda@2ndquadrant.com
the patch does not conform to spec
SQL:2016 Part 2 Foundation Section 7.10 <joined table>
Basically "x" is expected to include *ONLY* joined attributes with USING,
i.e. above only x.bid should exists, and per-table aliases are expected to
still work for other attributes.
ISTM that this patch could be "returned with feedback".
--
Fabien.
On 2019-12-24 19:13, Fabien COELHO wrote:
Indeed, that seems like a problem, and it's a good question. You can
see this on unpatched master with SELECT x.filler FROM
(pgbench_tellers AS t JOIN b USING (bid)) AS x.I'm not sure I understand why that problem is a blocker for this patch.
As discussed on another thread,
/messages/by-id/2aa57950-b1d7-e9b6-0770-fa592d565dda@2ndquadrant.com
the patch does not conform to spec
SQL:2016 Part 2 Foundation Section 7.10 <joined table>
Basically "x" is expected to include *ONLY* joined attributes with USING,
i.e. above only x.bid should exists, and per-table aliases are expected to
still work for other attributes.
I took another crack at this. Attached is a new patch that addresses
the semantic comments from this and the other thread. It's all a bit
tricky, comments welcome.
--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Attachments:
v2-0001-Allow-an-alias-to-be-attached-to-a-JOIN-.-USING.patchtext/plain; charset=UTF-8; name=v2-0001-Allow-an-alias-to-be-attached-to-a-JOIN-.-USING.patch; x-mac-creator=0; x-mac-type=0Download+258-42
On 30/12/2019 22:25, Peter Eisentraut wrote:
On 2019-12-24 19:13, Fabien COELHO wrote:
Indeed, that seems like a problem, and it's a good question. You can
see this on unpatched master with SELECT x.filler FROM
(pgbench_tellers AS t JOIN b USING (bid)) AS x.I'm not sure I understand why that problem is a blocker for this patch.
As discussed on another thread,
/messages/by-id/2aa57950-b1d7-e9b6-0770-fa592d565dda@2ndquadrant.comthe patch does not conform to spec
SQL:2016 Part 2 Foundation Section 7.10 <joined table>
Basically "x" is expected to include *ONLY* joined attributes with
USING,
i.e. above only x.bid should exists, and per-table aliases are
expected to
still work for other attributes.I took another crack at this. Attached is a new patch that addresses
the semantic comments from this and the other thread. It's all a bit
tricky, comments welcome.
Excellent! Thank you for working on this, Peter.
One thing I notice is that the joined columns are still accessible from
their respective table names when they should not be per spec. That
might be one of those "silly restrictions" that we choose to ignore, but
it should probably be noted somewhere, at the very least in a code
comment if not in user documentation. (This is my reading of SQL:2016 SR
11.a.i)
--
Vik Fearing
Hello Peter,
I took another crack at this. Attached is a new patch that addresses
the semantic comments from this and the other thread. It's all a bit
tricky, comments welcome.
It seems that this patch does not apply anymore after Tom's 5815696.
--
Fabien.
On 2019-12-31 00:07, Vik Fearing wrote:
One thing I notice is that the joined columns are still accessible from
their respective table names when they should not be per spec. That
might be one of those "silly restrictions" that we choose to ignore, but
it should probably be noted somewhere, at the very least in a code
comment if not in user documentation. (This is my reading of SQL:2016 SR
11.a.i)
Here is a rebased patch.
The above comment is valid. One reason I didn't implement it is that it
would create inconsistencies with existing behavior, which is already
nonstandard.
For example,
create table a (id int, a1 int, a2 int);
create table b (id int, b2 int, b3 int);
makes
select a.id from a join b using (id);
invalid. Adding an explicit alias for the common column names doesn't
change that semantically, because an implicit alias also exists if an
explicit one isn't specified.
I agree that some documentation would be in order if we decide to leave
it like this.
Another reason was that it seemed "impossible" to implement it before
Tom's recent refactoring of the parse namespace handling. Now we also
have parse namespace columns tracked separately from range table
entries, so it appears that this would be possible. If we want to do it.
--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Attachments:
v3-0001-Allow-an-alias-to-be-attached-to-a-JOIN-.-USING.patchtext/plain; charset=UTF-8; name=v3-0001-Allow-an-alias-to-be-attached-to-a-JOIN-.-USING.patch; x-mac-creator=0; x-mac-type=0Download+243-31
On 27 Jan 2020, at 10:19, Peter Eisentraut <peter.eisentraut@2ndquadrant.com> wrote:
On 2019-12-31 00:07, Vik Fearing wrote:
One thing I notice is that the joined columns are still accessible from
their respective table names when they should not be per spec. That
might be one of those "silly restrictions" that we choose to ignore, but
it should probably be noted somewhere, at the very least in a code
comment if not in user documentation. (This is my reading of SQL:2016 SR
11.a.i)Here is a rebased patch.
This thread has stalled for a bit, let's try to bring it to an end.
Vik: having shown interest in, and been actively reviewing, this patch; do you
have time to review this latest version from Peter during this commitfest?
cheers ./daniel
Peter Eisentraut:
On 2019-12-31 00:07, Vik Fearing wrote:
One thing I notice is that the joined columns are still accessible from
their respective table names when they should not be per spec. That
might be one of those "silly restrictions" that we choose to ignore, but
it should probably be noted somewhere, at the very least in a code
comment if not in user documentation. (This is my reading of SQL:2016 SR
11.a.i)Here is a rebased patch.
The above comment is valid. One reason I didn't implement it is that it
would create inconsistencies with existing behavior, which is already
nonstandard.For example,
create table a (id int, a1 int, a2 int);
create table b (id int, b2 int, b3 int);makes
select a.id from a join b using (id);
invalid. Adding an explicit alias for the common column names doesn't
change that semantically, because an implicit alias also exists if an
explicit one isn't specified.
I just looked through the patch without applying or testing it - but I
couldn't find anything that would indicate that this is not going to
work for e.g. a LEFT JOIN as well. First PG patch I looked at, so tell
me if I missed something there.
So given this:
SELECT x.id FROM a LEFT JOIN b USING (id) AS x
will this return NULL or a.id for rows that don't match in b? This
should definitely be mentioned in the docs and I guess a test wouldn't
be too bad as well?
In any case: If a.id and b.id would not be available anymore, but just
x.id, either the id value itself or the NULL value (indicating the
missing row in b) are lost. So this seems like a no-go.
I agree that some documentation would be in order if we decide to leave
it like this.
Keep it like that!
Hi,
I noticed that this patch fails on the cfbot.
For this, I changed the status to: 'Waiting on Author'.
Cheers,
//Georgios
The new status of this patch is: Waiting on Author
On 2020-11-10 16:15, Georgios Kokolatos wrote:
I noticed that this patch fails on the cfbot.
For this, I changed the status to: 'Waiting on Author'.Cheers,
//GeorgiosThe new status of this patch is: Waiting on Author
Here is a rebased and lightly retouched patch.
--
Peter Eisentraut
2ndQuadrant, an EDB company
https://www.2ndquadrant.com/
Attachments:
v4-0001-Allow-an-alias-to-be-attached-to-a-JOIN-.-USING.patchtext/plain; charset=UTF-8; name=v4-0001-Allow-an-alias-to-be-attached-to-a-JOIN-.-USING.patch; x-mac-creator=0; x-mac-type=0Download+239-14
On 2020-08-03 19:44, Wolfgang Walther wrote:
So given this:
SELECT x.id FROM a LEFT JOIN b USING (id) AS x
will this return NULL or a.id for rows that don't match in b? This
should definitely be mentioned in the docs and I guess a test wouldn't
be too bad as well?
This issue is independent of the presence of the alias "x", so I don't
think it has to do with this patch.
There is a fair amount of documentation on outer joins, so I expect that
this is discussed there.
--
Peter Eisentraut
2ndQuadrant, an EDB company
https://www.2ndquadrant.com/
On 11/14/20 3:49 AM, Peter Eisentraut wrote:
On 2020-11-10 16:15, Georgios Kokolatos wrote:
I noticed that this patch fails on the cfbot.
For this, I changed the status to: 'Waiting on Author'.Cheers,
//GeorgiosThe new status of this patch is: Waiting on Author
Here is a rebased and lightly retouched patch.
There don't seem to be any objections to just documenting the slight
divergence from the spec.
So, does it make sense to just document that and proceed?
Regards,
--
-David
david@pgmasters.net
On 05.03.21 18:00, David Steele wrote:
On 11/14/20 3:49 AM, Peter Eisentraut wrote:
On 2020-11-10 16:15, Georgios Kokolatos wrote:
I noticed that this patch fails on the cfbot.
For this, I changed the status to: 'Waiting on Author'.Cheers,
//GeorgiosThe new status of this patch is: Waiting on Author
Here is a rebased and lightly retouched patch.
There don't seem to be any objections to just documenting the slight
divergence from the spec.So, does it make sense to just document that and proceed?
Yeah, I think that is not a problem.
I think Tom's input on the guts of this patch would be most valuable,
since it intersects a lot with the parse namespace refactoring he did.
Peter Eisentraut <peter.eisentraut@2ndquadrant.com> writes:
I think Tom's input on the guts of this patch would be most valuable,
since it intersects a lot with the parse namespace refactoring he did.
Yeah, I've been meaning to take a look. I'll try to get it done in
the next couple of days.
regards, tom lane
Peter Eisentraut <peter.eisentraut@2ndquadrant.com> writes:
I think Tom's input on the guts of this patch would be most valuable,
since it intersects a lot with the parse namespace refactoring he did.
I really didn't like the way you'd done that :-(. My primary complaint
is that any one ParseNamespaceItem can describe only one table alias,
but here we have the potential for two aliases associated with the same
join:
select * from (t1 join t2 using(a) as tu) tx;
Admittedly that's not hugely useful since tx hides the tu alias, but
it should behave in a sane fashion. (BTW, after reading the SQL spec
again along the way to reviewing this, I am wondering if hiding the
lower aliases is really what we want; though it may be decades too late
to change that.)
However, ParseNamespaceItem as it stands needs some help for this.
It has a wired-in assumption that p_rte->eref describes the table
and column aliases exposed by the nsitem. 0001 below fixes this by
creating a separate p_names field in an nsitem. (There are some
comments in 0001 referencing JOIN USING aliases, but no actual code
for the feature.) That saves one indirection in common code paths,
so it's possibly a win on its own. Then 0002 is your patch rebased
onto that infrastructure, and with some cleanup of my own.
One thing I ran into is that a whole-row Var for the JOIN USING
alias did the wrong thing. It should have only the common columns,
but we were getting all the join columns in examples such as the
row_to_json() test case I added. This is difficult to fix given
the existing whole-row Var infrastructure, unless we want to make a
separate RTE for the JOIN USING alias, which I think is overkill.
What I did about this was to make transformWholeRowRef produce a
ROW() construct --- which is something that a whole-row Var for a
join would be turned into by the planner anyway. I think this is
semantically OK since the USING construct has already nailed down
the number and types of the join's common columns; there's no
prospect of those changing underneath a stored view query. It's
slightly ugly because the ROW() construct will be visible in a
decompiled view instead of "tu.*" like you wrote originally,
but I'm willing to live with that.
Speaking of decompiled views, I feel like ruleutils.c could do with
a little more work to teach it that these aliases are available.
Right now, it resorts to ugly workarounds:
regression=# create table t1 (a int, b int, c int);
CREATE TABLE
regression=# create table t2 (a int, x int, y int);
CREATE TABLE
regression=# create view vvv as select tj.a, t1.b from t1 full join t2 using(a) as tj, t1 as tx;
CREATE VIEW
regression=# \d+ vvv
View "public.vvv"
Column | Type | Collation | Nullable | Default | Storage | Description
--------+---------+-----------+----------+---------+---------+-------------
a | integer | | | | plain |
b | integer | | | | plain |
View definition:
SELECT a,
t1.b
FROM t1
FULL JOIN t2 USING (a) AS tj,
t1 tx(a_1, b, c);
That's not wrong, but it could likely be done better if ruleutils
realized it could use the tj alias to reference the column, instead
of having to force unqualified "a" to be a globally unique name.
I ran out of steam to look into that, though, and it's probably
something that could be improved later.
One other cosmetic thing is that this:
regression=# select tu.* from (t1 join t2 using(a) as tu) tx;
ERROR: missing FROM-clause entry for table "tu"
LINE 1: select tu.* from (t1 join t2 using(a) as tu) tx;
^
is a relatively dumb error message, compared to
regression=# select t1.* from (t1 join t2 using(a) as tu) tx;
ERROR: invalid reference to FROM-clause entry for table "t1"
LINE 1: select t1.* from (t1 join t2 using(a) as tu) tx;
^
HINT: There is an entry for table "t1", but it cannot be referenced from this part of the query.
I didn't look into why that isn't working, but maybe errorMissingRTE
needs to trawl all of the ParseNamespaceItems not just the RTEs.
Anyway, since these remaining gripes are cosmetic, I'll mark this RFC.
regards, tom lane