Targetlist lost when CTE join <targetlist lost when CTE join>

Started by Zhang Mingliover 2 years ago6 messages
#1Zhang Mingli
zmlpostgres@gmail.com

Hi,

Mini repo

create table t1(c1 int, c2 int);
CREATE TABLE
create table t2(c1 int, c2 int);
CREATE TABLE
explain with cte1 as (insert into t2 values (1, 2) returning *) select * from cte1 join t1 using(c1);
 QUERY PLAN
----------------------------------------------------------------
 Hash Join (cost=0.04..41.23 rows=11 width=12)
 Hash Cond: (t1.c1 = cte1.c1)
 CTE cte1
 -> Insert on t2 (cost=0.00..0.01 rows=1 width=8)
 -> Result (cost=0.00..0.01 rows=1 width=8)
 -> Seq Scan on t1 (cost=0.00..32.60 rows=2260 width=8)
 -> Hash (cost=0.02..0.02 rows=1 width=8)
 -> CTE Scan on cte1 (cost=0.00..0.02 rows=1 width=8)
(8 rows)

with cte1 as (insert into t2 values (1, 2) returning *) select * from cte1 join t1 using(c1);
 c1 | c2 | c2
----+----+----
(0 rows)

truncate t2;
TRUNCATE TABLE
with cte1 as (insert into t2 values (1, 2) returning *) select cte1.*, t1.* from cte1 join t1 using(c1);
 c1 | c2 | c1 | c2
----+----+----+----
(0 rows)

Table t1 and  t2 both has 2 columns: c1, c2, when CTE join select *, the result target list seems to lost one’s column c1.
But it looks good when select cte1.* and t1.* explicitly .

Is it a bug?

Regards,
Zhang Mingli

#2Zhang Mingli
zmlpostgres@gmail.com
In reply to: Zhang Mingli (#1)
Re: Targetlist lost when CTE join <targetlist lost when CTE join>

Hi,

Explain verbose, seems HashJoin node drop that column.

gpadmin=# explain(verbose) with cte1 as (insert into t2 values (1, 2) returning *) select * from cte1 join t1 using(c1);
 QUERY PLAN
-------------------------------------------------------------------
 Hash Join (cost=0.04..41.23 rows=11 width=12)
 Output: cte1.c1, cte1.c2, t1.c2
 Hash Cond: (t1.c1 = cte1.c1)
 CTE cte1
 -> Insert on public.t2 (cost=0.00..0.01 rows=1 width=8)
 Output: t2.c1, t2.c2
 -> Result (cost=0.00..0.01 rows=1 width=8)
 Output: 1, 2
 -> Seq Scan on public.t1 (cost=0.00..32.60 rows=2260 width=8)
 Output: t1.c1, t1.c2
 -> Hash (cost=0.02..0.02 rows=1 width=8)
 Output: cte1.c1, cte1.c2
 -> CTE Scan on cte1 (cost=0.00..0.02 rows=1 width=8)
 Output: cte1.c1, cte1.c2
(14 rows)

Regards,
Zhang Mingli

#3Julien Rouhaud
rjuju123@gmail.com
In reply to: Zhang Mingli (#1)
Re: Targetlist lost when CTE join <targetlist lost when CTE join>

Hi,

On Wed, Jun 28, 2023 at 04:52:34PM +0800, Zhang Mingli wrote:

Mini repo

create table t1(c1 int, c2 int);
CREATE TABLE
create table t2(c1 int, c2 int);
CREATE TABLE
explain with cte1 as (insert into t2 values (1, 2) returning *) select * from cte1 join t1 using(c1);
 QUERY PLAN
----------------------------------------------------------------
 Hash Join (cost=0.04..41.23 rows=11 width=12)
 Hash Cond: (t1.c1 = cte1.c1)
 CTE cte1
 -> Insert on t2 (cost=0.00..0.01 rows=1 width=8)
 -> Result (cost=0.00..0.01 rows=1 width=8)
 -> Seq Scan on t1 (cost=0.00..32.60 rows=2260 width=8)
 -> Hash (cost=0.02..0.02 rows=1 width=8)
 -> CTE Scan on cte1 (cost=0.00..0.02 rows=1 width=8)
(8 rows)

with cte1 as (insert into t2 values (1, 2) returning *) select * from cte1 join t1 using(c1);
 c1 | c2 | c2
----+----+----
(0 rows)

truncate t2;
TRUNCATE TABLE
with cte1 as (insert into t2 values (1, 2) returning *) select cte1.*, t1.* from cte1 join t1 using(c1);
 c1 | c2 | c1 | c2
----+----+----+----
(0 rows)

Table t1 and  t2 both has 2 columns: c1, c2, when CTE join select *, the result target list seems to lost one’s column c1.
But it looks good when select cte1.* and t1.* explicitly .

Is it a bug?

This is working as intended. When using a USING clause you "merge" both
columns so the final target list only contain one version of the merged
columns, which doesn't happen if you use e.g. ON instead. I'm assuming that
what the SQL standard says, but I don't have a copy to confirm.

#4Zhang Mingli
zmlpostgres@gmail.com
In reply to: Julien Rouhaud (#3)
Re: Targetlist lost when CTE join <targetlist lost when CTE join>

Hi

Regards,
Zhang Mingli
On Jun 28, 2023, 17:17 +0800, Julien Rouhaud <rjuju123@gmail.com>, wrote:

This is working as intended. When using a USING clause you "merge" both
columns so the final target list only contain one version of the merged
columns, which doesn't happen if you use e.g. ON instead. I'm assuming that
what the SQL standard says, but I don't have a copy to confirm.

Thanks. You’r right.

Have a test:

gpadmin=# with cte1 as (insert into t2 values (1, 2) returning *) select * from cte1 join t1 on t1.c1 = cte1.c1;
 c1 | c2 | c1 | c2
----+----+----+----
(0 rows)

#5Julien Rouhaud
rjuju123@gmail.com
In reply to: Julien Rouhaud (#3)
Re: Targetlist lost when CTE join <targetlist lost when CTE join>

On Wed, Jun 28, 2023 at 05:17:14PM +0800, Julien Rouhaud wrote:

Table t1 and  t2 both has 2 columns: c1, c2, when CTE join select *, the result target list seems to lost one’s column c1.
But it looks good when select cte1.* and t1.* explicitly .

Is it a bug?

This is working as intended. When using a USING clause you "merge" both
columns so the final target list only contain one version of the merged
columns, which doesn't happen if you use e.g. ON instead. I'm assuming that
what the SQL standard says, but I don't have a copy to confirm.

I forgot to mention that this is actually documented:

https://www.postgresql.org/docs/current/queries-table-expressions.html

Furthermore, the output of JOIN USING suppresses redundant columns: there is no
need to print both of the matched columns, since they must have equal values.
While JOIN ON produces all columns from T1 followed by all columns from T2,
JOIN USING produces one output column for each of the listed column pairs (in
the listed order), followed by any remaining columns from T1, followed by any
remaining columns from T2.

#6Zhang Mingli
zmlpostgres@gmail.com
In reply to: Julien Rouhaud (#5)
Re: Targetlist lost when CTE join <targetlist lost when CTE join>

HI,

On Jun 28, 2023, 17:26 +0800, Julien Rouhaud <rjuju123@gmail.com>, wrote:

On Wed, Jun 28, 2023 at 05:17:14PM +0800, Julien Rouhaud wrote:

Table t1 and  t2 both has 2 columns: c1, c2, when CTE join select *, the result target list seems to lost one’s column c1.
But it looks good when select cte1.* and t1.* explicitly .

Is it a bug?

This is working as intended. When using a USING clause you "merge" both
columns so the final target list only contain one version of the merged
columns, which doesn't happen if you use e.g. ON instead. I'm assuming that
what the SQL standard says, but I don't have a copy to confirm.

I forgot to mention that this is actually documented:

https://www.postgresql.org/docs/current/queries-table-expressions.html

Furthermore, the output of JOIN USING suppresses redundant columns: there is no
need to print both of the matched columns, since they must have equal values.
While JOIN ON produces all columns from T1 followed by all columns from T2,
JOIN USING produces one output column for each of the listed column pairs (in
the listed order), followed by any remaining columns from T1, followed by any
remaining columns from T2.

Thanks for your help.

Regards,
Zhang Mingli