New CORRESPONDING clause design

Started by Surafel Temesgenabout 9 years ago38 messageshackers
Jump to latest
#1Surafel Temesgen
surafel3000@gmail.com

I am new here and I really want to contribute, I have read same resource
that help understanding database system and postgresql. I would like to
start implementing sql syntax corresponding by clause because I believe
implementing sql syntax gives an opportunity to familiarize many part of
postgresql source code. Previous implementation is here
</messages/by-id/CAJZSWkWN3YwQ01C3+cq0+eyZ1DmK=69_6vryrsVGMC=+fWrSZA@mail.gmail.com&gt;and
have an issue on explain query and break cases on unlabeled NULLs
To repeat what a corresponding by clause means
Corresponding clause either contains a BY(...) clause or not. If it
doesn't have a BY(...) clause the usage is as follows.

SELECT 1 a, 2 b, 3 c UNION CORRESPONDING SELECT 4 b, 5 d, 6 c, 7 f;

with output:
b c
-----
2 3
4 6

i.e. matching column names are filtered and are only output from the
whole set operation clause.

If we introduce a BY(...) clause, then column names are further
intersected with that BY clause:

SELECT 1 a, 2 b, 3 c UNION CORRESPONDING BY(b) SELECT 4 b, 5 d, 6 c, 7 f;

with output:

b
--
2
4
My design is
*In parsing stage*
1. Check at least one common column name appear in queries
2. If corresponding column list is not specified, then make corresponding
list from common column name in queries target lists in the order
that those column names appear in first query
3. If corresponding column list is specified, then check that every column
name
in The corresponding column list appears in column names of both queries
*In planner*
1. Take projection columen name from corresponding list
2. Reorder left and right queries target lists according to corresponding
list
3. For each query, project columens as many as number of corresponding
columen.
Continue with normal set operation process

#2Merlin Moncure
mmoncure@gmail.com
In reply to: Surafel Temesgen (#1)
Re: New CORRESPONDING clause design

On Tue, Jan 17, 2017 at 12:37 AM, Surafel Temsgen <surafel3000@gmail.com> wrote:

I am new here and I really want to contribute, I have read same resource
that help understanding database system and postgresql. I would like to
start implementing sql syntax corresponding by clause because I believe
implementing sql syntax gives an opportunity to familiarize many part of
postgresql source code. Previous implementation is here and have an issue on
explain query and break cases on unlabeled NULLs
To repeat what a corresponding by clause means
Corresponding clause either contains a BY(...) clause or not. If it
doesn't have a BY(...) clause the usage is as follows.

This is great stuff. Does the syntax only apply to UNION? I would
imagine it would also apply to INTERSECT/EXCEPT? What about UNION
ALL?

merlin

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#3David Fetter
david@fetter.org
In reply to: Merlin Moncure (#2)
Re: New CORRESPONDING clause design

On Tue, Jan 17, 2017 at 08:20:25AM -0600, Merlin Moncure wrote:

On Tue, Jan 17, 2017 at 12:37 AM, Surafel Temsgen <surafel3000@gmail.com> wrote:

I am new here and I really want to contribute, I have read same resource
that help understanding database system and postgresql. I would like to
start implementing sql syntax corresponding by clause because I believe
implementing sql syntax gives an opportunity to familiarize many part of
postgresql source code. Previous implementation is here and have an issue on
explain query and break cases on unlabeled NULLs
To repeat what a corresponding by clause means
Corresponding clause either contains a BY(...) clause or not. If it
doesn't have a BY(...) clause the usage is as follows.

This is great stuff. Does the syntax only apply to UNION? I would
imagine it would also apply to INTERSECT/EXCEPT? What about UNION
ALL?

My draft working standard from 2011 says in 7IWD-02-Foundation section 7.13 <query expression>:

a) If CORRESPONDING is specified, then:
i) Within the columns of T1, equivalent <column name>s shall not be specified more than once
and within the columns of T2, equivalent <column name>s shall not be specified more than
once.
ii) At least one column of T1 shall have a <column name> that is the <column name> of some
column of T2.
iii) Case:
1) If <corresponding column list> is not specified, then let SL be a <select list> of those
<column name>s that are <column name>s of both T1 and T2 in the order that those
<column name>s appear in T1.
2) If <corresponding column list> is specified, then let SL be a <select list> of those <column
name>s explicitly appearing in the <corresponding column list> in the order that these
<column name>s appear in the <corresponding column list>. Every <column name> in
the <corresponding column list> shall be a <column name> of both T1 and T2.
iv) The <query term> or <query expression body> is equivalent to:
( SELECT SL FROM TN1 ) OP ( SELECT SL FROM TN2 )

Earlier, it defines ( UNION | EXCEPT ) [ ALL | DISTINCT ] to have a
meaning in this context, to wit:

<query expression body> ::=
<query term>
| <query expression body> UNION [ ALL | DISTINCT ]
[ <corresponding spec> ] <query term>
| <query expression body> EXCEPT [ ALL | DISTINCT ]
[ <corresponding spec> ] <query term>

Best,
David.
--
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter
Skype: davidfetter XMPP: david(dot)fetter(at)gmail(dot)com

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#4Tom Lane
tgl@sss.pgh.pa.us
In reply to: Surafel Temesgen (#1)
Re: New CORRESPONDING clause design

Surafel Temsgen <surafel3000@gmail.com> writes:

My design is
*In parsing stage*
1. Check at least one common column name appear in queries
2. If corresponding column list is not specified, then make corresponding
list from common column name in queries target lists in the order
that those column names appear in first query
3. If corresponding column list is specified, then check that every column
name
in The corresponding column list appears in column names of both queries
*In planner*
1. Take projection columen name from corresponding list
2. Reorder left and right queries target lists according to corresponding
list
3. For each query, project columens as many as number of corresponding
columen.

FWIW, I think you need to do most of that work in the parser, not the
planner. The parser certainly has to determine the actual output column
names and types of the setop construct, and we usually consider that the
parsing stage is expected to fully determine the semantics of the query.
So I'd envision the parser as emitting some explicit representation of
the column mapping for each side, rather than expecting the planner to
determine for itself what maps to what.

It probably does make sense for the actual implementation to involve
inserting projection nodes below the setop. I'm pretty sure the executor
code for setops expects to just pass input tuples through without
projection. You could imagine changing that, but it would add a penalty
for non-CORRESPONDING setops, which IMO would be the wrong tradeoff.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#5Surafel Temesgen
surafel3000@gmail.com
In reply to: Tom Lane (#4)
Re: New CORRESPONDING clause design

Here is the implementation of the clause with the slight change, instead of
doing column mapping for each side of leaf Queries in planner I make the
projection nodes output to corresponding column lists only.

This patch compiles and tests successfully with master branch on
ubuntu-15.10-desktop-amd64.It also includes documentation and new
regression tests in union.sql.

Regards

Surafel Temesgen

On Tue, Jan 17, 2017 at 8:21 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Show quoted text

Surafel Temsgen <surafel3000@gmail.com> writes:

My design is
*In parsing stage*
1. Check at least one common column name appear in queries
2. If corresponding column list is not specified, then make corresponding
list from common column name in queries target lists in the order
that those column names appear in first query
3. If corresponding column list is specified, then check that every

column

name
in The corresponding column list appears in column names of both

queries

*In planner*
1. Take projection columen name from corresponding list
2. Reorder left and right queries target lists according to corresponding
list
3. For each query, project columens as many as number of corresponding
columen.

FWIW, I think you need to do most of that work in the parser, not the
planner. The parser certainly has to determine the actual output column
names and types of the setop construct, and we usually consider that the
parsing stage is expected to fully determine the semantics of the query.
So I'd envision the parser as emitting some explicit representation of
the column mapping for each side, rather than expecting the planner to
determine for itself what maps to what.

It probably does make sense for the actual implementation to involve
inserting projection nodes below the setop. I'm pretty sure the executor
code for setops expects to just pass input tuples through without
projection. You could imagine changing that, but it would add a penalty
for non-CORRESPONDING setops, which IMO would be the wrong tradeoff.

regards, tom lane

Attachments:

corresponding_clause_v1.patchapplication/octet-stream; name=corresponding_clause_v1.patchDownload+1142-186
#6Robert Haas
robertmhaas@gmail.com
In reply to: Surafel Temesgen (#5)
Re: New CORRESPONDING clause design

On Thu, Feb 16, 2017 at 8:28 PM, Surafel Temsgen <surafel3000@gmail.com> wrote:

Here is the implementation of the clause with the slight change, instead of
doing column mapping for each side of leaf Queries in planner I make the
projection nodes output to corresponding column lists only.

This patch compiles and tests successfully with master branch on
ubuntu-15.10-desktop-amd64.It also includes documentation and new regression
tests in union.sql.

You should probably add this to https://commitfest.postgresql.org/ so
that it doesn't get forgotten about.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#7Pavel Stehule
pavel.stehule@gmail.com
In reply to: Surafel Temesgen (#1)
Re: New CORRESPONDING clause design

Hi

I am sending a review of this interesting feature.

I found following issues, questions:

1. unclosed tags <optional> in documentation
2. bad name "changeTargetEntry" - should be makeTargetEntry?
3. Why you removed lot of asserts in prepunion.c? These asserts should be
valid still
4. make_coresponding_target has wrong formatting
5. error "%s queries with a CORRESPONDING clause must have at least one
column with the same name" has wrong formatting, you can show position
6. previous issue is repeated - look on formatting ereport function,
please, you can use DETAIL and HINT fields
7. corresponding clause should to contain column list (I am looking to
ANSI/SQL 99) - you are using expr_list, what has not sense and probably it
has impact on all implementation.
8. typo orderCorrespondingLsit(List *targetlist)
9. I miss more tests for CORRESPONDING BY
10. if I understand to this feature, this query should to work

postgres=# SELECT 1 a, 2 b, 3 c UNION CORRESPONDING BY (c,b) SELECT 4
a, 5 b, 6 c, 8 d;
ERROR: each UNION query must have the same number of columns
LINE 1: ...1 a, 2 b, 3 c UNION CORRESPONDING BY (c,b) SELECT 4 a, 5 b, ...

postgres=# create table t1(a int, b int, c int);
CREATE TABLE
Time: 63,260 ms
postgres=# create table t2(a int, b int);
CREATE TABLE
Time: 57,120 ms
postgres=# select * from t1 union corresponding select * from t2;
ERROR: each UNION query must have the same number of columns
LINE 1: select * from t1 union corresponding select * from t2;

If it is your first patch to Postgres, then it is perfect work!

The @7 is probably most significant - I dislike a expression list there.
name_list should be better there.

Regards

Pavel

#8Surafel Temesgen
surafel3000@gmail.com
In reply to: Pavel Stehule (#7)
Re: New CORRESPONDING clause design

Hi ,

Here is a patch corrected as your feedback except missed tests case because
corresponding by clause is implemented on the top of set operation and you
can’t do that to set operation without corresponding by clause too

Eg

postgres=# SELECT 1 a, 2 b, 3 c UNION SELECT 4 a, 5 b, 6 c, 8 d;

ERROR: each UNION query must have the same number of columns

LINE 1: SELECT 1 a, 2 b, 3 c UNION SELECT 4 a, 5 b, 6 c, 8 d;

^

postgres=# create table t1(a int, b int, c int);

CREATE TABLE

postgres=# create table t2(a int, b int);

CREATE TABLE

postgres=# select * from t1 union select * from t2;

ERROR: each UNION query must have the same number of columns

LINE 1: select * from t1 union select * from t2;

Thanks

Surafel

On Tue, Mar 7, 2017 at 10:26 PM, Pavel Stehule <pavel.stehule@gmail.com>
wrote:

Show quoted text

Hi

I am sending a review of this interesting feature.

I found following issues, questions:

1. unclosed tags <optional> in documentation
2. bad name "changeTargetEntry" - should be makeTargetEntry?
3. Why you removed lot of asserts in prepunion.c? These asserts should be
valid still
4. make_coresponding_target has wrong formatting
5. error "%s queries with a CORRESPONDING clause must have at least one
column with the same name" has wrong formatting, you can show position
6. previous issue is repeated - look on formatting ereport function,
please, you can use DETAIL and HINT fields
7. corresponding clause should to contain column list (I am looking to
ANSI/SQL 99) - you are using expr_list, what has not sense and probably it
has impact on all implementation.
8. typo orderCorrespondingLsit(List *targetlist)
9. I miss more tests for CORRESPONDING BY
10. if I understand to this feature, this query should to work

postgres=# SELECT 1 a, 2 b, 3 c UNION CORRESPONDING BY (c,b) SELECT 4 a, 5 b, 6 c, 8 d;
ERROR: each UNION query must have the same number of columns
LINE 1: ...1 a, 2 b, 3 c UNION CORRESPONDING BY (c,b) SELECT 4 a, 5 b, ...

postgres=# create table t1(a int, b int, c int);
CREATE TABLE
Time: 63,260 ms
postgres=# create table t2(a int, b int);
CREATE TABLE
Time: 57,120 ms
postgres=# select * from t1 union corresponding select * from t2;
ERROR: each UNION query must have the same number of columns
LINE 1: select * from t1 union corresponding select * from t2;

If it is your first patch to Postgres, then it is perfect work!

The @7 is probably most significant - I dislike a expression list there.
name_list should be better there.

Regards

Pavel

Attachments:

corresponding_clause_v2.patchapplication/octet-stream; name=corresponding_clause_v2.patchDownload+1114-190
#9Pavel Stehule
pavel.stehule@gmail.com
In reply to: Surafel Temesgen (#8)
Re: New CORRESPONDING clause design

2017-03-09 13:18 GMT+01:00 Surafel Temesgen <surafel3000@gmail.com>:

Hi ,

Here is a patch corrected as your feedback except missed tests case
because corresponding by clause is implemented on the top of set operation
and you can’t do that to set operation without corresponding by clause too

I don't understand.

The following statement should to work

postgres=# select 10 as a, 20 as b union corresponding select 20 as a, 30
as b, 40 as c;

ERROR: each UNION query must have the same number of columns
LINE 1: ...elect 10 as a, 20 as b union corresponding select 20 as a, 3...

Corresponding clause should to work like projection filter.

Regards

Pavel

Show quoted text

Eg

postgres=# SELECT 1 a, 2 b, 3 c UNION SELECT 4 a, 5 b, 6 c, 8 d;

ERROR: each UNION query must have the same number of columns

LINE 1: SELECT 1 a, 2 b, 3 c UNION SELECT 4 a, 5 b, 6 c, 8 d;

^

postgres=# create table t1(a int, b int, c int);

CREATE TABLE

postgres=# create table t2(a int, b int);

CREATE TABLE

postgres=# select * from t1 union select * from t2;

ERROR: each UNION query must have the same number of columns

LINE 1: select * from t1 union select * from t2;

Thanks

Surafel

On Tue, Mar 7, 2017 at 10:26 PM, Pavel Stehule <pavel.stehule@gmail.com>
wrote:

Hi

I am sending a review of this interesting feature.

I found following issues, questions:

1. unclosed tags <optional> in documentation
2. bad name "changeTargetEntry" - should be makeTargetEntry?
3. Why you removed lot of asserts in prepunion.c? These asserts should be
valid still
4. make_coresponding_target has wrong formatting
5. error "%s queries with a CORRESPONDING clause must have at least one
column with the same name" has wrong formatting, you can show position
6. previous issue is repeated - look on formatting ereport function,
please, you can use DETAIL and HINT fields
7. corresponding clause should to contain column list (I am looking to
ANSI/SQL 99) - you are using expr_list, what has not sense and probably it
has impact on all implementation.
8. typo orderCorrespondingLsit(List *targetlist)
9. I miss more tests for CORRESPONDING BY
10. if I understand to this feature, this query should to work

postgres=# SELECT 1 a, 2 b, 3 c UNION CORRESPONDING BY (c,b) SELECT 4 a, 5 b, 6 c, 8 d;
ERROR: each UNION query must have the same number of columns
LINE 1: ...1 a, 2 b, 3 c UNION CORRESPONDING BY (c,b) SELECT 4 a, 5 b, ...

postgres=# create table t1(a int, b int, c int);
CREATE TABLE
Time: 63,260 ms
postgres=# create table t2(a int, b int);
CREATE TABLE
Time: 57,120 ms
postgres=# select * from t1 union corresponding select * from t2;
ERROR: each UNION query must have the same number of columns
LINE 1: select * from t1 union corresponding select * from t2;

If it is your first patch to Postgres, then it is perfect work!

The @7 is probably most significant - I dislike a expression list there.
name_list should be better there.

Regards

Pavel

#10Pavel Stehule
pavel.stehule@gmail.com
In reply to: Pavel Stehule (#9)
Re: New CORRESPONDING clause design

hi

2017-03-09 17:19 GMT+01:00 Pavel Stehule <pavel.stehule@gmail.com>:

2017-03-09 13:18 GMT+01:00 Surafel Temesgen <surafel3000@gmail.com>:

Hi ,

Here is a patch corrected as your feedback except missed tests case
because corresponding by clause is implemented on the top of set operation
and you can’t do that to set operation without corresponding by clause too

I don't understand.

The following statement should to work

postgres=# select 10 as a, 20 as b union corresponding select 20 as a, 30
as b, 40 as c;

ERROR: each UNION query must have the same number of columns
LINE 1: ...elect 10 as a, 20 as b union corresponding select 20 as a, 3...

Corresponding clause should to work like projection filter.

I found a link to postgresql mailing list related to some older try to this
feature implementation

/messages/by-id/CAJZSWkX7C6Wmfo9Py4BaF8vHz_Ofko3AFSOsJPsb17rGmgBuDQ@mail.gmail.com

Show quoted text

Regards

Pavel

Eg

postgres=# SELECT 1 a, 2 b, 3 c UNION SELECT 4 a, 5 b, 6 c, 8 d;

ERROR: each UNION query must have the same number of columns

LINE 1: SELECT 1 a, 2 b, 3 c UNION SELECT 4 a, 5 b, 6 c, 8 d;

^

postgres=# create table t1(a int, b int, c int);

CREATE TABLE

postgres=# create table t2(a int, b int);

CREATE TABLE

postgres=# select * from t1 union select * from t2;

ERROR: each UNION query must have the same number of columns

LINE 1: select * from t1 union select * from t2;

Thanks

Surafel

On Tue, Mar 7, 2017 at 10:26 PM, Pavel Stehule <pavel.stehule@gmail.com>
wrote:

Hi

I am sending a review of this interesting feature.

I found following issues, questions:

1. unclosed tags <optional> in documentation
2. bad name "changeTargetEntry" - should be makeTargetEntry?
3. Why you removed lot of asserts in prepunion.c? These asserts should
be valid still
4. make_coresponding_target has wrong formatting
5. error "%s queries with a CORRESPONDING clause must have at least one
column with the same name" has wrong formatting, you can show position
6. previous issue is repeated - look on formatting ereport function,
please, you can use DETAIL and HINT fields
7. corresponding clause should to contain column list (I am looking to
ANSI/SQL 99) - you are using expr_list, what has not sense and probably it
has impact on all implementation.
8. typo orderCorrespondingLsit(List *targetlist)
9. I miss more tests for CORRESPONDING BY
10. if I understand to this feature, this query should to work

postgres=# SELECT 1 a, 2 b, 3 c UNION CORRESPONDING BY (c,b) SELECT 4 a, 5 b, 6 c, 8 d;
ERROR: each UNION query must have the same number of columns
LINE 1: ...1 a, 2 b, 3 c UNION CORRESPONDING BY (c,b) SELECT 4 a, 5 b, ...

postgres=# create table t1(a int, b int, c int);
CREATE TABLE
Time: 63,260 ms
postgres=# create table t2(a int, b int);
CREATE TABLE
Time: 57,120 ms
postgres=# select * from t1 union corresponding select * from t2;
ERROR: each UNION query must have the same number of columns
LINE 1: select * from t1 union corresponding select * from t2;

If it is your first patch to Postgres, then it is perfect work!

The @7 is probably most significant - I dislike a expression list there.
name_list should be better there.

Regards

Pavel

#11Surafel Temesgen
surafel3000@gmail.com
In reply to: Pavel Stehule (#10)
Re: New CORRESPONDING clause design

Yes, you are correct it should to work on CORRESPONDING clause case. SQL
20nn standard draft only said each query to be of the same degree in a case
of set operation without corresponding clause. The attached patch is
corrected as such .I add those new test case to regression test too

Regards

Surafel

On Thu, Mar 9, 2017 at 9:49 PM, Pavel Stehule <pavel.stehule@gmail.com>
wrote:

Show quoted text

hi

2017-03-09 17:19 GMT+01:00 Pavel Stehule <pavel.stehule@gmail.com>:

2017-03-09 13:18 GMT+01:00 Surafel Temesgen <surafel3000@gmail.com>:

Hi ,

Here is a patch corrected as your feedback except missed tests case
because corresponding by clause is implemented on the top of set operation
and you can’t do that to set operation without corresponding by clause too

I don't understand.

The following statement should to work

postgres=# select 10 as a, 20 as b union corresponding select 20 as a, 30
as b, 40 as c;

ERROR: each UNION query must have the same number of columns
LINE 1: ...elect 10 as a, 20 as b union corresponding select 20 as a, 3...

Corresponding clause should to work like projection filter.

I found a link to postgresql mailing list related to some older try to
this feature implementation

/messages/by-id/CAJZSWkX7C6Wmfo9Py4BaF8vHz_
Ofko3AFSOsJPsb17rGmgBuDQ@mail.gmail.com

Regards

Pavel

Eg

postgres=# SELECT 1 a, 2 b, 3 c UNION SELECT 4 a, 5 b, 6 c, 8 d;

ERROR: each UNION query must have the same number of columns

LINE 1: SELECT 1 a, 2 b, 3 c UNION SELECT 4 a, 5 b, 6 c, 8 d;

^

postgres=# create table t1(a int, b int, c int);

CREATE TABLE

postgres=# create table t2(a int, b int);

CREATE TABLE

postgres=# select * from t1 union select * from t2;

ERROR: each UNION query must have the same number of columns

LINE 1: select * from t1 union select * from t2;

Thanks

Surafel

On Tue, Mar 7, 2017 at 10:26 PM, Pavel Stehule <pavel.stehule@gmail.com>
wrote:

Hi

I am sending a review of this interesting feature.

I found following issues, questions:

1. unclosed tags <optional> in documentation
2. bad name "changeTargetEntry" - should be makeTargetEntry?
3. Why you removed lot of asserts in prepunion.c? These asserts should
be valid still
4. make_coresponding_target has wrong formatting
5. error "%s queries with a CORRESPONDING clause must have at least one
column with the same name" has wrong formatting, you can show position
6. previous issue is repeated - look on formatting ereport function,
please, you can use DETAIL and HINT fields
7. corresponding clause should to contain column list (I am looking to
ANSI/SQL 99) - you are using expr_list, what has not sense and probably it
has impact on all implementation.
8. typo orderCorrespondingLsit(List *targetlist)
9. I miss more tests for CORRESPONDING BY
10. if I understand to this feature, this query should to work

postgres=# SELECT 1 a, 2 b, 3 c UNION CORRESPONDING BY (c,b) SELECT 4 a, 5 b, 6 c, 8 d;
ERROR: each UNION query must have the same number of columns
LINE 1: ...1 a, 2 b, 3 c UNION CORRESPONDING BY (c,b) SELECT 4 a, 5 b, ...

postgres=# create table t1(a int, b int, c int);
CREATE TABLE
Time: 63,260 ms
postgres=# create table t2(a int, b int);
CREATE TABLE
Time: 57,120 ms
postgres=# select * from t1 union corresponding select * from t2;
ERROR: each UNION query must have the same number of columns
LINE 1: select * from t1 union corresponding select * from t2;

If it is your first patch to Postgres, then it is perfect work!

The @7 is probably most significant - I dislike a expression list
there. name_list should be better there.

Regards

Pavel

Attachments:

corresponding_clause_v3.patchapplication/octet-stream; name=corresponding_clause_v3.patchDownload+1146-201
#12Pavel Stehule
pavel.stehule@gmail.com
In reply to: Surafel Temesgen (#11)
Re: New CORRESPONDING clause design

2017-03-10 10:13 GMT+01:00 Surafel Temesgen <surafel3000@gmail.com>:

Yes, you are correct it should to work on CORRESPONDING clause case. SQL
20nn standard draft only said each query to be of the same degree in a case
of set operation without corresponding clause. The attached patch is
corrected as such .I add those new test case to regression test too

Thank you - I will recheck it.

Regards

Pavel

Show quoted text

Regards

Surafel

On Thu, Mar 9, 2017 at 9:49 PM, Pavel Stehule <pavel.stehule@gmail.com>
wrote:

hi

2017-03-09 17:19 GMT+01:00 Pavel Stehule <pavel.stehule@gmail.com>:

2017-03-09 13:18 GMT+01:00 Surafel Temesgen <surafel3000@gmail.com>:

Hi ,

Here is a patch corrected as your feedback except missed tests case
because corresponding by clause is implemented on the top of set operation
and you can’t do that to set operation without corresponding by clause too

I don't understand.

The following statement should to work

postgres=# select 10 as a, 20 as b union corresponding select 20 as a,
30 as b, 40 as c;

ERROR: each UNION query must have the same number of columns
LINE 1: ...elect 10 as a, 20 as b union corresponding select 20 as a, 3...

Corresponding clause should to work like projection filter.

I found a link to postgresql mailing list related to some older try to
this feature implementation

/messages/by-id/CAJZSWkX7C6Wmfo9Py4Ba
F8vHz_Ofko3AFSOsJPsb17rGmgBuDQ@mail.gmail.com

Regards

Pavel

Eg

postgres=# SELECT 1 a, 2 b, 3 c UNION SELECT 4 a, 5 b, 6 c, 8 d;

ERROR: each UNION query must have the same number of columns

LINE 1: SELECT 1 a, 2 b, 3 c UNION SELECT 4 a, 5 b, 6 c, 8 d;

^

postgres=# create table t1(a int, b int, c int);

CREATE TABLE

postgres=# create table t2(a int, b int);

CREATE TABLE

postgres=# select * from t1 union select * from t2;

ERROR: each UNION query must have the same number of columns

LINE 1: select * from t1 union select * from t2;

Thanks

Surafel

On Tue, Mar 7, 2017 at 10:26 PM, Pavel Stehule <pavel.stehule@gmail.com

wrote:

Hi

I am sending a review of this interesting feature.

I found following issues, questions:

1. unclosed tags <optional> in documentation
2. bad name "changeTargetEntry" - should be makeTargetEntry?
3. Why you removed lot of asserts in prepunion.c? These asserts should
be valid still
4. make_coresponding_target has wrong formatting
5. error "%s queries with a CORRESPONDING clause must have at least
one column with the same name" has wrong formatting, you can show position
6. previous issue is repeated - look on formatting ereport function,
please, you can use DETAIL and HINT fields
7. corresponding clause should to contain column list (I am looking to
ANSI/SQL 99) - you are using expr_list, what has not sense and probably it
has impact on all implementation.
8. typo orderCorrespondingLsit(List *targetlist)
9. I miss more tests for CORRESPONDING BY
10. if I understand to this feature, this query should to work

postgres=# SELECT 1 a, 2 b, 3 c UNION CORRESPONDING BY (c,b) SELECT 4 a, 5 b, 6 c, 8 d;
ERROR: each UNION query must have the same number of columns
LINE 1: ...1 a, 2 b, 3 c UNION CORRESPONDING BY (c,b) SELECT 4 a, 5 b, ...

postgres=# create table t1(a int, b int, c int);
CREATE TABLE
Time: 63,260 ms
postgres=# create table t2(a int, b int);
CREATE TABLE
Time: 57,120 ms
postgres=# select * from t1 union corresponding select * from t2;
ERROR: each UNION query must have the same number of columns
LINE 1: select * from t1 union corresponding select * from t2;

If it is your first patch to Postgres, then it is perfect work!

The @7 is probably most significant - I dislike a expression list
there. name_list should be better there.

Regards

Pavel

#13Pavel Stehule
pavel.stehule@gmail.com
In reply to: Pavel Stehule (#12)
Re: New CORRESPONDING clause design

Hi

2017-03-10 12:55 GMT+01:00 Pavel Stehule <pavel.stehule@gmail.com>:

2017-03-10 10:13 GMT+01:00 Surafel Temesgen <surafel3000@gmail.com>:

Yes, you are correct it should to work on CORRESPONDING clause case. SQL
20nn standard draft only said each query to be of the same degree in a case
of set operation without corresponding clause. The attached patch is
corrected as such .I add those new test case to regression test too

Thank you - I will recheck it.

Fast check - it looks well

Regards

Pavel

#14Pavel Stehule
pavel.stehule@gmail.com
In reply to: Pavel Stehule (#13)
Re: New CORRESPONDING clause design

Hi

2017-03-10 13:49 GMT+01:00 Pavel Stehule <pavel.stehule@gmail.com>:

Hi

2017-03-10 12:55 GMT+01:00 Pavel Stehule <pavel.stehule@gmail.com>:

2017-03-10 10:13 GMT+01:00 Surafel Temesgen <surafel3000@gmail.com>:

Yes, you are correct it should to work on CORRESPONDING clause case. SQL
20nn standard draft only said each query to be of the same degree in a case
of set operation without corresponding clause. The attached patch is
corrected as such .I add those new test case to regression test too

Thank you - I will recheck it.

Fast check - it looks well

I am sending minor update - cleaning formatting and white spaces, error
messages + few more tests

It is working very well.

Maybe correspondingClause needs own node type with attached location. Then
context can be much better positioned.

Regards

Pavel

Show quoted text

Regards

Pavel

Attachments:

corresponding_clause_v4.patchtext/x-patch; charset=US-ASCII; name=corresponding_clause_v4.patchDownload+1167-198
#15Surafel Temesgen
surafel3000@gmail.com
In reply to: Pavel Stehule (#14)
Re: New CORRESPONDING clause design

On Sat, Mar 11, 2017 at 9:01 AM, Pavel Stehule <pavel.stehule@gmail.com>
wrote:

I am sending minor update - cleaning formatting and white spaces, error
messages + few more tests

Thank you very much for your help

Maybe correspondingClause needs own node type with attached location. Then
context can be much better positioned.

I think we can solve it by using your option or using expr_list for
corresponding column and check the syntax manually.

In my opinion, the last option eliminate the introduction of new node for
only the sake of error position.

What did you think about the second option?

Regards

Surafel

#16Pavel Stehule
pavel.stehule@gmail.com
In reply to: Surafel Temesgen (#15)
Re: New CORRESPONDING clause design

2017-03-13 14:13 GMT+01:00 Surafel Temesgen <surafel3000@gmail.com>:

On Sat, Mar 11, 2017 at 9:01 AM, Pavel Stehule <pavel.stehule@gmail.com>
wrote:

I am sending minor update - cleaning formatting and white spaces, error
messages + few more tests

Thank you very much for your help

Maybe correspondingClause needs own node type with attached location.
Then context can be much better positioned.

I think we can solve it by using your option or using expr_list for
corresponding column and check the syntax manually.

In my opinion, the last option eliminate the introduction of new node for
only the sake of error position.

What did you think about the second option?

I don't like it too much - using expr only for location is too misuse.

Some errors are related to just CORRESPONDING without any columns. So using
expr doesn't help here. So parse node CORRESPONDING can solve both issues.

Regards

Pavel

Show quoted text

Regards

Surafel

#17Surafel Temesgen
surafel3000@gmail.com
In reply to: Pavel Stehule (#16)
Re: New CORRESPONDING clause design

hi

Some errors are related to just CORRESPONDING without any columns. So using

expr doesn't help here. So parse node CORRESPONDING can solve both issues.

In current implementation pointing to a node means pointing to a node’s
first element so I don’t think we can be able to point to CORRESPONDING
without any columns

I find out that there is already a node prepare for the case called A_Const.
The attached patch use that node

Regards
Surafel

Attachments:

corresponding_clause_v5.patchapplication/octet-stream; name=corresponding_clause_v5.patchDownload+1178-201
#18Pavel Stehule
pavel.stehule@gmail.com
In reply to: Surafel Temesgen (#17)
Re: New CORRESPONDING clause design

Hi

2017-03-14 16:33 GMT+01:00 Surafel Temesgen <surafel3000@gmail.com>:

hi

Some errors are related to just CORRESPONDING without any columns. So

using expr doesn't help here. So parse node CORRESPONDING can solve both
issues.

In current implementation pointing to a node means pointing to a node’s
first element so I don’t think we can be able to point to CORRESPONDING
without any columns

I find out that there is already a node prepare for the case called
A_Const.
The attached patch use that node

It looks better

I fixed format of comments and some too long lines.

all regress tests passed

I have not any objection - I'll mark this patch as ready for commiter

Regards

Pavel

Show quoted text

Regards
Surafel

Attachments:

corresponding_clause_v6.patchtext/x-patch; charset=US-ASCII; name=corresponding_clause_v6.patchDownload+1211-199
#19Tom Lane
tgl@sss.pgh.pa.us
In reply to: Pavel Stehule (#18)
Re: New CORRESPONDING clause design

Pavel Stehule <pavel.stehule@gmail.com> writes:

I have not any objection - I'll mark this patch as ready for commiter

I took a quick look through this and noted that it fails to touch
ruleutils.c, which means that dumping of views containing CORRESPONDING
certainly doesn't work.

Also, the changes in parser/analyze.c seem rather massive and
correspondingly hard to review. Is it possible to rearrange the
patch to reduce the size of that diff? If you can avoid moving
or reindenting existing code, that'd help.

The code in that area seems rather confused, too. For instance,
I'm not sure exactly what orderCorrespondingList() is good for,
but it certainly doesn't look to be doing anything that either its
name or its header comment (or the comments at the call sites) would
suggest. Its input and output tlists are always in the same order.

I'm a little disturbed by the fact that determineMatchingColumns()
is called twice, and more disturbed by the fact that it looks to be
O(N^2). This could be really slow with a lot of columns, couldn't it?

I also think there should be some comments about exactly what matching
semantics we're enforcing. The SQL standard says

a) If CORRESPONDING is specified, then:
i) Within the columns of T1, equivalent <column name>s shall
not be specified more than once and within the columns of
T2, equivalent <column name>s shall not be specified more
than once.

That seems unreasonably stringent to me; it ought to be sufficient to
forbid duplicates of the names listed in CORRESPONDING, or the common
column names if there's no BY list. But whichever restriction you prefer,
this code seems to be failing to check it --- I certainly don't see any
new error message about "column name "foo" appears more than once".

I don't think you want to be leaving behind debugging cruft like this:
+ elog(DEBUG4, "%s", ltle->resname);

I'm not impressed by using A_Const for the members of the CORRESPONDING
name list. That's not a clever solution, that's a confusing kluge,
because it's a complete violation of the meaning of A_Const. Elsewhere
we just use lists of String for name lists, and that seems sufficient
here. Personally I'd just use the existing columnList production rather
than rolling your own.

The comments in parsenodes.h about the new fields seem rather confused,
in fact I think they're probably backwards.

Also, I thought the test cases were excessively uninspired and repetitive.
This, for example, seems like about two tests too many:

+SELECT 1 a, 2 b, 3 c UNION CORRESPONDING BY(a) SELECT 4 a, 5 b, 6 c;
+ a 
+---
+ 1
+ 4
+(2 rows)
+
+SELECT 1 a, 2 b, 3 c UNION CORRESPONDING BY(b) SELECT 4 a, 5 b, 6 c;
+ b 
+---
+ 2
+ 5
+(2 rows)
+
+SELECT 1 a, 2 b, 3 c UNION CORRESPONDING BY(c) SELECT 4 a, 5 b, 6 c;
+ c 
+---
+ 3
+ 6
+(2 rows)

without even considering the fact that these are pretty duplicative of
tests around them. And some of the added tests seem to be just testing
basic UNION functionality, which we already have tests for. I fail to
see the point of adding any of these:

+SELECT 1 AS two UNION CORRESPONDING SELECT 2 two;
+ two 
+-----
+   1
+   2
+(2 rows)
+
+SELECT 1 AS one UNION CORRESPONDING SELECT 1 one;
+ one 
+-----
+   1
+(1 row)
+
+SELECT 1 AS two UNION ALL CORRESPONDING SELECT 2 two;
+ two 
+-----
+   1
+   2
+(2 rows)
+
+SELECT 1 AS two UNION ALL CORRESPONDING SELECT 1 two;
+ two 
+-----
+   1
+   1
+(2 rows)

What I think actually is needed is some targeted testing showing
non-interference with nearby features. As an example, CORRESPONDING
can't safely be implemented by just replacing the tlists of the
input sub-selects with shortened versions, because that would break
situations such as DISTINCT in the sub-selects. I think it'd be a
good idea to have a test along the lines of

SELECT DISTINCT x, y FROM ...
UNION ALL CORRESPONDING
SELECT DISTINCT x, z FROM ...

with values chosen to show that the DISTINCT operators did operate
on x/y and x/z, not just x alone.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#20Pavel Stehule
pavel.stehule@gmail.com
In reply to: Tom Lane (#19)
Re: New CORRESPONDING clause design

2017-03-18 17:50 GMT+01:00 Tom Lane <tgl@sss.pgh.pa.us>:

Pavel Stehule <pavel.stehule@gmail.com> writes:

I have not any objection - I'll mark this patch as ready for commiter

I took a quick look through this and noted that it fails to touch
ruleutils.c, which means that dumping of views containing CORRESPONDING
certainly doesn't work.

it my mistake - this bug I should to see :(

Also, the changes in parser/analyze.c seem rather massive and
correspondingly hard to review. Is it possible to rearrange the
patch to reduce the size of that diff? If you can avoid moving
or reindenting existing code, that'd help.

The code in that area seems rather confused, too. For instance,
I'm not sure exactly what orderCorrespondingList() is good for,
but it certainly doesn't look to be doing anything that either its
name or its header comment (or the comments at the call sites) would
suggest. Its input and output tlists are always in the same order.

I'm a little disturbed by the fact that determineMatchingColumns()
is called twice, and more disturbed by the fact that it looks to be
O(N^2). This could be really slow with a lot of columns, couldn't it?

I also think there should be some comments about exactly what matching
semantics we're enforcing. The SQL standard says

a) If CORRESPONDING is specified, then:
i) Within the columns of T1, equivalent <column name>s shall
not be specified more than once and within the columns of
T2, equivalent <column name>s shall not be specified more
than once.

That seems unreasonably stringent to me; it ought to be sufficient to
forbid duplicates of the names listed in CORRESPONDING, or the common
column names if there's no BY list. But whichever restriction you prefer,
this code seems to be failing to check it --- I certainly don't see any
new error message about "column name "foo" appears more than once".

I don't think you want to be leaving behind debugging cruft like this:
+ elog(DEBUG4, "%s", ltle->resname);

I'm not impressed by using A_Const for the members of the CORRESPONDING
name list. That's not a clever solution, that's a confusing kluge,
because it's a complete violation of the meaning of A_Const. Elsewhere
we just use lists of String for name lists, and that seems sufficient
here. Personally I'd just use the existing columnList production rather
than rolling your own.

The reason was attach a location to name for more descriptive error
message.

Show quoted text

The comments in parsenodes.h about the new fields seem rather confused,
in fact I think they're probably backwards.

Also, I thought the test cases were excessively uninspired and repetitive.
This, for example, seems like about two tests too many:

+SELECT 1 a, 2 b, 3 c UNION CORRESPONDING BY(a) SELECT 4 a, 5 b, 6 c;
+ a
+---
+ 1
+ 4
+(2 rows)
+
+SELECT 1 a, 2 b, 3 c UNION CORRESPONDING BY(b) SELECT 4 a, 5 b, 6 c;
+ b
+---
+ 2
+ 5
+(2 rows)
+
+SELECT 1 a, 2 b, 3 c UNION CORRESPONDING BY(c) SELECT 4 a, 5 b, 6 c;
+ c
+---
+ 3
+ 6
+(2 rows)

without even considering the fact that these are pretty duplicative of
tests around them. And some of the added tests seem to be just testing
basic UNION functionality, which we already have tests for. I fail to
see the point of adding any of these:

+SELECT 1 AS two UNION CORRESPONDING SELECT 2 two;
+ two
+-----
+   1
+   2
+(2 rows)
+
+SELECT 1 AS one UNION CORRESPONDING SELECT 1 one;
+ one
+-----
+   1
+(1 row)
+
+SELECT 1 AS two UNION ALL CORRESPONDING SELECT 2 two;
+ two
+-----
+   1
+   2
+(2 rows)
+
+SELECT 1 AS two UNION ALL CORRESPONDING SELECT 1 two;
+ two
+-----
+   1
+   1
+(2 rows)

What I think actually is needed is some targeted testing showing
non-interference with nearby features. As an example, CORRESPONDING
can't safely be implemented by just replacing the tlists of the
input sub-selects with shortened versions, because that would break
situations such as DISTINCT in the sub-selects. I think it'd be a
good idea to have a test along the lines of

SELECT DISTINCT x, y FROM ...
UNION ALL CORRESPONDING
SELECT DISTINCT x, z FROM ...

with values chosen to show that the DISTINCT operators did operate
on x/y and x/z, not just x alone.

regards, tom lane

#21Tom Lane
tgl@sss.pgh.pa.us
In reply to: Pavel Stehule (#20)
#22Pavel Stehule
pavel.stehule@gmail.com
In reply to: Tom Lane (#21)
#23Tom Lane
tgl@sss.pgh.pa.us
In reply to: Pavel Stehule (#22)
#24Pavel Stehule
pavel.stehule@gmail.com
In reply to: Tom Lane (#23)
#25Surafel Temesgen
surafel3000@gmail.com
In reply to: Tom Lane (#19)
#26Surafel Temesgen
surafel3000@gmail.com
In reply to: Tom Lane (#19)
#27Pavel Stehule
pavel.stehule@gmail.com
In reply to: Surafel Temesgen (#26)
#28Pavel Stehule
pavel.stehule@gmail.com
In reply to: Pavel Stehule (#27)
#29Surafel Temesgen
surafel3000@gmail.com
In reply to: Pavel Stehule (#28)
#30Pavel Stehule
pavel.stehule@gmail.com
In reply to: Surafel Temesgen (#29)
#31Pavel Stehule
pavel.stehule@gmail.com
In reply to: Pavel Stehule (#30)
#32Surafel Temesgen
surafel3000@gmail.com
In reply to: Pavel Stehule (#31)
#33Pavel Stehule
pavel.stehule@gmail.com
In reply to: Surafel Temesgen (#32)
#34Tom Lane
tgl@sss.pgh.pa.us
In reply to: Pavel Stehule (#33)
#35Pavel Stehule
pavel.stehule@gmail.com
In reply to: Tom Lane (#34)
#36Pavel Stehule
pavel.stehule@gmail.com
In reply to: Pavel Stehule (#35)
#37Tom Lane
tgl@sss.pgh.pa.us
In reply to: Pavel Stehule (#36)
#38David Steele
david@pgmasters.net
In reply to: Tom Lane (#37)