Alias of VALUES RTE in explain plan

yasir.hussain.shah@gmail.com

over 1 year ago

In reply to: Ashutosh Bapat (#1)

Re: Alias of VALUES RTE in explain plan

On Mon, Jul 1, 2024 at 3:17 PM Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>
wrote:

Hi All,
While reviewing Richard's patch for grouping sets, I stumbled upon
following explain output

explain (costs off)
select distinct on (a, b) a, b
from (values (1, 1), (2, 2)) as t (a, b) where a = b
group by grouping sets((a, b), (a))
order by a, b;
QUERY PLAN
----------------------------------------------------------------
Unique
-> Sort
Sort Key: "*VALUES*".column1, "*VALUES*".column2
-> HashAggregate
Hash Key: "*VALUES*".column1, "*VALUES*".column2
Hash Key: "*VALUES*".column1
-> Values Scan on "*VALUES*"
Filter: (column1 = column2)
(8 rows)

There is no VALUES.column1 and VALUES.column2 in the query. The alias t.a
and t.b do not appear anywhere in the explain output. I think explain
output should look like
explain (costs off)
select distinct on (a, b) a, b
from (values (1, 1), (2, 2)) as t (a, b) where a = b
group by grouping sets((a, b), (a))
order by a, b;
QUERY PLAN
----------------------------------------------------------------
Unique
-> Sort
Sort Key: t.a, t.b
-> HashAggregate
Hash Key: t.a, t.b
Hash Key: t.a
-> Values Scan on "*VALUES*" t
Filter: (a = b)
(8 rows)

I didn't get time to figure out the reason behind this, nor the history.
But I thought I would report it nonetheless.

I have looked into the issue and found that when subqueries are pulled up,
a modifiable copy of the subquery is created for modification in the
pull_up_simple_subquery() function. During this process,
flatten_join_alias_vars() is called to flatten any join alias variables in
the subquery's target list. However at this point, we lose subquery's alias.
If you/hackers agree with my findings, I can provide a working patch soon.

Show quoted text

--
Best Wishes,
Ashutosh Bapat

yasir.hussain.shah@gmail.com

over 1 year ago

In reply to: Yasir (#2)

Re: Alias of VALUES RTE in explain plan

Hi Ashutosh & PG Hackers,

I have fixed the code to produce desired output by adding a few lines in
pull_up_simple_subquery().
Attached patch is divided in 2 files:
- 001-Fix-Alias-VALUES-RTE.patch contains the actual fix.
- 002-Fix-Alias-VALUES-RTE.patch contains the expected output changes
against the actual fix.

I also have verified regression tests, all seems good.

Respected hackers please have a look.

Thanks and regards...

Yasir

On Thu, Aug 15, 2024 at 7:13 PM Yasir <yasir.hussain.shah@gmail.com> wrote:

Show quoted text

On Mon, Jul 1, 2024 at 3:17 PM Ashutosh Bapat <
ashutosh.bapat.oss@gmail.com> wrote:

Hi All,
While reviewing Richard's patch for grouping sets, I stumbled upon
following explain output

explain (costs off)
select distinct on (a, b) a, b
from (values (1, 1), (2, 2)) as t (a, b) where a = b
group by grouping sets((a, b), (a))
order by a, b;
QUERY PLAN
----------------------------------------------------------------
Unique
-> Sort
Sort Key: "*VALUES*".column1, "*VALUES*".column2
-> HashAggregate
Hash Key: "*VALUES*".column1, "*VALUES*".column2
Hash Key: "*VALUES*".column1
-> Values Scan on "*VALUES*"
Filter: (column1 = column2)
(8 rows)

There is no VALUES.column1 and VALUES.column2 in the query. The alias t.a
and t.b do not appear anywhere in the explain output. I think explain
output should look like
explain (costs off)
select distinct on (a, b) a, b
from (values (1, 1), (2, 2)) as t (a, b) where a = b
group by grouping sets((a, b), (a))
order by a, b;
QUERY PLAN
----------------------------------------------------------------
Unique
-> Sort
Sort Key: t.a, t.b
-> HashAggregate
Hash Key: t.a, t.b
Hash Key: t.a
-> Values Scan on "*VALUES*" t
Filter: (a = b)
(8 rows)

I didn't get time to figure out the reason behind this, nor the history.
But I thought I would report it nonetheless.

I have looked into the issue and found that when subqueries are pulled up,
a modifiable copy of the subquery is created for modification in the
pull_up_simple_subquery() function. During this process,
flatten_join_alias_vars() is called to flatten any join alias variables
in the subquery's target list. However at this point, we lose
subquery's alias.
If you/hackers agree with my findings, I can provide a working patch soon.

--
Best Wishes,
Ashutosh Bapat

tgl@sss.pgh.pa.us

over 1 year ago

In reply to: Yasir (#3)

Re: Alias of VALUES RTE in explain plan

Yasir <yasir.hussain.shah@gmail.com> writes:

I have fixed the code to produce desired output by adding a few lines in
pull_up_simple_subquery().
Attached patch is divided in 2 files:
- 001-Fix-Alias-VALUES-RTE.patch contains the actual fix.
- 002-Fix-Alias-VALUES-RTE.patch contains the expected output changes
against the actual fix.

I was initially skeptical about this, because we've been printing
"*VALUES*" for a decade or more and there have been few complaints.
So I wondered if the change would annoy more people than liked it.
However, after looking at the output for awhile, it is nice that the
columns of the VALUES are referenced with their user-given names
instead of "columnN". I think that's enough of an improvement that
it'll win people over.

However ... I don't like this implementation, not even a little
bit. Table/column aliases are assigned by the parser, and the
planner has no business overriding them. Quite aside from being
a violation of system structural principles, there are practical
reasons not to do it like that:

1. We'd see different results when considering plan trees than
unplanned query trees.

2. At the place where you put this, some planning transformations have
already been done, and that affects the results. That means that
future extensions or restructuring of the planner might change the
results, which seems undesirable.

I think the right way to make this happen is for the parser to
do it, which it can do by passing down the outer query level's
Alias to addRangeTableEntryForValues. There's a few layers of
subroutine calls between, but we can minimize the pain by adding
a ParseState field to carry the Alias. See attached.

My point 2 is illustrated by the fact that my patch produces
different results in a few cases than yours does --- look at
groupingsets.out in particular. I think that's fine, and
the changes that yours makes and mine doesn't look a little
unprincipled. For example, in the tests involving the "gstest1"
view, if somebody wants nice labels on that view's VALUES columns
then the right place to apply those labels is within the view.
Letting a subsequent call of the view inject labels seems pretty
action-at-a-distance-y.

regards, tom lane

tgl@sss.pgh.pa.us

over 1 year ago

In reply to: Tom Lane (#4)

Re: Alias of VALUES RTE in explain plan

I wrote:

However ... I don't like this implementation, not even a little
bit.

I forgot to mention a third problem, which is that reassigning the
alias during subquery pullup means it doesn't happen if subquery
pullup doesn't happen. As an example, with your patch:

regression=# explain verbose select * from (values (1), (2)) v(x);
QUERY PLAN
----------------------------------------------------
Values Scan on v (cost=0.00..0.03 rows=2 width=4)
Output: v.x
(2 rows)

regression=# explain verbose select * from (values (1), (random())) v(x);
QUERY PLAN
-------------------------------------------------------------
Values Scan on "*VALUES*" (cost=0.00..0.03 rows=2 width=8)
Output: "*VALUES*".column1
(2 rows)

That's because the volatile function prevents subquery flattening.

regards, tom lane

yasir.hussain.shah@gmail.com

over 1 year ago

In reply to: Tom Lane (#4)

Re: Alias of VALUES RTE in explain plan

On Fri, Oct 25, 2024 at 10:35 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Yasir <yasir.hussain.shah@gmail.com> writes:

I have fixed the code to produce desired output by adding a few lines in
pull_up_simple_subquery().
Attached patch is divided in 2 files:
- 001-Fix-Alias-VALUES-RTE.patch contains the actual fix.
- 002-Fix-Alias-VALUES-RTE.patch contains the expected output changes
against the actual fix.

I was initially skeptical about this, because we've been printing
"*VALUES*" for a decade or more and there have been few complaints.
So I wondered if the change would annoy more people than liked it.
However, after looking at the output for awhile, it is nice that the
columns of the VALUES are referenced with their user-given names
instead of "columnN". I think that's enough of an improvement that
it'll win people over.

Hopefully, yes.

However ... I don't like this implementation, not even a little
bit. Table/column aliases are assigned by the parser, and the
planner has no business overriding them. Quite aside from being

Totally agreed.

a violation of system structural principles, there are practical
reasons not to do it like that:

1. We'd see different results when considering plan trees than
unplanned query trees.

2. At the place where you put this, some planning transformations have
already been done, and that affects the results. That means that
future extensions or restructuring of the planner might change the
results, which seems undesirable.

I think the right way to make this happen is for the parser to
do it, which it can do by passing down the outer query level's
Alias to addRangeTableEntryForValues. There's a few layers of
subroutine calls between, but we can minimize the pain by adding
a ParseState field to carry the Alias. See attached.

Actually, I fixed this problem using two approaches. One at the parser
side, 2nd at the planner.
The one I submitted was the latter one. The first way (attached partially)
I fixed the problem is almost similar to your approach.
Obviously, yours better manages the parent alias.
Why I submitted the 2nd solution was because I wanted to make as few
changes in the code as I could.

Show quoted text

My point 2 is illustrated by the fact that my patch produces
different results in a few cases than yours does --- look at
groupingsets.out in particular. I think that's fine, and
the changes that yours makes and mine doesn't look a little
unprincipled. For example, in the tests involving the "gstest1"
view, if somebody wants nice labels on that view's VALUES columns
then the right place to apply those labels is within the view.
Letting a subsequent call of the view inject labels seems pretty
action-at-a-distance-y.

regards, tom lane

yasir.hussain.shah@gmail.com

over 1 year ago

In reply to: Tom Lane (#5)

Re: Alias of VALUES RTE in explain plan

On Sat, Oct 26, 2024 at 12:21 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

I wrote:

However ... I don't like this implementation, not even a little
bit.

I forgot to mention a third problem, which is that reassigning the
alias during subquery pullup means it doesn't happen if subquery
pullup doesn't happen. As an example, with your patch:

regression=# explain verbose select * from (values (1), (2)) v(x);
QUERY PLAN
----------------------------------------------------
Values Scan on v (cost=0.00..0.03 rows=2 width=4)
Output: v.x
(2 rows)

regression=# explain verbose select * from (values (1), (random())) v(x);
QUERY PLAN
-------------------------------------------------------------
Values Scan on "*VALUES*" (cost=0.00..0.03 rows=2 width=8)
Output: "*VALUES*".column1
(2 rows)

That's because the volatile function prevents subquery flattening.

Yes, that is by design. As I used is_simple_values() so if the values list
is not a simple one, which is not in this case, the alias won't be
reassigned.

Show quoted text

regards, tom lane

tgl@sss.pgh.pa.us

over 1 year ago

In reply to: Yasir (#7)

Re: Alias of VALUES RTE in explain plan

Yasir <yasir.hussain.shah@gmail.com> writes:

On Sat, Oct 26, 2024 at 12:21 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

I forgot to mention a third problem, which is that reassigning the
alias during subquery pullup means it doesn't happen if subquery
pullup doesn't happen.

Yes, that is by design.

By design? How can you claim that's not a bug? The alias(es)
associated with a VALUES clause surely should not vary depending
on whether the clause includes a volatile function --- not to
mention other unobvious reasons for flattening to happen or not.

regards, tom lane

yasir.hussain.shah@gmail.com

over 1 year ago

In reply to: Tom Lane (#8)

Re: Alias of VALUES RTE in explain plan

On Mon, Oct 28, 2024 at 1:07 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Yasir <yasir.hussain.shah@gmail.com> writes:

On Sat, Oct 26, 2024 at 12:21 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

I forgot to mention a third problem, which is that reassigning the
alias during subquery pullup means it doesn't happen if subquery
pullup doesn't happen.

Yes, that is by design.

By design? How can you claim that's not a bug? The alias(es)
associated with a VALUES clause surely should not vary depending
on whether the clause includes a volatile function --- not to
mention other unobvious reasons for flattening to happen or not.

By design of my solution, I was not taking it as a bug. But now, I agree
with your opinion.

Show quoted text

regards, tom lane

#10

lepihov@gmail.com

over 1 year ago

In reply to: Yasir (#9)

Re: Alias of VALUES RTE in explain plan

On 10/28/24 03:15, Yasir wrote:

By design of my solution, I was not taking it as a bug. But now, I agree
with your opinion.

I think the case provided by Ashutosh was initially correct, and nothing
needs to change. Look at the similar case:

EXPLAIN SELECT x,y FROM (
SELECT oid,relname FROM pg_class WHERE relname = 'pg_index') AS
c(x,y) WHERE c.y = 'pg_index';

QUERY PLAN

--------------------------------------------------------------------------------------------
Index Scan using pg_class_relname_nsp_index on pg_class
(cost=0.27..8.29 rows=1 width=68)
Index Cond: (relname = 'pg_index'::name)
(2 rows)

I don't see any reference to the alias c(x,y) in this explain.
Similarly, the flattened VALUES clause shouldn't be referenced under the
alias t(a,b).

--
regards, Andrei Lepikhov

#11

Ashutosh Bapat

ashutosh.bapat@enterprisedb.com

over 1 year ago

In reply to: Andrei Lepikhov (#10)

Re: Alias of VALUES RTE in explain plan

On Mon, Oct 28, 2024 at 10:17 AM Andrei Lepikhov <lepihov@gmail.com> wrote:

On 10/28/24 03:15, Yasir wrote:

By design of my solution, I was not taking it as a bug. But now, I agree
with your opinion.

I think the case provided by Ashutosh was initially correct, and nothing
needs to change. Look at the similar case:

EXPLAIN SELECT x,y FROM (
SELECT oid,relname FROM pg_class WHERE relname = 'pg_index') AS
c(x,y) WHERE c.y = 'pg_index';

QUERY PLAN

--------------------------------------------------------------------------------------------
Index Scan using pg_class_relname_nsp_index on pg_class
(cost=0.27..8.29 rows=1 width=68)
Index Cond: (relname = 'pg_index'::name)
(2 rows)

I don't see any reference to the alias c(x,y) in this explain.
Similarly, the flattened VALUES clause shouldn't be referenced under the
alias t(a,b).

The reason you don't see c(x, y) is because the subquery gets pulled
up and the subquery with c(x, y) no longer exists. If the subquery
doesn't get pulled, you would see c(x, y) in the EXPLAIN plan.

Our syntax doesn't allow an alias to be attached to VALUES(). E.g.
select * from values (1), (2) x(a) is not allowed. Instead we allow
(values (1), (2)) x(a) where values (1), (2) is treated as a subquery.
Since there is no way to attach an alias to VALUES() itself, I think
it's fair to consider the outer alias as the alias of the VALUES
relation. That's what Tom's patch does. The result is useful as well.

The patch looks good to me, except the name of the new member.

CommonTableExpr *p_parent_cte; /* this query's containing CTE */
+ Alias *p_parent_alias; /* parent's alias for this query */

the two "parent"s here mean different things and that might lead one
to assume that the p_parent_alias refers to alias of CTE. The comment
adds to the confusion since it mentions parent. How about renaming it
as p_outer_alias? or something which indicates alias of the outer
query?

--
Best Wishes,
Ashutosh Bapat

#12

lepihov@gmail.com

over 1 year ago

In reply to: Ashutosh Bapat (#11)

Re: Alias of VALUES RTE in explain plan

On 28/10/2024 20:19, Ashutosh Bapat wrote:

On Mon, Oct 28, 2024 at 10:17 AM Andrei Lepikhov <lepihov@gmail.com> wrote:

On 10/28/24 03:15, Yasir wrote:

By design of my solution, I was not taking it as a bug. But now, I agree
with your opinion.

I think the case provided by Ashutosh was initially correct, and nothing
needs to change. Look at the similar case:

EXPLAIN SELECT x,y FROM (
SELECT oid,relname FROM pg_class WHERE relname = 'pg_index') AS
c(x,y) WHERE c.y = 'pg_index';

QUERY PLAN

--------------------------------------------------------------------------------------------
Index Scan using pg_class_relname_nsp_index on pg_class
(cost=0.27..8.29 rows=1 width=68)
Index Cond: (relname = 'pg_index'::name)
(2 rows)

I don't see any reference to the alias c(x,y) in this explain.
Similarly, the flattened VALUES clause shouldn't be referenced under the
alias t(a,b).

The reason you don't see c(x, y) is because the subquery gets pulled
up and the subquery with c(x, y) no longer exists. If the subquery
doesn't get pulled, you would see c(x, y) in the EXPLAIN plan.

My goal is to understand why the implementation follows this pattern. As
I see, previously, we had consistent behaviour, according to which we
removed the pulling-up subquery's alias as well. And I want to know, is
it really the only way to break this behavior? Maybe it is possible to
add the VALUES alias to the grammar. Or is it causing much worse code?

--
regards, Andrei Lepikhov

#13

tgl@sss.pgh.pa.us

over 1 year ago

In reply to: Andrei Lepikhov (#12)

Re: Alias of VALUES RTE in explain plan

Andrei Lepikhov <lepihov@gmail.com> writes:

My goal is to understand why the implementation follows this pattern. As
I see, previously, we had consistent behaviour, according to which we
removed the pulling-up subquery's alias as well. And I want to know, is
it really the only way to break this behavior? Maybe it is possible to
add the VALUES alias to the grammar. Or is it causing much worse code?

The problem is standards compliance. Per SQL, to put VALUES into FROM
with an alias you have to write

select * from (values (1,1), (2,2)) as t(a,b);

You can't omit the "extra" parentheses, and you can't put the AS
inside the parentheses.

Under the hood, those extra parentheses are making the VALUES into
a sub-select --- but I seriously doubt that any ordinary users
understand the construct that way. It's just a weird requirement to
put parentheses there. So IMO, when a user writes something like
this, they think they *are* putting an alias on the VALUES clause
itself.

As to your point that subquery aliases aren't generally used by
EXPLAIN, that's true, but consider this variant of your example:

regression=# EXPLAIN VERBOSE SELECT x,y FROM (
SELECT oidx,rname FROM pg_class p(oidx, rname) WHERE rname = 'pg_index') AS
c(x,y) WHERE c.y = 'pg_index';
QUERY PLAN
---------------------------------------------------------------------------------------------------------
Index Scan using pg_class_relname_nsp_index on pg_catalog.pg_class p (cost=0.28..8.29 rows=1 width=68)
Output: p.oidx, p.rname
Index Cond: (p.rname = 'pg_index'::name)
(3 rows)

So aliases attached directly to a relation *are* used by EXPLAIN,
table and column aliases both.

So IMO, making use of an alias that's attached to a VALUES clause
in this way is a natural thing to do from a user's viewpoint.
You have a good point that we should be wary of using subquery
aliases in other ways --- but the proposed patch is specific to
this case.

regards, tom lane

#14

tgl@sss.pgh.pa.us

over 1 year ago

In reply to: Ashutosh Bapat (#11)

Re: Alias of VALUES RTE in explain plan

Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> writes:

The patch looks good to me, except the name of the new member.

CommonTableExpr *p_parent_cte; /* this query's containing CTE */
+ Alias *p_parent_alias; /* parent's alias for this query */

the two "parent"s here mean different things and that might lead one
to assume that the p_parent_alias refers to alias of CTE. The comment
adds to the confusion since it mentions parent. How about renaming it
as p_outer_alias? or something which indicates alias of the outer
query?

Hmm, I figured the two "parent" references do mean the same thing,
ie the immediately surrounding syntactic construct. While I won't
fight hard about it, I don't see an advantage in naming the new
field differently. We could make the comment be

/* outer level's alias for this query */

if that helps any.

regards, tom lane

#15

yasir.hussain.shah@gmail.com

over 1 year ago

In reply to: Tom Lane (#14)

Re: Alias of VALUES RTE in explain plan

On Mon, Oct 28, 2024 at 8:16 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> writes:

The patch looks good to me, except the name of the new member.

CommonTableExpr *p_parent_cte; /* this query's containing CTE */
+ Alias *p_parent_alias; /* parent's alias for this query */

the two "parent"s here mean different things and that might lead one
to assume that the p_parent_alias refers to alias of CTE. The comment
adds to the confusion since it mentions parent. How about renaming it
as p_outer_alias? or something which indicates alias of the outer
query?

Hmm, I figured the two "parent" references do mean the same thing,
ie the immediately surrounding syntactic construct. While I won't
fight hard about it, I don't see an advantage in naming the new
field differently. We could make the comment be

/* outer level's alias for this query */

This seems ok to me.

Show quoted text

if that helps any.

regards, tom lane

#16

lepihov@gmail.com

over 1 year ago

In reply to: Tom Lane (#13)

Re: Alias of VALUES RTE in explain plan

On 10/28/24 22:05, Tom Lane wrote:

Andrei Lepikhov <lepihov@gmail.com> writes:

My goal is to understand why the implementation follows this pattern. As
I see, previously, we had consistent behaviour, according to which we
removed the pulling-up subquery's alias as well. And I want to know, is
it really the only way to break this behavior? Maybe it is possible to
add the VALUES alias to the grammar. Or is it causing much worse code?

So IMO, making use of an alias that's attached to a VALUES clause
in this way is a natural thing to do from a user's viewpoint.
You have a good point that we should be wary of using subquery
aliases in other ways --- but the proposed patch is specific to
this case.

Thanks for the detailed explanation. I agree it make sense.
Also, after skimming the code, I propose some extra tests:

-- Just to cover the ERROR:
EXPLAIN (COSTS OFF, VERBOSE)
SELECT * FROM (VALUES (1),(2),(3),(4)) AS t1(x,y);
ERROR: VALUES lists "t1" have 1 columns available but 2 columns specified

-- New behavior
EXPLAIN (COSTS OFF, VERBOSE)
SELECT * FROM (VALUES (4),(2),(3),(1) ORDER BY t1.x LIMIT 2) AS t1(x);
SELECT * FROM (VALUES (4),(2),(3),(1) ORDER BY t1.x LIMIT 2) AS t1(x);

-- Not mentioned column is assigned with default name
EXPLAIN (COSTS OFF, VERBOSE)
SELECT * FROM (VALUES (4,1),(2,1),(3,1),(1,1) ORDER BY t1.column2,t1.x
LIMIT 2 OFFSET 1) AS t1(x);
SELECT * FROM (VALUES (4,1),(2,1),(3,1),(1,1) ORDER BY t1.column2,t1.x
LIMIT 2 OFFSET 1) AS t1(x);

-- Here it isn't allowed to sort with full reference 't2.x2', but in the
EXPLAIN we see exactly 'Sort Key: t2.x2, t2.y':
EXPLAIN (COSTS OFF, VERBOSE)
SELECT * FROM (VALUES (3,3),(4,4)) AS t2(x2,y)
UNION ALL
SELECT * FROM (VALUES (1,1),(2,2)) AS t1(x1)
ORDER BY x2,y;

The code looks good.

--
regards, Andrei Lepikhov

#17

Ashutosh Bapat

ashutosh.bapat@enterprisedb.com

over 1 year ago

In reply to: Tom Lane (#14)

Re: Alias of VALUES RTE in explain plan

On Mon, Oct 28, 2024 at 8:46 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> writes:

The patch looks good to me, except the name of the new member.

CommonTableExpr *p_parent_cte; /* this query's containing CTE */
+ Alias *p_parent_alias; /* parent's alias for this query */

the two "parent"s here mean different things and that might lead one
to assume that the p_parent_alias refers to alias of CTE. The comment
adds to the confusion since it mentions parent. How about renaming it
as p_outer_alias? or something which indicates alias of the outer
query?

Hmm, I figured the two "parent" references do mean the same thing,
ie the immediately surrounding syntactic construct. While I won't
fight hard about it, I don't see an advantage in naming the new
field differently. We could make the comment be

/* outer level's alias for this query */

WFM.

--
Best Wishes,
Ashutosh Bapat

#18

tgl@sss.pgh.pa.us

over 1 year ago

In reply to: Andrei Lepikhov (#16)

Re: Alias of VALUES RTE in explain plan

Andrei Lepikhov <lepihov@gmail.com> writes:

Thanks for the detailed explanation. I agree it make sense.

Cool, I think we're all agreed then.

Also, after skimming the code, I propose some extra tests:

Most of these are covered well enough by existing tests, aren't
they?

regards, tom lane

#19

tgl@sss.pgh.pa.us

over 1 year ago

In reply to: Andrei Lepikhov (#16)

Re: Alias of VALUES RTE in explain plan

Andrei Lepikhov <lepihov@gmail.com> writes:

-- New behavior
EXPLAIN (COSTS OFF, VERBOSE)
SELECT * FROM (VALUES (4),(2),(3),(1) ORDER BY t1.x LIMIT 2) AS t1(x);
SELECT * FROM (VALUES (4),(2),(3),(1) ORDER BY t1.x LIMIT 2) AS t1(x);

After taking a closer look at that, yeah it's new behavior, and
I'm not sure we want to change it. (The existing behavior is that
you'd have to write 'column1' or '"*VALUES*".column1' in the
subquery's ORDER BY.)

This example also violates my argument that the user thinks they
are attaching the alias directly to VALUES. So what I now think
is that we ought to tweak the patch so that the parent alias is
pushed down only when the subquery contains just VALUES, no other
clauses. Per a look at the grammar, ORDER BY, LIMIT, and FOR
UPDATE could conceivably appear alongside VALUES; although
FOR UPDATE would draw "FOR UPDATE cannot be applied to VALUES",
so maybe we needn't worry about it.

Thoughts?

regards, tom lane

#20

lepihov@gmail.com

over 1 year ago

In reply to: Tom Lane (#19)

Re: Alias of VALUES RTE in explain plan

On 10/30/24 00:19, Tom Lane wrote:

Andrei Lepikhov <lepihov@gmail.com> writes:

-- New behavior
EXPLAIN (COSTS OFF, VERBOSE)
SELECT * FROM (VALUES (4),(2),(3),(1) ORDER BY t1.x LIMIT 2) AS t1(x);
SELECT * FROM (VALUES (4),(2),(3),(1) ORDER BY t1.x LIMIT 2) AS t1(x);

After taking a closer look at that, yeah it's new behavior, and
I'm not sure we want to change it. (The existing behavior is that
you'd have to write 'column1' or '"*VALUES*".column1' in the
subquery's ORDER BY.)

This example also violates my argument that the user thinks they
are attaching the alias directly to VALUES. So what I now think
is that we ought to tweak the patch so that the parent alias is
pushed down only when the subquery contains just VALUES, no other
clauses. Per a look at the grammar, ORDER BY, LIMIT, and FOR
UPDATE could conceivably appear alongside VALUES; although
FOR UPDATE would draw "FOR UPDATE cannot be applied to VALUES",
so maybe we needn't worry about it.

Thoughts?

You have written before that a VALUES alias should be a special case
because it's a 'natural thing'. And I buy it. So, it looks natural to
use this alias everywhere in the query without restrictions. That's why
I provided examples in my previous email to check that it is a full
replacement for the '"*VALUES*".columnN'.

--
regards, Andrei Lepikhov

#21

Ashutosh Bapat

ashutosh.bapat@enterprisedb.com

over 1 year ago

In reply to: Tom Lane (#19)

#22