Proposition for autoname columns

Started by Eugen Konkovover 5 years ago22 messageshackers

kes-kes@yandex.ru

over 5 years ago

Hello Pgsql-hackers,

When selecting data from json column it named as '?column?'
tucha=# select info->>'suma', docn from document order by id desc limit 5;
?column? | docn
----------+------
665.97 | 695
513.51 | 632
665.97 | 4804
492.12 | 4315
332.98 | 1302
(5 rows)

It would be useful if the name of column will be autoassigned based on
name of json key. Like at next query:

tucha=# select info->>'suma' as suma, docn from document order by id desc limit 5;
suma | docn
--------+------
665.97 | 695
513.51 | 632
665.97 | 4804
492.12 | 4315
332.98 | 1302
(5 rows)

Would it be useful this auto assigned name for column from json?

--
Best regards,
Eugen Konkov

Bruce Momjian

bruce@momjian.us

over 5 years ago

In reply to: Eugen Konkov (#1)

Re: Proposition for autoname columns

On Mon, Nov 2, 2020 at 05:05:29PM +0200, Eugen Konkov wrote:

Hello Pgsql-hackers,

When selecting data from json column it named as '?column?'
tucha=# select info->>'suma', docn from document order by id desc limit 5;
?column? | docn
----------+------
665.97 | 695
513.51 | 632
665.97 | 4804
492.12 | 4315
332.98 | 1302
(5 rows)

It would be useful if the name of column will be autoassigned based on
name of json key. Like at next query:

tucha=# select info->>'suma' as suma, docn from document order by id desc limit 5;
suma | docn
--------+------
665.97 | 695
513.51 | 632
665.97 | 4804
492.12 | 4315
332.98 | 1302
(5 rows)

Would it be useful this auto assigned name for column from json?

I think we could do it, but it would only work if the column was output
as a single json value, and not a multi-key/value field. I am afraid if
we tried to do it, the result would be too inconsistent to be useful.

--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EnterpriseDB https://enterprisedb.com

The usefulness of a cup is in its emptiness, Bruce Lee

David G. Johnston

david.g.johnston@gmail.com

over 5 years ago

In reply to: Bruce Momjian (#2)

Re: Proposition for autoname columns

On Wed, Nov 11, 2020 at 8:56 AM Bruce Momjian <bruce@momjian.us> wrote:

It would be useful if the name of column will be autoassigned based on
name of json key. Like at next query:

tucha=# select info->>'suma' as suma, docn from document order by id

desc limit 5;

suma | docn
--------+------

Would it be useful this auto assigned name for column from json?

I think we could do it, but it would only work if the column was output
as a single json value, and not a multi-key/value field. I am afraid if
we tried to do it, the result would be too inconsistent to be useful.

Doing it seems problematic given the nature of SQL and existing means to
assign names to columns. If it can be done I don't see how the output
value would make any difference. What is being asked for is the simple
textual value on the right side of the ->> (and other similar) operators to
be converted into a column name. I could image doing this at rewrite time
by saying (in parse terms):

info->>'suma to' becomes info->>'suma' AS "suma to" (specifically, add AS,
double-quote the literal and stick it after the AS).

If {AS "suma to"} isn't valid syntax for some value of "suma to" just drop
the attempt and move on.

I agree that this feature would be useful.

David J.

Eugen Konkov

kes-kes@yandex.ru

over 5 years ago

In reply to: Bruce Momjian (#2)

Re: Proposition for autoname columns

Hello Bruce,

Wednesday, November 11, 2020, 5:56:08 PM, you wrote:

On Mon, Nov 2, 2020 at 05:05:29PM +0200, Eugen Konkov wrote:

Hello Pgsql-hackers,

When selecting data from json column it named as '?column?'
tucha=# select info->>'suma', docn from document order by id desc limit 5;
?column? | docn
----------+------
665.97 | 695
513.51 | 632
665.97 | 4804
492.12 | 4315
332.98 | 1302
(5 rows)

It would be useful if the name of column will be autoassigned based on
name of json key. Like at next query:

tucha=# select info->>'suma' as suma, docn from document order by id desc limit 5;
suma | docn
--------+------
665.97 | 695
513.51 | 632
665.97 | 4804
492.12 | 4315
332.98 | 1302
(5 rows)

Would it be useful this auto assigned name for column from json?

I think we could do it, but it would only work if the column was output
as a single json value, and not a multi-key/value field. I am afraid if
we tried to do it, the result would be too inconsistent to be useful.

cool, thank you.

--
Best regards,
Eugen Konkov

Dagfinn Ilmari Mannsåker

ilmari@ilmari.org

over 5 years ago

In reply to: Bruce Momjian (#2)

Re: Proposition for autoname columns

Bruce Momjian <bruce@momjian.us> writes:

On Mon, Nov 2, 2020 at 05:05:29PM +0200, Eugen Konkov wrote:

Hello Pgsql-hackers,

When selecting data from json column it named as '?column?'
tucha=# select info->>'suma', docn from document order by id desc limit 5;
?column? | docn
----------+------
665.97 | 695
513.51 | 632
665.97 | 4804
492.12 | 4315
332.98 | 1302
(5 rows)

It would be useful if the name of column will be autoassigned based on
name of json key. Like at next query:

tucha=# select info->>'suma' as suma, docn from document order by id desc limit 5;
suma | docn
--------+------
665.97 | 695
513.51 | 632
665.97 | 4804
492.12 | 4315
332.98 | 1302
(5 rows)

Would it be useful this auto assigned name for column from json?

I think we could do it, but it would only work if the column was output
as a single json value, and not a multi-key/value field. I am afraid if
we tried to do it, the result would be too inconsistent to be useful.

Could this be done via the support function, so that the top-level
operator/function in each select list item can return a suggested column
name if the relevant arguments are constants?

- ilmari
--
- Twitter seems more influential [than blogs] in the 'gets reported in
the mainstream press' sense at least. - Matt McLeod
- That'd be because the content of a tweet is easier to condense down
to a mainstream media article. - Calle Dybedahl

Bruce Momjian

bruce@momjian.us

over 5 years ago

In reply to: Dagfinn Ilmari Mannsåker (#5)

Re: Proposition for autoname columns

On Thu, Nov 12, 2020 at 12:18:49AM +0000, Dagfinn Ilmari Mannsï¿½ker wrote:

Bruce Momjian <bruce@momjian.us> writes:

I think we could do it, but it would only work if the column was output
as a single json value, and not a multi-key/value field. I am afraid if
we tried to do it, the result would be too inconsistent to be useful.

Could this be done via the support function, so that the top-level
operator/function in each select list item can return a suggested column
name if the relevant arguments are constants?

Yes, the user explicitly calling a function would be much easier to
predict.

--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EnterpriseDB https://enterprisedb.com

The usefulness of a cup is in its emptiness, Bruce Lee

David G. Johnston

david.g.johnston@gmail.com

over 5 years ago

In reply to: Bruce Momjian (#6)

Re: Proposition for autoname columns

On Wed, Nov 11, 2020 at 5:56 PM Bruce Momjian <bruce@momjian.us> wrote:

On Thu, Nov 12, 2020 at 12:18:49AM +0000, Dagfinn Ilmari Mannsåker wrote:

Bruce Momjian <bruce@momjian.us> writes:

I think we could do it, but it would only work if the column was output
as a single json value, and not a multi-key/value field. I am afraid

if

we tried to do it, the result would be too inconsistent to be useful.

Could this be done via the support function, so that the top-level
operator/function in each select list item can return a suggested column
name if the relevant arguments are constants?

Yes, the user explicitly calling a function would be much easier to
predict.

For the user an operator and a function are different ways to invoke the
same underlying thing using different syntax. I'm not seeing how this
syntax difference makes this any easier to implement for explicit function
invocation compared to operator function invocation.

David J.

Andrew Dunstan

andrew@dunslane.net

over 5 years ago

In reply to: Bruce Momjian (#6)

Re: Proposition for autoname columns

On 11/11/20 7:55 PM, Bruce Momjian wrote:

On Thu, Nov 12, 2020 at 12:18:49AM +0000, Dagfinn Ilmari Mannsåker wrote:

Bruce Momjian <bruce@momjian.us> writes:

I think we could do it, but it would only work if the column was output
as a single json value, and not a multi-key/value field. I am afraid if
we tried to do it, the result would be too inconsistent to be useful.

Could this be done via the support function, so that the top-level
operator/function in each select list item can return a suggested column
name if the relevant arguments are constants?

Yes, the user explicitly calling a function would be much easier to
predict.

I suspect this is doomed to failure. There is no guarantee that the path
expression is going to be static or constant across rows. Say you have
this table:

x: foo, j: {"foo": 1, "bar": 2}

x: bar j: {"foo": 3, "bar": 4}

and you say:

select j->>x from mytable;

What should the column be named?

I think we'd be trying to manage a set of corner cases, and all because
someone didn't want to put "as foo" in their query. And if we generate a
column name in some cases and not in others there will be complaints of
inconsistency.

cheers

andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Eugen Konkov

kes-kes@yandex.ru

over 5 years ago

In reply to: Andrew Dunstan (#8)

Re: Proposition for autoname columns

Hello Andrew,

Thursday, November 12, 2020, 3:19:39 PM, you wrote:

On 11/11/20 7:55 PM, Bruce Momjian wrote:

On Thu, Nov 12, 2020 at 12:18:49AM +0000, Dagfinn Ilmari Mannsåker wrote:

Bruce Momjian <bruce@momjian.us> writes:

I think we could do it, but it would only work if the column was output
as a single json value, and not a multi-key/value field. I am afraid if
we tried to do it, the result would be too inconsistent to be useful.

Could this be done via the support function, so that the top-level
operator/function in each select list item can return a suggested column
name if the relevant arguments are constants?

Yes, the user explicitly calling a function would be much easier to
predict.

I suspect this is doomed to failure. There is no guarantee that the path
expression is going to be static or constant across rows. Say you have
this table:

x: foo, j: {"foo": 1, "bar": 2}

x: bar j: {"foo": 3, "bar": 4}

and you say:

select j->>x from mytable;
What should the column be named?

Suppose it should be named 'as x'

I think we'd be trying to manage a set of corner cases, and all because
someone didn't want to put "as foo" in their query. And if we generate a
column name in some cases and not in others there will be complaints of
inconsistency.

cheers

andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

--
Best regards,
Eugen Konkov

#10

David G. Johnston

david.g.johnston@gmail.com

over 5 years ago

In reply to: Eugen Konkov (#9)

Re: Proposition for autoname columns

On Thu, Nov 12, 2020 at 7:18 AM Eugen Konkov <kes-kes@yandex.ru> wrote:

Hello Andrew,

Thursday, November 12, 2020, 3:19:39 PM, you wrote:

On 11/11/20 7:55 PM, Bruce Momjian wrote:

select j->>x from mytable;
What should the column be named?

Suppose it should be named 'as x'

I think we'd be trying to manage a set of corner cases, and all because
someone didn't want to put "as foo" in their query. And if we generate a
column name in some cases and not in others there will be complaints of
inconsistency.

Yes, this is suggesting a behavior that is contrary to (but not prohibited
by) the natural expression and expectations of SQL. That said, we already
take a function's name and use it to specify the name of it output column
as opposed to using "?column?" and requiring a user to apply a specific
alias. This is only a step beyond that, choosing the default name for an
operator's output column based upon not the name of the operator (or its
underlying function) but based upon its one (and only possible) right-hand
argument. It is purely a user convenience feature and can be rejected on
that grounds but I'm not seeing any fundamental issue with only having some
operator combinations doing this. It's nice when it works and you are no
worse off than today when it doesn't.

David J.

#11

Andrew Dunstan

andrew@dunslane.net

over 5 years ago

In reply to: Eugen Konkov (#9)

Re: Proposition for autoname columns

On 11/12/20 9:14 AM, Eugen Konkov wrote:

Hello Andrew,

Thursday, November 12, 2020, 3:19:39 PM, you wrote:

On 11/11/20 7:55 PM, Bruce Momjian wrote:

On Thu, Nov 12, 2020 at 12:18:49AM +0000, Dagfinn Ilmari Mannsåker wrote:

Bruce Momjian <bruce@momjian.us> writes:

I think we could do it, but it would only work if the column was output
as a single json value, and not a multi-key/value field. I am afraid if
we tried to do it, the result would be too inconsistent to be useful.

Could this be done via the support function, so that the top-level
operator/function in each select list item can return a suggested column
name if the relevant arguments are constants?

Yes, the user explicitly calling a function would be much easier to
predict.

I suspect this is doomed to failure. There is no guarantee that the path
expression is going to be static or constant across rows. Say you have
this table:
x: foo, j: {"foo": 1, "bar": 2}
x: bar j: {"foo": 3, "bar": 4}
and you say:
select j->>x from mytable;
What should the column be named?

Suppose it should be named 'as x'

So if we then say:

select x, j->>x from mytable;

you want both result columns named x? That seems like a recipe for
serious confusion. I really don't think this proposal has been properly
thought through.

cheers

andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

#12

Pavel Stehule

pavel.stehule@gmail.com

over 5 years ago

In reply to: Andrew Dunstan (#11)

Re: Proposition for autoname columns

čt 12. 11. 2020 v 16:59 odesílatel Andrew Dunstan <andrew@dunslane.net>
napsal:

On 11/12/20 9:14 AM, Eugen Konkov wrote:

Hello Andrew,

Thursday, November 12, 2020, 3:19:39 PM, you wrote:

On 11/11/20 7:55 PM, Bruce Momjian wrote:

On Thu, Nov 12, 2020 at 12:18:49AM +0000, Dagfinn Ilmari Mannsåker

wrote:

Bruce Momjian <bruce@momjian.us> writes:

I think we could do it, but it would only work if the column was

output

as a single json value, and not a multi-key/value field. I am

afraid if

we tried to do it, the result would be too inconsistent to be useful.

Could this be done via the support function, so that the top-level
operator/function in each select list item can return a suggested

column

name if the relevant arguments are constants?

Yes, the user explicitly calling a function would be much easier to
predict.

I suspect this is doomed to failure. There is no guarantee that the path
expression is going to be static or constant across rows. Say you have
this table:
x: foo, j: {"foo": 1, "bar": 2}
x: bar j: {"foo": 3, "bar": 4}
and you say:
select j->>x from mytable;
What should the column be named?

Suppose it should be named 'as x'

So if we then say:

select x, j->>x from mytable;

you want both result columns named x? That seems like a recipe for
serious confusion. I really don't think this proposal has been properly
thought through.

Why? It is consistent - you will get a value of key x, and anybody expects,
so value should be different.

Regards

Pavel

Show quoted text

cheers

andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

#13

David G. Johnston

david.g.johnston@gmail.com

over 5 years ago

In reply to: Andrew Dunstan (#11)

Re: Proposition for autoname columns

On Thu, Nov 12, 2020 at 8:59 AM Andrew Dunstan <andrew@dunslane.net> wrote:

So if we then say:

select x, j->>x from mytable;

you want both result columns named x? That seems like a recipe for
serious confusion. I really don't think this proposal has been properly
thought through.

IMO It no worse than today's:

select count(*), count(*) from (values (1), (2)) vals (v);
count | count
2 | 2
David J.

#14

Andrew Dunstan

andrew@dunslane.net

over 5 years ago

In reply to: David G. Johnston (#13)

Re: Proposition for autoname columns

On 11/12/20 11:12 AM, David G. Johnston wrote:

On Thu, Nov 12, 2020 at 8:59 AM Andrew Dunstan <andrew@dunslane.net
<mailto:andrew@dunslane.net>> wrote:

So if we then say:

select x, j->>x from mytable;

you want both result columns named x? That seems like a recipe for
serious confusion. I really don't think this proposal has been
properly
thought through.

IMO It no worse than today's:

select count(*), count(*) from (values (1), (2)) vals (v);
count | count
2 | 2

I guess the difference here is that there's an extra level of
indirection. So

select x, j->>'x', j->>x from mytable

would have 3 result columns all named x.

cheers

andrew

#15

Bruce Momjian

bruce@momjian.us

over 5 years ago

In reply to: Andrew Dunstan (#14)

Re: Proposition for autoname columns

On Thu, Nov 12, 2020 at 11:32:49AM -0500, Andrew Dunstan wrote:

On 11/12/20 11:12 AM, David G. Johnston wrote:

IMO It no worse than today's:

select count(*), count(*) from (values (1), (2)) vals (v);
count | count
2 | 2

I guess the difference here is that there's an extra level of
indirection. So

select x, j->>'x', j->>x from mytable

would have 3 result columns all named x.

Yeah, I feel it would have to be something a user specifically asks for,
and we would have to say it would be the first or a random match of one
of the keys. Ultimately, it might be so awkward as to be useless.

--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EnterpriseDB https://enterprisedb.com

The usefulness of a cup is in its emptiness, Bruce Lee

#16

David G. Johnston

david.g.johnston@gmail.com

over 5 years ago

In reply to: Andrew Dunstan (#14)

Re: Proposition for autoname columns

On Thu, Nov 12, 2020 at 9:32 AM Andrew Dunstan <andrew@dunslane.net> wrote:

On 11/12/20 11:12 AM, David G. Johnston wrote:

On Thu, Nov 12, 2020 at 8:59 AM Andrew Dunstan <andrew@dunslane.net
<mailto:andrew@dunslane.net>> wrote:

So if we then say:

select x, j->>x from mytable;

you want both result columns named x? That seems like a recipe for
serious confusion. I really don't think this proposal has been
properly
thought through.

IMO It no worse than today's:

select count(*), count(*) from (values (1), (2)) vals (v);
count | count
2 | 2

I guess the difference here is that there's an extra level of
indirection. So

select x, j->>'x', j->>x from mytable

would have 3 result columns all named x.

I totally missed the variable reference there - only two of those become
"x", the variable reference stays un-rewritten and thus results in
"?column?", similar to today:

select count(*), count(*) +1 from (values (1), (2)) vals (v);
count | ?column?
2 | 2

The query rewriter would only rewrite these expressions and provide an
expression-related explicit alias clause if the expression is a single
operator (same as single function today) and the right-hand side of the
operator is a constant (meaning the constant is a reasonable representation
of every output value that is going to appear in the result column). If
the RHS is a variable then there is no good name that is known to cover all
output values and thus ?column? (i.e., do not rewrite/provide an alias
clause) is an appropriate choice.

My concerns in this area involve stored views and ruleutils, dump/reload by
extension. Greenfield, this would have been nice, and worth the minimal
complexity given its usefulness in the common case, but is it useful enough
to introduce a whole new default naming mechanism and dealing with
dump/restore concerns?

David J.

#17

Tom Lane

tgl@sss.pgh.pa.us

over 5 years ago

In reply to: David G. Johnston (#16)

Re: Proposition for autoname columns

"David G. Johnston" <david.g.johnston@gmail.com> writes:

The query rewriter would only rewrite these expressions and provide an
expression-related explicit alias clause if the expression is a single
operator (same as single function today) and the right-hand side of the
operator is a constant (meaning the constant is a reasonable representation
of every output value that is going to appear in the result column).

I haven't been paying too close attention to this thread, but it seems
like there is a lot of misapprehension here about how this could
reasonably be implemented. There is zero (not epsilon, but zero)
chance of changing column aliases at rewrite time. Those have to be
assigned in the parser, else we will not understand how to resolve
references to sub-select output columns. Specifically it has to happen
in FigureColname(), which means that resolving non-constant arguments
to constants isn't terribly practical.

Actually, since FigureColname() works on the raw parse tree, I'm not
even sure how you could make this happen in that context, unless you're
willing to say that "j ->> 'x'" resolves as "x" just based on the name
of the operator, without any info about its semantics. That doesn't
seem very cool. Now, in a quick look at the callers, it looks like it'd
be no problem from the callers' standpoint to switch things around to do
colname selection on the parsed tree instead, ie the existing choice is
for FigureColname's benefit not the callers'. But it'd likely cost
a good deal to do it the other way, since now FigureColname would need
to perform catalog lookups to get column and function names.

Maybe you could do something like passing *both* trees to FigureColname,
and let it obtain the actual operator OID from the parsed tree when the
raw tree contains AEXPR_OP. But the recursion in FigureColname would be
difficult to manage because the two trees often don't match one-to-one.

On the whole, I'm on the side of the people who don't want to change this.
The implementation cost seems likely to greatly outweigh the value, plus
it feels more like a wart than a feature.

regards, tom lane

#18

Bruce Momjian

bruce@momjian.us

over 5 years ago

In reply to: Tom Lane (#17)

Re: Proposition for autoname columns

On Thu, Nov 12, 2020 at 01:52:11PM -0500, Tom Lane wrote:

On the whole, I'm on the side of the people who don't want to change this.
The implementation cost seems likely to greatly outweigh the value, plus
it feels more like a wart than a feature.

I think we can mark this as, "We thought about it, and we decided it is
probably not a good idea."

--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EnterpriseDB https://enterprisedb.com

The usefulness of a cup is in its emptiness, Bruce Lee

#19

David G. Johnston

david.g.johnston@gmail.com

over 5 years ago

In reply to: Bruce Momjian (#18)

Re: Proposition for autoname columns

On Thursday, November 12, 2020, Bruce Momjian <bruce@momjian.us> wrote:

On Thu, Nov 12, 2020 at 01:52:11PM -0500, Tom Lane wrote:

On the whole, I'm on the side of the people who don't want to change

this.

The implementation cost seems likely to greatly outweigh the value, plus
it feels more like a wart than a feature.

I think we can mark this as, "We thought about it, and we decided it is
probably not a good idea."

David J.

#20

Alvaro Herrera

alvherre@2ndquadrant.com

over 5 years ago

In reply to: Tom Lane (#17)

Re: Proposition for autoname columns

On 2020-Nov-12, Tom Lane wrote:

On the whole, I'm on the side of the people who don't want to change this.
The implementation cost seems likely to greatly outweigh the value, plus
it feels more like a wart than a feature.

I think if Eugen wants to spend some time with it and see how it could
be implemented, then sent a patch for consideration, then we could make
a better informed decision. My own opinion is that it's not worth the
trouble, but I'd rather us not stand in his way if he wants to try
(With disclaimer that we might end up not liking the patch, of course).

#21

Bruce Momjian

bruce@momjian.us

over 5 years ago

In reply to: Alvaro Herrera (#20)

#22

Eugen Konkov

kes-kes@yandex.ru

over 5 years ago

In reply to: Alvaro Herrera (#20)