Proposition for autoname columns
Hello Pgsql-hackers,
When selecting data from json column it named as '?column?'
tucha=# select info->>'suma', docn from document order by id desc limit 5;
?column? | docn
----------+------
665.97 | 695
513.51 | 632
665.97 | 4804
492.12 | 4315
332.98 | 1302
(5 rows)
It would be useful if the name of column will be autoassigned based on
name of json key. Like at next query:
tucha=# select info->>'suma' as suma, docn from document order by id desc limit 5;
suma | docn
--------+------
665.97 | 695
513.51 | 632
665.97 | 4804
492.12 | 4315
332.98 | 1302
(5 rows)
Would it be useful this auto assigned name for column from json?
--
Best regards,
Eugen Konkov
On Mon, Nov 2, 2020 at 05:05:29PM +0200, Eugen Konkov wrote:
Hello Pgsql-hackers,
When selecting data from json column it named as '?column?'
tucha=# select info->>'suma', docn from document order by id desc limit 5;
?column? | docn
----------+------
665.97 | 695
513.51 | 632
665.97 | 4804
492.12 | 4315
332.98 | 1302
(5 rows)It would be useful if the name of column will be autoassigned based on
name of json key. Like at next query:tucha=# select info->>'suma' as suma, docn from document order by id desc limit 5;
suma | docn
--------+------
665.97 | 695
513.51 | 632
665.97 | 4804
492.12 | 4315
332.98 | 1302
(5 rows)Would it be useful this auto assigned name for column from json?
I think we could do it, but it would only work if the column was output
as a single json value, and not a multi-key/value field. I am afraid if
we tried to do it, the result would be too inconsistent to be useful.
--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EnterpriseDB https://enterprisedb.com
The usefulness of a cup is in its emptiness, Bruce Lee
On Wed, Nov 11, 2020 at 8:56 AM Bruce Momjian <bruce@momjian.us> wrote:
It would be useful if the name of column will be autoassigned based on
name of json key. Like at next query:tucha=# select info->>'suma' as suma, docn from document order by id
desc limit 5;
suma | docn
--------+------Would it be useful this auto assigned name for column from json?
I think we could do it, but it would only work if the column was output
as a single json value, and not a multi-key/value field. I am afraid if
we tried to do it, the result would be too inconsistent to be useful.
Doing it seems problematic given the nature of SQL and existing means to
assign names to columns. If it can be done I don't see how the output
value would make any difference. What is being asked for is the simple
textual value on the right side of the ->> (and other similar) operators to
be converted into a column name. I could image doing this at rewrite time
by saying (in parse terms):
info->>'suma to' becomes info->>'suma' AS "suma to" (specifically, add AS,
double-quote the literal and stick it after the AS).
If {AS "suma to"} isn't valid syntax for some value of "suma to" just drop
the attempt and move on.
I agree that this feature would be useful.
David J.
Hello Bruce,
Wednesday, November 11, 2020, 5:56:08 PM, you wrote:
On Mon, Nov 2, 2020 at 05:05:29PM +0200, Eugen Konkov wrote:
Hello Pgsql-hackers,
When selecting data from json column it named as '?column?'
tucha=# select info->>'suma', docn from document order by id desc limit 5;
?column? | docn
----------+------
665.97 | 695
513.51 | 632
665.97 | 4804
492.12 | 4315
332.98 | 1302
(5 rows)It would be useful if the name of column will be autoassigned based on
name of json key. Like at next query:tucha=# select info->>'suma' as suma, docn from document order by id desc limit 5;
suma | docn
--------+------
665.97 | 695
513.51 | 632
665.97 | 4804
492.12 | 4315
332.98 | 1302
(5 rows)Would it be useful this auto assigned name for column from json?
I think we could do it, but it would only work if the column was output
as a single json value, and not a multi-key/value field. I am afraid if
we tried to do it, the result would be too inconsistent to be useful.
cool, thank you.
--
Best regards,
Eugen Konkov
Bruce Momjian <bruce@momjian.us> writes:
On Mon, Nov 2, 2020 at 05:05:29PM +0200, Eugen Konkov wrote:
Hello Pgsql-hackers,
When selecting data from json column it named as '?column?'
tucha=# select info->>'suma', docn from document order by id desc limit 5;
?column? | docn
----------+------
665.97 | 695
513.51 | 632
665.97 | 4804
492.12 | 4315
332.98 | 1302
(5 rows)It would be useful if the name of column will be autoassigned based on
name of json key. Like at next query:tucha=# select info->>'suma' as suma, docn from document order by id desc limit 5;
suma | docn
--------+------
665.97 | 695
513.51 | 632
665.97 | 4804
492.12 | 4315
332.98 | 1302
(5 rows)Would it be useful this auto assigned name for column from json?
I think we could do it, but it would only work if the column was output
as a single json value, and not a multi-key/value field. I am afraid if
we tried to do it, the result would be too inconsistent to be useful.
Could this be done via the support function, so that the top-level
operator/function in each select list item can return a suggested column
name if the relevant arguments are constants?
- ilmari
--
- Twitter seems more influential [than blogs] in the 'gets reported in
the mainstream press' sense at least. - Matt McLeod
- That'd be because the content of a tweet is easier to condense down
to a mainstream media article. - Calle Dybedahl
On Thu, Nov 12, 2020 at 12:18:49AM +0000, Dagfinn Ilmari Manns�ker wrote:
Bruce Momjian <bruce@momjian.us> writes:
I think we could do it, but it would only work if the column was output
as a single json value, and not a multi-key/value field. I am afraid if
we tried to do it, the result would be too inconsistent to be useful.Could this be done via the support function, so that the top-level
operator/function in each select list item can return a suggested column
name if the relevant arguments are constants?
Yes, the user explicitly calling a function would be much easier to
predict.
--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EnterpriseDB https://enterprisedb.com
The usefulness of a cup is in its emptiness, Bruce Lee
On Wed, Nov 11, 2020 at 5:56 PM Bruce Momjian <bruce@momjian.us> wrote:
On Thu, Nov 12, 2020 at 12:18:49AM +0000, Dagfinn Ilmari Mannsåker wrote:
Bruce Momjian <bruce@momjian.us> writes:
I think we could do it, but it would only work if the column was output
as a single json value, and not a multi-key/value field. I am afraidif
we tried to do it, the result would be too inconsistent to be useful.
Could this be done via the support function, so that the top-level
operator/function in each select list item can return a suggested column
name if the relevant arguments are constants?Yes, the user explicitly calling a function would be much easier to
predict.
For the user an operator and a function are different ways to invoke the
same underlying thing using different syntax. I'm not seeing how this
syntax difference makes this any easier to implement for explicit function
invocation compared to operator function invocation.
David J.
On 11/11/20 7:55 PM, Bruce Momjian wrote:
On Thu, Nov 12, 2020 at 12:18:49AM +0000, Dagfinn Ilmari Mannsåker wrote:
Bruce Momjian <bruce@momjian.us> writes:
I think we could do it, but it would only work if the column was output
as a single json value, and not a multi-key/value field. I am afraid if
we tried to do it, the result would be too inconsistent to be useful.Could this be done via the support function, so that the top-level
operator/function in each select list item can return a suggested column
name if the relevant arguments are constants?Yes, the user explicitly calling a function would be much easier to
predict.
I suspect this is doomed to failure. There is no guarantee that the path
expression is going to be static or constant across rows. Say you have
this table:
x: foo, j: {"foo": 1, "bar": 2}
x: bar j: {"foo": 3, "bar": 4}
and you say:
select j->>x from mytable;
What should the column be named?
I think we'd be trying to manage a set of corner cases, and all because
someone didn't want to put "as foo" in their query. And if we generate a
column name in some cases and not in others there will be complaints of
inconsistency.
cheers
andrew
--
Andrew Dunstan
EDB: https://www.enterprisedb.com
Hello Andrew,
Thursday, November 12, 2020, 3:19:39 PM, you wrote:
On 11/11/20 7:55 PM, Bruce Momjian wrote:
On Thu, Nov 12, 2020 at 12:18:49AM +0000, Dagfinn Ilmari Mannsåker wrote:
Bruce Momjian <bruce@momjian.us> writes:
I think we could do it, but it would only work if the column was output
as a single json value, and not a multi-key/value field. I am afraid if
we tried to do it, the result would be too inconsistent to be useful.Could this be done via the support function, so that the top-level
operator/function in each select list item can return a suggested column
name if the relevant arguments are constants?Yes, the user explicitly calling a function would be much easier to
predict.
I suspect this is doomed to failure. There is no guarantee that the path
expression is going to be static or constant across rows. Say you have
this table:
x: foo, j: {"foo": 1, "bar": 2}
x: bar j: {"foo": 3, "bar": 4}
and you say:
select j->>x from mytable;
What should the column be named?
Suppose it should be named 'as x'
I think we'd be trying to manage a set of corner cases, and all because
someone didn't want to put "as foo" in their query. And if we generate a
column name in some cases and not in others there will be complaints of
inconsistency.
cheers
andrew
--
Andrew Dunstan
EDB: https://www.enterprisedb.com
--
Best regards,
Eugen Konkov
On Thu, Nov 12, 2020 at 7:18 AM Eugen Konkov <kes-kes@yandex.ru> wrote:
Hello Andrew,
Thursday, November 12, 2020, 3:19:39 PM, you wrote:
On 11/11/20 7:55 PM, Bruce Momjian wrote:
select j->>x from mytable;
What should the column be named?Suppose it should be named 'as x'
+1
I think we'd be trying to manage a set of corner cases, and all because
someone didn't want to put "as foo" in their query. And if we generate a
column name in some cases and not in others there will be complaints of
inconsistency.
Yes, this is suggesting a behavior that is contrary to (but not prohibited
by) the natural expression and expectations of SQL. That said, we already
take a function's name and use it to specify the name of it output column
as opposed to using "?column?" and requiring a user to apply a specific
alias. This is only a step beyond that, choosing the default name for an
operator's output column based upon not the name of the operator (or its
underlying function) but based upon its one (and only possible) right-hand
argument. It is purely a user convenience feature and can be rejected on
that grounds but I'm not seeing any fundamental issue with only having some
operator combinations doing this. It's nice when it works and you are no
worse off than today when it doesn't.
David J.
On 11/12/20 9:14 AM, Eugen Konkov wrote:
Hello Andrew,
Thursday, November 12, 2020, 3:19:39 PM, you wrote:
On 11/11/20 7:55 PM, Bruce Momjian wrote:
On Thu, Nov 12, 2020 at 12:18:49AM +0000, Dagfinn Ilmari Mannsåker wrote:
Bruce Momjian <bruce@momjian.us> writes:
I think we could do it, but it would only work if the column was output
as a single json value, and not a multi-key/value field. I am afraid if
we tried to do it, the result would be too inconsistent to be useful.Could this be done via the support function, so that the top-level
operator/function in each select list item can return a suggested column
name if the relevant arguments are constants?Yes, the user explicitly calling a function would be much easier to
predict.I suspect this is doomed to failure. There is no guarantee that the path
expression is going to be static or constant across rows. Say you have
this table:
x: foo, j: {"foo": 1, "bar": 2}
x: bar j: {"foo": 3, "bar": 4}
and you say:
select j->>x from mytable;
What should the column be named?Suppose it should be named 'as x'
So if we then say:
select x, j->>x from mytable;
you want both result columns named x? That seems like a recipe for
serious confusion. I really don't think this proposal has been properly
thought through.
cheers
andrew
--
Andrew Dunstan
EDB: https://www.enterprisedb.com
čt 12. 11. 2020 v 16:59 odesílatel Andrew Dunstan <andrew@dunslane.net>
napsal:
On 11/12/20 9:14 AM, Eugen Konkov wrote:
Hello Andrew,
Thursday, November 12, 2020, 3:19:39 PM, you wrote:
On 11/11/20 7:55 PM, Bruce Momjian wrote:
On Thu, Nov 12, 2020 at 12:18:49AM +0000, Dagfinn Ilmari Mannsåker
wrote:
Bruce Momjian <bruce@momjian.us> writes:
I think we could do it, but it would only work if the column was
output
as a single json value, and not a multi-key/value field. I am
afraid if
we tried to do it, the result would be too inconsistent to be useful.
Could this be done via the support function, so that the top-level
operator/function in each select list item can return a suggestedcolumn
name if the relevant arguments are constants?
Yes, the user explicitly calling a function would be much easier to
predict.I suspect this is doomed to failure. There is no guarantee that the path
expression is going to be static or constant across rows. Say you have
this table:
x: foo, j: {"foo": 1, "bar": 2}
x: bar j: {"foo": 3, "bar": 4}
and you say:
select j->>x from mytable;
What should the column be named?Suppose it should be named 'as x'
So if we then say:
select x, j->>x from mytable;
you want both result columns named x? That seems like a recipe for
serious confusion. I really don't think this proposal has been properly
thought through.
Why? It is consistent - you will get a value of key x, and anybody expects,
so value should be different.
Regards
Pavel
Show quoted text
cheers
andrew
--
Andrew Dunstan
EDB: https://www.enterprisedb.com
On Thu, Nov 12, 2020 at 8:59 AM Andrew Dunstan <andrew@dunslane.net> wrote:
So if we then say:
select x, j->>x from mytable;
you want both result columns named x? That seems like a recipe for
serious confusion. I really don't think this proposal has been properly
thought through.
IMO It no worse than today's:
select count(*), count(*) from (values (1), (2)) vals (v);
count | count
2 | 2
David J.
On 11/12/20 11:12 AM, David G. Johnston wrote:
On Thu, Nov 12, 2020 at 8:59 AM Andrew Dunstan <andrew@dunslane.net
<mailto:andrew@dunslane.net>> wrote:So if we then say:
select x, j->>x from mytable;
you want both result columns named x? That seems like a recipe for
serious confusion. I really don't think this proposal has been
properly
thought through.IMO It no worse than today's:
select count(*), count(*) from (values (1), (2)) vals (v);
count | count
2 | 2
I guess the difference here is that there's an extra level of
indirection. So
select x, j->>'x', j->>x from mytable
would have 3 result columns all named x.
cheers
andrew
On Thu, Nov 12, 2020 at 11:32:49AM -0500, Andrew Dunstan wrote:
On 11/12/20 11:12 AM, David G. Johnston wrote:
IMO It no worse than today's:
select count(*), count(*) from (values (1), (2)) vals (v);
count | count
2 | 2I guess the difference here is that there's an extra level of
indirection. Soselect x, j->>'x', j->>x from mytable
would have 3 result columns all named x.
Yeah, I feel it would have to be something a user specifically asks for,
and we would have to say it would be the first or a random match of one
of the keys. Ultimately, it might be so awkward as to be useless.
--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EnterpriseDB https://enterprisedb.com
The usefulness of a cup is in its emptiness, Bruce Lee
On Thu, Nov 12, 2020 at 9:32 AM Andrew Dunstan <andrew@dunslane.net> wrote:
On 11/12/20 11:12 AM, David G. Johnston wrote:
On Thu, Nov 12, 2020 at 8:59 AM Andrew Dunstan <andrew@dunslane.net
<mailto:andrew@dunslane.net>> wrote:So if we then say:
select x, j->>x from mytable;
you want both result columns named x? That seems like a recipe for
serious confusion. I really don't think this proposal has been
properly
thought through.IMO It no worse than today's:
select count(*), count(*) from (values (1), (2)) vals (v);
count | count
2 | 2I guess the difference here is that there's an extra level of
indirection. Soselect x, j->>'x', j->>x from mytable
would have 3 result columns all named x.
I totally missed the variable reference there - only two of those become
"x", the variable reference stays un-rewritten and thus results in
"?column?", similar to today:
select count(*), count(*) +1 from (values (1), (2)) vals (v);
count | ?column?
2 | 2
The query rewriter would only rewrite these expressions and provide an
expression-related explicit alias clause if the expression is a single
operator (same as single function today) and the right-hand side of the
operator is a constant (meaning the constant is a reasonable representation
of every output value that is going to appear in the result column). If
the RHS is a variable then there is no good name that is known to cover all
output values and thus ?column? (i.e., do not rewrite/provide an alias
clause) is an appropriate choice.
My concerns in this area involve stored views and ruleutils, dump/reload by
extension. Greenfield, this would have been nice, and worth the minimal
complexity given its usefulness in the common case, but is it useful enough
to introduce a whole new default naming mechanism and dealing with
dump/restore concerns?
David J.
"David G. Johnston" <david.g.johnston@gmail.com> writes:
The query rewriter would only rewrite these expressions and provide an
expression-related explicit alias clause if the expression is a single
operator (same as single function today) and the right-hand side of the
operator is a constant (meaning the constant is a reasonable representation
of every output value that is going to appear in the result column).
I haven't been paying too close attention to this thread, but it seems
like there is a lot of misapprehension here about how this could
reasonably be implemented. There is zero (not epsilon, but zero)
chance of changing column aliases at rewrite time. Those have to be
assigned in the parser, else we will not understand how to resolve
references to sub-select output columns. Specifically it has to happen
in FigureColname(), which means that resolving non-constant arguments
to constants isn't terribly practical.
Actually, since FigureColname() works on the raw parse tree, I'm not
even sure how you could make this happen in that context, unless you're
willing to say that "j ->> 'x'" resolves as "x" just based on the name
of the operator, without any info about its semantics. That doesn't
seem very cool. Now, in a quick look at the callers, it looks like it'd
be no problem from the callers' standpoint to switch things around to do
colname selection on the parsed tree instead, ie the existing choice is
for FigureColname's benefit not the callers'. But it'd likely cost
a good deal to do it the other way, since now FigureColname would need
to perform catalog lookups to get column and function names.
Maybe you could do something like passing *both* trees to FigureColname,
and let it obtain the actual operator OID from the parsed tree when the
raw tree contains AEXPR_OP. But the recursion in FigureColname would be
difficult to manage because the two trees often don't match one-to-one.
On the whole, I'm on the side of the people who don't want to change this.
The implementation cost seems likely to greatly outweigh the value, plus
it feels more like a wart than a feature.
regards, tom lane
On Thu, Nov 12, 2020 at 01:52:11PM -0500, Tom Lane wrote:
On the whole, I'm on the side of the people who don't want to change this.
The implementation cost seems likely to greatly outweigh the value, plus
it feels more like a wart than a feature.
I think we can mark this as, "We thought about it, and we decided it is
probably not a good idea."
--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EnterpriseDB https://enterprisedb.com
The usefulness of a cup is in its emptiness, Bruce Lee
On Thursday, November 12, 2020, Bruce Momjian <bruce@momjian.us> wrote:
On Thu, Nov 12, 2020 at 01:52:11PM -0500, Tom Lane wrote:
On the whole, I'm on the side of the people who don't want to change
this.
The implementation cost seems likely to greatly outweigh the value, plus
it feels more like a wart than a feature.I think we can mark this as, "We thought about it, and we decided it is
probably not a good idea."
+1
David J.
On 2020-Nov-12, Tom Lane wrote:
On the whole, I'm on the side of the people who don't want to change this.
The implementation cost seems likely to greatly outweigh the value, plus
it feels more like a wart than a feature.
I think if Eugen wants to spend some time with it and see how it could
be implemented, then sent a patch for consideration, then we could make
a better informed decision. My own opinion is that it's not worth the
trouble, but I'd rather us not stand in his way if he wants to try
(With disclaimer that we might end up not liking the patch, of course).
On Thu, Nov 12, 2020 at 04:30:15PM -0300, �lvaro Herrera wrote:
On 2020-Nov-12, Tom Lane wrote:
On the whole, I'm on the side of the people who don't want to change this.
The implementation cost seems likely to greatly outweigh the value, plus
it feels more like a wart than a feature.I think if Eugen wants to spend some time with it and see how it could
be implemented, then sent a patch for consideration, then we could make
a better informed decision. My own opinion is that it's not worth the
trouble, but I'd rather us not stand in his way if he wants to try
(With disclaimer that we might end up not liking the patch, of course).
I think he would be better outlining how he wants it to behave before
even working on a patch; from our TODO list:
Desirability -> Design -> Implement -> Test -> Review -> Commit
--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EnterpriseDB https://enterprisedb.com
The usefulness of a cup is in its emptiness, Bruce Lee
On 2020-Nov-12, Tom Lane wrote:
On the whole, I'm on the side of the people who don't want to change this.
The implementation cost seems likely to greatly outweigh the value, plus
it feels more like a wart than a feature.
I think if Eugen wants to spend some time with it and see how it could
be implemented, then sent a patch for consideration, then we could make
a better informed decision. My own opinion is that it's not worth the
trouble, but I'd rather us not stand in his way if he wants to try
(With disclaimer that we might end up not liking the patch, of course).
Sorry, I am not C/C++ programmist and do not imagine how to start to patch.
I do not know internals of PG. The only useful thing from me is just that idea
to make world better.
I suppose initially there were only ?column?, later names were implemented for count, sum etc
But it will be cool if PG will do step further and name sum( a ) as sum_a instead of just sum
The purpose of this proposition is not about correct name generation, the purpose to get
more distinct default names:
?column?, ?column?, ?column?, ?column?, ?column?, ?column?, ?column?,
?count?, ?count?, ?count?, ?sum?, ?sum?, ?sum?, ?sum?
?count_a?, ?count_b?, ?count_c?, ?sum_a?, ?sum_b?, ?sum_c?, ?sum_d?
Notice, that latest is more robust that first ;-)
I suppose we just ignore comlex cases and left them as they are
current. We could try some very very small step at the direction to
improve default names and see feed back from many users how it is
useful or not. Then we can decide it worth or not to implement whole
system for default name generation.
Unfortunately I am not judje at which level those should occur: parser, analiser or so.
I just does not understand those things =(
Thank you.
--
Best regards,
Eugen Konkov