Undocumented array_val[generate_series(...)] functionality?
Hey,
A post over in Reddit had an expression form I've never seen before:
select (array[1,2,3,4]::integer[])[generate_series(1, 3)];
===
1
2
3
Looking at subscripting in the SQL syntax this example doesn't seem to be
documented.
https://www.postgresql.org/docs/current/sql-expressions.html#SQL-EXPRESSIONS-SUBSCRIPTS
Is it documented somewhere else, and, regardless, shouldn't it be
documented in the linked location?
I get this is basically a variant of the slice syntax [n:m], and it is
working as expected, but nothing indicates I can put an SRF inside the
brackets.
David J.
On Jul 11, 2021, at 5:52 PM, David G. Johnston <david.g.johnston@gmail.com> wrote:
Hey,A post over in Reddit had an expression form I've never seen before:
select (array[1,2,3,4]::integer[])[generate_series(1, 3)];
===
1
2
3Looking at subscripting in the SQL syntax this example doesn't seem to be documented.
https://www.postgresql.org/docs/current/sql-expressions.html#SQL-EXPRESSIONS-SUBSCRIPTS
Is it documented somewhere else, and, regardless, shouldn't it be documented in the linked location?
I get this is basically a variant of the slice syntax [n:m], and it is working as expected, but nothing indicates I can put an SRF inside the brackets.
David J.
Is that expression saying something like select the first second third element from array 1,2,3,4?
Show quoted text
On Mon, 12 Jul 2021 at 11:52, David G. Johnston
<david.g.johnston@gmail.com> wrote:
A post over in Reddit had an expression form I've never seen before:
select (array[1,2,3,4]::integer[])[generate_series(1, 3)];
Looking at subscripting in the SQL syntax this example doesn't seem to be documented.
https://www.postgresql.org/docs/current/sql-expressions.html#SQL-EXPRESSIONS-SUBSCRIPTS
Is it documented somewhere else, and, regardless, shouldn't it be documented in the linked location?
Isn't this implied by "Each subscript is itself an expression"?
There's nothing special here with the SRF. That just produces 3 rows
and passes the subscript as 1, 2 then 3.
David
On Sun, Jul 11, 2021 at 5:43 PM David Rowley <dgrowleyml@gmail.com> wrote:
Isn't this implied by "Each subscript is itself an expression"?
There's nothing special here with the SRF. That just produces 3 rows
and passes the subscript as 1, 2 then 3.
One can indeed infer that if the expression chosen for subscript is an SRF
that the resultant output will also be an SRF. It seems less than helpful
to force the reader to infer that, IMO. Especially when the slice syntax,
which could conceivably return either a sub-array OR a set, the later being
the undocumented reality (our function examples document the actual
outputs, these examples probably should do the same. I would be OK if we
simply added an SRF example and showed the SRF result it produces without
changing the wording - though this doesn't seem like a hard improvement to
make. I'll offer something up if a committer at least agrees it could use
some work).
David J.
On Mon, 12 Jul 2021 at 12:58, David G. Johnston
<david.g.johnston@gmail.com> wrote:
On Sun, Jul 11, 2021 at 5:43 PM David Rowley <dgrowleyml@gmail.com> wrote:
Isn't this implied by "Each subscript is itself an expression"?
There's nothing special here with the SRF. That just produces 3 rows
and passes the subscript as 1, 2 then 3.One can indeed infer that if the expression chosen for subscript is an SRF that the resultant output will also be an SRF.
I'd say in your example the array with the subscript does not become
an SRF anymore than abs() becomes an SRF in the following:
select abs(generate_series(-3,-1));
abs() is simply called once per output value of the generate_series
SRF. That seems fairly equivalent to me to what's going on in your
example case.
David
On Sun, Jul 11, 2021 at 6:06 PM David Rowley <dgrowleyml@gmail.com> wrote:
select abs(generate_series(-3,-1));
abs() is simply called once per output value of the generate_series
SRF. That seems fairly equivalent to me to what's going on in your
example case.
Fair point. Both of these are premised on two related facts:
One, the select generate_series(1,3) function call causes multiple rows to
be generated where there would usually be only one. In short, SRF function
calls and non-SRF function calls exhibit different behaviors on the output.
Two, composition results in an inside-to-outside execution order: the SRF
is evaluated first, the additional rows added, then the outer function (abs
or the subscript function respectively in these examples) is evaluated for
whatever rows are now present in the result.
Is the above something one can learn from our documentation?
Is this syntax we are discouraging users from using and thus intentionally
not documenting it? I do get this impression but, frankly, given the
utility of " [ generate_series(...) ] " I find it hard to recommend
something different to get the same result yet don't have a place to point
and say "here is how and why it works".
David J.
On Mon, 12 Jul 2021 at 02:39, David G. Johnston
<david.g.johnston@gmail.com> wrote:
One, the select generate_series(1,3) function call causes multiple rows to be generated where there would usually be only one.
Yes.
Two, composition results in an inside-to-outside execution order: the SRF is evaluated first, the additional rows added, then the outer function (abs or the subscript function respectively in these examples) is evaluated for whatever rows are now present in the result.
Yes.
Is the above something one can learn from our documentation?
Yes, but only if you know where to look.
Is this syntax we are discouraging users from using and thus intentionally not documenting it?
On the contrary, I would say that this is the expected behaviour, and
that it is documented, though not in the most obvious place:
https://www.postgresql.org/docs/current/xfunc-sql.html#XFUNC-SQL-FUNCTIONS-RETURNING-SET
That's probably not the first place one would go looking for it, and
might also (wrongly) imply that it only works for functions written in
language SQL.
BTW, this is something that started working in PG10 (in 9.6, an error
is thrown), and I think that it's a result of this release note item,
which matches your conclusions:
Change the implementation of set-returning functions appearing in
a query's SELECT list (Andres Freund)
Set-returning functions are now evaluated before evaluation of
scalar expressions in the SELECT list, much as though they had been
placed in a LATERAL FROM-clause item. This allows saner semantics for
cases where multiple set-returning functions are present. If they
return different numbers of rows, the shorter results are extended to
match the longest result by adding nulls. Previously the results were
cycled until they all terminated at the same time, producing a number
of rows equal to the least common multiple of the functions'
periods. In addition, set-returning functions are now disallowed
within CASE and COALESCE constructs. For more information see Section
37.4.8.
Another interesting consequence of that is that it's possible to do a
similar thing with the array slice syntax, and a pair of
generate_series() calls, for example:
SELECT (array[1,2,3,4]::int[])[generate_series(1,4) : generate_series(2,4)];
array
-------
{1,2}
{2,3}
{3,4}
(4 rows)
Note: there are 4 rows in that result, and the last one is NULL, which
is also consistent with the documentation, and the fact that the array
slice function returns NULL if either subscript is NULL.
I'd agree that there's an opportunity to improve the docs here.
Regards,
Dean
Dean Rasheed <dean.a.rasheed@gmail.com> writes:
BTW, this is something that started working in PG10 (in 9.6, an error
is thrown), and I think that it's a result of this release note item,
which matches your conclusions:
Change the implementation of set-returning functions appearing in
a query's SELECT list (Andres Freund)
Interesting. When I first looked at this thread I figured "oh, that's
always worked, nothing to see here". But you're right, we didn't use
to allow SRFs in subscripts. Still, I'm with David that no new docs
are needed. IMO the former restriction was the surprising thing, and
the current behavior is simply what one would expect from assembling
those parts in that order.
regards, tom lane
On Monday, July 12, 2021, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Still, I'm with David that no new docs
are needed. IMO the former restriction was the surprising thing, and
the current behavior is simply what one would expect from assembling
those parts in that order.
I agree the material in Extending SQL - SQL Functions, plus normal
expectations, mean that we do indeed cover the topic sufficiently. The
location and specifics of the material and use of cross-references could
use some attention though its also not an area that gets too many questions
so people are figuring this all out one way or another (or not finding a
need and just don’t know what they are missing…)
It doesn’t really fit that well in a syntax chapter since the calling
syntax is indeed the same. But the Queries Chapter is where I was looking
this past time (I knew about and read the SQL Functions Chapter previously
but didn’t get there is this skim).
David J.
"David G. Johnston" <david.g.johnston@gmail.com> writes:
I agree the material in Extending SQL - SQL Functions, plus normal
expectations, mean that we do indeed cover the topic sufficiently. The
location and specifics of the material and use of cross-references could
use some attention though its also not an area that gets too many questions
so people are figuring this all out one way or another (or not finding a
need and just don’t know what they are missing…)
Yeah, there is a lot of material in that chapter that is about using
functions, not just writing them. I don't think anyone's entirely
happy about the current factorization of those parts of the manual;
but it's not entirely clear how to make it better, either.
regards, tom lane