Enhancement request: enable FIRST/LAST_value() also as a regular aggregate (not only as windowing function)
Hi All,
I'd like to throw here an enhancement proposal to discuss/consider.
The FIRST/LAST_value()
<https://www.postgresql.org/docs/devel/static/functions-window.html#id-1.5.8.26.6.2.2.9.1.1>
functions offer powerful lookup capabilities, eg.
here
1)
https://dbfiddle.uk/?rdbms=postgres_9.6&fiddle=0f13c8541191c3018703d2a97aa90bf9
SELECT t.* ,FIRST_value(v1)OVER(PARTITION BY gid ORDER BY v2) fv
,LAST_value(v1)OVER(PARTITION BY gid ORDER BY v2 ROWS BETWEEN UNBOUNDED
PRECEDING AND UNBOUNDED FOLLOWING) lv FROM( VALUES (1, 'b', 3),(1, 'd',
1),(1, 'a', 2) ,(2, 'x', 7),(2, 'y', 9),(2, 'z', 8),(2, 'v', 9)) t (gid,
v1, v2);
gidv1v2fvlv
1 d 1 d b
1 a 2 d b
1 b 3 d b
2 x 7 x v
2 z 8 x v
2 y 9 x v
2 v 9 x v
but, given those values are repeating - why can't I simply use this
functions as regular aggregates?
Or can I? It doesn't seem to be possible while I find this use case
actually more common than in windowing context…
Am I missing some workaround here?
Anyway, Oracle is an example where both contexts are possible
<https://docs.oracle.com/cd/E11882_01/server.112/e41084/functions065.htm#SQLRF00641>
and I think this could provide great reference as to what I have in mind.
2) Demonstrating those functions in both contexts there:
https://dbfiddle.uk/?rdbms=oracle_11.2&fiddle=ed62d9af3fbe6a038e4433625a70540f
a) as Windowing Aggregate:
* WITH t(gid, v1, v2) AS(SELECT 1, 'b', 3 FROM dual UNION ALLSELECT 1,
'd', 1 FROM dual UNION ALLSELECT 1, 'a', 2 FROM dual UNION ALLSELECT 2,
'x', 7 FROM dual UNION ALLSELECT 2, 'y', 9 FROM dual UNION ALLSELECT 2,
'z', 8 FROM dual UNION ALLSELECT 2, 'v', 9 FROM dual --UNION ALL)SELECT t.*
,MIN(v1)KEEP(DENSE_RANK FIRST ORDER BY v2) OVER(PARTITION BY gid)
lkp_first ,MIN(v1)KEEP(DENSE_RANK LAST ORDER BY v2) OVER(PARTITION BY
gid) lkp_lastMin ,MAX(v1)KEEP(DENSE_RANK LAST ORDER BY v2)
OVER(PARTITION BY gid) lkp_lastMax FROM t*;
GIDV1V2LKP_FIRSTLKP_LASTMINLKP_LASTMAX
1 a 2 d b b
1 d 1 d b b
1 b 3 d b b
2 z 8 x v y
2 y 9 x v y
2 v 9 x v y
2 x 7 x v y
b) as a regular aggregate (cf. GROUP BY and two rows only in the result)
* WITH t(gid, v1, v2) AS(SELECT 1, 'b', 3 FROM dual UNION ALLSELECT 1,
'd', 1 FROM dual UNION ALLSELECT 1, 'a', 2 FROM dual UNION ALLSELECT 2,
'x', 7 FROM dual UNION ALLSELECT 2, 'y', 9 FROM dual UNION ALLSELECT 2,
'z', 8 FROM dual UNION ALLSELECT 2, 'v', 9 FROM dual --UNION ALL)SELECT
t.gid ,MIN(v1)KEEP(DENSE_RANK FIRST ORDER BY v2)lkp_first
,MIN(v1)KEEP(DENSE_RANK LAST ORDER BY v2)lkp_lastMin
,MAX(v1)KEEP(DENSE_RANK LAST ORDER BY v2)lkp_lastMax FROM t GROUP BY
t.gid*;
GIDLKP_FIRSTLKP_LASTMINLKP_LASTMAX
1 d b b
2 x v y
Any chances of implementing that?
Thank you,
Kind Regards
~Maciek
matshyeq <matshyeq@gmail.com> writes:
I'd like to throw here an enhancement proposal to discuss/consider.
The FIRST/LAST_value()
functions offer powerful lookup capabilities, ...
but, given those values are repeating - why can't I simply use this
functions as regular aggregates?
Or can I? It doesn't seem to be possible while I find this use case
actually more common than in windowing context…
The reason the SQL standard doesn't allow that usage, or similar ones
for other window functions, is that the function results are undefined
unless you specify a particular ordering of the inputs ... which is
exactly what the window syntax provides.
Yeah, in most cases there are other ways where you could get the same
result, but all of them require extra syntax too. So you might as
well use window syntax and be standards-conforming.
Anyway, Oracle is an example where both contexts are possible
Oracle is not exactly a shining example of either SQL spec compliance
or well-thought-out features ...
regards, tom lane
I was hoping the proposal would be judged on its merits - primarily
usefulness, not necessarily conforming to the standards
I would expect postgres overlap with the SQL standard (eg. SQL:2016)
doesn't differ much from other big vendors and has lots of own
deviations/extensions itself.
Not a fan of Oracle myself but have to give it to them the mentioned
versatility is really nice.
In terms of performance it also carries a significant difference:
being forced to work on an unaggregated result and process it further (eg.
with extra subselect filtering on row_number)
vs. simple GROUP BY
Yes it would require particular ordering of the input
but please note PostgreSQL already supports a small set of such Ordered-Set
Aggregate Functions
<https://www.postgresql.org/docs/devel/static/functions-aggregate.html#FUNCTIONS-ORDEREDSET-TABLE>
(
https://www.postgresql.org/docs/devel/static/functions-aggregate.html#FUNCTIONS-ORDEREDSET-TABLE
)
Currently I can do workaround with something like:
SELECT gid
,RIGHT(MIN(lpad(v2::varchar,10,'0')||v1),-10) pseudo_lkp_first
,RIGHT(MAX(lpad(v2::varchar,10,'0')||v1),-10) pseudo_lkp_last
FROM(
VALUES (1, 'b', 3),(1, 'd', 1),(1, 'a', 2)
,(2, 'x', 7),(2, 'y', 9),(2, 'z', 8),(2, 'v', 9)) t (gid, v1, v2)
GROUP BY gid;
but it's neither universal nor readable.
It's fast though.
I was hoping those FIRST/LAST lookup functions could be made available as
some new Ordered-Set Aggregate Functions
<https://www.postgresql.org/docs/devel/static/functions-aggregate.html#FUNCTIONS-ORDEREDSET-TABLE>
too…
On Tue, 24 Jul 2018 at 23:16, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Show quoted text
matshyeq <matshyeq@gmail.com> writes:
I'd like to throw here an enhancement proposal to discuss/consider.
The FIRST/LAST_value()
functions offer powerful lookup capabilities, ...
but, given those values are repeating - why can't I simply use this
functions as regular aggregates?
Or can I? It doesn't seem to be possible while I find this use case
actually more common than in windowing context…The reason the SQL standard doesn't allow that usage, or similar ones
for other window functions, is that the function results are undefined
unless you specify a particular ordering of the inputs ... which is
exactly what the window syntax provides.Yeah, in most cases there are other ways where you could get the same
result, but all of them require extra syntax too. So you might as
well use window syntax and be standards-conforming.Anyway, Oracle is an example where both contexts are possible
Oracle is not exactly a shining example of either SQL spec compliance
or well-thought-out features ...regards, tom lane
On Tue, Jul 24, 2018 at 4:16 PM, matshyeq <matshyeq@gmail.com> wrote:
I'd like to throw here an enhancement proposal to discuss/consider.
The FIRST/LAST_value()
<https://www.postgresql.org/docs/devel/static/functions-window.html#id-1.5.8.26.6.2.2.9.1.1>
functions offer powerful lookup capabilities, eg.
here
1) https://dbfiddle.uk/?rdbms=postgres_9.6&fiddle=
0f13c8541191c3018703d2a97aa90bf9SELECT t.* ,FIRST_value(v1)OVER(PARTITION BY gid ORDER BY v2) fv
,LAST_value(v1)OVER(PARTITION BY gid ORDER BY v2 ROWS BETWEEN UNBOUNDED
PRECEDING AND UNBOUNDED FOLLOWING) lv FROM( VALUES (1, 'b', 3),(1, 'd',
1),(1, 'a', 2) ,(2, 'x', 7),(2, 'y', 9),(2, 'z', 8),(2, 'v', 9)) t (gid,
v1, v2);gidv1v2fvlv
1 d 1 d b
1 a 2 d b
1 b 3 d b
2 x 7 x v
2 z 8 x v
2 y 9 x v
2 v 9 x vbut, given those values are repeating - why can't I simply use this
functions as regular aggregates?
Or can I? It doesn't seem to be possible while I find this use case
actually more common than in windowing context…
Am I missing some workaround here?
Why not just define a custom aggregate function that does whatever you
need? I don't think it would be too hard. e.g. for something like
LAST_VALUE() just make the transition type equal to the output type and
save the last value you've seen thus far as the transition value.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
Thanks Robert!
I thought this could be nice functionality available out of the box
but yea, sure, that would work for me.
Can I make custom aggregate function that accepts WITHIN GROUP syntax?
Which language would that need to be implemented in? Would you have
examples (url?)
Thank you,
Kind Regards
~Maciek
On Thu, 26 Jul 2018 at 16:22, Robert Haas <robertmhaas@gmail.com> wrote:
Show quoted text
On Tue, Jul 24, 2018 at 4:16 PM, matshyeq <matshyeq@gmail.com> wrote:
I'd like to throw here an enhancement proposal to discuss/consider.
The FIRST/LAST_value()
<https://www.postgresql.org/docs/devel/static/functions-window.html#id-1.5.8.26.6.2.2.9.1.1>
functions offer powerful lookup capabilities, eg.
here
1)
https://dbfiddle.uk/?rdbms=postgres_9.6&fiddle=0f13c8541191c3018703d2a97aa90bf9SELECT t.* ,FIRST_value(v1)OVER(PARTITION BY gid ORDER BY v2) fv
,LAST_value(v1)OVER(PARTITION BY gid ORDER BY v2 ROWS BETWEEN UNBOUNDED
PRECEDING AND UNBOUNDED FOLLOWING) lv FROM( VALUES (1, 'b', 3),(1, 'd',
1),(1, 'a', 2) ,(2, 'x', 7),(2, 'y', 9),(2, 'z', 8),(2, 'v', 9)) t (gid,
v1, v2);gidv1v2fvlv
1 d 1 d b
1 a 2 d b
1 b 3 d b
2 x 7 x v
2 z 8 x v
2 y 9 x v
2 v 9 x vbut, given those values are repeating - why can't I simply use this
functions as regular aggregates?
Or can I? It doesn't seem to be possible while I find this use case
actually more common than in windowing context…
Am I missing some workaround here?Why not just define a custom aggregate function that does whatever you
need? I don't think it would be too hard. e.g. for something like
LAST_VALUE() just make the transition type equal to the output type and
save the last value you've seen thus far as the transition value.--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
matshyeq <matshyeq@gmail.com> writes:
Can I make custom aggregate function that accepts WITHIN GROUP syntax?
It's possible, but it would have to be written in C, I think; see
src/backend/utils/adt/orderedsetaggs.c for prototype code. You'll
find it a lot easier to write a plain aggregate function and control
its input ordering with "last(foo order by foo)".
regards, tom lane
On 26/07/18 20:31, Tom Lane wrote:
matshyeq <matshyeq@gmail.com> writes:
Can I make custom aggregate function that accepts WITHIN GROUP syntax?
It's possible, but it would have to be written in C, I think; see
src/backend/utils/adt/orderedsetaggs.c for prototype code. You'll
find it a lot easier to write a plain aggregate function and control
its input ordering with "last(foo order by foo)".
We have examples on the wiki.
https://wiki.postgresql.org/wiki/First/last_(aggregate)
--
Vik Fearing +33 6 46 75 15 36
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support