Enhancement request: enable FIRST/LAST_value() also as a regular aggregate (not only as windowing function)

Started by matshyeqover 7 years ago7 messages
#1matshyeq
matshyeq@gmail.com

Hi All,

I'd like to throw here an enhancement proposal to discuss/consider.
The FIRST/LAST_value()
<https://www.postgresql.org/docs/devel/static/functions-window.html#id-1.5.8.26.6.2.2.9.1.1&gt;
functions offer powerful lookup capabilities, eg.
here
1)
https://dbfiddle.uk/?rdbms=postgres_9.6&amp;fiddle=0f13c8541191c3018703d2a97aa90bf9

SELECT t.* ,FIRST_value(v1)OVER(PARTITION BY gid ORDER BY v2) fv
,LAST_value(v1)OVER(PARTITION BY gid ORDER BY v2 ROWS BETWEEN UNBOUNDED
PRECEDING AND UNBOUNDED FOLLOWING) lv FROM( VALUES (1, 'b', 3),(1, 'd',
1),(1, 'a', 2) ,(2, 'x', 7),(2, 'y', 9),(2, 'z', 8),(2, 'v', 9)) t (gid,
v1, v2);

gidv1v2fvlv
1 d 1 d b
1 a 2 d b
1 b 3 d b
2 x 7 x v
2 z 8 x v
2 y 9 x v
2 v 9 x v

but, given those values are repeating - why can't I simply use this
functions as regular aggregates?
Or can I? It doesn't seem to be possible while I find this use case
actually more common than in windowing context…
Am I missing some workaround here?

Anyway, Oracle is an example where both contexts are possible
<https://docs.oracle.com/cd/E11882_01/server.112/e41084/functions065.htm#SQLRF00641&gt;
and I think this could provide great reference as to what I have in mind.
2) Demonstrating those functions in both contexts there:
https://dbfiddle.uk/?rdbms=oracle_11.2&amp;fiddle=ed62d9af3fbe6a038e4433625a70540f

a) as Windowing Aggregate:

* WITH t(gid, v1, v2) AS(SELECT 1, 'b', 3 FROM dual UNION ALLSELECT 1,
'd', 1 FROM dual UNION ALLSELECT 1, 'a', 2 FROM dual UNION ALLSELECT 2,
'x', 7 FROM dual UNION ALLSELECT 2, 'y', 9 FROM dual UNION ALLSELECT 2,
'z', 8 FROM dual UNION ALLSELECT 2, 'v', 9 FROM dual --UNION ALL)SELECT t.*
,MIN(v1)KEEP(DENSE_RANK FIRST ORDER BY v2) OVER(PARTITION BY gid)
lkp_first ,MIN(v1)KEEP(DENSE_RANK LAST ORDER BY v2) OVER(PARTITION BY
gid) lkp_lastMin ,MAX(v1)KEEP(DENSE_RANK LAST ORDER BY v2)
OVER(PARTITION BY gid) lkp_lastMax FROM t*;
GIDV1V2LKP_FIRSTLKP_LASTMINLKP_LASTMAX
1 a 2 d b b
1 d 1 d b b
1 b 3 d b b
2 z 8 x v y
2 y 9 x v y
2 v 9 x v y
2 x 7 x v y

b) as a regular aggregate (cf. GROUP BY and two rows only in the result)

* WITH t(gid, v1, v2) AS(SELECT 1, 'b', 3 FROM dual UNION ALLSELECT 1,
'd', 1 FROM dual UNION ALLSELECT 1, 'a', 2 FROM dual UNION ALLSELECT 2,
'x', 7 FROM dual UNION ALLSELECT 2, 'y', 9 FROM dual UNION ALLSELECT 2,
'z', 8 FROM dual UNION ALLSELECT 2, 'v', 9 FROM dual --UNION ALL)SELECT
t.gid ,MIN(v1)KEEP(DENSE_RANK FIRST ORDER BY v2)lkp_first
,MIN(v1)KEEP(DENSE_RANK LAST ORDER BY v2)lkp_lastMin
,MAX(v1)KEEP(DENSE_RANK LAST ORDER BY v2)lkp_lastMax FROM t GROUP BY
t.gid*;
GIDLKP_FIRSTLKP_LASTMINLKP_LASTMAX
1 d b b
2 x v y

Any chances of implementing that?

Thank you,
Kind Regards
~Maciek

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: matshyeq (#1)
Re: Enhancement request: enable FIRST/LAST_value() also as a regular aggregate (not only as windowing function)

matshyeq <matshyeq@gmail.com> writes:

I'd like to throw here an enhancement proposal to discuss/consider.
The FIRST/LAST_value()
functions offer powerful lookup capabilities, ...
but, given those values are repeating - why can't I simply use this
functions as regular aggregates?
Or can I? It doesn't seem to be possible while I find this use case
actually more common than in windowing context…

The reason the SQL standard doesn't allow that usage, or similar ones
for other window functions, is that the function results are undefined
unless you specify a particular ordering of the inputs ... which is
exactly what the window syntax provides.

Yeah, in most cases there are other ways where you could get the same
result, but all of them require extra syntax too. So you might as
well use window syntax and be standards-conforming.

Anyway, Oracle is an example where both contexts are possible

Oracle is not exactly a shining example of either SQL spec compliance
or well-thought-out features ...

regards, tom lane

#3matshyeq
matshyeq@gmail.com
In reply to: Tom Lane (#2)
Re: Enhancement request: enable FIRST/LAST_value() also as a regular aggregate (not only as windowing function)

I was hoping the proposal would be judged on its merits - primarily
usefulness, not necessarily conforming to the standards
I would expect postgres overlap with the SQL standard (eg. SQL:2016)
doesn't differ much from other big vendors and has lots of own
deviations/extensions itself.
Not a fan of Oracle myself but have to give it to them the mentioned
versatility is really nice.
In terms of performance it also carries a significant difference:
being forced to work on an unaggregated result and process it further (eg.
with extra subselect filtering on row_number)
vs. simple GROUP BY

Yes it would require particular ordering of the input
but please note PostgreSQL already supports a small set of such Ordered-Set
Aggregate Functions
<https://www.postgresql.org/docs/devel/static/functions-aggregate.html#FUNCTIONS-ORDEREDSET-TABLE&gt;
(
https://www.postgresql.org/docs/devel/static/functions-aggregate.html#FUNCTIONS-ORDEREDSET-TABLE
)

Currently I can do workaround with something like:

SELECT gid
,RIGHT(MIN(lpad(v2::varchar,10,'0')||v1),-10) pseudo_lkp_first
,RIGHT(MAX(lpad(v2::varchar,10,'0')||v1),-10) pseudo_lkp_last
FROM(
VALUES (1, 'b', 3),(1, 'd', 1),(1, 'a', 2)
,(2, 'x', 7),(2, 'y', 9),(2, 'z', 8),(2, 'v', 9)) t (gid, v1, v2)
GROUP BY gid;

but it's neither universal nor readable.
It's fast though.

I was hoping those FIRST/LAST lookup functions could be made available as
some new Ordered-Set Aggregate Functions
<https://www.postgresql.org/docs/devel/static/functions-aggregate.html#FUNCTIONS-ORDEREDSET-TABLE&gt;
too…

On Tue, 24 Jul 2018 at 23:16, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Show quoted text

matshyeq <matshyeq@gmail.com> writes:

I'd like to throw here an enhancement proposal to discuss/consider.
The FIRST/LAST_value()
functions offer powerful lookup capabilities, ...
but, given those values are repeating - why can't I simply use this
functions as regular aggregates?
Or can I? It doesn't seem to be possible while I find this use case
actually more common than in windowing context…

The reason the SQL standard doesn't allow that usage, or similar ones
for other window functions, is that the function results are undefined
unless you specify a particular ordering of the inputs ... which is
exactly what the window syntax provides.

Yeah, in most cases there are other ways where you could get the same
result, but all of them require extra syntax too. So you might as
well use window syntax and be standards-conforming.

Anyway, Oracle is an example where both contexts are possible

Oracle is not exactly a shining example of either SQL spec compliance
or well-thought-out features ...

regards, tom lane

#4Robert Haas
robertmhaas@gmail.com
In reply to: matshyeq (#1)
Re: Enhancement request: enable FIRST/LAST_value() also as a regular aggregate (not only as windowing function)

On Tue, Jul 24, 2018 at 4:16 PM, matshyeq <matshyeq@gmail.com> wrote:

I'd like to throw here an enhancement proposal to discuss/consider.
The FIRST/LAST_value()
<https://www.postgresql.org/docs/devel/static/functions-window.html#id-1.5.8.26.6.2.2.9.1.1&gt;
functions offer powerful lookup capabilities, eg.
here
1) https://dbfiddle.uk/?rdbms=postgres_9.6&amp;fiddle=
0f13c8541191c3018703d2a97aa90bf9

SELECT t.* ,FIRST_value(v1)OVER(PARTITION BY gid ORDER BY v2) fv
,LAST_value(v1)OVER(PARTITION BY gid ORDER BY v2 ROWS BETWEEN UNBOUNDED
PRECEDING AND UNBOUNDED FOLLOWING) lv FROM( VALUES (1, 'b', 3),(1, 'd',
1),(1, 'a', 2) ,(2, 'x', 7),(2, 'y', 9),(2, 'z', 8),(2, 'v', 9)) t (gid,
v1, v2);

gidv1v2fvlv
1 d 1 d b
1 a 2 d b
1 b 3 d b
2 x 7 x v
2 z 8 x v
2 y 9 x v
2 v 9 x v

but, given those values are repeating - why can't I simply use this
functions as regular aggregates?
Or can I? It doesn't seem to be possible while I find this use case
actually more common than in windowing context…
Am I missing some workaround here?

Why not just define a custom aggregate function that does whatever you
need? I don't think it would be too hard. e.g. for something like
LAST_VALUE() just make the transition type equal to the output type and
save the last value you've seen thus far as the transition value.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#5matshyeq
matshyeq@gmail.com
In reply to: Robert Haas (#4)
Re: Enhancement request: enable FIRST/LAST_value() also as a regular aggregate (not only as windowing function)

Thanks Robert!
I thought this could be nice functionality available out of the box
but yea, sure, that would work for me.
Can I make custom aggregate function that accepts WITHIN GROUP syntax?
Which language would that need to be implemented in? Would you have
examples (url?)

Thank you,
Kind Regards
~Maciek
On Thu, 26 Jul 2018 at 16:22, Robert Haas <robertmhaas@gmail.com> wrote:

Show quoted text

On Tue, Jul 24, 2018 at 4:16 PM, matshyeq <matshyeq@gmail.com> wrote:

I'd like to throw here an enhancement proposal to discuss/consider.
The FIRST/LAST_value()
<https://www.postgresql.org/docs/devel/static/functions-window.html#id-1.5.8.26.6.2.2.9.1.1&gt;
functions offer powerful lookup capabilities, eg.
here
1)
https://dbfiddle.uk/?rdbms=postgres_9.6&amp;fiddle=0f13c8541191c3018703d2a97aa90bf9

SELECT t.* ,FIRST_value(v1)OVER(PARTITION BY gid ORDER BY v2) fv
,LAST_value(v1)OVER(PARTITION BY gid ORDER BY v2 ROWS BETWEEN UNBOUNDED
PRECEDING AND UNBOUNDED FOLLOWING) lv FROM( VALUES (1, 'b', 3),(1, 'd',
1),(1, 'a', 2) ,(2, 'x', 7),(2, 'y', 9),(2, 'z', 8),(2, 'v', 9)) t (gid,
v1, v2);

gidv1v2fvlv
1 d 1 d b
1 a 2 d b
1 b 3 d b
2 x 7 x v
2 z 8 x v
2 y 9 x v
2 v 9 x v

but, given those values are repeating - why can't I simply use this
functions as regular aggregates?
Or can I? It doesn't seem to be possible while I find this use case
actually more common than in windowing context…
Am I missing some workaround here?

Why not just define a custom aggregate function that does whatever you
need? I don't think it would be too hard. e.g. for something like
LAST_VALUE() just make the transition type equal to the output type and
save the last value you've seen thus far as the transition value.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#6Tom Lane
tgl@sss.pgh.pa.us
In reply to: matshyeq (#5)
Re: Enhancement request: enable FIRST/LAST_value() also as a regular aggregate (not only as windowing function)

matshyeq <matshyeq@gmail.com> writes:

Can I make custom aggregate function that accepts WITHIN GROUP syntax?

It's possible, but it would have to be written in C, I think; see
src/backend/utils/adt/orderedsetaggs.c for prototype code. You'll
find it a lot easier to write a plain aggregate function and control
its input ordering with "last(foo order by foo)".

regards, tom lane

#7Vik Fearing
vik.fearing@2ndquadrant.com
In reply to: Tom Lane (#6)
Re: Enhancement request: enable FIRST/LAST_value() also as a regular aggregate (not only as windowing function)

On 26/07/18 20:31, Tom Lane wrote:

matshyeq <matshyeq@gmail.com> writes:

Can I make custom aggregate function that accepts WITHIN GROUP syntax?

It's possible, but it would have to be written in C, I think; see
src/backend/utils/adt/orderedsetaggs.c for prototype code. You'll
find it a lot easier to write a plain aggregate function and control
its input ordering with "last(foo order by foo)".

We have examples on the wiki.
https://wiki.postgresql.org/wiki/First/last_(aggregate)
--
Vik Fearing +33 6 46 75 15 36
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support