first_value/last_value

Started by Tom Smithalmost 10 years ago10 messagesgeneral
Jump to latest
#1Tom Smith
tomsmith1989sk@gmail.com

Hello:

Is there a plan for 9.7 to enable using the two aggregate function
as non-window function? i.e. enabling getting the first/last row
in single sql without using window features.
There is actually a C-extension for first()/last().
I am wondering if 9.7 would make them built-in function like max/min

Thanks

#2Adam Brusselback
adambrusselback@gmail.com
In reply to: Tom Smith (#1)
Re: first_value/last_value

Is there a reason you can't do that now with a limit 1/order by/union all?
Just have it ordered one way on the first query and the other on the
bottom. That will give you two rows that are the first / last in your set
based on whatever column you order on.
On May 18, 2016 8:47 PM, "Tom Smith" <tomsmith1989sk@gmail.com> wrote:

Show quoted text

Hello:

Is there a plan for 9.7 to enable using the two aggregate function
as non-window function? i.e. enabling getting the first/last row
in single sql without using window features.
There is actually a C-extension for first()/last().
I am wondering if 9.7 would make them built-in function like max/min

Thanks

#3David G. Johnston
david.g.johnston@gmail.com
In reply to: Tom Smith (#1)
Re: first_value/last_value

On Wednesday, May 18, 2016, Tom Smith <tomsmith1989sk@gmail.com> wrote:

Hello:

Is there a plan for 9.7 to enable using the two aggregate function
as non-window function? i.e. enabling getting the first/last row
in single sql without using window features.
There is actually a C-extension for first()/last().
I am wondering if 9.7 would make them built-in function like max/min

The "distinct on" capability provides this for at least some use cases.

Given this is first I've even seen a gripe on this, and the window versions
have existed for something like 6 years, I'd have to say this isn't high on
anyone's radar.

David J.

#4Tom Smith
tomsmith1989sk@gmail.com
In reply to: Adam Brusselback (#2)
Re: first_value/last_value

I need to use both in single sql.

On Wed, May 18, 2016 at 9:08 PM, Adam Brusselback <adambrusselback@gmail.com

Show quoted text

wrote:

Is there a reason you can't do that now with a limit 1/order by/union all?
Just have it ordered one way on the first query and the other on the
bottom. That will give you two rows that are the first / last in your set
based on whatever column you order on.
On May 18, 2016 8:47 PM, "Tom Smith" <tomsmith1989sk@gmail.com> wrote:

Hello:

Is there a plan for 9.7 to enable using the two aggregate function
as non-window function? i.e. enabling getting the first/last row
in single sql without using window features.
There is actually a C-extension for first()/last().
I am wondering if 9.7 would make them built-in function like max/min

Thanks

#5Adam Brusselback
adambrusselback@gmail.com
In reply to: Tom Smith (#4)
Re: first_value/last_value

Here is an example that works in a single query. Since you have two
different orders you want the data back in, you need to use subqueries to
get the proper data back, but it works, and is very fast.

CREATE TEMPORARY TABLE foo AS
SELECT generate_series as bar
FROM generate_series(1, 1000000);

CREATE INDEX idx_foo_bar ON foo (bar);

SELECT *
FROM (
SELECT bar
FROM foo
ORDER BY bar asc
LIMIT 1
) x
UNION ALL
SELECT *
FROM (
SELECT bar
FROM foo
ORDER BY bar desc
LIMIT 1
) y;

DROP TABLE foo;

#6Melvin Davidson
melvin6925@gmail.com
In reply to: Adam Brusselback (#5)
Re: first_value/last_value

On Wed, May 18, 2016 at 10:36 PM, Adam Brusselback <
adambrusselback@gmail.com> wrote:

Here is an example that works in a single query. Since you have two
different orders you want the data back in, you need to use subqueries to
get the proper data back, but it works, and is very fast.

CREATE TEMPORARY TABLE foo AS
SELECT generate_series as bar
FROM generate_series(1, 1000000);

CREATE INDEX idx_foo_bar ON foo (bar);

SELECT *
FROM (
SELECT bar
FROM foo
ORDER BY bar asc
LIMIT 1
) x
UNION ALL
SELECT *
FROM (
SELECT bar
FROM foo
ORDER BY bar desc
LIMIT 1
) y;

DROP TABLE foo;

Seems to me SELECT min(<column>), max(<column>) FROM deja.vu ; would also
work.

--
*Melvin Davidson*
I reserve the right to fantasize. Whether or not you
wish to share my fantasy is entirely up to you.

#7Tom Smith
tomsmith1989sk@gmail.com
In reply to: Melvin Davidson (#6)
Re: first_value/last_value

It would really save all the troubles for many people if postgresql has a
built-in first/last function along with sum/avg.
There is already a C extension and a wiki sample and implemented for
window function.
I am curious why these two functions were not added along their window
implementation counter part,
for completness and consistency

On Wed, May 18, 2016 at 10:42 PM, Melvin Davidson <melvin6925@gmail.com>
wrote:

Show quoted text

On Wed, May 18, 2016 at 10:36 PM, Adam Brusselback <
adambrusselback@gmail.com> wrote:

Here is an example that works in a single query. Since you have two
different orders you want the data back in, you need to use subqueries to
get the proper data back, but it works, and is very fast.

CREATE TEMPORARY TABLE foo AS
SELECT generate_series as bar
FROM generate_series(1, 1000000);

CREATE INDEX idx_foo_bar ON foo (bar);

SELECT *
FROM (
SELECT bar
FROM foo
ORDER BY bar asc
LIMIT 1
) x
UNION ALL
SELECT *
FROM (
SELECT bar
FROM foo
ORDER BY bar desc
LIMIT 1
) y;

DROP TABLE foo;

Seems to me SELECT min(<column>), max(<column>) FROM deja.vu ; would also
work.

--
*Melvin Davidson*
I reserve the right to fantasize. Whether or not you
wish to share my fantasy is entirely up to you.

#8Matija Lesar
matija.lesar@gmail.com
In reply to: Tom Smith (#7)
Re: first_value/last_value

On 19 May 2016 at 05:04, Tom Smith <tomsmith1989sk@gmail.com> wrote:

It would really save all the troubles for many people if postgresql has a
built-in first/last function along with sum/avg.
There is already a C extension and a wiki sample and implemented for
window function.
I am curious why these two functions were not added along their window
implementation counter part,
for completness and consistency

On Wed, May 18, 2016 at 10:42 PM, Melvin Davidson <melvin6925@gmail.com>
wrote:

On Wed, May 18, 2016 at 10:36 PM, Adam Brusselback <
adambrusselback@gmail.com> wrote:

Here is an example that works in a single query. Since you have two
different orders you want the data back in, you need to use subqueries to
get the proper data back, but it works, and is very fast.

CREATE TEMPORARY TABLE foo AS
SELECT generate_series as bar
FROM generate_series(1, 1000000);

CREATE INDEX idx_foo_bar ON foo (bar);

SELECT *
FROM (
SELECT bar
FROM foo
ORDER BY bar asc
LIMIT 1
) x
UNION ALL
SELECT *
FROM (
SELECT bar
FROM foo
ORDER BY bar desc
LIMIT 1
) y;

DROP TABLE foo;

Seems to me SELECT min(<column>), max(<column>) FROM deja.vu ; would
also work.

--
*Melvin Davidson*
I reserve the right to fantasize. Whether or not you
wish to share my fantasy is entirely up to you.

You can always create your aggregate function for this.

Here is example for getting non null first and last value:

CREATE OR REPLACE FUNCTION firstval_sfunc(anyelement, anyelement)
returns anyelement
language sql
as $BODY$
SELECT CASE WHEN $1 is NULL THEN $2 ELSE $1 END;
$BODY$;

CREATE AGGREGATE myfirstval(anyelement) (
SFUNC = firstval_sfunc,
STYPE = anyelement
);

CREATE OR REPLACE FUNCTION lastval_sfunc(anyelement, anyelement)
returns anyelement
language sql
as $BODY$
SELECT CASE WHEN $2 is NULL THEN $1 ELSE $2 END;
$BODY$;

CREATE AGGREGATE mylastval(anyelement) (
SFUNC = lastval_sfunc,
STYPE = anyelement
);

Outputs:

select myfirstval(b), mylastval(b) from
unnest(array[3,2,null,12,-1]::int[]) b;
myfirstval | mylastval
------------+-----------
3 | -1

select myfirstval(b order by b), mylastval(b order by b) from
unnest(array[3,2,null,12,-1]::int[]) b;
myfirstval | mylastval
------------+-----------
-1 | 12

select myfirstval(b), mylastval(b) from generate_series(10,20000) as b;
myfirstval | mylastval
------------+-----------
10 | 20000

select myfirstval(b), mylastval(b) from
unnest(array['c','b','t','x']::text[]) b;
myfirstval | mylastval
------------+-----------
c | x

Bye,
Matija Lesar

#9Alexey Bashtanov
bashtanov@imap.cc
In reply to: Tom Smith (#7)
Re: first_value/last_value

Hello Tom,

I agree such functions are very useful, as they allow you to use
ordinary aggregation functions such as sum/max/avg
along with first/last ones (traditionally served by DISTINCT ON or
LIMIT) in the same group-by node
which improves performance and readability.

The first/last extension you mentioned
http://pgxn.org/dist/first_last_agg/ has a slight disadvantage:
it relies on postgres providing it sorted data if user asks for defined
ordering.
This makes first/last functions not supporting hashAgg, only groupAgg
which usually requires explicit sort.

If you find first_last_agg performance poor take a look at argm
extension http://pgxn.org/dist/argm/ .
It provides similar functionality but works faster as it does not sort
or make postgres core sort,
but only chooses the first row within each group.

As for the original question unfortunately the way from an extension to
postgres core is not too easy
and normally requires an extension to become popular and to be included
in postgres distribution as a contrib module first.

Regards,
Alexey Bashtanov

Show quoted text

On 19/05/16 04:04, Tom Smith wrote:

It would really save all the troubles for many people if postgresql
has a built-in first/last function along with sum/avg.
There is already a C extension and a wiki sample and implemented for
window function.
I am curious why these two functions were not added along their window
implementation counter part,
for completness and consistency

On Wed, May 18, 2016 at 10:42 PM, Melvin Davidson
<melvin6925@gmail.com <mailto:melvin6925@gmail.com>> wrote:

On Wed, May 18, 2016 at 10:36 PM, Adam Brusselback
<adambrusselback@gmail.com <mailto:adambrusselback@gmail.com>> wrote:

Here is an example that works in a single query. Since you
have two different orders you want the data back in, you need
to use subqueries to get the proper data back, but it works,
and is very fast.

CREATE TEMPORARY TABLE foo AS
SELECT generate_series as bar
FROM generate_series(1, 1000000);

CREATE INDEX idx_foo_bar ON foo (bar);

SELECT *
FROM (
SELECT bar
FROM foo
ORDER BY bar asc
LIMIT 1
) x
UNION ALL
SELECT *
FROM (
SELECT bar
FROM foo
ORDER BY bar desc
LIMIT 1
) y;

DROP TABLE foo;

Seems to me SELECT min(<column>), max(<column>) FROM deja.vu ;
would also work.

--
*Melvin Davidson*
I reserve the right to fantasize. Whether or not you
wish to share my fantasy is entirely up to you.

#10David G. Johnston
david.g.johnston@gmail.com
In reply to: Alexey Bashtanov (#9)
Re: first_value/last_value

On Thu, May 19, 2016 at 7:10 AM, Alexey Bashtanov <bashtanov@imap.cc> wrote:

As for the original question unfortunately the way from an extension to
postgres core is not too easy
and normally requires an extension to become popular and to be included in
postgres distribution as a contrib module first.

​Do you have a specific example of a patch being written for this feature
and not being accepted into core simply due to lack of popularity?
Generalizing here doesn't do anyone any favors - every patch and feature is
unique.

​I'll agree that somewhat marginal features are often passed on, especially
if there is any non-trivial bike-shedding, but I do suspect that if someone
took the time to write a patch for this and usher it through the commit
process it would stand a good chance of being accepted. I don't recall a
request for this feature recently (last couple of years) let alone someone
putting forth an actual patch. That the core committers have not chosen to
work on it is not in itself an indication of their opinion on whether the
feature is worth adding.

There are no coding standards for extensions - there are for core. One
cannot assume that a functioning extension can simply be dropped in.

David J.