Unique - first

Started by Robert Jamesover 12 years ago5 messagesgeneral

srobertjames@gmail.com

over 12 years ago

I have a table (x,y,z) - I'd like to take the rows with unique x
values - but, when more than one row have the same x value, I want the
one with the minimal z value.

How can I do that? I can imagine doing it with window functions, but
also that regular SQL should be able to do it too.

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Thomas Kellerer

spam_eater@gmx.net

over 12 years ago

In reply to: Robert James (#1)

Re: Unique - first

Robert James wrote on 27.10.2013 14:04:

I have a table (x,y,z) - I'd like to take the rows with unique x
values - but, when more than one row have the same x value, I want the
one with the minimal z value.

How can I do that? I can imagine doing it with window functions, but
also that regular SQL should be able to do it too.

Window functions *are* "regular" SQL ;)

select x,y,z
from (
select x,y,z,
min(y) over (partition by x) as min_y
from the_table
) t
where y = min_y;

Instead of min() you could also use row_number() or dense_rank() to find the minimum value.

A solution without window functions could be something like:

select t1.x, t1.y, t1.z
from table t1
join (select t2.x, min(t2.y) as min_y
from the_table t2
group by t2.x
) mt on mt.x = t1.x and mt.min_y = t1.y;

But I'm pretty sure the solution with the window function will perform better.

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Marcin Mańk

marcin@maniek.info

over 12 years ago

In reply to: Robert James (#1)

Re: Unique - first

On Sun, Oct 27, 2013 at 2:04 PM, Robert James <srobertjames@gmail.com>
wrote:

I have a table (x,y,z) - I'd like to take the rows with unique x
values - but, when more than one row have the same x value, I want the
one with the minimal z value.

You can use distinct on (which is a Postgresql extension to the SQL
standard):

select distinct on(x) x, y, z
from the_table
order by x, z

Regards
Marcin Mańk

Tom Lane

tgl@sss.pgh.pa.us

over 12 years ago

In reply to: Thomas Kellerer (#2)

Re: Unique - first

Thomas Kellerer <spam_eater@gmx.net> writes:

Robert James wrote on 27.10.2013 14:04:

I have a table (x,y,z) - I'd like to take the rows with unique x
values - but, when more than one row have the same x value, I want the
one with the minimal z value.

How can I do that? I can imagine doing it with window functions, but
also that regular SQL should be able to do it too.

Window functions *are* "regular" SQL ;)

Indeed. The only other easy way I know of involves SELECT DISTINCT ON
(see the "weather reports" example in the SELECT reference page); but
that is most definitely not standard SQL, it is a Postgres-ism.

A solution without window functions could be something like:

select t1.x, t1.y, t1.z
from table t1
join (select t2.x, min(t2.y) as min_y
from the_table t2
group by t2.x
) mt on mt.x = t1.x and mt.min_y = t1.y;

Note that this doesn't work unless x and y form a primary key, else you
get multiple join rows (or no join rows, if one is NULL). In any case,
it's unlikely to be fast.

I think I've seen some even more esoteric solutions that use only
SQL-92-era features, but lack of caffeine prevents me from recalling them.
In any case, there's a good reason why we invented SELECT DISTINCT ON:
this is just not easy to do in minimal SQL.

regards, tom lane

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Rowan Collins

rowan.collins@gmail.com

over 12 years ago

In reply to: Robert James (#1)

Re: Unique - first

On 27/10/2013 13:04, Robert James wrote:

I have a table (x,y,z) - I'd like to take the rows with unique x
values - but, when more than one row have the same x value, I want the
one with the minimal z value.

How can I do that? I can imagine doing it with window functions, but
also that regular SQL should be able to do it too.

My personal favourite approach, assuming you can rely on Postgres 9.0 or
higher, is to define a first() aggregate as shown here:
http://wiki.postgresql.org/wiki/First/last_%28aggregate%29

Once created, this can be used with the order_by_clause of the aggregate
expression as shown here:
http://www.postgresql.org/docs/current/static/sql-expressions.html#SYNTAX-AGGREGATES
(That's why it requires 9.0, since earlier versions had no such clause).

So in your case, you could run

SELECT
x,
first(y order by z) as y_with_lowest_z
FROM
xyzzy
GROUP BY
x

I find this a lot easier to understand than window functions; I've no
idea how its performance compares.

--
Rowan Collins
[IMSoP]

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general