Thread: Unique - first
I have a table (x,y,z) - I'd like to take the rows with unique x values - but, when more than one row have the same x value, I want the one with the minimal z value. How can I do that? I can imagine doing it with window functions, but also that regular SQL should be able to do it too.
Robert James wrote on 27.10.2013 14:04: > I have a table (x,y,z) - I'd like to take the rows with unique x > values - but, when more than one row have the same x value, I want the > one with the minimal z value. > > How can I do that? I can imagine doing it with window functions, but > also that regular SQL should be able to do it too. > > Window functions *are* "regular" SQL ;) select x,y,z from ( select x,y,z, min(y) over (partition by x) as min_y from the_table ) t where y = min_y; Instead of min() you could also use row_number() or dense_rank() to find the minimum value. A solution without window functions could be something like: select t1.x, t1.y, t1.z from table t1 join (select t2.x, min(t2.y) as min_y from the_table t2 group by t2.x ) mt on mt.x = t1.x and mt.min_y = t1.y; But I'm pretty sure the solution with the window function will perform better.
Thomas Kellerer <spam_eater@gmx.net> writes: > Robert James wrote on 27.10.2013 14:04: >> I have a table (x,y,z) - I'd like to take the rows with unique x >> values - but, when more than one row have the same x value, I want the >> one with the minimal z value. >> >> How can I do that? I can imagine doing it with window functions, but >> also that regular SQL should be able to do it too. > Window functions *are* "regular" SQL ;) Indeed. The only other easy way I know of involves SELECT DISTINCT ON (see the "weather reports" example in the SELECT reference page); but that is most definitely not standard SQL, it is a Postgres-ism. > A solution without window functions could be something like: > select t1.x, t1.y, t1.z > from table t1 > join (select t2.x, min(t2.y) as min_y > from the_table t2 > group by t2.x > ) mt on mt.x = t1.x and mt.min_y = t1.y; Note that this doesn't work unless x and y form a primary key, else you get multiple join rows (or no join rows, if one is NULL). In any case, it's unlikely to be fast. I think I've seen some even more esoteric solutions that use only SQL-92-era features, but lack of caffeine prevents me from recalling them. In any case, there's a good reason why we invented SELECT DISTINCT ON: this is just not easy to do in minimal SQL. regards, tom lane
On 27/10/2013 13:04, Robert James wrote: > I have a table (x,y,z) - I'd like to take the rows with unique x > values - but, when more than one row have the same x value, I want the > one with the minimal z value. > > How can I do that? I can imagine doing it with window functions, but > also that regular SQL should be able to do it too. My personal favourite approach, assuming you can rely on Postgres 9.0 or higher, is to define a first() aggregate as shown here: http://wiki.postgresql.org/wiki/First/last_%28aggregate%29 Once created, this can be used with the order_by_clause of the aggregate expression as shown here: http://www.postgresql.org/docs/current/static/sql-expressions.html#SYNTAX-AGGREGATES (That's why it requires 9.0, since earlier versions had no such clause). So in your case, you could run SELECT x, first(y order by z) as y_with_lowest_z FROM xyzzy GROUP BY x I find this a lot easier to understand than window functions; I've no idea how its performance compares. -- Rowan Collins [IMSoP]
On Sun, Oct 27, 2013 at 2:04 PM, Robert James <srobertjames@gmail.com> wrote:
I have a table (x,y,z) - I'd like to take the rows with unique x
values - but, when more than one row have the same x value, I want the
one with the minimal z value.
You can use distinct on (which is a Postgresql extension to the SQL standard):
select distinct on(x) x, y, z
from the_table
order by x, z
Regards
Marcin Mańk