Thread: Top n queries and GROUP BY
All, This is a straight SQL question, maybe not appropriate for a performance list, but... I have a simple stock holdings setup: => select * from t1; nam | co | num -----+-----------+------ joe | ibm | 600 abe | ibm | 1500 joe | cisco | 1200 abe | cisco | 800 joe | novell | 500 joe | microsoft | 200 What I would like to see is a Top-n-holdings-by-name", e.g, for n=2: nam | co | num ----------+--------+----- joe | cisco | 1200 joe | ibm | 600 abe | ibm | 1500 abe | cisco | 800 I can get part of the way by using a LIMIT clause in a subquery, e.g, => select 'abe', a.co, a.num from (select co, num from t1 where nam='abe' order by num desc limit 2) as a; ?column? | co | num ----------+-------+------ abe | ibm | 1500 abe | cisco | 800 but I can't figure out a correlated subquery (or GROUP BY arrangement or anything else) that will cycle through the names. I vaguely remember that these kinds or queries are hard to do in standard SQL, but I was hoping that PG, with its extensions... Thanks, Rich Cullingford rculling@sysd.com
Rich Cullingford wrote: > All, > This is a straight SQL question, maybe not appropriate for a performance > list, but... > > I have a simple stock holdings setup: > > => select * from t1; > nam | co | num > -----+-----------+------ > joe | ibm | 600 > abe | ibm | 1500 > joe | cisco | 1200 > abe | cisco | 800 > joe | novell | 500 > joe | microsoft | 200 > > What I would like to see is a Top-n-holdings-by-name", e.g, for n=2: > > nam | co | num > ----------+--------+----- > joe | cisco | 1200 > joe | ibm | 600 > abe | ibm | 1500 > abe | cisco | 800 > > I can get part of the way by using a LIMIT clause in a subquery, e.g, > > => select 'abe', a.co, a.num from (select co, num from t1 where > nam='abe' order by num desc limit 2) as a; > ?column? | co | num > ----------+-------+------ > abe | ibm | 1500 > abe | cisco | 800 > > but I can't figure out a correlated subquery (or GROUP BY arrangement or > anything else) that will cycle through the names. I vaguely remember > that these kinds or queries are hard to do in standard SQL, but I was > hoping that PG, with its extensions... I forgot about row subqueries; for n=3, for example: => SELECT * FROM t1 WHERE (nam,co,num) IN (SELECT nam,co,num FROM t1 b where b.nam=t1.nam order by num desc limit 3) order by nam, num desc; nam | co | num -----+--------+------ abe | ibm | 1500 abe | cisco | 800 joe | cisco | 1200 joe | ibm | 600 joe | novell | 500 (5 rows) Seems to work... Thanks all, Rich Cullingford rculling@sysd.com
In article <3FB8F98D.1010707@sysd.com>, Rich Cullingford <rculling@sysd.com> writes: > All, > This is a straight SQL question, maybe not appropriate for a performance > list, but... > I have a simple stock holdings setup: > => select * from t1; > nam | co | num > -----+-----------+------ > joe | ibm | 600 > abe | ibm | 1500 > joe | cisco | 1200 > abe | cisco | 800 > joe | novell | 500 > joe | microsoft | 200 > What I would like to see is a Top-n-holdings-by-name", e.g, for n=2: > nam | co | num > ----------+--------+----- > joe | cisco | 1200 > joe | ibm | 600 > abe | ibm | 1500 > abe | cisco | 800 > I can get part of the way by using a LIMIT clause in a subquery, e.g, > => select 'abe', a.co, a.num from (select co, num from t1 where > nam='abe' order by num desc limit 2) as a; > ?column? | co | num > ----------+-------+------ > abe | ibm | 1500 > abe | cisco | 800 > but I can't figure out a correlated subquery (or GROUP BY arrangement or > anything else) that will cycle through the names. I vaguely remember > that these kinds or queries are hard to do in standard SQL, but I was > hoping that PG, with its extensions... How about an outer join? SELECT x1.nam, x1.co, x1.num FROM t1 x1 LEFT JOIN t1 x2 ON x2.nam = x1.nam AND x2.num > x1.num GROUP BY x1.nam, x1.co, x1.num HAVING count(*) < 2 ORDER BY x1.nam, x1.num DESC