Re: Top n queries and GROUP BY - Mailing list pgsql-performance

From Harald Fuchs
Subject Re: Top n queries and GROUP BY
Date
Msg-id pusmkl627a.fsf@srv.protecting.net
Whole thread Raw
In response to Top n queries and GROUP BY  (Rich Cullingford <rculling@sysd.com>)
List pgsql-performance
In article <3FB8F98D.1010707@sysd.com>,
Rich Cullingford <rculling@sysd.com> writes:

> All,
> This is a straight SQL question, maybe not appropriate for a performance
> list, but...

> I have a simple stock holdings setup:

> => select * from t1;
>   nam |    co     | num
> -----+-----------+------
>   joe | ibm       |  600
>   abe | ibm       | 1500
>   joe | cisco     | 1200
>   abe | cisco     |  800
>   joe | novell    |  500
>   joe | microsoft |  200

> What I would like to see is a Top-n-holdings-by-name", e.g, for n=2:

>   nam      |  co    | num
> ----------+--------+-----
>   joe      | cisco  |  1200
>   joe      | ibm    |   600
>   abe      | ibm    |  1500
>   abe      | cisco  |   800

> I can get part of the way by using a LIMIT clause in a subquery, e.g,

> => select 'abe', a.co, a.num from (select co, num from t1 where
> nam='abe' order by num desc limit 2) as a;
>   ?column? |  co   | num
> ----------+-------+------
>   abe      | ibm   | 1500
>   abe      | cisco |  800

> but I can't figure out a correlated subquery (or GROUP BY arrangement or
> anything else) that will cycle through the names. I vaguely remember
> that these kinds or queries are hard to do in standard SQL, but I was
> hoping that PG, with its extensions...

How about an outer join?

  SELECT x1.nam, x1.co, x1.num
  FROM t1 x1
  LEFT JOIN t1 x2 ON x2.nam = x1.nam AND x2.num > x1.num
  GROUP BY x1.nam, x1.co, x1.num
  HAVING count(*) < 2
  ORDER BY x1.nam, x1.num DESC

pgsql-performance by date:

Previous
From: Bruce Momjian
Date:
Subject: Re: duration logging setting in 7.4
Next
From: Laurent Martelli
Date:
Subject: Re: Join on incompatible types