On Fri, Dec 16, 2005 at 03:42:34PM -0000, Michael Farewell wrote:
> I need to write a query which returns each company together with the
> highest product_count and its associated product type so the result should
> look like this:
There are a few ways to do this, a couple of which were mentioned
recently in another thread. One is to use PostgreSQL's non-standard
DISTINCT ON:
SELECT DISTINCT ON (company_name) b_id, company_name, product_count, product_type
FROM foo
ORDER BY company_name, product_count DESC;
Here's another way that uses standard SQL syntax:
SELECT b_id, company_name, product_count, product_type
FROM foo AS a
WHERE NOT EXISTS ( SELECT 1 FROM foo AS b WHERE b.company_name = a.company_name AND b.product_count >
a.product_count
)
ORDER BY company_name;
These queries aren't quite the same, however: the first will return
exactly one row per company, while the second will return multiple
rows per company if multiple rows have the maximum product count.
For example, suppose you have this data, where two rows for company 1
have that company's maximum product count:
b_id | company_name | product_count | product_type
------+--------------+---------------+-------------- 1 | company 1 | 1 | a 1 | company 1 |
2 | b 1 | company 1 | 2 | c 2 | company 2 | 3 | d 2 | company 2 |
4 | e
The first query (using DISTINCT ON) will return the following,
picking the row for company 1 based on whichever row among the
maximums is ordered first (which you can control with ORDER BY;
otherwise the row chosen will be indeterminate):
b_id | company_name | product_count | product_type
------+--------------+---------------+-------------- 1 | company 1 | 2 | b 2 | company 2 |
4 | e
The second query will return both of the maximum rows for company 1:
b_id | company_name | product_count | product_type
------+--------------+---------------+-------------- 1 | company 1 | 2 | b 1 | company 1 |
2 | c 2 | company 2 | 4 | e
--
Michael Fuhr