Thread: Re: [SQL] Counting the number of distinct rows returned

Re: [SQL] Counting the number of distinct rows returned

From

"Sejin Oh"

Date:

19 August 1999, 09:58:58

yeah... GROUP BY should do..

try this:

SELECT COUNT(*) FROM sales_by_region GROUP BY user_id;

Regards,

-----Original Message-----
From: Tom Lane <tgl@sss.pgh.pa.us>
To: Drew Whittle <drew@albatross.co.nz>
Cc: pgsql-sql@postgreSQL.org <pgsql-sql@postgreSQL.org>
Date: Thursday, August 19, 1999 9:35 AM
Subject: Re: [SQL] Counting the number of distinct rows returned


>Drew Whittle <drew@albatross.co.nz> writes:
>> How can I return the number of records (ie a count(user_id)) when using a
>> statement like this:
>
>> select distinct user_id from sales_by_region;
>
>count() with a GROUP BY will do what you want, I think --- see other
>thread with Doug Younger for an example.
>
>> I have tried things like:
>> select count(distinct user_id) from sales_by_region (parse error)
>
>Unfortunately we don't handle count(distinct ...) --- we should, since
>it's in SQL92.
>
> regards, tom lane
>
>************
>Check out "PostgreSQL Wearables" @ http://www.pgsql.com

Re: [SQL] Counting the number of distinct rows returned

From

Herouth Maoz

Date:

22 August 1999, 15:59:41

At 16:56 +0300 on 19/08/1999, Sejin Oh wrote:

> yeah... GROUP BY should do..
>
> try this:
>
> SELECT COUNT(*) FROM sales_by_region GROUP BY user_id;

No, no, it returns the exact opposite of what he wanted. Imagine a table
that has the values

fld
===
2
1
2
1
3

Doing select distinct on this table would return

fld
===
2
1
3

Therefore the number Drew needed is 3. But SELECT COUNT(*) with GROUP BY
would give:

fld  count
===  =====
2    2
1    2
3    1

That is, for each distinct value, it will give the frequency of that value
in the table. This is not the number of distinct values, it is the number
of times each distinct value appears.

I have once already posted a (kinda kludgy) solution for this in the
absence of COUNT DISTINCT in PostgreSQL:

SELECT count(*)
FROM the_table t1
WHERE int4( oid ) =
( SELECT max( int4(oid) ) FROM the_table t2 WHERE t1.fld = t2.fld );

The inner select gives you the maximal oid of the row that has a given
field value. The outer select selects only rows whose oid is this maximal
one. Thus it selects only one row for each distinct value, and so the count
is COUNT DISTINCT. If the original SELECT DISTINCT selected several fields,
more fields should be added to the WHERE clause of the subselect.

A more efficient solution would probably be:

SELECT DISTINCT( fld )
INTO TABLE tmp_tbl
FROM the_table;

SELECT COUNT( * ) -- This is the query that returns the needed result
FROM tmp_tbl;

DROP TABLE tmp_tbl;

Herouth

--
Herouth Maoz, Internet developer.
Open University of Israel - Telem project
http://telem.openu.ac.il/~herutma