Thread: counting groups

counting groups

From
Holger Klawitter
Date:
Hi there,

is there any way to count groups in Postgres ? The following
(simplified) statement should return the number of attributes
which come up more than once.

SELECT count(distinct attr)
FROM table
WHERE condition
GROUP BY attr
HAVING count(*)>1

But it returns just the right number of 1's.
There must be a nicer way than selection INTO a temporary table
and counting from there ...

Regards,
Mit freundlichem Gruß,
    Holger Klawitter
--
Holger Klawitter                                    +49 (0)251 484 0637
holger@klawitter.de                            http://www.klawitter.de/

Re: counting groups

From
Holger Klawitter
Date:
Hi I found the solution myself,

but perhaps someone is interested as well.

SELECT count(*)
FROM table
WHERE attr IN (
    SELECT attr
    FROM table
    WHERE condition
    GROUP BY attr
    HAVING count(*)>1
);

--
Regards,
Mit freundlichem Gruß,
    Holger Klawitter
--
Holger Klawitter                                    +49 (0)251 484 0637
holger@klawitter.de                            http://www.klawitter.de/


Re: counting groups

From
Tom Lane
Date:
Holger Klawitter <holger@klawitter.de> writes:
> is there any way to count groups in Postgres ? The following
> (simplified) statement should return the number of attributes
> which come up more than once.

> SELECT count(distinct attr)
> FROM table
> WHERE condition
> GROUP BY attr
> HAVING count(*)>1

> But it returns just the right number of 1's.

As it should, since the count() in the SELECT list is evaluated over
each group, and there naturally is going to be only one distinct
value of attr in each group.  A single level of SQL query only does
one pass of grouping and aggregation.

In 7.1 it'll be possible to solve this sort of problem by using
table subqueries:

SELECT count(*) FROM
(SELECT attr
 FROM table
 WHERE condition
 GROUP BY attr
 HAVING count(*)>1) tab;

which is a little cleaner and a lot more efficient than a WHERE-IN
kind of solution.  Doesn't help you much today (unless you're willing to
run development-tip code), but something to keep in mind for later.

BTW, in your proposed workaround

SELECT count(*)
FROM table
WHERE attr IN (
        SELECT attr
        FROM table
        WHERE condition
        GROUP BY attr
        HAVING count(*)>1
);

I think you still need "SELECT count(distinct attr)" at the outer level.
As is, you'll get the total number of appearances of the attr values
that appear multiple times.

            regards, tom lane