Thread: DISTINCT & COUNT

DISTINCT & COUNT

From
"Moray McConnachie"
Date:
I'm sure there is an easy answer to this, but it's Friday evening
here...

I have table A, with field Cid, and table C, whose primary key is Cid.
I have table B which contains a subset of the IDs of table A, Aid.

I need to produce a *count* of items in C for which certain criteria
in A are true, among them that only items in both B and A can be
considered (equijoin B to A).

At the moment, I am doing

CREATE TABLE C (Cid int4 primary key,cname text);
INSERT INTO C VALUES (1,'I am 1');
INSERT INTO C VALUES (2,'I am 2');
INSERT INTO C VALUES (3,'I am 3');

CREATE TABLE A (Aid int4 primary key,somefield int4,otherfield
int4,Cid int4 not null);
INSERT INTO A VALUES(1,1,2,1);
INSERT INTO A VALUES(2,1,3,1);
INSERT INTO A VALUES(3,1,2,1);
INSERT INTO A VALUES(4,1,2,2);
INSERT INTO A VALUES(5,1,3,2);
INSERT INTO A VALUES(6,1,2,3);
INSERT INTO A VALUES(7,1,2,3);

CREATE TABLE B (Aid int4,Bid int4 primary key);
INSERT INTO B VALUES (1,1);
INSERT INTO B VALUES (2,2);
INSERT INTO B VALUES (3,3);
INSERT INTO B VALUES (4,4);
INSERT INTO B VALUES (5,5);

SELECT Count(Cid) FROM C WHERE EXISTS(SELECT A.Aid FROM A,B WHERE
A.Cid=C.Cid AND B.Aid=A.Aid AND A.somefield=1 AND A.otherfield=2);

This gives the correct answer, i.e.2.

There must be something more elegant (and faster), but my experiments
with COUNT, WHERE and DISTINCT were not a success.

Thanks,
Moray

----------------------------------------------------------------------
----------------
Moray.McConnachie@computing-services.oxford.ac.uk



Re: [SQL] DISTINCT & COUNT

From
Stuart Rison
Date:
OK, I got this one:

SELECT count(*) 
FROM a,b,c
WHERE a.aid=b.aid --equijoin table A and B on aid (only aids in A and B)
AND a.somefield=1 and a.otherfield=2 --apply your select on A 
AND c.cid=a.cid; -- equijoin C and A on cid

Unfortunately you get a count of 3...

Why?  Because there are three rows in table A that satisfy your
requirements although they represent only two cid's.

The solution?

SELECT COUNT(DISTINCT c.cid)
FROM a,b,c
WHERE a.aid=b.aid
AND a.somefield=1 and a.otherfield=2
AND c.cid=a.cid;  

Unfortunately, although legal SQL92, this is currently not supported by
postgreSQL!

You can however do:

SELECT DISTINCT c.cid
FROM a,b,c
WHERE a.aid=b.aid
AND a.somefield=1 and a.otherfield=2
AND c.cid=a.cid;

and if your are using a interface (e.g. DBI::DBD::Pg) then you can get
the number of tuples returned...

not very satisfactory but the only alternative I can think of is the one
you propose using EXISTS.

HTH,

S.

On Fri, 26 Nov 1999, Moray McConnachie wrote:

> I'm sure there is an easy answer to this, but it's Friday evening
> here...
> 
> I have table A, with field Cid, and table C, whose primary key is Cid.
> I have table B which contains a subset of the IDs of table A, Aid.
> 
> I need to produce a *count* of items in C for which certain criteria
> in A are true, among them that only items in both B and A can be
> considered (equijoin B to A).
> 
> At the moment, I am doing
> 
> CREATE TABLE C (Cid int4 primary key,cname text);
> INSERT INTO C VALUES (1,'I am 1');
> INSERT INTO C VALUES (2,'I am 2');
> INSERT INTO C VALUES (3,'I am 3');
> 
> CREATE TABLE A (Aid int4 primary key,somefield int4,otherfield
> int4,Cid int4 not null);
> INSERT INTO A VALUES(1,1,2,1);
> INSERT INTO A VALUES(2,1,3,1);
> INSERT INTO A VALUES(3,1,2,1);
> INSERT INTO A VALUES(4,1,2,2);
> INSERT INTO A VALUES(5,1,3,2);
> INSERT INTO A VALUES(6,1,2,3);
> INSERT INTO A VALUES(7,1,2,3);
> 
> CREATE TABLE B (Aid int4,Bid int4 primary key);
> INSERT INTO B VALUES (1,1);
> INSERT INTO B VALUES (2,2);
> INSERT INTO B VALUES (3,3);
> INSERT INTO B VALUES (4,4);
> INSERT INTO B VALUES (5,5);
> 
> SELECT Count(Cid) FROM C WHERE EXISTS(SELECT A.Aid FROM A,B WHERE
> A.Cid=C.Cid AND B.Aid=A.Aid AND A.somefield=1 AND A.otherfield=2);
> 
> This gives the correct answer, i.e.2.
> 
> There must be something more elegant (and faster), but my experiments
> with COUNT, WHERE and DISTINCT were not a success.
> 
> Thanks,
> Moray
> 
> ----------------------------------------------------------------------
> ----------------
> Moray.McConnachie@computing-services.oxford.ac.uk
> 
> 
> ************
> 

Stuart C. G. Rison
Department of Biochemistry and Molecular Biology
6th floor, Darwin Building, University College London (UCL)
Gower Street, London, WC1E 6BT, United Kingdom
Tel. 0207 504 2303, Fax. 0207 380 7193
e-mail: rison@biochem.ucl.ac.uk



Re: [SQL] DISTINCT & COUNT

From
Tom Lane
Date:
Stuart Rison <rison@biochemistry.ucl.ac.uk> writes:
> You can however do:

> SELECT DISTINCT c.cid
> FROM a,b,c
> WHERE a.aid=b.aid
> AND a.somefield=1 and a.otherfield=2
> AND c.cid=a.cid;

> and if your are using a interface (e.g. DBI::DBD::Pg) then you can get
> the number of tuples returned...

Or do the SELECT into a temp table, followed by fetching count(*) on the
temp table.  Ugly, but avoids transferring what might be a lot of rows
to the frontend.

DISTINCT within aggregate functions is on the TODO list, but I don't
know when it will happen.
        regards, tom lane