Thread: Aggregating by unique values

Aggregating by unique values

From

Lee Hachadoorian

Date:

14 December 2010, 01:11:55

Hello,

I'm trying to count customers who have received services by ZIP code,
but I want to count each customer only once even though customers may
have received services on multiple dates, and therefore appear in the
table multiple times. There *is* a separate customers table, but because
of dirty data, I cannot rely on it.

The best I can come up with is:

SELECT   zip, count(*) AS count_serviced
FROM   (SELECT DISTINCT zip, id FROM customer_service_date) a
GROUP BY   zip
;

The table (with some irrelevant fields dropped) is:

CREATE TABLE customer_service_date
( id integer, address character varying, city character varying, state character varying, zip character varying,
service_datedate
 
)
;

The table is missing a primary key field, but it would be (id,
service_date) if it had one.

Any suggestions to improve this?

Thanks,
--Lee

-- 
Lee Hachadoorian
PhD Student, Geography
Program in Earth & Environmental Sciences
CUNY Graduate Center

Re: Aggregating by unique values

From

Filip Rembiałkowski

Date:

14 December 2010, 04:23:12

try<br /><br />select zip, count(distinct id) from customer_service_date group by zip;<br /><br /><div
class="gmail_quote">2010/12/14Lee Hachadoorian <span dir="ltr"><<a
href="mailto:lee.hachadoorian@gmail.com">lee.hachadoorian@gmail.com</a>></span><br/><blockquote class="gmail_quote"
style="margin:0pt 0pt 0pt 0.8ex; border-left: 1px solid rgb(204, 204, 204); padding-left: 1ex;">Hello,<br /><br /> I'm
tryingto count customers who have received services by ZIP code,<br /> but I want to count each customer only once even
thoughcustomers may<br /> have received services on multiple dates, and therefore appear in the<br /> table multiple
times.There *is* a separate customers table, but because<br /> of dirty data, I cannot rely on it.<br /><br /> The best
Ican come up with is:<br /><br /> SELECT<br />    zip, count(*) AS count_serviced<br /> FROM<br />    (SELECT DISTINCT
zip,id FROM customer_service_date) a<br /> GROUP BY<br />    zip<br /> ;<br /><br /> The table (with some irrelevant
fieldsdropped) is:<br /><br /> CREATE TABLE customer_service_date<br /> (<br />  id integer,<br />  address character
varying,<br/>  city character varying,<br />  state character varying,<br />  zip character varying,<br />
 service_datedate<br /> )<br /> ;<br /><br /> The table is missing a primary key field, but it would be (id,<br />
service_date)if it had one.<br /><br /> Any suggestions to improve this?<br /><br /> Thanks,<br /> --Lee<br /><br />
--<br/> Lee Hachadoorian<br /> PhD Student, Geography<br /> Program in Earth & Environmental Sciences<br /> CUNY
GraduateCenter<br /><font color="#888888"><br /><br /> --<br /> Sent via pgsql-sql mailing list (<a
href="mailto:pgsql-sql@postgresql.org">pgsql-sql@postgresql.org</a>)<br/> To make changes to your subscription:<br /><a
href="http://www.postgresql.org/mailpref/pgsql-sql"target="_blank">http://www.postgresql.org/mailpref/pgsql-sql</a><br
/></font></blockquote></div><br/>

Re: Aggregating by unique values

From

Lee Hachadoorian

Date:

14 December 2010, 13:31:12

Perfect. Thanks, --Lee<br /><br /> On 12/14/2010 03:23 AM, Filip Rembiałkowski wrote: <blockquote
cite="mid:AANLkTinwDzmxMG92_YeG+9+3=J_z1+vwU2DkfspQCia4@mail.gmail.com"type="cite">try<br /><br /> select zip,
count(distinctid) from customer_service_date group by zip;<br /><br /><div class="gmail_quote">2010/12/14 Lee
Hachadoorian<span dir="ltr"><<a href="mailto:lee.hachadoorian@gmail.com"
moz-do-not-send="true">lee.hachadoorian@gmail.com</a>></span><br/><blockquote class="gmail_quote" style="margin: 0pt
0pt0pt         0.8ex; border-left: 1px solid rgb(204, 204, 204);         padding-left: 1ex;">Hello,<br /><br /> I'm
tryingto count customers who have received services by ZIP code,<br /> but I want to count each customer only once even
thoughcustomers may<br /> have received services on multiple dates, and therefore appear in the<br /> table multiple
times.There *is* a separate customers table, but because<br /> of dirty data, I cannot rely on it.<br /><br /> The best
Ican come up with is:<br /><br /> SELECT<br />    zip, count(*) AS count_serviced<br /> FROM<br />    (SELECT DISTINCT
zip,id FROM customer_service_date) a<br /> GROUP BY<br />    zip<br /> ;<br /><br /> The table (with some irrelevant
fieldsdropped) is:<br /><br /> CREATE TABLE customer_service_date<br /> (<br />  id integer,<br />  address character
varying,<br/>  city character varying,<br />  state character varying,<br />  zip character varying,<br />
 service_datedate<br /> )<br /> ;<br /><br /> The table is missing a primary key field, but it would be (id,<br />
service_date)if it had one.<br /><br /> Any suggestions to improve this?<br /><br /> Thanks,<br /> --Lee<br /><br />
--<br/> Lee Hachadoorian<br /> PhD Student, Geography<br /> Program in Earth & Environmental Sciences<br /> CUNY
GraduateCenter<br /><font color="#888888"><br /><br /> --<br /> Sent via pgsql-sql mailing list (<a
href="mailto:pgsql-sql@postgresql.org"moz-do-not-send="true">pgsql-sql@postgresql.org</a>)<br /> To make changes to
yoursubscription:<br /><a href="http://www.postgresql.org/mailpref/pgsql-sql" moz-do-not-send="true"
target="_blank">http://www.postgresql.org/mailpref/pgsql-sql</a><br/></font></blockquote></div><br /></blockquote><br
/><preclass="moz-signature" cols="72">-- 
 
Lee Hachadoorian
PhD Student, Geography
Program in Earth & Environmental Sciences
CUNY Graduate Center
</pre>