Home > mailing lists

Re: How to make this CTE also print rows with 0 as count? - Mailing list pgsql-sql

From	David Johnston
Subject	Re: How to make this CTE also print rows with 0 as count?
Date	October 6, 2012 17:31:04
Msg-id	00fa01cda3e8$4c86df60$e5949e20$@yahoo.com Whole thread Raw
In response to	How to make this CTE also print rows with 0 as count? (air <mojaveranger7@gmail.com>)
List	pgsql-sql

Tree view

> -----Original Message-----
> From: pgsql-sql-owner@postgresql.org [mailto:pgsql-sql-
> owner@postgresql.org] On Behalf Of air
> Sent: Saturday, October 06, 2012 8:48 AM
> To: pgsql-sql@postgresql.org
> Subject: [SQL] How to make this CTE also print rows with 0 as count?
> 
> I have a CTE based query, to which I pass about 2600 4-tuple
> latitude/longitude values using joins - these latitude longitude 4-tuples
have
> been ID tagged and held in a second table called coordinates. These top
left
> and bottom right latitude / longitude values are passed into the CTE in
order
> to display the amount of requests (hourly) made within those coordinates
> for given two timestamps).- I am able to get the total requests per day
within
> the timestamps given, that is, the total count of user requests on every
> specified day. (E.g. user opts to see every Wednesday or Wednesday AND
> Thursday etc. - between hours 11:55 and 22:04 between dates January 1 and
> 31, 2012 for every latitude/longitude 4-tuples I pass.) But I cannot view
the
> rows with count 0. My query is as below:
> 
> 
> 
> WITH v AS (
>    SELECT '2012-01-1 11:55:11'::timestamp AS _from
>          ,'2012-01-31 22:02:21'::timestamp AS _to
>    )
> , q AS (
>    SELECT c.coordinates_id
>         , date_trunc('hour', t.calltime) AS stamp
>         , count(*) AS zcount
>    FROM   v
>    JOIN   mytable t ON  t.calltime BETWEEN v._from AND v._to
>                    AND (t.calltime::time >= v._from::time AND
>                         t.calltime::time <= v._to::time) AND (extract(DOW
from
> t.calltime) = 3)
>    JOIN   coordinates c ON (t.lat, t.lon)
>                    BETWEEN (c.bottomrightlat, c.topleftlon)
>                        AND (c.topleftlat, c.bottomrightlon)
>    GROUP BY c.coordinates_id, date_trunc('hour', t.calltime)
>    )
> , cal AS (
>    SELECT generate_series('2011-2-2 00:00:00'::timestamp
>                         , '2012-4-1 05:00:00'::timestamp
>                         , '1 hour'::interval) AS stamp
>    FROM v
>    )
> SELECT q.coordinates_id, cal.stamp::date, sum(q.zcount) AS zcount
> FROM   v, cal
> LEFT   JOIN q USING (stamp)
> WHERE  extract(hour from cal.stamp) >= extract(hour from v._from)
> AND    extract(hour from cal.stamp) <= extract(hour from v._to)
> AND    extract(DOW from cal.stamp) = 3
> AND    cal.stamp >= v._from
> AND    cal.stamp <= v._to
> GROUP  BY q.coordinates_id, cal.stamp::date ORDER  BY q.coordinates_id,
> stamp;
> 
> 
> 
> 
> The output I get when I execute this query is basically like this
(normally I
> have about 10354 rows returned excluding the rows with 0 zcount, just
> providing two coordinates for sake of similarity):
> 
> coordinates_id  | stamp      | zcount
> 1               ;"2012-01-04";      2
> 1               ;"2012-01-11";      3
> 1               ;"2012-01-18";      2
> 2               ;"2012-01-04";      2
> 2               ;"2012-01-11";      3
> 2               ;"2012-01-18";      2
> 
> 
> 
> 
> However, it should be like this where all rows with zcount 0 should also
be
> printed out along with rows that have nonzero zcounts -E.g. January 25
with
> zcount 0 for the two coordinates with ID 1 and 2 should also be printed in
this
> small portion of example-:
> 
> coordinates_id  | stamp      | zcount
> 1               ;"2012-01-04";      2
> 1               ;"2012-01-11";      3
> 1               ;"2012-01-18";      2
> 1               ;"2012-01-25";      0
> 2               ;"2012-01-04";      2
> 2               ;"2012-01-11";      3
> 2               ;"2012-01-18";      2
> 2               ;"2012-01-25";      0
> 
> 
> 
> 
> How can I achieve this? Thanks in advance.
> 
> 

Food for thought, generally when you want "everything including the zeros"
you want to build the "master" set without any values, build out the "values
only" dataset, then LEFT JOIN them and use COALESCE to generate values for
the missing data.

So:

SELECT id, stamp, COALESCE(datavalues.zcount, 0) AS zcount
FROM (cal CROSS JOIN id_master) master
LEFT JOIN datavalues USING (id, stamp)

Also, the mixing of multiple FROM relations and JOINs is confusing.  In
particular is the fact the JOIN takes precedence over the "," in FROM

"A JOIN clause combines two FROM items. Use parentheses if necessary to
determine the order of nesting. In the absence of parentheses, JOINs nest
left-to-right. In any case JOIN binds more tightly than the commas
separating FROM items."

http://www.postgresql.org/docs/9.2/interactive/sql-select.html

Your query is equivalent to:

SELECT ... 
FROM v CROSS JOIN (cal LEFT JOIN q USING stamp)
WHERE ...

Anyway, the "create master, left join data, coalesce" methodology is one
that I find to be easy to understand and implement.

HTH,

David J.

pgsql-sql by date:

From: air
Date: 06 October 2012, 12:56:05
Subject: Re: Calling the CTE for multiple inputs

From: Gavin Flower
Date: 08 October 2012, 11:36:31
Subject: Re: checking the gaps in intervals

Re: How to make this CTE also print rows with 0 as count? - Mailing list pgsql-sql

Previous

Next