Re: Need magical advice for counting NOTHING - Mailing list pgsql-sql

From Andreas
Subject Re: Need magical advice for counting NOTHING
Date
Msg-id 4A6B6F26.9040303@gmx.net
Whole thread Raw
In response to Re: Need magical advice for counting NOTHING  (nha <lyondif02@free.fr>)
List pgsql-sql
nha schrieb:
> Hello,
>
> Le 23/07/09 10:23, Glenn Maynard a écrit :
>   
>> On Thu, Jul 23, 2009 at 1:01 AM, Andreas<maps.on@gmx.net> wrote:
>>     
>>> SELECT user_name, log_type_fk, COUNT(log_type_fk)
>>> FROM log
>>> JOIN users ON (user_id = user_fk)
>>> WHERE (ts IS BETWEEN  sometime  AND   another)
>>> GROUP BY user_name, log_type_fk
>>> ORDER BY user_name, log_type_fk
>>>       
>> [...] 
>> SELECT user_name, log_type.log_type, sum((log_type_fk IS NOT
>> NULL)::integer) AS count
>> FROM users JOIN log_type ON (true) LEFT JOIN log ON (user_id = user_fk
>> AND log.log_type_fk = log_type.log_type)
>> GROUP BY user_name, log_type.log_type
>> ORDER BY user_name, log_type.log_type;
>> [...]
>>     
>
> In the same direction as Andreas Krestchmer and Glenn Maynard (thanks to
> the latter for DDL assumed statements), the following query should also
> suit:
>
> SELECT user_name, log_type, COUNT(log_type_fk)
> FROM (users CROSS JOIN log_type)
> LEFT JOIN log ON (user_id = user_fk AND log_type_id = log_type_fk)
> WHERE (ts IS BETWEEN sometime AND another)
> GROUP BY user_name, log_type
> ORDER BY user_name, log_type
>   
I tried it and found it looses the COUNT() = 0 lines because of the date 
filtering at this position.
ts is a columns of log. The tricky lines are those log_types that aren't 
in log so (ts IS NULL) here and NULL can't be compared to timestamps so 
the interesting lines get filtered out of the LEFT JOIN's result and 
can't be counted.

The folowing aproach works AFAIK.

SELECT user_name, log_type_id, COUNT(log_type_fk)
FROM (users CROSS JOIN log_type)
LEFT JOIN
(    SELECT user_fk, log_type_fk    FROM log    WHERE (ts BETWEEN sometime AND another)
) AS x
ON (user_id = user_fk AND log_type_id = log_type_fk)
GROUP BY user_name, log_type_id
ORDER BY user_name, log_type_id

This subselect gives a really considerable speed up, too.

> While it is formally assumed that user_id and log_type_id are
> respectively keys for users and log_type tables, it is semantically
> admitted here that user_name identifies user_id in users table and
> log_type identifies log_type_id in log_type table.
Actually I didn't consider this key issue, yet.   :}
But, as the result gets stuffed in the crosstab function, the formally strict key log_type_id would do as column 2 as
well,or maybe better being a number instead of a varchar. 
 
I'll even change the first column to user_id and wrap another JOIN users around so I don't get messed up by users with
thesame name.
 


Thank you and Andreas Krestchmer and Glenn Maynard for giving me a new view on things.
:)



pgsql-sql by date:

Previous
From: nha
Date:
Subject: Re: Need magical advice for counting NOTHING
Next
From: Glenn Maynard
Date:
Subject: Re: Bit by "commands ignored until end of transaction block" again