Re: two count columns? - Mailing list pgsql-sql

From Greg Stark
Subject Re: two count columns?
Date
Msg-id 87mzgp1oc0.fsf@stark.xeocode.com
Whole thread Raw
In response to two count columns?  (Jan Danielsson <jan.danielsson@gmail.com>)
List pgsql-sql
Jan Danielsson <jan.danielsson@gmail.com> writes:

> select from_ip, count(from_ip) as entries, count(select * from log where
> ...) as tot_entries, max(ts)::timestamp(0) as last_access from log where
> to_port=22 and direction='in' group by from_ip

select from_ip,       count(from_ip) as entries,       (select count(*)          from log as l        where l.from_ip =
log.from_ip     ) as tot_entries,      max(ts) as last_access from logwhere to_port=22  and direction='ingroup by
from_ip

expect it to be pretty slow though. For every from_ip it has to look up every
other entry with that from_ip.

> Thankful for any hints or tips.

There is a trick you could use to make it faster but it gets cumbersome and
pretty tricky to use when you're doing more than one thing at a time:

select from_ip      sum(case when to_port=22 and direction='in' then 1 else 0 end) as entries,      count(*) as
tot_entries,     max(case when to_port=22 and direction='in' then ts::timestamp(0) else null end) as last_access from
loggroupby from_iphaving entries > 0
 


Note that in either case you might want to look at ANALYZE results for the
query and try raising work_mem for this query using SET until you see the plan
using a hash aggregate. If it can use a hash aggregate for your query (more
likely for the first query than the second) without swapping it'll be faster
than sorting.

-- 
greg



pgsql-sql by date:

Previous
From: Tom Lane
Date:
Subject: Re: two count columns?
Next
From: "Nalin Bakshi"
Date:
Subject: Postgres for Dummies - a new request