Basically, I have one table A with a set of start times and duration and
another table B with just a set of times. What I want is a result that for
each time in table B count the number of intervals in table A that it is in.
The query is not so difficult but it's a large set. The plan looks like
this:
Aggregate (cost=17365.57..17671.74 rows=3062 width=28)
-> Group (cost=17365.57..17595.20 rows=30618 width=28)
-> Sort (cost=17365.57..17365.57 rows=30618 width=28)
-> Nested Loop (cost=0.00..14718.41 rows=30618 width=28)
-> Seq Scan on internetusage (cost=0.00..596.04 rows=191 width=16)
-> Seq Scan on times (cost=0.00..23.40 rows=1440 width=12)
The problem is that it is running out of memory (after using around 500MB).
I imagine this is because it has to build the entire result in the sort
step. Can someone tell me why it doesn't use a plan like this one?
Aggregate
-> Group
-> Merge Join
-> Sort
-> Seq Scan on internetusage
-> Sort
-> Seq Scan on times
This would use a fraction of the memory and in this case that would make it
faster since it could start producing output faster. Using limit 1 has no
effect. The version is 7.0.2
If later versions fix it, let me know. Workarounds would be appreciated
also. The actual query follows:
select date_trunc('day',starttime) + minute as date,
count(iuid)
from times, internetusage
where date_trunc('month',starttime) = '2001-07-01'
and minute between starttime::time
and starttime::time + (duration || 'seconds')::interval
group by date, starttime, minute;
--
Martijn van Oosterhout <kleptog@svana.org>
http://svana.org/kleptog/
> It would be nice if someone came up with a certification system that
> actually separated those who can barely regurgitate what they crammed over
> the last few weeks from those who command secret ninja networking powers.