Thread: Two optimization questions
Hi, I think it can be done better than I did and I want to learn... 1. I have a table that registers the history of messages: output_message_history(id, event_type, event_time) I need those ID-s from the table where there is one 'MESSAGE SENT' event and one 'MESSAGE SUBMITTED' event and there are no more events on that message. select id from output_message_history group by content_id having sum(case when event_type='MESSAGE SENT' then 1 else 0 end) = 1 and sum(case when event_type='MESSAGE SUBMITTED' then 1 else 0 end) = 1 and count(*) = 2 Can it be done another, more optimal way? 2. I have to create reports like this one: time count 12:00 3 12:01 2 12:02 4 12:03 0 <--- now this one is problematic for me 12:04 5 So I need something like this: select date_trunc('minute', crd), count(*) from subscriber where crd between '2006-09-08' and '2006-09-12' group by date_trunc('minute', crd) But the output of this query won't show minutes with 0 count. I searched the archives and found an example using a view to solve this problem, but creating such a view for grouping by minutes or seconds doesn't seem to be nice solution. Any ideas how to solve this problem? Thanks for any answers, Zizi
On 9/12/06, Mezei Zoltán <mezei.zoltan@telefor.hu> wrote:
You could split it into sub-queries but would that make the performance better or worse? I guess it depends on how much data is there, and what frequency you have ot the event_type's but indexing the event_type column would help. This may be worth a try - use EXPLAIN to see which is better.
SELECT id
FROM output_message_history
WHERE NOT content_id IN (
SELECT content_id -- distinct(content_id)
FROM output_message_history
WHERE NOT event_type IN ('MESSAGE SENT', 'MESSAGE SUBMITTED')
)
GROUP BY content_id
HAVING
sum(case when event_type='MESSAGE SENT' then 1 else 0 end) = 1
AND
sum(case when event_type='MESSAGE SUBMITTED' then 1 else 0 end) = 1
Questions like this come up frequently and there are some nice solutions:
See if this gives you some insight:
select
'2006-01-15'::date + s.inc
from
generate_series(0, ('2006-02-20'::date - '2006-01-15'::date)::integer) as s(inc)
See http://www.postgresql.org/docs/8.1/static/functions-srf.html and http://www.postgresql.org/docs/8.0/interactive/functions-datetime.html for more.
==================================================================
Aaron Bono
Aranya Software Technologies, Inc.
http://www.aranya.com
http://codeelixir.com
==================================================================
Hi,
I think it can be done better than I did and I want to learn...
1. I have a table that registers the history of messages:
output_message_history(id, event_type, event_time)
I need those ID-s from the table where there is one 'MESSAGE SENT' event
and one 'MESSAGE SUBMITTED' event and there are no more events on that
message.
select id
from output_message_history
group by content_id
having
sum(case when event_type='MESSAGE SENT' then 1 else 0 end) = 1
and
sum(case when event_type='MESSAGE SUBMITTED' then 1 else 0 end) = 1
and count(*) = 2
Can it be done another, more optimal way?
You could split it into sub-queries but would that make the performance better or worse? I guess it depends on how much data is there, and what frequency you have ot the event_type's but indexing the event_type column would help. This may be worth a try - use EXPLAIN to see which is better.
SELECT id
FROM output_message_history
WHERE NOT content_id IN (
SELECT content_id -- distinct(content_id)
FROM output_message_history
WHERE NOT event_type IN ('MESSAGE SENT', 'MESSAGE SUBMITTED')
)
GROUP BY content_id
HAVING
sum(case when event_type='MESSAGE SENT' then 1 else 0 end) = 1
AND
sum(case when event_type='MESSAGE SUBMITTED' then 1 else 0 end) = 1
2. I have to create reports like this one:
time count
12:00 3
12:01 2
12:02 4
12:03 0 <--- now this one is problematic for me
12:04 5
So I need something like this:
select date_trunc('minute', crd), count(*) from subscriber
where crd between '2006-09-08' and '2006-09-12'
group by date_trunc('minute', crd)
But the output of this query won't show minutes with 0 count. I searched
the archives and found an example using a view to solve this problem,
but creating such a view for grouping by minutes or seconds doesn't seem
to be nice solution. Any ideas how to solve this problem?
See if this gives you some insight:
select
'2006-01-15'::date + s.inc
from
generate_series(0, ('2006-02-20'::date - '2006-01-15'::date)::integer) as s(inc)
See http://www.postgresql.org/docs/8.1/static/functions-srf.html and http://www.postgresql.org/docs/8.0/interactive/functions-datetime.html for more.
==================================================================
Aaron Bono
Aranya Software Technologies, Inc.
http://www.aranya.com
http://codeelixir.com
==================================================================
Aaron Bono wrote: > You could split it into sub-queries but would that make the > performance better or worse? I guess it depends on how much data is > there, and what frequency you have ot the event_type's but indexing > the event_type column would help. This may be worth a try - use > EXPLAIN to see which is better. > Yep, I tought of that, but the solution with having is faster now. I think it is because about 30% of the rows are selected in the subquery now. Later we will have more records, and the subselect should select only about 1-2% of the rows. At that time, I think your solution will be faster. > Questions like this come up frequently and there are some nice solutions: > > See if this gives you some insight: > > select > '2006-01-15'::date + s.inc > from > generate_series(0, ('2006-02-20'::date - > '2006-01-15'::date)::integer) as s(inc) Sup, that's just what I needed. Thank you. Zizi