*very* slow query to summarize data for a month ... - Mailing list pgsql-performance
From | Marc G. Fournier |
---|---|
Subject | *very* slow query to summarize data for a month ... |
Date | |
Msg-id | 20031110160144.C727@ganymede.hub.org Whole thread Raw |
Responses |
Re: *very* slow query to summarize data for a month ...
Re: *very* slow query to summarize data for a month ... Re: *very* slow query to summarize data for a month ... Re: *very* slow query to summarize data for a month ... |
List | pgsql-performance |
Table structure is simple: CREATE TABLE traffic_logs ( company_id bigint, ip_id bigint, port integer, bytes bigint, runtime timestamp without time zone ); runtime is 'day of month' ... I need to summarize the month, per company, with a query as: explain analyze SELECT ts.company_id, company_name, SUM(ts.bytes) AS total_traffic FROM company c, traffic_logs ts WHERE c.company_id = ts.company_id AND month_trunc(ts.runtime) = '2003-10-01' GROUP BY company_name,ts.company_id; and the explain looks like: QUERY PLAN ---------------------------------------------------------------------------------------------------------------------------------------------------------------------- Aggregate (cost=32000.94..32083.07 rows=821 width=41) (actual time=32983.36..47586.17 rows=144 loops=1) -> Group (cost=32000.94..32062.54 rows=8213 width=41) (actual time=32957.40..42817.88 rows=462198 loops=1) -> Sort (cost=32000.94..32021.47 rows=8213 width=41) (actual time=32957.38..36261.31 rows=462198 loops=1) Sort Key: c.company_name, ts.company_id -> Merge Join (cost=31321.45..31466.92 rows=8213 width=41) (actual time=13983.07..22642.14 rows=462198 loops=1) Merge Cond: ("outer".company_id = "inner".company_id) -> Sort (cost=24.41..25.29 rows=352 width=25) (actual time=5.52..7.40 rows=348 loops=1) Sort Key: c.company_id -> Seq Scan on company c (cost=0.00..9.52 rows=352 width=25) (actual time=0.02..2.78 rows=352loops=1) -> Sort (cost=31297.04..31317.57 rows=8213 width=16) (actual time=13977.49..16794.41 rows=462198 loops=1) Sort Key: ts.company_id -> Index Scan using tl_month on traffic_logs ts (cost=0.00..30763.02 rows=8213 width=16) (actualtime=0.29..5562.25 rows=462198 loops=1) Index Cond: (month_trunc(runtime) = '2003-10-01 00:00:00'::timestamp without time zone) Total runtime: 47587.82 msec (14 rows) the problem is that we're only taking a few months worth of data, so I don't think there is much of a way of 'improve performance' on this, but figured I'd ask quickly before I do something rash ... Note that without the month_trunc() index, the Total runtime more then doubles: QUERY PLAN -------------------------------------------------------------------------------------------------------------------------------------------------------- Aggregate (cost=39578.63..39660.76 rows=821 width=41) (actual time=87805.47..101251.35 rows=144 loops=1) -> Group (cost=39578.63..39640.23 rows=8213 width=41) (actual time=87779.56..96824.56 rows=462198 loops=1) -> Sort (cost=39578.63..39599.17 rows=8213 width=41) (actual time=87779.52..90781.48 rows=462198 loops=1) Sort Key: c.company_name, ts.company_id -> Merge Join (cost=38899.14..39044.62 rows=8213 width=41) (actual time=64073.98..72783.68 rows=462198 loops=1) Merge Cond: ("outer".company_id = "inner".company_id) -> Sort (cost=24.41..25.29 rows=352 width=25) (actual time=64.66..66.55 rows=348 loops=1) Sort Key: c.company_id -> Seq Scan on company c (cost=0.00..9.52 rows=352 width=25) (actual time=1.76..61.70 rows=352loops=1) -> Sort (cost=38874.73..38895.27 rows=8213 width=16) (actual time=64009.26..66860.71 rows=462198 loops=1) Sort Key: ts.company_id -> Seq Scan on traffic_logs ts (cost=0.00..38340.72 rows=8213 width=16) (actual time=5.02..-645982.04rows=462198 loops=1) Filter: (date_trunc('month'::text, runtime) = '2003-10-01 00:00:00'::timestamp without timezone) Total runtime: 101277.17 msec (14 rows)
pgsql-performance by date: