Re: Index isn't used during a join. - Mailing list pgsql-performance
From | Robert Creager |
---|---|
Subject | Re: Index isn't used during a join. |
Date | |
Msg-id | 20060111072659.7c1772ad@thunder.logicalchaos.org Whole thread Raw |
In response to | Re: Index isn't used during a join. (Michael Fuhr <mike@fuhr.org>) |
Responses |
Re: Index isn't used during a join.
(Robert Creager <Robert_Creager@LogicalChaos.org>)
Re: Index isn't used during a join. (Tom Lane <tgl@sss.pgh.pa.us>) |
List | pgsql-performance |
When grilled further on (Wed, 11 Jan 2006 00:56:55 -0700), Michael Fuhr <mike@fuhr.org> confessed: > On Tue, Jan 10, 2006 at 10:10:55PM -0700, Robert Creager wrote: > > The query is now correct, but still is slow because of lack of > > index usage. I don't know how to structure the query correctly to > > use the index. > > Have you tried adding restrictions on doy in the WHERE clause? > Something like this, I think: I cannot. That's what I thought I would get from the join. The query shown will always have two days involved, and onlygrows from there. The data is graphed at http://www.logicalchaos.org/weather/index.html, and I'm looking at adding historicaldata to the graphs. Opps, never mind. You hit the nail on the head: weather-# SELECT *, unmunge_time( time_group ) AS time, weather-# EXTRACT( doy FROM unmunge_time( time_group ) ) weather-# FROM minute."windspeed" weather-# JOIN doy_agg ON( EXTRACT( doy FROM unmunge_time( time_group ) ) = doy ) weather-# WHERE unmunge_time( time_group ) > ( now() - '24 hour'::interval ) weather-# AND doy BETWEEN EXTRACT( doy FROM now() - '24 hour'::interval) weather-# AND EXTRACT( doy FROM now() ) weather-# ORDER BY time_group; QUERY PLAN ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- Sort (cost=21914.09..21914.10 rows=1 width=48) (actual time=76.595..76.662 rows=286 loops=1) Sort Key: windspeed.time_group -> Hash Join (cost=21648.19..21914.08 rows=1 width=48) (actual time=64.656..75.562 rows=286 loops=1) Hash Cond: (date_part('doy'::text, unmunge_time("outer".time_group)) = "inner".doy) -> Bitmap Heap Scan on windspeed (cost=2.27..267.40 rows=74 width=28) (actual time=0.585..1.111 rows=286 loops=1) Recheck Cond: (unmunge_time(time_group) > (now() - '24:00:00'::interval)) -> Bitmap Index Scan on minute_windspeed_unmunge_index (cost=0.00..2.27 rows=74 width=0) (actual time=0.566..0.566rows=287 loops=1) Index Cond: (unmunge_time(time_group) > (now() - '24:00:00'::interval)) -> Hash (cost=21645.92..21645.92 rows=3 width=20) (actual time=63.849..63.849 rows=2 loops=1) -> HashAggregate (cost=21645.84..21645.89 rows=3 width=20) (actual time=63.832..63.834 rows=2 loops=1) -> Bitmap Heap Scan on readings (cost=59.21..21596.85 rows=6532 width=20) (actual time=15.174..53.249rows=7613 loops=1) Recheck Cond: ((date_part('doy'::text, "when") >= date_part('doy'::text, (now() - '24:00:00'::interval)))AND (date_part('doy'::text, "when") <= date_part('doy'::text, now()))) -> Bitmap Index Scan on readings_doy_index (cost=0.00..59.21 rows=6532 width=0) (actual time=12.509..12.509rows=10530 loops=1) Index Cond: ((date_part('doy'::text, "when") >= date_part('doy'::text, (now() - '24:00:00'::interval)))AND (date_part('doy'::text, "when") <= date_part('doy'::text, now()))) Total runtime: 77.177 ms What I had thought is that PG would (could?) be smart enough to realize that one query was restricted, and apply that restrictionto the other based on the join. I know it works in other cases (using indexes on both tables using the join)... > > Something else occurred to me: do you (or will you) have more than > one year of data? If so then matching on doy could be problematic > unless you also check for the year, or unless you want to match > more than one year. Yes and yes. I'm doing both aggregate by day of the year for all data, and aggregate by day of year within each year. Theexamples are: weather=# select * from doy_agg where doy = extract( doy from now() ); doy | avg_windspeed | max_windspeed -----+------------------+--------------- 11 | 6.14058239764748 | 69 (1 row) weather=# select * from doy_day_agg where extract( doy from day ) = extract( doy from now() ); day | avg_windspeed | max_windspeed ---------------------+------------------+--------------- 2004-01-11 00:00:00 | 5.03991313397539 | 17 2006-01-11 00:00:00 | 18.532050716667 | 69 2005-01-11 00:00:00 | 3.6106763448041 | 13 Thanks for your help Michael. Cheers, Rob -- 07:07:30 up 3 days, 23:34, 9 users, load average: 2.29, 2.44, 2.43 Linux 2.6.12-12-2 #4 SMP Tue Jan 3 19:56:19 MST 2006
Attachment
pgsql-performance by date: