Re: Index isn't used during a join. - Mailing list pgsql-performance

From Robert Creager
Subject Re: Index isn't used during a join.
Whole thread Raw
In response to Re: Index isn't used during a join.  (Michael Fuhr)
Responses Re: Index isn't used during a join.  (Robert Creager)
Re: Index isn't used during a join.  (Tom Lane)
List pgsql-performance
When grilled further on (Wed, 11 Jan 2006 00:56:55 -0700),
Michael Fuhr <> confessed:

> On Tue, Jan 10, 2006 at 10:10:55PM -0700, Robert Creager wrote:
> > The query is now correct, but still is slow because of lack of
> > index usage.  I don't know how to structure the query correctly to
> > use the index.
> Have you tried adding restrictions on doy in the WHERE clause?
> Something like this, I think:

I cannot.  That's what I thought I would get from the join.  The query shown will always have two days involved, and
onlygrows from there.  The data is graphed at, and I'm looking at adding
historicaldata to the graphs. 

Opps, never mind.  You hit the nail on the head:

weather-# SELECT *, unmunge_time( time_group ) AS time,
weather-# EXTRACT( doy FROM unmunge_time( time_group ) )
weather-# FROM minute."windspeed"
weather-# JOIN doy_agg ON( EXTRACT( doy FROM unmunge_time( time_group ) ) = doy )
weather-# WHERE unmunge_time( time_group ) > ( now() - '24 hour'::interval )
weather-# AND doy BETWEEN EXTRACT( doy FROM now() - '24 hour'::interval)
weather-#             AND EXTRACT( doy FROM now() )
weather-# ORDER BY time_group;
                                                                                                   QUERY PLAN

 Sort  (cost=21914.09..21914.10 rows=1 width=48) (actual time=76.595..76.662 rows=286 loops=1)
   Sort Key: windspeed.time_group
   ->  Hash Join  (cost=21648.19..21914.08 rows=1 width=48) (actual time=64.656..75.562 rows=286 loops=1)
         Hash Cond: (date_part('doy'::text, unmunge_time("outer".time_group)) = "inner".doy)
         ->  Bitmap Heap Scan on windspeed  (cost=2.27..267.40 rows=74 width=28) (actual time=0.585..1.111 rows=286
               Recheck Cond: (unmunge_time(time_group) > (now() - '24:00:00'::interval))
               ->  Bitmap Index Scan on minute_windspeed_unmunge_index  (cost=0.00..2.27 rows=74 width=0) (actual
time=0.566..0.566rows=287 loops=1) 
                     Index Cond: (unmunge_time(time_group) > (now() - '24:00:00'::interval))
         ->  Hash  (cost=21645.92..21645.92 rows=3 width=20) (actual time=63.849..63.849 rows=2 loops=1)
               ->  HashAggregate  (cost=21645.84..21645.89 rows=3 width=20) (actual time=63.832..63.834 rows=2 loops=1)
                     ->  Bitmap Heap Scan on readings  (cost=59.21..21596.85 rows=6532 width=20) (actual
time=15.174..53.249rows=7613 loops=1) 
                           Recheck Cond: ((date_part('doy'::text, "when") >= date_part('doy'::text, (now() -
'24:00:00'::interval)))AND (date_part('doy'::text, "when") <= date_part('doy'::text, now()))) 
                           ->  Bitmap Index Scan on readings_doy_index  (cost=0.00..59.21 rows=6532 width=0) (actual
time=12.509..12.509rows=10530 loops=1) 
                                 Index Cond: ((date_part('doy'::text, "when") >= date_part('doy'::text, (now() -
'24:00:00'::interval)))AND (date_part('doy'::text, "when") <= date_part('doy'::text, now()))) 
 Total runtime: 77.177 ms

What I had thought is that PG would (could?) be smart enough to realize that one query was restricted, and apply that
restrictionto the other based on the join.  I know it works in other cases (using indexes on both tables using the

> Something else occurred to me: do you (or will you) have more than
> one year of data?  If so then matching on doy could be problematic
> unless you also check for the year, or unless you want to match
> more than one year.

Yes and yes.  I'm doing both aggregate by day of the year for all data, and aggregate by day of year within each year.
Theexamples are: 

weather=# select * from doy_agg where doy = extract( doy from now() );
 doy |  avg_windspeed   | max_windspeed
  11 | 6.14058239764748 |            69
(1 row)

weather=# select * from doy_day_agg where extract( doy from day ) = extract( doy from now() );
         day         |  avg_windspeed   | max_windspeed
 2004-01-11 00:00:00 | 5.03991313397539 |            17
 2006-01-11 00:00:00 |  18.532050716667 |            69
 2005-01-11 00:00:00 |  3.6106763448041 |            13

Thanks for your help Michael.


 07:07:30 up 3 days, 23:34,  9 users,  load average: 2.29, 2.44, 2.43
Linux 2.6.12-12-2 #4 SMP Tue Jan 3 19:56:19 MST 2006


pgsql-performance by date:

From: Josh Berkus
Subject: Re: [ADMIN] Assimilation of these "versus" and hardware threads
From: Robert Creager
Subject: Re: Index isn't used during a join.