Re: Index isn't used during a join. - Mailing list pgsql-performance

From Robert Creager
Subject Re: Index isn't used during a join.
Date
Msg-id 20060110221055.053596ab@thunder.logicalchaos.org
Whole thread Raw
In response to Re: Index isn't used during a join.  (Michael Fuhr <mike@fuhr.org>)
Responses Re: Index isn't used during a join.
List pgsql-performance
Ok, I'm back, and in a little better shape.

The query is now correct, but still is slow because of lack of index usage.  I don't know how to structure the query
correctlyto use the index. 

Taken individually:

weather=# explain analyze select * from doy_agg where doy = extract( doy from now() );
                                                               QUERY PLAN
                

----------------------------------------------------------------------------------------------------------------------------------------
 HashAggregate  (cost=13750.67..13750.71 rows=2 width=20) (actual time=123.134..123.135 rows=1 loops=1)
   ->  Bitmap Heap Scan on readings  (cost=25.87..13720.96 rows=3962 width=20) (actual time=6.384..116.559 rows=4175
loops=1)
         Recheck Cond: (date_part('doy'::text, "when") = date_part('doy'::text, now()))
         ->  Bitmap Index Scan on readings_doy_index  (cost=0.00..25.87 rows=3962 width=0) (actual time=5.282..5.282
rows=4215loops=1) 
               Index Cond: (date_part('doy'::text, "when") = date_part('doy'::text, now()))
 Total runtime: 123.366 ms

produces the data:

weather=# select * from doy_agg where doy = extract( doy from now() );
 doy |  avg_windspeed   | max_windspeed
-----+------------------+---------------
  10 | 8.53403056583666 |            59

and:

weather=# EXPLAIN ANALYZE
weather-# SELECT *,
weather-# unmunge_time( time_group ) AS time
weather-# FROM minute."windspeed"
weather-# WHERE unmunge_time( time_group ) > ( now() - '24 hour'::interval )
weather-# ORDER BY time_group;
                                                                   QUERY PLAN
                         

-------------------------------------------------------------------------------------------------------------------------------------------------
 Sort  (cost=595.33..595.77 rows=176 width=28) (actual time=4.762..4.828 rows=283 loops=1)
   Sort Key: time_group
   ->  Bitmap Heap Scan on windspeed  (cost=2.62..588.76 rows=176 width=28) (actual time=0.901..3.834 rows=283 loops=1)
         Recheck Cond: (unmunge_time(time_group) > (now() - '24:00:00'::interval))
         ->  Bitmap Index Scan on minute_windspeed_unmunge_index  (cost=0.00..2.62 rows=176 width=0) (actual
time=0.745..0.745rows=284 loops=1) 
               Index Cond: (unmunge_time(time_group) > (now() - '24:00:00'::interval))
 Total runtime: 5.108 ms

produces:

 time_group |    min_reading    | max_reading |    avg_reading    |        time
------------+-------------------+-------------+-------------------+---------------------
 1136869500 |               0.8 |           6 |  2.62193548387097 | 2006-01-09 22:05:00
 1136869800 |                 0 |           3 | 0.406021505376343 | 2006-01-09 22:10:00
 1136870100 |                 0 |           5 |              1.68 | 2006-01-09 22:15:00
...

But I want the composite of the two queries, and I'm stuck on:

weather=# EXPLAIN ANALYZE
weather-# SELECT *,
weather-# unmunge_time( time_group ) AS time
weather-# FROM minute."windspeed"
weather-# JOIN doy_agg ON( EXTRACT( doy FROM unmunge_time( time_group ) ) = doy )
weather-# WHERE unmunge_time( time_group ) > ( now() - '24 hour'::interval )
weather-# ORDER BY time_group;
                                                                         QUERY PLAN
                                     

-------------------------------------------------------------------------------------------------------------------------------------------------------------
 Sort  (cost=153627.67..153628.48 rows=322 width=48) (actual time=10637.681..10637.748 rows=286 loops=1)
   Sort Key: windspeed.time_group
   ->  Merge Join  (cost=153604.82..153614.26 rows=322 width=48) (actual time=10633.375..10636.728 rows=286 loops=1)
         Merge Cond: ("outer"."?column5?" = "inner".doy)
         ->  Sort  (cost=594.89..595.33 rows=176 width=28) (actual time=5.539..5.612 rows=286 loops=1)
               Sort Key: date_part('doy'::text, unmunge_time(windspeed.time_group))
               ->  Bitmap Heap Scan on windspeed  (cost=2.62..588.32 rows=176 width=28) (actual time=0.918..4.637
rows=286loops=1) 
                     Recheck Cond: (unmunge_time(time_group) > (now() - '24:00:00'::interval))
                     ->  Bitmap Index Scan on minute_windspeed_unmunge_index  (cost=0.00..2.62 rows=176 width=0)
(actualtime=0.739..0.739 rows=287 loops=1) 
                           Index Cond: (unmunge_time(time_group) > (now() - '24:00:00'::interval))
         ->  Sort  (cost=153009.93..153010.84 rows=366 width=20) (actual time=10627.699..10627.788 rows=295 loops=1)
               Sort Key: doy_agg.doy
               ->  HashAggregate  (cost=152984.28..152990.69 rows=366 width=20) (actual time=10625.649..10626.601
rows=366loops=1) 
                     ->  Seq Scan on readings  (cost=0.00..145364.93 rows=1015914 width=20) (actual
time=0.079..8901.123rows=1015917 loops=1) 
 Total runtime: 10638.298 ms

Where:

weather=# \d doy_agg
            View "public.doy_agg"
    Column     |       Type       | Modifiers
---------------+------------------+-----------
 doy           | double precision |
 avg_windspeed | double precision |
 max_windspeed | integer          |
View definition:
 SELECT doy_readings.doy, avg(doy_readings.windspeedaverage1) AS avg_windspeed, max(doy_readings.windspeedmax1) AS
max_windspeed
   FROM ONLY doy_readings
  GROUP BY doy_readings.doy;

which I don't want because of the full scan on readings.

I can easily do the two queries seperately in the script utilizing this data, but want to do it in the db itself.  I
figureI'm just not seeing how to combine the two queries effectively. 

Thoughts?

Thanks,
Rob

--
 22:08:50 up 3 days, 14:35,  9 users,  load average: 2.71, 2.48, 2.51
Linux 2.6.12-12-2 #4 SMP Tue Jan 3 19:56:19 MST 2006

Attachment

pgsql-performance by date:

Previous
From: Tom Lane
Date:
Subject: Re: Left Join Performance vs Inner Join Performance
Next
From: Ron
Date:
Subject: Re: help tuning queries on large database