Sub-optimal plan chosen - Mailing list pgsql-performance

From bricklen
Subject Sub-optimal plan chosen
Date
Msg-id 33b743250909100734wfc96a8ct9d56379c3ae0d289@mail.gmail.com
Whole thread Raw
Responses Re: Sub-optimal plan chosen  (tv@fuzzy.cz)
Re: Sub-optimal plan chosen  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-performance
In the following query, We are seeing a sub-optimal plan being chosen. The following results are after running the query several times (after each change).

dev1=# select version();
                                                  version
-----------------------------------------------------------------------------------------------------------
 PostgreSQL 8.2.13 on x86_64-redhat-linux-gnu, compiled by GCC gcc (GCC) 4.1.2 20070626 (Red Hat 4.1.2-14)



dev1=# EXPLAIN ANALYZE SELECT SUM (revenue) as revenue FROM statsdaily WHERE ofid = 38 AND date >= '2009-09-01' AND date <= '2999-01-01';
                                                                       QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=11796.19..11796.20 rows=1 width=8) (actual time=28.598..28.599 rows=1 loops=1)
   ->  Index Scan using statsdaily_unique_idx on statsdaily  (cost=0.00..11783.65 rows=5017 width=8) (actual time=0.043..25.374 rows=3125 loops=1)
         Index Cond: ((date >= '2009-09-01'::date) AND (date <= '2999-01-01'::date) AND (ofid = 38))
 Total runtime: 28.650 ms


dev1=# set enable_indexscan to off;


dev1=# EXPLAIN ANALYZE SELECT SUM (revenue) as revenue FROM statsdaily WHERE ofid = '38' AND date >= '2009-09-01' AND date <= '2999-01-01';
                                                                        QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=13153.47..13153.48 rows=1 width=8) (actual time=7.746..7.747 rows=1 loops=1)
   ->  Bitmap Heap Scan on statsdaily  (cost=3622.22..13140.92 rows=5017 width=8) (actual time=0.941..4.865 rows=3125 loops=1)
         Recheck Cond: ((ofid = 38) AND (date >= '2009-09-01'::date))
         Filter: (date <= '2999-01-01'::date)
         ->  Bitmap Index Scan on statsdaily_ofid_sept2009_idx  (cost=0.00..3620.97 rows=5046 width=0) (actual time=0.551..0.551 rows=3125 loops=1)
               Index Cond: (ofid = 38)
 Total runtime: 7.775 ms


default_statistics_target = 100 (tried with 500, no change). Vacuum analyzed before initial query, and after each change to default_statistics_target.


The same query, with a different "ofid", will occasionally get the more optimal plan -- I assume that the distribution of data is the differentiator there.

Is there any other data I can provide to shed some light on this?

Thanks!

pgsql-performance by date:

Previous
From: Kevin Kempter
Date:
Subject: Re: partitioning max() sql not using index
Next
From: tv@fuzzy.cz
Date:
Subject: Re: Sub-optimal plan chosen