Thread: MIN() performance regression 8.0 -> 8.1

MIN() performance regression 8.0 -> 8.1

From
Paul Lindner
Date:
Hi,

I believe I've found a performance regression between 8.0 and 8.1 for
some cases.  For some queries it appears that the old MIN() ran much
faster than the new optimized MIN().

The following set of query plans illustrates the problem.  (schema
names changed to protect the innocent...)

Basically it appears that the planner favors a Filter instead of an
Index Cond unless I significantly increase the cpu_tuple_cost.  Also
note that the filter adds a IS NOT NULL condition for a column already
defined as IS NOT NULL.

[[Note: Results were obtained by running three hours worth of SELECT traffic through an 8.0 and 8.1 instance, then
analyzingthe before and after for speedups and regressions, scripts available on request.]] 


*** The table:

typepad_20051127=# \d mm_moo_summary;              Table "public.moo_summary"       Column         |           Type
     | Modifiers 
-----------------------+--------------------------+-----------moo_summary_id      | integer                  | not
nullmoo_summary_b_id   | integer                  |moo_summary_set_id  | integer                  |moo_summary_hits
|integer                  |moo_summary_date    | timestamp with time zone | not null 
Indexes:   "mm_moo_summary_pkey" PRIMARY KEY, btree (moo_summary_id)   "mm_moo_summary_b_id" btree (moo_summary_b_id)
"mm_moo_summary_date"btree (moo_summary_date)   "mm_moo_summary_set_id" btree (moo_summary_set_id) 


*** 8.0 Query Plans:

explain SELECT MIN(moo_summary_date) FROM mm_moo_summary WHERE (moo_summary_b_id = '215222')
Aggregate  (cost=5730.96..5730.96 rows=1 width=8)  ->  Index Scan using mm_moo_summary_b_id on mm_moo_summary
(cost=0.00..5725.62rows=2137 width=8)        Index Cond: (moo_summary_b_id = 215222) 

explain SELECT moo_summary_date FROM mm_moo_summary WHERE (moo_summary_b_id = '215222') order by moo_summary_date limit
1
Limit  (cost=0.00..195.96 rows=1 width=8)  ->  Index Scan using mm_moo_summary_date on mm_moo_summary
(cost=0.00..418775.56rows=2137 width=8)        Filter: (moo_summary_b_id = 21522) 

explain SELECT moo_summary_date FROM mm_moo_summary WHERE (moo_summary_b_id = '215222') order by moo_summary_date
Sort  (cost=5843.81..5849.15 rows=2137 width=8)  Sort Key: moo_summary_date  ->  Index Scan using mm_moo_summary_b_id
onmm_moo_summary  (cost=0.00..5725.62 rows=2137 width=8)        Index Cond: (moo_summary_b_id = 21522) 


*** 8.1 Query Plan:

explain SELECT MIN(moo_summary_date) FROM mm_moo_summary WHERE (moo_summary_b_id = '215222')
Result  (cost=1988.44..1988.45 rows=1 width=0)  InitPlan    ->  Limit  (cost=0.00..1988.44 rows=1 width=8)          ->
IndexScan using mm_moo_summary_date on mm_moo_summary  (cost=0.00..4239343.73 rows=2132 width=8)                Filter:
((moo_summary_b_id= 215222) AND (moo_summary_date IS NOT NULL)) 

typepad_20051127=# set cpu_tuple_cost = .55;

SET cpu_tuple_cost = .55;
EXPLAIN SELECT MIN(stats_summary_date) from mm_moo_summary where moo_summary_b_id = 21522;
                 QUERY PLAN                                                   
-------------------------------------------------------------------------------------------------------------Aggregate
(cost=5524.70..5525.25rows=1 width=8)  ->  Index Scan using mm_moo_summary_blog_id on mm_moo_summary
(cost=0.00..5519.37rows=2132 width=8)        Index Cond: (moo_summary_b_id = 21522) 
(3 rows)


--
Paul Lindner        ||||| | | | |  |  |  |   |   |
lindner@inuus.com

Re: MIN() performance regression 8.0 -> 8.1

From
Tom Lane
Date:
Paul Lindner <lindner@inuus.com> writes:
> I believe I've found a performance regression between 8.0 and 8.1 for
> some cases.  For some queries it appears that the old MIN() ran much
> faster than the new optimized MIN().

When you are complaining about planner mistakes, showing EXPLAIN rather
than EXPLAIN ANALYZE output is an entirely ineffective way of making
your point.

I suspect the problem is that there is a correlation between the
moo_summary_date and moo_summary_b_id columns, which the planner
will not realize --- is that the case?
        regards, tom lane