MIN() performance regression 8.0 -> 8.1 - Mailing list pgsql-hackers

From Paul Lindner
Subject MIN() performance regression 8.0 -> 8.1
Date
Msg-id 20051204155715.GC10317@inuus.com
Whole thread Raw
Responses Re: MIN() performance regression 8.0 -> 8.1
List pgsql-hackers
Hi,

I believe I've found a performance regression between 8.0 and 8.1 for
some cases.  For some queries it appears that the old MIN() ran much
faster than the new optimized MIN().

The following set of query plans illustrates the problem.  (schema
names changed to protect the innocent...)

Basically it appears that the planner favors a Filter instead of an
Index Cond unless I significantly increase the cpu_tuple_cost.  Also
note that the filter adds a IS NOT NULL condition for a column already
defined as IS NOT NULL.

[[Note: Results were obtained by running three hours worth of SELECT traffic through an 8.0 and 8.1 instance, then
analyzingthe before and after for speedups and regressions, scripts available on request.]] 


*** The table:

typepad_20051127=# \d mm_moo_summary;              Table "public.moo_summary"       Column         |           Type
     | Modifiers 
-----------------------+--------------------------+-----------moo_summary_id      | integer                  | not
nullmoo_summary_b_id   | integer                  |moo_summary_set_id  | integer                  |moo_summary_hits
|integer                  |moo_summary_date    | timestamp with time zone | not null 
Indexes:   "mm_moo_summary_pkey" PRIMARY KEY, btree (moo_summary_id)   "mm_moo_summary_b_id" btree (moo_summary_b_id)
"mm_moo_summary_date"btree (moo_summary_date)   "mm_moo_summary_set_id" btree (moo_summary_set_id) 


*** 8.0 Query Plans:

explain SELECT MIN(moo_summary_date) FROM mm_moo_summary WHERE (moo_summary_b_id = '215222')
Aggregate  (cost=5730.96..5730.96 rows=1 width=8)  ->  Index Scan using mm_moo_summary_b_id on mm_moo_summary
(cost=0.00..5725.62rows=2137 width=8)        Index Cond: (moo_summary_b_id = 215222) 

explain SELECT moo_summary_date FROM mm_moo_summary WHERE (moo_summary_b_id = '215222') order by moo_summary_date limit
1
Limit  (cost=0.00..195.96 rows=1 width=8)  ->  Index Scan using mm_moo_summary_date on mm_moo_summary
(cost=0.00..418775.56rows=2137 width=8)        Filter: (moo_summary_b_id = 21522) 

explain SELECT moo_summary_date FROM mm_moo_summary WHERE (moo_summary_b_id = '215222') order by moo_summary_date
Sort  (cost=5843.81..5849.15 rows=2137 width=8)  Sort Key: moo_summary_date  ->  Index Scan using mm_moo_summary_b_id
onmm_moo_summary  (cost=0.00..5725.62 rows=2137 width=8)        Index Cond: (moo_summary_b_id = 21522) 


*** 8.1 Query Plan:

explain SELECT MIN(moo_summary_date) FROM mm_moo_summary WHERE (moo_summary_b_id = '215222')
Result  (cost=1988.44..1988.45 rows=1 width=0)  InitPlan    ->  Limit  (cost=0.00..1988.44 rows=1 width=8)          ->
IndexScan using mm_moo_summary_date on mm_moo_summary  (cost=0.00..4239343.73 rows=2132 width=8)                Filter:
((moo_summary_b_id= 215222) AND (moo_summary_date IS NOT NULL)) 

typepad_20051127=# set cpu_tuple_cost = .55;

SET cpu_tuple_cost = .55;
EXPLAIN SELECT MIN(stats_summary_date) from mm_moo_summary where moo_summary_b_id = 21522;
                 QUERY PLAN                                                   
-------------------------------------------------------------------------------------------------------------Aggregate
(cost=5524.70..5525.25rows=1 width=8)  ->  Index Scan using mm_moo_summary_blog_id on mm_moo_summary
(cost=0.00..5519.37rows=2132 width=8)        Index Cond: (moo_summary_b_id = 21522) 
(3 rows)


--
Paul Lindner        ||||| | | | |  |  |  |   |   |
lindner@inuus.com

pgsql-hackers by date:

Previous
From: Andrew Dunstan
Date:
Subject: Re: pg_restore [archiver] file offset in dump file is too
Next
From: Tom Lane
Date:
Subject: Er ... does port/snprintf.c actually work?