Optimizer badness in 7.0 beta - Mailing list pgsql-hackers

From Brian Hirt
Subject Optimizer badness in 7.0 beta
Date
Msg-id 20000305032655.A29257@loopy.berkhirt.com
Whole thread Raw
Responses Re: [HACKERS] Optimizer badness in 7.0 beta  (Peter Eisentraut <peter_e@gmx.net>)
Re: [HACKERS] Optimizer badness in 7.0 beta  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
Hello,

I just downloaded the 7.0 beta to test it with my database to make sure
there will be no unexpected problems when I upgrade my production site.
I've run into a problem that I hope you can help me with.  I dumped my 
6.5.2 database and loaded it into 7.0.  Lot's of queries are now taking 
much much longer.  I have included the plans from one of the queries.  
In 7.0, the query takes 94 seconds compared to less than a second for
it to run on 6.5.2.  All of the data is exactly the same, the indexes are the 
same.  I thought maybe the indexes had bad statistics, so I "vaccum analyze" 
both the 6.5.2 database and the 7.0 database and ran again on both just to 
be on the safe side.  Still, same problem.  I know that there were problems 
with IN clauses optimizing and the preferred method is to use an exists 
statement.  However, I wouldn't expect this kind of change in performance.  
It does appear that 7.0 is trying to be smarter by using an index in the 
SubPlan, but for some reason it's being a hog.

Some more information that may be useful, the table 'game' has about 1000
rows and the table game_developer has about 15000 rows.  There is an
index on game_developer(developer_id)  

Other than these types of queries, everything else seems to be working 
okay.  I logged about 500 different queries that run against my database,
removed the ones that exhibit the behaviour above and ran a little
benchmark.  The run times between 6.5.2 and 7.0.0, for the types of
queries I'm running, are almost identical.   I was hoping that the new
improved optimizer would bring a great speed improvement, but I'm not
seeing it.  My guess is that most of the queries that I'm running are
small and there's a fixed cost associated with running each one -- the
actual work they perform is pretty small.  Possibly more time is being
spent optimizing the plan and is offsetting the improved execution time
on smaller queries.

-brian


-- PG 7.0 --
NOTICE:  QUERY PLAN:

Sort  (cost=383940.72..383940.72 rows=905 width=59) ->  Seq Scan on game  (cost=0.00..383896.28 rows=905 width=59)
SubPlan         ->  Unique  (cost=0.00..808.88 rows=0 width=4)               ->  Index Scan using
game_developer_game_indexon game_developer  (cost=0.00..808.87 rows=4 width=4)
 

EXPLAIN

-- PG 6.5.2 --
NOTICE:  QUERY PLAN:

Sort  (cost=99.32 rows=872 width=59) ->  Seq Scan on game  (cost=99.32 rows=872 width=59)       SubPlan         ->
Unique (cost=578.53 rows=2 width=4)               ->  Sort  (cost=578.53 rows=2 width=4)                     ->  Seq
Scanon game_developer  (cost=578.53 rows=2 width=4)
 

EXPLAIN

Query:

select
creation_timestamp,approved,moby_user_id,copyright_year,game_title,game_url,company_line,credits_complete,game_id
 
from game 
where approved = 1 
and game_id in (    select         distinct game_id    from         game_developer    where         developer_id = 3) 
order by copyright_year desc,game_title;
-- 
The world's most ambitious and comprehensive PC game database project.
                     http://www.mobygames.com


pgsql-hackers by date:

Previous
From: Tatsuo Ishii
Date:
Subject: DROP TABLE inside a transaction block
Next
From: Peter Eisentraut
Date:
Subject: pg_pwd trigger to be removed