Re: [HACKERS] Optimizer badness in 7.0 beta - Mailing list pgsql-hackers
From | Brian Hirt |
---|---|
Subject | Re: [HACKERS] Optimizer badness in 7.0 beta |
Date | |
Msg-id | 20000305150132.B29257@loopy.berkhirt.com Whole thread Raw |
In response to | Re: [HACKERS] Optimizer badness in 7.0 beta (Peter Eisentraut <peter_e@gmx.net>) |
Responses |
Re: [HACKERS] Optimizer badness in 7.0 beta
|
List | pgsql-hackers |
Peter, Actually, the query you supply will not work, I'll get duplicate rows because the relationship between game and game_developer is a one to many. Of course you had no way of knowing that from the information I supplied. I could throw a distinct in there to get the results, but that really feels like bad form because of the large amount of duplicate rows. In any case, the original query I supplied is generated SQL, created by a Database to Object persistance layer and cannot by design have multiple tables in the from clause, so restrictions to the table seleted from must be in the form of a qualifier. I realize that the query in question could be written better. My concern was the huge difference in performance between 6.5 and 7.0 on this type of query. Other people may be bitten by this one, so I wanted to bring it up. I've been able to easily to work around this problem, it just seems wrong that the difference in execution time is so far off from the previous release. I dont know too much about the PG internals, but when I used sybase, it would usually execute the sub-select independently and stuff the results into a temp table and then do another query, joining to the results of the sub-select. In a situation like this one, worst case without indexes you would get a table scan for the sub-select and then a merge join with a sequential scan on the temp table and a sequential scan on the other table (example below). Using that approach, with no indexes, the query still executes in a fraction of a second. It just seems that a query on tables as small as I'm describing should never take as long as it did. It seems like a problem with the optimizer, but if people are happy with currenty functionality that's fine with me also. -brian For Example: SELECT DISTINCT game_id INTO temp tmp_res FROM game_developer WHERE developer_id = 3 SELECT * FROM game, tmp_res WHERE game.game_id = tmp_res.game_id AND game.approved = 1 ORDER BY copyright_year desc, game_title On Sun, Mar 05, 2000 at 03:15:45PM +0100, Peter Eisentraut wrote: > This query can be rewritten as > > SELECT creation_timestamp, etc. > FROM game, game_developer > WHERE game.game_id = game_developer.game_id > AND approved = 1 AND developer_id = 3 > ORDER BY copyright_year desc, game_title > > The way you're writing it you're almost asking it to be slow. :) > > Of course that still doesn't explain why it's now 94sec versus formerly 1 > but I'm sure Tom Lane will enlighten us all very soon. :) > > > Brian Hirt writes: > > > select > > creation_timestamp, > [snip] > > from > > game > > where > > approved = 1 > > and > > game_id in ( > > select > > distinct game_id > > from > > game_developer > > where > > developer_id = 3) > > order by > > copyright_year desc, > > game_title; > > > -- > Peter Eisentraut Sernanders väg 10:115 > peter_e@gmx.net 75262 Uppsala > http://yi.org/peter-e/ Sweden > > > > ************ -- The world's most ambitious and comprehensive PC game database project. http://www.mobygames.com
pgsql-hackers by date: