Thread: Cost-based optimizers
A vaguely interesting interview with IBM and MS guys about cost-based optimizers. http://www.acmqueue.com/modules.php?name=Content&pa=showpage&pid=297 Chris
> http://www.acmqueue.com/modules.php?name=Content&pa=showpage&pid=297 I saw it in print; the only thing that seemed interesting about it was the recommendation that query optimization be biased towards the notion of "stable plans," query plans that may not be the most "aggressively fast," but which don't fall apart into hideous performance if the estimates are a little bit off. -- output = ("cbbrowne" "@" "ntlug.org") http://linuxdatabases.info/info/lsf.html Rules of the Evil Overlord #114. "I will never accept a challenge from the hero." <http://www.eviloverlord.com/>
> I saw it in print; the only thing that seemed interesting about it was > the recommendation that query optimization be biased towards the > notion of "stable plans," query plans that may not be the most > "aggressively fast," but which don't fall apart into hideous > performance if the estimates are a little bit off. And the answer is interesting as well: "I think we have to approach it in two ways. One is that you have to be able to execute good plans, and during the execution of a plan you want to notice when the actual data is deviating dramatically from what you expected. If you expected five rows and you’ve got a million, chances are your plan is not going to do well because you chose it based on the assumption of five. Thus, being able to correct mid-course is an area of enhancement for query optimizers that IBM is pursuing." Hmmm dynamic re-planning! Chris
Chris, On 12/12/05 8:44 PM, "Christopher Kings-Lynne" <chriskl@familyhealth.com.au> wrote: > assumption of five. Thus, being able to correct mid-course is an area of > enhancement for query optimizers that IBM is pursuing." > > Hmmm dynamic re-planning! I recently interviewed someone who is in the research group working on this at IBM. From what he said - it seems this is pretty far from making it's way into the production codebase. Apparently there is too much history in DB2's optimizer and the perception is that the risk / payoff is too high. - Luke
On Tue, Dec 13, 2005 at 12:44:50PM +0800, Christopher Kings-Lynne wrote: > And the answer is interesting as well: > > "I think we have to approach it in two ways. One is that you have to be > able to execute good plans, and during the execution of a plan you want > to notice when the actual data is deviating dramatically from what you > expected. If you expected five rows and you?ve got a million, chances > are your plan is not going to do well because you chose it based on the > assumption of five. Thus, being able to correct mid-course is an area of > enhancement for query optimizers that IBM is pursuing." Well, now we have savepoints, it would actually be possible for a plan to notice while running that's it's producing more or less than expected and to abort, replan and start again. Ofcourse, this is another can of worms. To do this you would have to be able to have the failed query provide hints to the planner telling it where it went wrong. Now, it may be possible to provide (via post-mortem of an execution) a list of actual selectivites like: table1.field1 = value (selectivity 5%) func2(table2.field2) = value (selectivity 1%) However, the biggest errors in selectivity occur when joining two tables. Of the top of my head I can't think of any way to manage those other than store the entire expression being tested... Have a nice day, -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a > tool for doing 5% of the work and then sitting around waiting for someone > else to do the other 95% so you can sue them.
> Ofcourse, this is another can of worms. To do this you would have to be > able to have the failed query provide hints to the planner telling it > where it went wrong. Now, it may be possible to provide (via > post-mortem of an execution) a list of actual selectivites like: Just being able to provide hints to go with EXPLAIN ANALYZE results would be useful. --
>> Ofcourse, this is another can of worms. To do this you would have to be >> able to have the failed query provide hints to the planner telling it >> where it went wrong. Now, it may be possible to provide (via >> post-mortem of an execution) a list of actual selectivites like: > > Just being able to provide hints to go with EXPLAIN ANALYZE results > would be useful. We'd probably get some milage out of collecting statistics equivalent to EXPLAIN ANALYZE (actual versus estimated), and, over some set of such statistics, try to improve parameter usage in the optimizer. That of course requires collecting those stats... -- output = reverse("moc.liamg" "@" "enworbbc") http://linuxdatabases.info/info/emacs.html "...Yet terrible as Unix addiction is, there are worse fates. If Unix is the heroin of operating systems, then VMS is barbiturate addiction, the Mac is MDMA, and MS-DOS is sniffing glue. (Windows is filling your sinuses with lucite and letting it set.) You owe the Oracle a twelve-step program." --The Usenet Oracle