Thread: Cost-based optimizers

Cost-based optimizers

From

Christopher Kings-Lynne

Date:

12 December 2005, 21:26:24

A vaguely interesting interview with IBM and MS guys about cost-based 
optimizers.

http://www.acmqueue.com/modules.php?name=Content&pa=showpage&pid=297

Chris

Re: Cost-based optimizers

From

Christopher Browne

Date:

13 December 2005, 00:38:26

> http://www.acmqueue.com/modules.php?name=Content&pa=showpage&pid=297

I saw it in print; the only thing that seemed interesting about it was
the recommendation that query optimization be biased towards the
notion of "stable plans," query plans that may not be the most
"aggressively fast," but which don't fall apart into hideous
performance if the estimates are a little bit off.
-- 
output = ("cbbrowne" "@" "ntlug.org")
http://linuxdatabases.info/info/lsf.html
Rules of the Evil Overlord #114. "I will never accept a challenge from
the hero." <http://www.eviloverlord.com/>

Re: Cost-based optimizers

From

Christopher Kings-Lynne

Date:

13 December 2005, 00:43:59

> I saw it in print; the only thing that seemed interesting about it was
> the recommendation that query optimization be biased towards the
> notion of "stable plans," query plans that may not be the most
> "aggressively fast," but which don't fall apart into hideous
> performance if the estimates are a little bit off.

And the answer is interesting as well:

"I think we have to approach it in two ways. One is that you have to be 
able to execute good plans, and during the execution of a plan you want 
to notice when the actual data is deviating dramatically from what you 
expected. If you expected five rows and you’ve got a million, chances 
are your plan is not going to do well because you chose it based on the 
assumption of five. Thus, being able to correct mid-course is an area of 
enhancement for query optimizers that IBM is pursuing."

Hmmm dynamic re-planning!

Chris

Re: Cost-based optimizers

From

"Luke Lonergan"

Date:

13 December 2005, 01:31:49

Chris,

On 12/12/05 8:44 PM, "Christopher Kings-Lynne" <chriskl@familyhealth.com.au>
wrote:

> assumption of five. Thus, being able to correct mid-course is an area of
> enhancement for query optimizers that IBM is pursuing."
> 
> Hmmm dynamic re-planning!

I recently interviewed someone who is in the research group working on this
at IBM.  From what he said - it seems this is pretty far from making it's
way into the production codebase.  Apparently there is too much history in
DB2's optimizer and the perception is that the risk / payoff is too high.

- Luke

Re: Cost-based optimizers

From

Martijn van Oosterhout

Date:

14 December 2005, 05:58:15

On Tue, Dec 13, 2005 at 12:44:50PM +0800, Christopher Kings-Lynne wrote:
> And the answer is interesting as well:
>
> "I think we have to approach it in two ways. One is that you have to be
> able to execute good plans, and during the execution of a plan you want
> to notice when the actual data is deviating dramatically from what you
> expected. If you expected five rows and you?ve got a million, chances
> are your plan is not going to do well because you chose it based on the
> assumption of five. Thus, being able to correct mid-course is an area of
> enhancement for query optimizers that IBM is pursuing."

Well, now we have savepoints, it would actually be possible for a plan
to notice while running that's it's producing more or less than
expected and to abort, replan and start again.

Ofcourse, this is another can of worms. To do this you would have to be
able to have the failed query provide hints to the planner telling it
where it went wrong. Now, it may be possible to provide (via
post-mortem of an execution) a list of actual selectivites like:

table1.field1 = value (selectivity 5%)
func2(table2.field2) = value   (selectivity 1%)

However, the biggest errors in selectivity occur when joining two
tables. Of the top of my head I can't think of any way to manage those
other than store the entire expression being tested...

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
> tool for doing 5% of the work and then sitting around waiting for someone
> else to do the other 95% so you can sue them.

Re: Cost-based optimizers

From

Rod Taylor

Date:

14 December 2005, 10:05:29

> Ofcourse, this is another can of worms. To do this you would have to be
> able to have the failed query provide hints to the planner telling it
> where it went wrong. Now, it may be possible to provide (via
> post-mortem of an execution) a list of actual selectivites like:

Just being able to provide hints to go with EXPLAIN ANALYZE results
would be useful.

--

Re: Cost-based optimizers

From

Christopher Browne

Date:

14 December 2005, 12:03:40

>> Ofcourse, this is another can of worms. To do this you would have to be
>> able to have the failed query provide hints to the planner telling it
>> where it went wrong. Now, it may be possible to provide (via
>> post-mortem of an execution) a list of actual selectivites like:
>
> Just being able to provide hints to go with EXPLAIN ANALYZE results
> would be useful.

We'd probably get some milage out of collecting statistics equivalent
to EXPLAIN ANALYZE (actual versus estimated), and, over some set of
such statistics, try to improve parameter usage in the optimizer.

That of course requires collecting those stats...
-- 
output = reverse("moc.liamg" "@" "enworbbc")
http://linuxdatabases.info/info/emacs.html
"...Yet terrible as Unix addiction  is, there are worse fates. If Unix
is the heroin of operating systems, then VMS is barbiturate addiction,
the Mac is MDMA, and MS-DOS is sniffing glue. (Windows is filling your
sinuses  with  lucite and  letting  it set.)   You  owe  the Oracle  a
twelve-step program."  --The Usenet Oracle