Cost-based optimizers

Started by Christopher Kings-Lynneabout 20 years ago7 messages
#1Christopher Kings-Lynne
chriskl@familyhealth.com.au

A vaguely interesting interview with IBM and MS guys about cost-based
optimizers.

http://www.acmqueue.com/modules.php?name=Content&pa=showpage&pid=297

Chris

#2Christopher Browne
cbbrowne@acm.org
In reply to: Christopher Kings-Lynne (#1)
Re: Cost-based optimizers

http://www.acmqueue.com/modules.php?name=Content&pa=showpage&pid=297

I saw it in print; the only thing that seemed interesting about it was
the recommendation that query optimization be biased towards the
notion of "stable plans," query plans that may not be the most
"aggressively fast," but which don't fall apart into hideous
performance if the estimates are a little bit off.
--
output = ("cbbrowne" "@" "ntlug.org")
http://linuxdatabases.info/info/lsf.html
Rules of the Evil Overlord #114. "I will never accept a challenge from
the hero." <http://www.eviloverlord.com/&gt;

#3Christopher Kings-Lynne
chriskl@familyhealth.com.au
In reply to: Christopher Browne (#2)
Re: Cost-based optimizers

I saw it in print; the only thing that seemed interesting about it was
the recommendation that query optimization be biased towards the
notion of "stable plans," query plans that may not be the most
"aggressively fast," but which don't fall apart into hideous
performance if the estimates are a little bit off.

And the answer is interesting as well:

"I think we have to approach it in two ways. One is that you have to be
able to execute good plans, and during the execution of a plan you want
to notice when the actual data is deviating dramatically from what you
expected. If you expected five rows and you�ve got a million, chances
are your plan is not going to do well because you chose it based on the
assumption of five. Thus, being able to correct mid-course is an area of
enhancement for query optimizers that IBM is pursuing."

Hmmm dynamic re-planning!

Chris

#4Luke Lonergan
llonergan@greenplum.com
In reply to: Christopher Kings-Lynne (#3)
Re: Cost-based optimizers

Chris,

On 12/12/05 8:44 PM, "Christopher Kings-Lynne" <chriskl@familyhealth.com.au>
wrote:

assumption of five. Thus, being able to correct mid-course is an area of
enhancement for query optimizers that IBM is pursuing."

Hmmm dynamic re-planning!

I recently interviewed someone who is in the research group working on this
at IBM. From what he said - it seems this is pretty far from making it's
way into the production codebase. Apparently there is too much history in
DB2's optimizer and the perception is that the risk / payoff is too high.

- Luke

#5Martijn van Oosterhout
kleptog@svana.org
In reply to: Christopher Kings-Lynne (#3)
Re: Cost-based optimizers

On Tue, Dec 13, 2005 at 12:44:50PM +0800, Christopher Kings-Lynne wrote:

And the answer is interesting as well:

"I think we have to approach it in two ways. One is that you have to be
able to execute good plans, and during the execution of a plan you want
to notice when the actual data is deviating dramatically from what you
expected. If you expected five rows and you?ve got a million, chances
are your plan is not going to do well because you chose it based on the
assumption of five. Thus, being able to correct mid-course is an area of
enhancement for query optimizers that IBM is pursuing."

Well, now we have savepoints, it would actually be possible for a plan
to notice while running that's it's producing more or less than
expected and to abort, replan and start again.

Ofcourse, this is another can of worms. To do this you would have to be
able to have the failed query provide hints to the planner telling it
where it went wrong. Now, it may be possible to provide (via
post-mortem of an execution) a list of actual selectivites like:

table1.field1 = value (selectivity 5%)
func2(table2.field2) = value (selectivity 1%)

However, the biggest errors in selectivity occur when joining two
tables. Of the top of my head I can't think of any way to manage those
other than store the entire expression being tested...

Have a nice day,
--
Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/

Show quoted text

Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
tool for doing 5% of the work and then sitting around waiting for someone
else to do the other 95% so you can sue them.

#6Rod Taylor
pg@rbt.ca
In reply to: Martijn van Oosterhout (#5)
Re: Cost-based optimizers

Ofcourse, this is another can of worms. To do this you would have to be
able to have the failed query provide hints to the planner telling it
where it went wrong. Now, it may be possible to provide (via
post-mortem of an execution) a list of actual selectivites like:

Just being able to provide hints to go with EXPLAIN ANALYZE results
would be useful.

--

#7Christopher Browne
cbbrowne@acm.org
In reply to: Christopher Kings-Lynne (#1)
Re: Cost-based optimizers

Ofcourse, this is another can of worms. To do this you would have to be
able to have the failed query provide hints to the planner telling it
where it went wrong. Now, it may be possible to provide (via
post-mortem of an execution) a list of actual selectivites like:

Just being able to provide hints to go with EXPLAIN ANALYZE results
would be useful.

We'd probably get some milage out of collecting statistics equivalent
to EXPLAIN ANALYZE (actual versus estimated), and, over some set of
such statistics, try to improve parameter usage in the optimizer.

That of course requires collecting those stats...
--
output = reverse("moc.liamg" "@" "enworbbc")
http://linuxdatabases.info/info/emacs.html
"...Yet terrible as Unix addiction is, there are worse fates. If Unix
is the heroin of operating systems, then VMS is barbiturate addiction,
the Mac is MDMA, and MS-DOS is sniffing glue. (Windows is filling your
sinuses with lucite and letting it set.) You owe the Oracle a
twelve-step program." --The Usenet Oracle