Re: GEQO: ERX - Mailing list pgsql-hackers

From Robert Haas
Subject Re: GEQO: ERX
Date
Msg-id 603c8f070905031303t770724d6h674e4ba049acc65c@mail.gmail.com
Whole thread Raw
In response to Re: GEQO: ERX  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: GEQO: ERX  (Tobias Zahn <tobias-zahn@arcor.de>)
List pgsql-hackers
On Sat, May 2, 2009 at 11:37 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Tobias Zahn <tobias-zahn@arcor.de> writes:
>> I didn't not get any response to my initial message below. Now I am
>> wondering if nobody is into the optimizer or if my question was just too
>> stupid. Could you please give me some clues? Your help would really be
>> appreciated.
>
> Well, nobody's into GEQO very much.  I took a quick look and didn't
> think that deleting the ERX support would save anything noticeable,
> but you're welcome to try it if you think different.
>
> The real problem with working on GEQO, in my humble opinion, is that
> it's throwing good effort after bad.  That module doesn't need marginal
> fixing, it needs throwing away and rewriting from scratch.  Bad enough
> that it's convoluted and full of dead (experimental?) code; but I don't
> even believe that it's based on a good analogy.   The planning problem
> is not all that much like traveling salesman problems, so heuristics
> designed for TSP are of pretty questionable usefulness to start with.
> That complaint could have been refuted if the module performed well,
> but in fact if you check the archives you'll find many many complaints
> about it --- its ability to find good plans seems to be mostly dependent
> on luck.
>
> My knowledge of AI search algorithms is about 20 years obsolete, but
> last I heard simulated annealing had overtaken genetic algorithms for
> many purposes.  It might be interesting to try a rewrite based on SA;
> or maybe there's something better out there now.

There's a 1997 article on this topic that's pretty interesting.

Heuristic and randomized optimization for the join ordering problem
http://reference.kfupm.edu.sa/content/h/e/heuristic_and_randomized_optimization_fo_87585.pdf

Here's the basic conclusion:

"If good solutions are of highest importance, Two-Phase Optimization,
the algorithm that performed best in our experiments, is a very good
choice; other Simulated Annealing variants, for instance Toured
Simulated Annealing (TSA, LVZ93]), that we did not implement, are
likely to achieve quite similar results. The 'pure' Simulated
Annealing algorithm has a much higher running time without yielding
significantly better solutions. If short running time is more
important, Iterative Improvement (IIIO), the genetic algo- rithm
(BushyGenetic), and, to a lesser extent, Two-Phase Optimization (2PO)
are feasible alternatives."

I'm not sure if there's anything more recent out there.

...Robert


pgsql-hackers by date:

Previous
From: justin
Date:
Subject: Re: windows doesn't notice backend death
Next
From: Andrew Dunstan
Date:
Subject: Re: windows doesn't notice backend death