Re: GEQO: ERX - Mailing list pgsql-hackers

From Tobias Zahn
Subject Re: GEQO: ERX
Date
Msg-id 4A0B2A21.1000908@arcor.de
Whole thread Raw
In response to Re: GEQO: ERX  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: GEQO: ERX  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
Hello,
thank you for posting the paper, it was quite interesting to read. I
think it would be a good idea to give the two-phase optimization a try.
As far as I know and understand the current (geqo) optimizer source,
many important parts are already there. For example, we can calculate
the costs of a given join order. Therefore, it would only be necessary
to write an algorithm witch chooses the right input for the cost function.
I would be interested in your opinion.

Regards,
Tobias

Robert Haas schrieb:
> On Sat, May 2, 2009 at 11:37 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Tobias Zahn <tobias-zahn@arcor.de> writes:
>>> I didn't not get any response to my initial message below. Now I am
>>> wondering if nobody is into the optimizer or if my question was just too
>>> stupid. Could you please give me some clues? Your help would really be
>>> appreciated.
>> Well, nobody's into GEQO very much.  I took a quick look and didn't
>> think that deleting the ERX support would save anything noticeable,
>> but you're welcome to try it if you think different.
>>
>> The real problem with working on GEQO, in my humble opinion, is that
>> it's throwing good effort after bad.  That module doesn't need marginal
>> fixing, it needs throwing away and rewriting from scratch.  Bad enough
>> that it's convoluted and full of dead (experimental?) code; but I don't
>> even believe that it's based on a good analogy.   The planning problem
>> is not all that much like traveling salesman problems, so heuristics
>> designed for TSP are of pretty questionable usefulness to start with.
>> That complaint could have been refuted if the module performed well,
>> but in fact if you check the archives you'll find many many complaints
>> about it --- its ability to find good plans seems to be mostly dependent
>> on luck.
>>
>> My knowledge of AI search algorithms is about 20 years obsolete, but
>> last I heard simulated annealing had overtaken genetic algorithms for
>> many purposes.  It might be interesting to try a rewrite based on SA;
>> or maybe there's something better out there now.
> 
> There's a 1997 article on this topic that's pretty interesting.
> 
> Heuristic and randomized optimization for the join ordering problem
> http://reference.kfupm.edu.sa/content/h/e/heuristic_and_randomized_optimization_fo_87585.pdf
> 
> Here's the basic conclusion:
> 
> "If good solutions are of highest importance, Two-Phase Optimization,
> the algorithm that performed best in our experiments, is a very good
> choice; other Simulated Annealing variants, for instance Toured
> Simulated Annealing (TSA, LVZ93]), that we did not implement, are
> likely to achieve quite similar results. The 'pure' Simulated
> Annealing algorithm has a much higher running time without yielding
> significantly better solutions. If short running time is more
> important, Iterative Improvement (IIIO), the genetic algo- rithm
> (BushyGenetic), and, to a lesser extent, Two-Phase Optimization (2PO)
> are feasible alternatives."
> 
> I'm not sure if there's anything more recent out there.
> 
> ...Robert
> 



pgsql-hackers by date:

Previous
From: Heikki Linnakangas
Date:
Subject: Re: New trigger option of pg_standby
Next
From: Tom Lane
Date:
Subject: Re: libxml incompatibility