Thread: GSoC idea - Simulated annealing to search for query plans
Dear Hackers,
I'm Grzegorz Parka, BSc Engineer of Technical Physics and student of Computer Science at WUT, Poland. Last year I've been a bit into evolutionary algorithms and during my research I found out about GEQO in Postgres. I also found out that there are plans to try a different attempt to find optimal query plans and thought it could be a good thing to use it as a project for GSoC.I'm interested in one of old TODO items related to the optimizer - 'Consider compressed annealing to search for query plans'. I believe this would be potentially beneficial to Postgres to check if such a heuristic could really choose better query plans than GEQO does. Judging by the results that simulated annealing gives on the travelling salesman problem, it looks like a simpler and potentially more effective way of combinatorial optimization.
As deliverables of such a project I would provide a simple implementation of basic simulated annealing optimizer and some form of quantitative comparison with GEQO.
On 02/26/2015 01:59 PM, Grzegorz Parka wrote: > Dear Hackers, > > I'm Grzegorz Parka, BSc Engineer of Technical Physics and student of > Computer Science at WUT, Poland. Last year I've been a bit into > evolutionary algorithms and during my research I found out about GEQO in > Postgres. I also found out that there are plans to try a different > attempt to find optimal query plans and thought it could be a good thing > to use it as a project for GSoC. > > I'm interested in one of old TODO items related to the optimizer - > 'Consider compressed annealing to search for query plans'. I believe > this would be potentially beneficial to Postgres to check if such a > heuristic could really choose better query plans than GEQO does. Judging > by the results that simulated annealing gives on the travelling salesman > problem, it looks like a simpler and potentially more effective way of > combinatorial optimization. > > As deliverables of such a project I would provide a simple > implementation of basic simulated annealing optimizer and some form of > quantitative comparison with GEQO. You might look at the earlier attempt to make the GEQO replacement "pluggable". That project failed to complete sufficiently to be a feature though, but it did enough to show that our current GEQO implementation was suboptimal. I'm currently searching for this project ... it was a GSOC project, but I think before they required posting to Google Code. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com
Josh Berkus <josh@agliodbs.com> writes: > On 02/26/2015 01:59 PM, Grzegorz Parka wrote: >> I'm interested in one of old TODO items related to the optimizer - >> 'Consider compressed annealing to search for query plans'. > You might look at the earlier attempt to make the GEQO replacement > "pluggable". That project failed to complete sufficiently to be a > feature though, but it did enough to show that our current GEQO > implementation was suboptimal. > I'm currently searching for this project ... it was a GSOC project, but > I think before they required posting to Google Code. I seem to recall somebody demo'ing a simulated-annealing GEQO replacement at PGCon a couple years back. It never got to the point of being a submitted patch though. regards, tom lane
On 2015-02-26 20:23:33 -0500, Tom Lane wrote: > Josh Berkus <josh@agliodbs.com> writes: > > On 02/26/2015 01:59 PM, Grzegorz Parka wrote: > >> I'm interested in one of old TODO items related to the optimizer - > >> 'Consider compressed annealing to search for query plans'. > > > You might look at the earlier attempt to make the GEQO replacement > > "pluggable". That project failed to complete sufficiently to be a > > feature though, but it did enough to show that our current GEQO > > implementation was suboptimal. > > > I'm currently searching for this project ... it was a GSOC project, but > > I think before they required posting to Google Code. > > I seem to recall somebody demo'ing a simulated-annealing GEQO replacement > at PGCon a couple years back. It never got to the point of being a > submitted patch though. Yea, it was Jan Urbański (CCed). Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
<div dir="ltr"><div class="gmail_extra"><br />On Thu, Feb 26, 2015 at 10:27 PM, Andres Freund <<a href="mailto:andres@2ndquadrant.com">andres@2ndquadrant.com</a>>wrote:<br />><br />> On 2015-02-26 20:23:33 -0500,Tom Lane wrote:<br />> > Josh Berkus <<a href="mailto:josh@agliodbs.com">josh@agliodbs.com</a>> writes:<br/>> > > On 02/26/2015 01:59 PM, Grzegorz Parka wrote:<br />> > >> I'm interested in one ofold TODO items related to the optimizer -<br />> > >> 'Consider compressed annealing to search for query plans'.<br/>> ><br />> > > You might look at the earlier attempt to make the GEQO replacement<br />> >> "pluggable". That project failed to complete sufficiently to be a<br />> > > feature though, but it didenough to show that our current GEQO<br />> > > implementation was suboptimal.<br />> ><br />> >> I'm currently searching for this project ... it was a GSOC project, but<br />> > > I think before theyrequired posting to Google Code.<br />> ><br />> > I seem to recall somebody demo'ing a simulated-annealingGEQO replacement<br />> > at PGCon a couple years back. It never got to the point of being a<br/>> > submitted patch though.<br />><br />> Yea, it was Jan Urbański (CCed).<br />><br /><br />And theproject link: <a href="https://github.com/wulczer/saio">https://github.com/wulczer/saio</a><br /><br />Regards,<br /><br/>--<br />Fabrízio de Royes Mello<br />Consultoria/Coaching PostgreSQL<br />>> Timbira: <a href="http://www.timbira.com.br">http://www.timbira.com.br</a><br/>>> Blog: <a href="http://fabriziomello.github.io">http://fabriziomello.github.io</a><br/>>> Linkedin: <a href="http://br.linkedin.com/in/fabriziomello">http://br.linkedin.com/in/fabriziomello</a><br/>>> Twitter: <a href="http://twitter.com/fabriziomello">http://twitter.com/fabriziomello</a><br/>>> Github: <a href="http://github.com/fabriziomello">http://github.com/fabriziomello</a></div></div>
On 02/26/2015 05:50 PM, Fabrízio de Royes Mello wrote: > > On Thu, Feb 26, 2015 at 10:27 PM, Andres Freund <andres@2ndquadrant.com > <mailto:andres@2ndquadrant.com>> wrote: >> >> On 2015-02-26 20:23:33 -0500, Tom Lane wrote: >> > Josh Berkus <josh@agliodbs.com <mailto:josh@agliodbs.com>> writes: >> > > On 02/26/2015 01:59 PM, Grzegorz Parka wrote: >> > >> I'm interested in one of old TODO items related to the optimizer - >> > >> 'Consider compressed annealing to search for query plans'. >> > >> > > You might look at the earlier attempt to make the GEQO replacement >> > > "pluggable". That project failed to complete sufficiently to be a >> > > feature though, but it did enough to show that our current GEQO >> > > implementation was suboptimal. >> > >> > > I'm currently searching for this project ... it was a GSOC > project, but >> > > I think before they required posting to Google Code. >> > >> > I seem to recall somebody demo'ing a simulated-annealing GEQO > replacement >> > at PGCon a couple years back. It never got to the point of being a >> > submitted patch though. >> >> Yea, it was Jan Urbański (CCed). >> > > And the project link: https://github.com/wulczer/saio So what w'ere saying, Grzegorz, is that we would love to see someone pick this up and get it to the point of making it a feature as a GSOC project. I think if you can start from where Jan left off, you could actually complete it. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com
On Thu, Feb 26, 2015 at 10:59:50PM +0100, Grzegorz Parka wrote: > Dear Hackers, > > I'm Grzegorz Parka, BSc Engineer of Technical Physics and student of > Computer Science at WUT, Poland. Last year I've been a bit into > evolutionary algorithms and during my research I found out about GEQO in > Postgres. I also found out that there are plans to try a different attempt > to find optimal query plans and thought it could be a good thing to use it > as a project for GSoC. > > I'm interested in one of old TODO items related to the optimizer - > 'Consider compressed annealing to search for query plans'. I believe this > would be potentially beneficial to Postgres to check if such a heuristic > could really choose better query plans than GEQO does. Judging by the > results that simulated annealing gives on the travelling salesman problem, > it looks like a simpler and potentially more effective way of combinatorial > optimization. > > As deliverables of such a project I would provide a simple implementation > of basic simulated annealing optimizer and some form of quantitative > comparison with GEQO. > > I see that this may be considerably bigger than most of GSoC projects, but > I would like to know your opinion. Do you think that this would be > beneficial enough to make a proper GSoC project? I would also like to know > if you have any additional ideas about this project. > > Best regards, > Grzegorz Parka Hi, I think someone already mentioned it, but it would be very neat if the optimizer could be pluggable. Then many different algorithms could be evaluated more easily. Regards, Ken
Josh Berkus writes: > On 02/26/2015 05:50 PM, Fabrízio de Royes Mello wrote: >> >> On Thu, Feb 26, 2015 at 10:27 PM, Andres Freund <andres@2ndquadrant.com >> <mailto:andres@2ndquadrant.com>> wrote: >>> >>> On 2015-02-26 20:23:33 -0500, Tom Lane wrote: >>> > >>> > I seem to recall somebody demo'ing a simulated-annealing GEQO >> replacement >>> > at PGCon a couple years back. It never got to the point of being a >>> > submitted patch though. >>> >>> Yea, it was Jan Urbański (CCed). >>> >> >> And the project link: https://github.com/wulczer/saio > > So what w'ere saying, Grzegorz, is that we would love to see someone > pick this up and get it to the point of making it a feature as a GSOC > project. I think if you can start from where Jan left off, you could > actually complete it. Sorry, late to the party. Yes, I wrote a GEQO replacement that used simulated annealing for my Master thesis. It got to a point where it was generating plans similar to what GEQO outputs for small queries and better plans for very large ones. The thesis itself is here: https://wulczer.org/saio.pdf and the linked GitHub repo contains source for the PGCon presentation, which gives a higher-level overview. The big problem turned out to be writing the step function that generates a new join order from a previous one. Look for the "Simulated Annealing challenges" and "Moves generator" chapters in my thesis, which are the only interesting ones :) If you'd like to pick up where I left, I'd be more than happy to help in any ways I can. Best, Jan
Thank you a lot for your feedback. I searched a lot about GEQO,
but I didn't find information about any earlier attempts.
I'm happy to know that this is important for Postgres.
but I didn't find information about any earlier attempts.
I'm happy to know that this is important for Postgres.
I'm really interested in this project, so I just need to estimate if I can handle it.
Now I will spend some time with SAIO and GEQO to find it out.
Best,
Grzegorz
2015-02-27 16:29 GMT+01:00 Jan Urbański <wulczer@wulczer.org>:
Josh Berkus writes:
> On 02/26/2015 05:50 PM, Fabrízio de Royes Mello wrote:
>>
>> On Thu, Feb 26, 2015 at 10:27 PM, Andres Freund <andres@2ndquadrant.com
>> <mailto:andres@2ndquadrant.com>> wrote:
>>>
>>> On 2015-02-26 20:23:33 -0500, Tom Lane wrote:
>>> >
>>> > I seem to recall somebody demo'ing a simulated-annealing GEQO
>> replacement
>>> > at PGCon a couple years back. It never got to the point of being a
>>> > submitted patch though.
>>>
>>> Yea, it was Jan Urbański (CCed).
>>>
>>
>> And the project link: https://github.com/wulczer/saio
>
> So what w'ere saying, Grzegorz, is that we would love to see someone
> pick this up and get it to the point of making it a feature as a GSOC
> project. I think if you can start from where Jan left off, you could
> actually complete it.
Sorry, late to the party.
Yes, I wrote a GEQO replacement that used simulated annealing for my Master
thesis. It got to a point where it was generating plans similar to what GEQO
outputs for small queries and better plans for very large ones.
The thesis itself is here: https://wulczer.org/saio.pdf and the linked GitHub
repo contains source for the PGCon presentation, which gives a higher-level
overview.
The big problem turned out to be writing the step function that generates a new
join order from a previous one. Look for the "Simulated Annealing challenges"
and "Moves generator" chapters in my thesis, which are the only interesting
ones :)
If you'd like to pick up where I left, I'd be more than happy to help in any
ways I can.
Best,
Jan
I think someone already mentioned it, but it would be very neat if the
optimizer could be pluggable. Then many different algorithms could be
evaluated more easily.
Does it mean just to make the join order optimizer pluggable? If yes then it is already pluggable as an extension. Is this the desired approach so far?
2015-02-27 15:03 GMT+01:00 ktm@rice.edu <ktm@rice.edu>:
Hi,On Thu, Feb 26, 2015 at 10:59:50PM +0100, Grzegorz Parka wrote:
> Dear Hackers,
>
> I'm Grzegorz Parka, BSc Engineer of Technical Physics and student of
> Computer Science at WUT, Poland. Last year I've been a bit into
> evolutionary algorithms and during my research I found out about GEQO in
> Postgres. I also found out that there are plans to try a different attempt
> to find optimal query plans and thought it could be a good thing to use it
> as a project for GSoC.
>
> I'm interested in one of old TODO items related to the optimizer -
> 'Consider compressed annealing to search for query plans'. I believe this
> would be potentially beneficial to Postgres to check if such a heuristic
> could really choose better query plans than GEQO does. Judging by the
> results that simulated annealing gives on the travelling salesman problem,
> it looks like a simpler and potentially more effective way of combinatorial
> optimization.
>
> As deliverables of such a project I would provide a simple implementation
> of basic simulated annealing optimizer and some form of quantitative
> comparison with GEQO.
>
> I see that this may be considerably bigger than most of GSoC projects, but
> I would like to know your opinion. Do you think that this would be
> beneficial enough to make a proper GSoC project? I would also like to know
> if you have any additional ideas about this project.
>
> Best regards,
> Grzegorz Parka
I think someone already mentioned it, but it would be very neat if the
optimizer could be pluggable. Then many different algorithms could be
evaluated more easily.
Regards,
Ken
Grzegorz Parka <grzegorz.parka@gmail.com> writes: > I'm thinking of testing and improving SAIO as an extension before reaching > a satisfactory quality of code and returned plans. > Would then the destination be the /contrib and then main source tree or > would it ever stay as an extension? I'd like to push it into the main tree, replacing GEQO, if as we suspect it turns out to dominate GEQO on speed/plan quality. There's no reason not to test it in the manner you describe though. regards, tom lane