Thread: GSoC idea - Simulated annealing to search for query plans

GSoC idea - Simulated annealing to search for query plans

From
Grzegorz Parka
Date:
Dear Hackers,

I'm Grzegorz Parka, BSc Engineer of Technical Physics and student of Computer Science at WUT, Poland. Last year I've been a bit into evolutionary algorithms and during my research I found out about GEQO in Postgres. I also found out that there are plans to try a different attempt to find optimal query plans and thought it could be a good thing to use it as a project for GSoC.

I'm interested in one of old TODO items related to the optimizer - 'Consider compressed annealing to search for query plans'. I believe this would be potentially beneficial to Postgres to check if such a heuristic could really choose better query plans than GEQO does. Judging by the results that simulated annealing gives on the travelling salesman problem, it looks like a simpler and potentially more effective way of combinatorial optimization.

As deliverables of such a project I would provide a simple implementation of basic simulated annealing optimizer and some form of quantitative comparison with GEQO.

I see that this may be considerably bigger than most of GSoC projects, but I would like to know your opinion. Do you think that this would be beneficial enough to make a proper GSoC project? I would also like to know if you have any additional ideas about this project.

Best regards,
Grzegorz Parka

Re: GSoC idea - Simulated annealing to search for query plans

From
Josh Berkus
Date:
On 02/26/2015 01:59 PM, Grzegorz Parka wrote:
> Dear Hackers,
> 
> I'm Grzegorz Parka, BSc Engineer of Technical Physics and student of
> Computer Science at WUT, Poland. Last year I've been a bit into
> evolutionary algorithms and during my research I found out about GEQO in
> Postgres. I also found out that there are plans to try a different
> attempt to find optimal query plans and thought it could be a good thing
> to use it as a project for GSoC.
> 
> I'm interested in one of old TODO items related to the optimizer -
> 'Consider compressed annealing to search for query plans'. I believe
> this would be potentially beneficial to Postgres to check if such a
> heuristic could really choose better query plans than GEQO does. Judging
> by the results that simulated annealing gives on the travelling salesman
> problem, it looks like a simpler and potentially more effective way of
> combinatorial optimization.
> 
> As deliverables of such a project I would provide a simple
> implementation of basic simulated annealing optimizer and some form of
> quantitative comparison with GEQO.

You might look at the earlier attempt to make the GEQO replacement
"pluggable".  That project failed to complete sufficiently to be a
feature though, but it did enough to show that our current GEQO
implementation was suboptimal.

I'm currently searching for this project ... it was a GSOC project, but
I think before they required posting to Google Code.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com



Re: GSoC idea - Simulated annealing to search for query plans

From
Tom Lane
Date:
Josh Berkus <josh@agliodbs.com> writes:
> On 02/26/2015 01:59 PM, Grzegorz Parka wrote:
>> I'm interested in one of old TODO items related to the optimizer -
>> 'Consider compressed annealing to search for query plans'.

> You might look at the earlier attempt to make the GEQO replacement
> "pluggable".  That project failed to complete sufficiently to be a
> feature though, but it did enough to show that our current GEQO
> implementation was suboptimal.

> I'm currently searching for this project ... it was a GSOC project, but
> I think before they required posting to Google Code.

I seem to recall somebody demo'ing a simulated-annealing GEQO replacement
at PGCon a couple years back.  It never got to the point of being a
submitted patch though.
        regards, tom lane



Re: GSoC idea - Simulated annealing to search for query plans

From
Andres Freund
Date:
On 2015-02-26 20:23:33 -0500, Tom Lane wrote:
> Josh Berkus <josh@agliodbs.com> writes:
> > On 02/26/2015 01:59 PM, Grzegorz Parka wrote:
> >> I'm interested in one of old TODO items related to the optimizer -
> >> 'Consider compressed annealing to search for query plans'.
>
> > You might look at the earlier attempt to make the GEQO replacement
> > "pluggable".  That project failed to complete sufficiently to be a
> > feature though, but it did enough to show that our current GEQO
> > implementation was suboptimal.
>
> > I'm currently searching for this project ... it was a GSOC project, but
> > I think before they required posting to Google Code.
>
> I seem to recall somebody demo'ing a simulated-annealing GEQO replacement
> at PGCon a couple years back.  It never got to the point of being a
> submitted patch though.

Yea, it was Jan Urbański (CCed).

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



Re: GSoC idea - Simulated annealing to search for query plans

From
Fabrízio de Royes Mello
Date:
<div dir="ltr"><div class="gmail_extra"><br />On Thu, Feb 26, 2015 at 10:27 PM, Andres Freund <<a
href="mailto:andres@2ndquadrant.com">andres@2ndquadrant.com</a>>wrote:<br />><br />> On 2015-02-26 20:23:33
-0500,Tom Lane wrote:<br />> > Josh Berkus <<a href="mailto:josh@agliodbs.com">josh@agliodbs.com</a>>
writes:<br/>> > > On 02/26/2015 01:59 PM, Grzegorz Parka wrote:<br />> > >> I'm interested in one
ofold TODO items related to the optimizer -<br />> > >> 'Consider compressed annealing to search for query
plans'.<br/>> ><br />> > > You might look at the earlier attempt to make the GEQO replacement<br />>
>> "pluggable".  That project failed to complete sufficiently to be a<br />> > > feature though, but it
didenough to show that our current GEQO<br />> > > implementation was suboptimal.<br />> ><br />>
>> I'm currently searching for this project ... it was a GSOC project, but<br />> > > I think before
theyrequired posting to Google Code.<br />> ><br />> > I seem to recall somebody demo'ing a
simulated-annealingGEQO replacement<br />> > at PGCon a couple years back.  It never got to the point of being
a<br/>> > submitted patch though.<br />><br />> Yea, it was Jan Urbański (CCed).<br />><br /><br />And
theproject link: <a href="https://github.com/wulczer/saio">https://github.com/wulczer/saio</a><br /><br />Regards,<br
/><br/>--<br />Fabrízio de Royes Mello<br />Consultoria/Coaching PostgreSQL<br />>> Timbira: <a
href="http://www.timbira.com.br">http://www.timbira.com.br</a><br/>>> Blog: <a
href="http://fabriziomello.github.io">http://fabriziomello.github.io</a><br/>>> Linkedin: <a
href="http://br.linkedin.com/in/fabriziomello">http://br.linkedin.com/in/fabriziomello</a><br/>>> Twitter: <a
href="http://twitter.com/fabriziomello">http://twitter.com/fabriziomello</a><br/>>> Github: <a
href="http://github.com/fabriziomello">http://github.com/fabriziomello</a></div></div>

Re: GSoC idea - Simulated annealing to search for query plans

From
Josh Berkus
Date:
On 02/26/2015 05:50 PM, Fabrízio de Royes Mello wrote:
>
> On Thu, Feb 26, 2015 at 10:27 PM, Andres Freund <andres@2ndquadrant.com
> <mailto:andres@2ndquadrant.com>> wrote:
>>
>> On 2015-02-26 20:23:33 -0500, Tom Lane wrote:
>> > Josh Berkus <josh@agliodbs.com <mailto:josh@agliodbs.com>> writes:
>> > > On 02/26/2015 01:59 PM, Grzegorz Parka wrote:
>> > >> I'm interested in one of old TODO items related to the optimizer -
>> > >> 'Consider compressed annealing to search for query plans'.
>> >
>> > > You might look at the earlier attempt to make the GEQO replacement
>> > > "pluggable".  That project failed to complete sufficiently to be a
>> > > feature though, but it did enough to show that our current GEQO
>> > > implementation was suboptimal.
>> >
>> > > I'm currently searching for this project ... it was a GSOC
> project, but
>> > > I think before they required posting to Google Code.
>> >
>> > I seem to recall somebody demo'ing a simulated-annealing GEQO
> replacement
>> > at PGCon a couple years back.  It never got to the point of being a
>> > submitted patch though.
>>
>> Yea, it was Jan Urbański (CCed).
>>
>
> And the project link: https://github.com/wulczer/saio

So what w'ere saying, Grzegorz, is that we would love to see someone
pick this up and get it to the point of making it a feature as a GSOC
project.  I think if you can start from where Jan left off, you could
actually complete it.


--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com



Re: GSoC idea - Simulated annealing to search for query plans

From
"ktm@rice.edu"
Date:
On Thu, Feb 26, 2015 at 10:59:50PM +0100, Grzegorz Parka wrote:
> Dear Hackers,
> 
> I'm Grzegorz Parka, BSc Engineer of Technical Physics and student of
> Computer Science at WUT, Poland. Last year I've been a bit into
> evolutionary algorithms and during my research I found out about GEQO in
> Postgres. I also found out that there are plans to try a different attempt
> to find optimal query plans and thought it could be a good thing to use it
> as a project for GSoC.
> 
> I'm interested in one of old TODO items related to the optimizer -
> 'Consider compressed annealing to search for query plans'. I believe this
> would be potentially beneficial to Postgres to check if such a heuristic
> could really choose better query plans than GEQO does. Judging by the
> results that simulated annealing gives on the travelling salesman problem,
> it looks like a simpler and potentially more effective way of combinatorial
> optimization.
> 
> As deliverables of such a project I would provide a simple implementation
> of basic simulated annealing optimizer and some form of quantitative
> comparison with GEQO.
> 
> I see that this may be considerably bigger than most of GSoC projects, but
> I would like to know your opinion. Do you think that this would be
> beneficial enough to make a proper GSoC project? I would also like to know
> if you have any additional ideas about this project.
> 
> Best regards,
> Grzegorz Parka

Hi,

I think someone already mentioned it, but it would be very neat if the
optimizer could be pluggable. Then many different algorithms could be
evaluated more easily.

Regards,
Ken



Re: GSoC idea - Simulated annealing to search for query plans

From
Jan Urbański
Date:
Josh Berkus writes:

> On 02/26/2015 05:50 PM, Fabrízio de Royes Mello wrote:
>> 
>> On Thu, Feb 26, 2015 at 10:27 PM, Andres Freund <andres@2ndquadrant.com
>> <mailto:andres@2ndquadrant.com>> wrote:
>>>
>>> On 2015-02-26 20:23:33 -0500, Tom Lane wrote:
>>> >
>>> > I seem to recall somebody demo'ing a simulated-annealing GEQO
>> replacement
>>> > at PGCon a couple years back.  It never got to the point of being a
>>> > submitted patch though.
>>>
>>> Yea, it was Jan Urbański (CCed).
>>>
>> 
>> And the project link: https://github.com/wulczer/saio
>
> So what w'ere saying, Grzegorz, is that we would love to see someone
> pick this up and get it to the point of making it a feature as a GSOC
> project.  I think if you can start from where Jan left off, you could
> actually complete it.

Sorry, late to the party.

Yes, I wrote a GEQO replacement that used simulated annealing for my Master
thesis. It got to a point where it was generating plans similar to what GEQO
outputs for small queries and better plans for very large ones.

The thesis itself is here: https://wulczer.org/saio.pdf and the linked GitHub
repo contains source for the PGCon presentation, which gives a higher-level
overview.

The big problem turned out to be writing the step function that generates a new
join order from a previous one. Look for the "Simulated Annealing challenges"
and "Moves generator" chapters in my thesis, which are the only interesting
ones :)

If you'd like to pick up where I left, I'd be more than happy to help in any
ways I can.

Best,
Jan



Re: GSoC idea - Simulated annealing to search for query plans

From
Grzegorz Parka
Date:
Thank you a lot for your feedback. I searched a lot about GEQO,
but I didn't find information about any earlier attempts.
I'm happy to know that this is important for Postgres.

I'm really interested in this project, so I just need to estimate if I can handle it.
Now I will spend some time with SAIO and GEQO to find it out.

Best,
Grzegorz

2015-02-27 16:29 GMT+01:00 Jan Urbański <wulczer@wulczer.org>:

Josh Berkus writes:

> On 02/26/2015 05:50 PM, Fabrízio de Royes Mello wrote:
>>
>> On Thu, Feb 26, 2015 at 10:27 PM, Andres Freund <andres@2ndquadrant.com
>> <mailto:andres@2ndquadrant.com>> wrote:
>>>
>>> On 2015-02-26 20:23:33 -0500, Tom Lane wrote:
>>> >
>>> > I seem to recall somebody demo'ing a simulated-annealing GEQO
>> replacement
>>> > at PGCon a couple years back.  It never got to the point of being a
>>> > submitted patch though.
>>>
>>> Yea, it was Jan Urbański (CCed).
>>>
>>
>> And the project link: https://github.com/wulczer/saio
>
> So what w'ere saying, Grzegorz, is that we would love to see someone
> pick this up and get it to the point of making it a feature as a GSOC
> project.  I think if you can start from where Jan left off, you could
> actually complete it.

Sorry, late to the party.

Yes, I wrote a GEQO replacement that used simulated annealing for my Master
thesis. It got to a point where it was generating plans similar to what GEQO
outputs for small queries and better plans for very large ones.

The thesis itself is here: https://wulczer.org/saio.pdf and the linked GitHub
repo contains source for the PGCon presentation, which gives a higher-level
overview.

The big problem turned out to be writing the step function that generates a new
join order from a previous one. Look for the "Simulated Annealing challenges"
and "Moves generator" chapters in my thesis, which are the only interesting
ones :)

If you'd like to pick up where I left, I'd be more than happy to help in any
ways I can.

Best,
Jan

Re: GSoC idea - Simulated annealing to search for query plans

From
Grzegorz Parka
Date:
I think someone already mentioned it, but it would be very neat if the
optimizer could be pluggable. Then many different algorithms could be
evaluated more easily.

Does it mean just to make the join order optimizer pluggable? If yes then it is already pluggable as an extension. Is this the desired approach so far?
I'm thinking of testing and improving SAIO as an extension before reaching a satisfactory quality of code and returned plans.

Would then the destination be the /contrib and then main source tree or would it ever stay as an extension?

Best regards,
Grzegorz

2015-02-27 15:03 GMT+01:00 ktm@rice.edu <ktm@rice.edu>:
On Thu, Feb 26, 2015 at 10:59:50PM +0100, Grzegorz Parka wrote:
> Dear Hackers,
>
> I'm Grzegorz Parka, BSc Engineer of Technical Physics and student of
> Computer Science at WUT, Poland. Last year I've been a bit into
> evolutionary algorithms and during my research I found out about GEQO in
> Postgres. I also found out that there are plans to try a different attempt
> to find optimal query plans and thought it could be a good thing to use it
> as a project for GSoC.
>
> I'm interested in one of old TODO items related to the optimizer -
> 'Consider compressed annealing to search for query plans'. I believe this
> would be potentially beneficial to Postgres to check if such a heuristic
> could really choose better query plans than GEQO does. Judging by the
> results that simulated annealing gives on the travelling salesman problem,
> it looks like a simpler and potentially more effective way of combinatorial
> optimization.
>
> As deliverables of such a project I would provide a simple implementation
> of basic simulated annealing optimizer and some form of quantitative
> comparison with GEQO.
>
> I see that this may be considerably bigger than most of GSoC projects, but
> I would like to know your opinion. Do you think that this would be
> beneficial enough to make a proper GSoC project? I would also like to know
> if you have any additional ideas about this project.
>
> Best regards,
> Grzegorz Parka

Hi,

I think someone already mentioned it, but it would be very neat if the
optimizer could be pluggable. Then many different algorithms could be
evaluated more easily.

Regards,
Ken

Re: GSoC idea - Simulated annealing to search for query plans

From
Tom Lane
Date:
Grzegorz Parka <grzegorz.parka@gmail.com> writes:
> I'm thinking of testing and improving SAIO as an extension before reaching
> a satisfactory quality of code and returned plans.

> Would then the destination be the /contrib and then main source tree or
> would it ever stay as an extension?

I'd like to push it into the main tree, replacing GEQO, if as we suspect
it turns out to dominate GEQO on speed/plan quality.  There's no reason
not to test it in the manner you describe though.
        regards, tom lane