Thread: SDP query optimizer

SDP query optimizer

From
Adriano Lange
Date:
Hi all,

I have developed a new query optimizer for PostgreSQL and I would like
to share it with the community. The optimizer's name is Sampling and
Dynamic Programming (SDP). I put it into a plugin developed some years
ago, named LJQO:

https://github.com/alange0001/ljqo.git

This plugin was configured to compile only against PostgreSQL 9.2.
However, I guess it may be easily adjusted for other versions of PostgreSQL.

I would be glad for any feedback about SDP or even about LJQO.

I have some numbers about the SDP in comparison with GEQO. If
interested, see a diff between the two ".out2" files attached. The
schema and query are from a previous email posted by Andres Freund in
this list.

Thanks for attention.
Adriano Lange

Attachment

Re: SDP query optimizer

From
Josh Berkus
Date:
Adriano,

> I have developed a new query optimizer for PostgreSQL and I would like
> to share it with the community. The optimizer's name is Sampling and
> Dynamic Programming (SDP). I put it into a plugin developed some years
> ago, named LJQO:

Woah!  Way cool.

As a warning, we're in the closing throes of version 9.3 right now, so
if you code/ideas doesn't get the attention it deserves, that's why.

There is an incomplete project from a few years back to make the
non-exhaustive query planner pluggable so that we could use different
algorithms.  Unfortunately, it was never finished and merged with the
core code.  Your planner is yet another reason it would be great to
complete this.

Keep up the good work!

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com



Re: SDP query optimizer

From
Ants Aasma
Date:
On Sat, Mar 23, 2013 at 1:35 AM, Adriano Lange <alange0001@gmail.com> wrote:
> I have developed a new query optimizer for PostgreSQL and I would like to
> share it with the community. The optimizer's name is Sampling and Dynamic
> Programming (SDP). I put it into a plugin developed some years ago, named
> LJQO:
>
> https://github.com/alange0001/ljqo.git

Looks great. Do you happen to have any papers or insight into why SDP
works as well as it does? It isn't immediately clear to me why the
cheapest left-deep tree is a good heuristic start point for the
dynamic programming phase.

Regards,
Ants Aasma
--
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt
Web: http://www.postgresql-support.de



Re: SDP query optimizer

From
Adriano Lange
Date:
Hi,

On 22-03-2013 21:22, Josh Berkus wrote:
> Woah!  Way cool.
>
> As a warning, we're in the closing throes of version 9.3 right now, so
> if you code/ideas doesn't get the attention it deserves, that's why.

Ok. No problem. :-)

> There is an incomplete project from a few years back to make the
> non-exhaustive query planner pluggable so that we could use different
> algorithms.  Unfortunately, it was never finished and merged with the
> core code.  Your planner is yet another reason it would be great to
> complete this.

Yes. I looked at the Julius and Tomas' project in pgFoundry [1] some 
years ago, but it was inactive. Therefore, I decided to start a new one.

[1] - http://pgfoundry.org/projects/optimizer/

Anyway, good work for all of you.

--
Adriano Lange




Re: SDP query optimizer

From
Adriano Lange
Date:
On 22-03-2013 21:46, Ants Aasma wrote:
> On Sat, Mar 23, 2013 at 1:35 AM, Adriano Lange <alange0001@gmail.com> wrote:
>> I have developed a new query optimizer for PostgreSQL and I would like to
>> share it with the community. The optimizer's name is Sampling and Dynamic
>> Programming (SDP). I put it into a plugin developed some years ago, named
>> LJQO:
>>
>> https://github.com/alange0001/ljqo.git
>
> Looks great. Do you happen to have any papers or insight into why SDP
> works as well as it does? It isn't immediately clear to me why the
> cheapest left-deep tree is a good heuristic start point for the
> dynamic programming phase.

Hi Ants Aasma,

I think it is difficult to say that SDP is completely better than GEQO.
There is a huge variety of situations where these kinds of optimizers
can be used.

I have made a series of experiments on SDP in a separate environment.
The main objective was to parallelize the optimizer. Thus, I called it
PSDP. I developed this environment from scratch in c++ and I used a
simplified version of the cost model used by PostgreSQL. Therefore, the
SDP is a sequential version of an experimental parallel optimizer.

For PSDP in its experimental environment, I have a comparison between
the costs obtained by the sampling phase and the costs obtained by the
complete PSDP. See the figure grafico-scaled_costs-dp_psdp attached. I
think that the dynamic programming phase can bring some robustness to
the optimizer. However, I still need to perform these same experiments
on PostgreSQL.

Considering the heuristic used in sampling phase, I have made another
series of experiments on PostgreSQL 9.0 comparing this sampling
(Sampling-L) with both a sampling on bushy trees (Sampling-B) and the
sampling used by GEQO. See as example the file "sampling...eps" also
attached. By these experiments I observed that Sampling-L was
statistically more efficient to get good join orders in a short time.


Regards,
Adriano Lange

Attachment

Re: SDP query optimizer

From
Andres Freund
Date:
On 2013-03-22 20:35:43 -0300, Adriano Lange wrote:
> Hi all,
> 
> I have developed a new query optimizer for PostgreSQL and I would like to
> share it with the community. The optimizer's name is Sampling and Dynamic
> Programming (SDP). I put it into a plugin developed some years ago, named
> LJQO:
> 
> https://github.com/alange0001/ljqo.git
> 
> This plugin was configured to compile only against PostgreSQL 9.2. However,
> I guess it may be easily adjusted for other versions of PostgreSQL.
> 
> I would be glad for any feedback about SDP or even about LJQO.
> 
> I have some numbers about the SDP in comparison with GEQO. If interested,
> see a diff between the two ".out2" files attached. The schema and query are
> from a previous email posted by Andres Freund in this list.

I just want to mention that unless you skew the statistics for the individual
tables from their empty/default state this mostly measures a pretty degenerate
case where optima are very rare and not very differentiated. Thats a useful
thing to test, but not to have as the target to optimize for.
So it might be interesting to run that thing with some table
stats/contents stats set up.

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



Re: SDP query optimizer

From
Adriano Lange
Date:
On 23-03-2013 10:15, Andres Freund wrote:
> I just want to mention that unless you skew the statistics for the individual
> tables from their empty/default state this mostly measures a pretty degenerate
> case where optima are very rare and not very differentiated. Thats a useful
> thing to test, but not to have as the target to optimize for.
> So it might be interesting to run that thing with some table
> stats/contents stats set up.


Yes, the search space obtained from this experiment may be very simpler 
than a real case. Beyond this experiment, I can construct classical 
queries used to evaluate this kind of algorithm, as stars, cliques, 
chains and cycles. Beyond these queries I have no idea how can I further 
test it.

Regards,

Adriano Lange