Re: SDP query optimizer - Mailing list pgsql-hackers

From Adriano Lange
Subject Re: SDP query optimizer
Date
Msg-id 514D958F.7060200@gmail.com
Whole thread Raw
In response to Re: SDP query optimizer  (Ants Aasma <ants@cybertec.at>)
List pgsql-hackers
On 22-03-2013 21:46, Ants Aasma wrote:
> On Sat, Mar 23, 2013 at 1:35 AM, Adriano Lange <alange0001@gmail.com> wrote:
>> I have developed a new query optimizer for PostgreSQL and I would like to
>> share it with the community. The optimizer's name is Sampling and Dynamic
>> Programming (SDP). I put it into a plugin developed some years ago, named
>> LJQO:
>>
>> https://github.com/alange0001/ljqo.git
>
> Looks great. Do you happen to have any papers or insight into why SDP
> works as well as it does? It isn't immediately clear to me why the
> cheapest left-deep tree is a good heuristic start point for the
> dynamic programming phase.

Hi Ants Aasma,

I think it is difficult to say that SDP is completely better than GEQO.
There is a huge variety of situations where these kinds of optimizers
can be used.

I have made a series of experiments on SDP in a separate environment.
The main objective was to parallelize the optimizer. Thus, I called it
PSDP. I developed this environment from scratch in c++ and I used a
simplified version of the cost model used by PostgreSQL. Therefore, the
SDP is a sequential version of an experimental parallel optimizer.

For PSDP in its experimental environment, I have a comparison between
the costs obtained by the sampling phase and the costs obtained by the
complete PSDP. See the figure grafico-scaled_costs-dp_psdp attached. I
think that the dynamic programming phase can bring some robustness to
the optimizer. However, I still need to perform these same experiments
on PostgreSQL.

Considering the heuristic used in sampling phase, I have made another
series of experiments on PostgreSQL 9.0 comparing this sampling
(Sampling-L) with both a sampling on bushy trees (Sampling-B) and the
sampling used by GEQO. See as example the file "sampling...eps" also
attached. By these experiments I observed that Sampling-L was
statistically more efficient to get good join orders in a short time.


Regards,
Adriano Lange

Attachment

pgsql-hackers by date:

Previous
From: Adriano Lange
Date:
Subject: Re: SDP query optimizer
Next
From: Ants Aasma
Date:
Subject: Re: Page replacement algorithm in buffer cache