Thread: optimizer questions

optimizer questions

From
hector Corrada Bravo
Date:
Hello everyone,

I am working with the Postgres optimizer for the first time, so bear with me...

I want to extend the optimizer to deal with aggregate queries a bit
better. The idea is from an old paper by Chaudhuri and Shim in VLDB
94. The gist of it is that when computing aggregates over the result
of joining multiple tables, under some conditions the aggregate can be
pushed into the join tree to reduce the size of join operands making
resulting plans cheaper.

So here is my problem, due to the path/plan separation of the Postgres
optimizer, this is not trivial (joins are decided in path, aggregates
decided in plan). As it stands, aggregate nodes can only appear as the
top node of subqueries.

Before I start trying this (creating aggregate paths seems the
reasonable thing to do) I would like your counsel.

1) Regardless of the optimization problem, is the executor able to
execute aggregate nodes within join trees (that is, not as the result
of subqueries)?

2) Has anyone tried something like this before?

3) For debugging purposes: Has anyone figured out a way to feed
hand-crafted plans to the executor? Setting up some of the data
structures (PlannerInfo, target lists) etc. does not look trivial. By
this I mean, beyond giving explicit join clauses in queries.

4) Any other suggestions?

Thank you very much,
Hector


Re: optimizer questions

From
Martijn van Oosterhout
Date:
On Tue, Feb 14, 2006 at 10:35:12AM -0600, hector Corrada Bravo wrote:
> Hello everyone,
>
> I am working with the Postgres optimizer for the first time, so bear with me...
>
> I want to extend the optimizer to deal with aggregate queries a bit
> better. The idea is from an old paper by Chaudhuri and Shim in VLDB
> 94. The gist of it is that when computing aggregates over the result
> of joining multiple tables, under some conditions the aggregate can be
> pushed into the join tree to reduce the size of join operands making
> resulting plans cheaper.
>
> So here is my problem, due to the path/plan separation of the Postgres
> optimizer, this is not trivial (joins are decided in path, aggregates
> decided in plan). As it stands, aggregate nodes can only appear as the
> top node of subqueries.

Well, the basic approach in the beginning should probably be to change
the planner to modify the plan so you get a new plan with the
aggregation in the right place.

> Before I start trying this (creating aggregate paths seems the
> reasonable thing to do) I would like your counsel.
>
> 1) Regardless of the optimization problem, is the executor able to
> execute aggregate nodes within join trees (that is, not as the result
> of subqueries)?

An Aggregate is just a node type that can be stacked above any other
node. Ofcourse, it requires the input rows to be in the appropriate
order but other than that.

> 2) Has anyone tried something like this before?

No idea.

> 3) For debugging purposes: Has anyone figured out a way to feed
> hand-crafted plans to the executor? Setting up some of the data
> structures (PlannerInfo, target lists) etc. does not look trivial. By
> this I mean, beyond giving explicit join clauses in queries.

Nope sorry.

> 4) Any other suggestions?

Well, I honestly have no idea what kind of transformations you have in
mind so not really. However, you should probably realise that in
PostgreSQL aggregates are not special and users can make their own.
Whatever you think of needs to take that into account.

Have a ncie day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
> tool for doing 5% of the work and then sitting around waiting for someone
> else to do the other 95% so you can sue them.

Re: optimizer questions

From
Tom Lane
Date:
hector Corrada Bravo <hcorrada@gmail.com> writes:
> 1) Regardless of the optimization problem, is the executor able to
> execute aggregate nodes within join trees (that is, not as the result
> of subqueries)?

Sure.

> 3) For debugging purposes: Has anyone figured out a way to feed
> hand-crafted plans to the executor? Setting up some of the data
> structures (PlannerInfo, target lists) etc. does not look trivial. By
> this I mean, beyond giving explicit join clauses in queries.

It's not really very practical --- the data structures are too complex
to create by hand in any reasonable way.  You can probably test by
entering modified queries that do the aggregations in sub-selects,
though.

The most practical way to implement something like this is probably to
restructure the query during the "prep" stage, pushing the aggregates
down into sub-queries automatically.  The 8.1 min/max optimization code
does something related, although I'll freely admit that's a bit of a hack.
        regards, tom lane


Re: optimizer questions

From
Jens-Wolfhard Schicke
Date:
--On Dienstag, Februar 14, 2006 10:35:12 -0600 hector Corrada Bravo
<hcorrada@gmail.com> wrote:

> Before I start trying this (creating aggregate paths seems the
> reasonable thing to do) I would like your counsel.
>
> 1) Regardless of the optimization problem, is the executor able to
> execute aggregate nodes within join trees (that is, not as the result
> of subqueries)?
>
> 2) Has anyone tried something like this before?
I did and failed because I did not quite understand how the Postgres
internal variables should be initialized.

My approach was to supply other join pathes if one of the two tables was an
aggregate.

Mit freundlichem Gruß
Jens Schicke
--
Jens Schicke              j.schicke@asco.de
asco GmbH              http://www.asco.de
Mittelweg 7              Tel 0531/3906-127
38106 Braunschweig          Fax 0531/3906-400