Thread: min/max planner optimization

min/max planner optimization

From

Gregory Stark

Date:

27 October 2007, 09:54:09

In investigating the planner changes necessary for the append node planning I
described in my other email I noticed something else I find strange.

The min/max optimization which builds an "ORDER BY ... LIMIT 1" type of plan
for min or max works by explicitly building an index path to scan a plain
relation. It has comments saying it's not possible to do this optimization for
most joins and let alone more complex things like subqueries, grouped queries,
or set operations.

I don't understand why it wouldn't work to do it for any arbitrary path for
any query at all as long as it has the correct ordering.

I would have expected the way to do this would be to detect that the min/max
optimization might be applicable early on, in which case we teach
pathkeys_useful_for_ordering about the ordering which would be necessary for
it.

Then when we're finding useful indexes we should automatically accumulate
paths to produce that order. And when we come to produce the final plan we
check if we have an already-ordered path which will produce the required
column and has a startup cost less than the total cost of the cheapest path.

--  Gregory Stark EnterpriseDB          http://www.enterprisedb.com

Re: min/max planner optimization

From

Tom Lane

Date:

27 October 2007, 10:46:10

Gregory Stark <stark@enterprisedb.com> writes:
> I don't understand why it wouldn't work to do it for any arbitrary path for
> any query at all as long as it has the correct ordering.

It might work, but the resulting plan would be uniformly inferior to the
regular aggregate code.  The only case where the optimization is a win
is where you have a zero-startup-cost subplan, and the only way to get
sorted output with zero startup cost is an indexscan.  Anything
involving a sort will be a loser, or at best break-even if the sort
figures out it can use top-N.

> I would have expected the way to do this would be to detect that the min/max
> optimization might be applicable early on, in which case we teach
> pathkeys_useful_for_ordering about the ordering which would be necessary for
> it.

Possibly.  I did it the way I did mostly to minimize the side-effects on
the rest of the planner.  But I think if we did it like that, we'd waste
cycles considering ordered-output paths that are really not of any value
in light of the comments above.
        regards, tom lane

Re: min/max planner optimization

From

Gregory Stark

Date:

27 October 2007, 12:00:43

"Tom Lane" <tgl@sss.pgh.pa.us> writes:

> The only case where the optimization is a win is where you have a
> zero-startup-cost subplan, and the only way to get sorted output with zero
> startup cost is an indexscan.

Sure but there could be other nodes above the index scan which preserve the
order. In particular nested loop and merge joins. Unique also preserves the
order but I can't see how it could be useful here. And of course potentially
Append nodes in the future...

--  Gregory Stark EnterpriseDB          http://www.enterprisedb.com

Re: min/max planner optimization

From

"Gokulakannan Somasundaram"

Date:

27 October 2007, 12:18:43

Hi,
I don't know whether this input would be useful. But what i could observe from the behavior of MIN/MAX is

It goes to the proper page, but starts the page scan in a opposite way. Say for example you want the min value, it goes to the first leaf page, but starts from the last tuple and comes to the top. For Max, it goes to the last page and starts scanning from top and reaches the bottom. If some extra information can be passed on to the scan, saying whether it is a min/max oper, then we can tune this part. I don't know how much we will save from this in-memory operation. But it would definitely do lesser work.

Thanks,
Gokul.

On 10/27/07, Gregory Stark <stark@enterprisedb.com> wrote:

"Tom Lane" <tgl@sss.pgh.pa.us> writes:

> The only case where the optimization is a win is where you have a
> zero-startup-cost subplan, and the only way to get sorted output with zero
> startup cost is an indexscan.

Sure but there could be other nodes above the index scan which preserve the
order. In particular nested loop and merge joins. Unique also preserves the
order but I can't see how it could be useful here. And of course potentially
Append nodes in the future...

--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com

---------------------------(end of broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster

--
Thanks,
Gokul.
CertoSQL Project,
Allied Solution Groups.
(www.alliedgroups.com )