Thread: min/max planner optimization
In investigating the planner changes necessary for the append node planning I described in my other email I noticed something else I find strange. The min/max optimization which builds an "ORDER BY ... LIMIT 1" type of plan for min or max works by explicitly building an index path to scan a plain relation. It has comments saying it's not possible to do this optimization for most joins and let alone more complex things like subqueries, grouped queries, or set operations. I don't understand why it wouldn't work to do it for any arbitrary path for any query at all as long as it has the correct ordering. I would have expected the way to do this would be to detect that the min/max optimization might be applicable early on, in which case we teach pathkeys_useful_for_ordering about the ordering which would be necessary for it. Then when we're finding useful indexes we should automatically accumulate paths to produce that order. And when we come to produce the final plan we check if we have an already-ordered path which will produce the required column and has a startup cost less than the total cost of the cheapest path. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com
Gregory Stark <stark@enterprisedb.com> writes: > I don't understand why it wouldn't work to do it for any arbitrary path for > any query at all as long as it has the correct ordering. It might work, but the resulting plan would be uniformly inferior to the regular aggregate code. The only case where the optimization is a win is where you have a zero-startup-cost subplan, and the only way to get sorted output with zero startup cost is an indexscan. Anything involving a sort will be a loser, or at best break-even if the sort figures out it can use top-N. > I would have expected the way to do this would be to detect that the min/max > optimization might be applicable early on, in which case we teach > pathkeys_useful_for_ordering about the ordering which would be necessary for > it. Possibly. I did it the way I did mostly to minimize the side-effects on the rest of the planner. But I think if we did it like that, we'd waste cycles considering ordered-output paths that are really not of any value in light of the comments above. regards, tom lane
"Tom Lane" <tgl@sss.pgh.pa.us> writes: > The only case where the optimization is a win is where you have a > zero-startup-cost subplan, and the only way to get sorted output with zero > startup cost is an indexscan. Sure but there could be other nodes above the index scan which preserve the order. In particular nested loop and merge joins. Unique also preserves the order but I can't see how it could be useful here. And of course potentially Append nodes in the future... -- Gregory Stark EnterpriseDB http://www.enterprisedb.com
Hi,
I don't know whether this input would be useful. But what i could observe from the behavior of MIN/MAX is
It goes to the proper page, but starts the page scan in a opposite way. Say for example you want the min value, it goes to the first leaf page, but starts from the last tuple and comes to the top. For Max, it goes to the last page and starts scanning from top and reaches the bottom. If some extra information can be passed on to the scan, saying whether it is a min/max oper, then we can tune this part. I don't know how much we will save from this in-memory operation. But it would definitely do lesser work.
Thanks,
Gokul.
--
Thanks,
Gokul.
CertoSQL Project,
Allied Solution Groups.
(www.alliedgroups.com )
I don't know whether this input would be useful. But what i could observe from the behavior of MIN/MAX is
It goes to the proper page, but starts the page scan in a opposite way. Say for example you want the min value, it goes to the first leaf page, but starts from the last tuple and comes to the top. For Max, it goes to the last page and starts scanning from top and reaches the bottom. If some extra information can be passed on to the scan, saying whether it is a min/max oper, then we can tune this part. I don't know how much we will save from this in-memory operation. But it would definitely do lesser work.
Thanks,
Gokul.
On 10/27/07, Gregory Stark <stark@enterprisedb.com> wrote:
"Tom Lane" <tgl@sss.pgh.pa.us> writes:
> The only case where the optimization is a win is where you have a
> zero-startup-cost subplan, and the only way to get sorted output with zero
> startup cost is an indexscan.
Sure but there could be other nodes above the index scan which preserve the
order. In particular nested loop and merge joins. Unique also preserves the
order but I can't see how it could be useful here. And of course potentially
Append nodes in the future...
--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com
---------------------------(end of broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster
--
Thanks,
Gokul.
CertoSQL Project,
Allied Solution Groups.
(www.alliedgroups.com )