Thread: left-deep plans?
Presently the planner considers left-deep, right-deep, and bushy plans (i.e. it will consider plans in which the outer operand of a join is a join, the inner operand is a join, or both operands are joins). It is a fairly standard heuristic in the literature to restrict the search to left-deep plans, on the grounds that this significantly reduces the set of plans to consider, and the more efficient plans are _usually_ found in the set of left-deep plans (since we can do pipelining more efficiently). Has there been any thought about applying this optimization? (I doubt it would be wise to unconditionally restrict the search to left-deep plans, but there may be situations in which applying this heuristic would allow the regular planner to be used instead of GEQO. Perhaps a GUC variable?) -Neil
Neil Conway <neilc@samurai.com> writes: > Presently the planner considers left-deep, right-deep, and bushy plans > (i.e. it will consider plans in which the outer operand of a join is a > join, the inner operand is a join, or both operands are joins). It is a > fairly standard heuristic in the literature to restrict the search to > left-deep plans, on the grounds that this significantly reduces the set > of plans to consider, and the more efficient plans are _usually_ found > in the set of left-deep plans (since we can do pipelining more > efficiently). Has there been any thought about applying this optimization? Yes, and it's been rejected. The notion is obviously bogus; it amounts to assuming that every database is a star schema with only one core table. The left-deep vs right-deep case is more tricky, since on its face that's redundant; but I believe we have things fixed so that we aren't considering redundant plans wholesale. (Note the elimination of match_unsorted_inner in joinpath.c.) Once we get into GEQO territory, we are using the left-deep-only heuristic because that's the only kind of plan GEQO can construct. But at that point you've already given up any notion of exhaustive search. regards, tom lane
Tom Lane wrote: > Yes, and it's been rejected. The notion is obviously bogus; it amounts > to assuming that every database is a star schema with only one core table. Interesting; yes, I suppose that's true. > Once we get into GEQO territory, we are using the left-deep-only > heuristic because that's the only kind of plan GEQO can construct. > But at that point you've already given up any notion of exhaustive > search. I think most applications would prefer an exhaustive, deterministic search of a subset of the search space over a non-exhaustive, non-deterministic search of the same subset, given approximately the same performance. In other words, if confining the search to left-deep plans allows people to use the normal planner in situations where they would normally be forced to use GEQO to get acceptable performance, I think that would be a win. Speaking of which, why does GEQO restrict its search to left-deep plans only? -Neil
Neil Conway <neilc@samurai.com> writes: > Tom Lane wrote: >> Once we get into GEQO territory, we are using the left-deep-only >> heuristic because that's the only kind of plan GEQO can construct. > I think most applications would prefer an exhaustive, deterministic > search of a subset of the search space over a non-exhaustive, > non-deterministic search of the same subset, given approximately the > same performance. I am not by any means standing up to defend GEQO as being the best way to do partial searches ;-). Just saying that in the regime where we can hope to do complete searches, we shouldn't exclude bushy plans. > Speaking of which, why does GEQO restrict its search to left-deep plans > only? Well, because it's really a traveling-salesman algorithm, and it models the "find a good join tree" problem as "find a good tour". I've commented before that I don't believe this is a particularly good model --- intuitively it doesn't seem that the cost functions have the same structure. But I've not had time to look for a better heuristic algorithm. Just one of the many things on the TODO list ... regards, tom lane
Kenneth Marshall wrote: > GEQO is an attempt to provide a near-optimal join order without using > an exhaustive search. "An exhaustive, deterministic search of a subset > of the search space" has a non-zero probability of finding only a local > minimum in execution time. I'm not sure what you mean. By an "exhaustive, deterministic search of a subset of the search space", I was referring to using the normal planner, but restricting the search space to only left-deep plans. Since GEQO will also only consider left-deep plans, ISTM there is no issue of "local minima" that does not apply to an equal degree to GEQO itself. > Since purely random selection, with learning, works so well, it may be > worth developing a new random join-order optimization algorithm Yeah, I agree. I've read a few papers on randomized algorithms for join order selection, but none of them really seemed to be clear winners. Do you have a reference for the paper you read? -Neil
On Tue, Feb 22, 2005 at 05:40:40PM +1100, Neil Conway wrote: > Tom Lane wrote: > >Yes, and it's been rejected. The notion is obviously bogus; it amounts > >to assuming that every database is a star schema with only one core table. > > Interesting; yes, I suppose that's true. > > >Once we get into GEQO territory, we are using the left-deep-only > >heuristic because that's the only kind of plan GEQO can construct. > >But at that point you've already given up any notion of exhaustive > >search. > > I think most applications would prefer an exhaustive, deterministic > search of a subset of the search space over a non-exhaustive, > non-deterministic search of the same subset, given approximately the > same performance. In other words, if confining the search to left-deep > plans allows people to use the normal planner in situations where they > would normally be forced to use GEQO to get acceptable performance, I > think that would be a win. > GEQO is an attempt to provide a near-optimal join order without using an exhaustive search. "An exhaustive, deterministic search of a subset of the search space" has a non-zero probability of finding only a local minimum in execution time. Most users really are only concerned with final execution time and if the "subset" is mis-chosen the resulting exhaustive search would completely and possibly disasterously miss the optimal execution time. In a paper on join-order optimization that I read recently, it was shown that non-uniform sampling, with the application of cost-based pruning of the search space, provided good results with quick convergence to within a delta of the optimum plan. This has the nice property that you can balence result quality with the amount of time spent optimizing the join order. The piece missing from the current GECO algorithm is to ensure that once the cost of a join plan is larger (or a piece of a plan), never visit that plan again. The memory/learning piece is what provided the ability to explore much more of the desirable (low execution time) search space. Since purely random selection, with learning, works so well, it may be worth developing a new random join-order optimization algorithm, particularly since most of the "genetic" features of the GECO we are not using. Ken
On Wed, Feb 23, 2005 at 10:02:22AM +1100, Neil Conway wrote: > Kenneth Marshall wrote: > >GEQO is an attempt to provide a near-optimal join order without using > >an exhaustive search. "An exhaustive, deterministic search of a subset > >of the search space" has a non-zero probability of finding only a local > >minimum in execution time. > > I'm not sure what you mean. By an "exhaustive, deterministic search of a > subset of the search space", I was referring to using the normal > planner, but restricting the search space to only left-deep plans. Since > GEQO will also only consider left-deep plans, ISTM there is no issue of > "local minima" that does not apply to an equal degree to GEQO itself. That is just a direct quote from your text. That is true, that this will have the same "local minima" problem as GEQO. > > >Since purely random selection, with learning, works so well, it may be > >worth developing a new random join-order optimization algorithm > > Yeah, I agree. I've read a few papers on randomized algorithms for join > order selection, but none of them really seemed to be clear winners. Do > you have a reference for the paper you read? > Here is one article that I found relevent in my literature search to find research on join-order optimization. It is appealing in that it should be fairly straightforward to implement and would allow us to consider bushy plans: http://www.cwi.nl/htbin/ins1/publications?request=pdfgz&key=WaPe:BNCOD:00 Ken