Thread: *_collapse_limit, geqo_threshold
I think we should try to do something about join_collapse_limit, from_collapse_limit, and geqo_threshold for 8.5. http://archives.postgresql.org/message-id/9134.1243289706@sss.pgh.pa.us http://archives.postgresql.org/message-id/603c8f070905251800g5b86d2dav26eca7f417d15dbf@mail.gmail.com I'm still of the opinion that join_collapse_threshold is a loaded foot-gun, because I don't think that users will expect that a join specified this way: SELECT ... FROM a JOIN b ON Pab JOIN c ON Pac JOIN d ON Pad ... will behave differently than one specified this way: SELECT ... FROM a, b, c, d WHERE Pab AND Pac AND Pad ... The whole purpose of join_collapse_limit in the first instance is to prevent planning time from getting out of control, but I don't see how we can view it as a very effective safety valve when it depends so heavily on which syntax is used. If the planning time for an N-way join is excessive, then we're going to have a problem with excessive planning time whenever the second syntax is selected, and I don't see any reason to believe that users see the second syntax as "dangerous" in terms of planning time but the first syntax as "safer". One possibility would be to remove join_collapse_limit entirely, but that would eliminate one possibily-useful piece of functionality that it current enables: namely, the ability to exactly specify the join order by setting join_collapse_limit to 1. So one possibility would be to rename the variable something like explicit_join_order and make it a Boolean; another possibility would be to change the default value to INT_MAX. The approach I've taken in the attached patch is to make 0 mean "unlimited" and make that the default value. I don't have a strong feeling about whether that's better than the other two options, although it seems cleaner to me or I'd not have written the patch that way. We could also consider adopting this same approach for from_collapse_limit, though for some reason that behavior marginally less pathological to me. At any rate, regardless of whether this patch (or one of the other approaches mentioned above) are adopted for 8.5, I think we should raise the default values for whatever is left. The defaults basically haven't been modified since they were put in, and my experience is that even queries with 10 to 15 joins perform acceptably for OLTP workloads, which are exactly the workloads where query planning time is most likely to be an issue. So I would propose raising each of the limits by 4 (to 12 for from_collapse_limit and join_collapse_limit if we don't unlimit them entirely, and to 16 for geqo_threshold). I'm interested in hearing from anyone who has practical experience with tuning these variables, or any ideas on what we should test to get a better idea as to how to set them. Thanks, ...Robert
Attachment
Robert Haas <robertmhaas@gmail.com> wrote: > I'm interested in hearing from anyone who has practical experience > with tuning these variables, or any ideas on what we should test to > get a better idea as to how to set them. I don't remember any clear resolution to the wild variations in plan time mentioned here: http://archives.postgresql.org/pgsql-hackers/2009-06/msg00743.php I think it would be prudent to try to figure out why small changes in the query caused the large changes in the plan times Andres was seeing. Has anyone else ever seen such behavior? Can we get examples? (It should be enough to get the statistics and the schema, since this is about planning time, not run time.) My own experience is that when we investigate a complaint about a query not performing to user or application programmer expectations, we have sometimes found that boosting these values has helped. We boost them overall (in postgresql.conf) without ever having seen a downside. We currently have geqo disabled and set both collapse limits to 20. We should probably just set them both to several hundred and not wait until some query with more than 20 tables performs badly, but I'm not sure we have any of those yet. In short, my experience is that when setting these higher has made any difference at all, it has always generated a plan that saved more time than the extra planning required. Well, I'd bet that there has been an increase in the plan time of some queries which wound up with the same plan anyway, but the difference has never been noticeable; the net effect has been a plus for us. I guess the question is whether there is anyone who has had a contrary experience. (There must have been some benchmarks to justify adding geqo at some point?) -Kevin
On Jul 7, 2009, at 9:31 AM, "Kevin Grittner" <Kevin.Grittner@wicourts.gov > wrote: > Robert Haas <robertmhaas@gmail.com> wrote: > >> I'm interested in hearing from anyone who has practical experience >> with tuning these variables, or any ideas on what we should test to >> get a better idea as to how to set them. > > I don't remember any clear resolution to the wild variations in plan > time mentioned here: > > http://archives.postgresql.org/pgsql-hackers/2009-06/msg00743.php > > I think it would be prudent to try to figure out why small changes in > the query caused the large changes in the plan times Andres was > seeing. Has anyone else ever seen such behavior? Can we get > examples? (It should be enough to get the statistics and the schema, > since this is about planning time, not run time.) Well, there's not really enough information there to figure out specifically what was happening, but from 10,000 feet, join_collapse_limit and from_collapse_limit constrain the join order. If the estimates are all accurate, setting them to a value < infinity will either leave the plans unchanged or make them worse. If it's making them better, then the estimates are off and the join order constraint happens to be preventing the planner from considering the cases what really hurts you. But that's mostly luck. > My own experience is that when we investigate a complaint about a > query not performing to user or application programmer expectations, > we have sometimes found that boosting these values has helped. We > boost them overall (in postgresql.conf) without ever having seen a > downside. We currently have geqo disabled and set both collapse > limits to 20. We should probably just set them both to several > hundred and not wait until some query with more than 20 tables > performs badly, but I'm not sure we have any of those yet. > > In short, my experience is that when setting these higher has made any > difference at all, it has always generated a plan that saved more time > than the extra planning required. Well, I'd bet that there has been > an increase in the plan time of some queries which wound up with the > same plan anyway, but the difference has never been noticeable; the > net > effect has been a plus for us. You have a big dataset AIUI so the right values for you might be too high for some people with, say, OLTP workloads. > I guess the question is whether there is anyone who has had a contrary > experience. (There must have been some benchmarks to justify adding > geqo at some point?) GEQO or something like it is certainly needed for very large planning problems. The non-GEQO planner takes exponential time in the size of the problem, so at some point that's going to get ugly. But triggering it at the level we do now seems unnecessarily pessimistic about what constitutes too much planning. ...Robert
Hi Kevin, Hi all, On Tuesday 07 July 2009 16:31:14 Kevin Grittner wrote: > Robert Haas <robertmhaas@gmail.com> wrote: > > I'm interested in hearing from anyone who has practical experience > > with tuning these variables, or any ideas on what we should test to > > get a better idea as to how to set them. > > I don't remember any clear resolution to the wild variations in plan > time mentioned here: > > http://archives.postgresql.org/pgsql-hackers/2009-06/msg00743.php > > I think it would be prudent to try to figure out why small changes in > the query caused the large changes in the plan times Andres was > seeing. Has anyone else ever seen such behavior? Can we get > examples? (It should be enough to get the statistics and the schema, > since this is about planning time, not run time.) I don't think it is surprising that small changes on those variables change the plan time widely on a complex query. I.e. a increase by one in from_collapse_limit can completely change the plan before optimizations change due to more inlining. I don't know the exact behaviour in the case more joins exists than join_collapse_limit but is not hard to imagine that this also can dramatically change the plan complexity. As there were quite many different views involved all the changes on the *_limit variables could have triggered plan changes in different parts of the query. I plan to revisit the issue you referenced btw. Only first was release phase and then I could not motivate myself to investigate a bit more... The mail you referenced contains a completely bogus and ugly query that shows similar symptoms by the way. I guess the variations would be even bigger if differently sized views/subqueries would be used. > My own experience is that when we investigate a complaint about a > query not performing to user or application programmer expectations, > we have sometimes found that boosting these values has helped. We > boost them overall (in postgresql.conf) without ever having seen a > downside. We currently have geqo disabled and set both collapse > limits to 20. We should probably just set them both to several > hundred and not wait until some query with more than 20 tables > performs badly, but I'm not sure we have any of those yet. I have not found consistently better results with geqo enabled. Some queries are better, others worse. Often the comparison is not reliably reproducable. (The possibility to set geqo to some "know" starting value would be nice for such comparisons) I cannot reasonably plan some queries with join_collapse_limit set to 20. At least not without setting the geqo limit very low and a geqo_effort to a low value. So I would definitely not agree that removing j_c_l is a good idea. Andres
Andres Freund <andres@anarazel.de> writes: > I cannot reasonably plan some queries with join_collapse_limit set to 20. At > least not without setting the geqo limit very low and a geqo_effort to a low > value. > So I would definitely not agree that removing j_c_l is a good idea. Can you show some specific examples? All of this discussion seems like speculation in a vacuum ... regards, tom lane
On Tuesday 07 July 2009 17:40:50 Tom Lane wrote: > Andres Freund <andres@anarazel.de> writes: > > I cannot reasonably plan some queries with join_collapse_limit set to 20. > > At least not without setting the geqo limit very low and a geqo_effort to > > a low value. > > So I would definitely not agree that removing j_c_l is a good idea. > Can you show some specific examples? All of this discussion seems like > speculation in a vacuum ... I still may not publish the original schema (And I still have not heard any reasonable reasons) - the crazy query in the referenced email shows similar problems and has a somewhat similar structure. If that is not enough I will try to design a schema that is similar and different enough from the original schema. Will take a day or two though. Andres
"Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes: > I guess the question is whether there is anyone who has had a contrary > experience. (There must have been some benchmarks to justify adding > geqo at some point?) The CVS history shows that geqo was integrated on 1997-02-19, which I think means that it must have been developed against Postgres95 (or even earlier Berkeley releases?). That was certainly before any of the current community's work on the optimizer began. A quick look at the code as it stood on that date suggests that the regular optimizer's behavior for large numbers of rels was a lot worse than it is today --- notably, it looks like it would consider a whole lot more Cartesian-product joins than we do now; especially if you had "bushy" mode turned on, which you'd probably have to do to find good plans in complicated cases. There were also a bunch of enormous inefficiencies that we've whittled down over time, such as the mechanisms for comparing pathkeys or the use of integer Lists to represent relid sets. So while I don't doubt that geqo was absolutely essential when it was written, it's fair to question whether it still provides a real win. And we could definitely stand to take another look at the default thresholds. regards, tom lane
Robert Haas <robertmhaas@gmail.com> writes: > One possibility would be to remove join_collapse_limit entirely, but > that would eliminate one possibily-useful piece of functionality that > it current enables: namely, the ability to exactly specify the join > order by setting join_collapse_limit to 1. So one possibility would > be to rename the variable something like explicit_join_order and make > it a Boolean; another possibility would be to change the default value > to INT_MAX. As the person who put in those thresholds, I kind of prefer going over to the boolean definition. That was the alternative that we considered; the numeric thresholds were used instead because they were easy to implement and seemed to possibly offer more control. But I'm not convinced that anyone has really used them profitably. I agree that the ability to use JOIN syntax to specify the join order exactly (with join_collapse_limit=1) is the only really solid use-case anyone has proposed for either threshold. I'm interested in Andreas' comment that he has use-cases where using the collapse_limit is better than allowing geqo to take over for very large problems ... but I think we need to see those use-cases and see if there's a better fix. regards, tom lane
On Tue, Jul 7, 2009 at 5:58 PM, Tom Lane<tgl@sss.pgh.pa.us> wrote: > So while I don't doubt that geqo was absolutely essential when it was > written, it's fair to question whether it still provides a real win. > And we could definitely stand to take another look at the default > thresholds The whole point of these parameters is to save time planning large complex queries -- which are rarely going to be the kind of short, simple, fast to execute oltp queries where planning time makes a big difference. The larger more complex the query the more likely it is to be a long-running dss or olap style query where shaving one percent off the runtime would be worth spending many seconds planning. I propose that there's a maximum reasonable planning time which a programmer woulod normally expect the database to be able to come up with a plan for virtually any query within. Personally I would be surprised if a plain EXPLAIN took more than, say, 30s. perhaps even something more like 10s. We should benchmark the planner on increasingly large sets of relations on a typical developer machine and set geqo to whatever value the planner can handle in that length of time. I suspect even at 10s you're talking about substantially larger values than the current default. -- greg http://mit.edu/~gsstark/resume.pdf
Greg Stark <gsstark@mit.edu> writes: > We should benchmark the planner on increasingly large sets of > relations on a typical developer machine and set geqo to whatever > value the planner can handle in that length of time. I suspect even at > 10s you're talking about substantially larger values than the current > default. The problem is to find some realistic "benchmark" cases. That's one reason why I was pestering Andreas to see his actual use cases ... regards, tom lane
On Tuesday 07 July 2009 19:45:44 Tom Lane wrote: > Greg Stark <gsstark@mit.edu> writes: > > We should benchmark the planner on increasingly large sets of > > relations on a typical developer machine and set geqo to whatever > > value the planner can handle in that length of time. I suspect even at > > 10s you're talking about substantially larger values than the current > > default. > The problem is to find some realistic "benchmark" cases. That's one > reason why I was pestering Andreas to see his actual use cases ... I will start writing a reduced/altered schema tomorrow then... Andres PS: Its "Andres" btw ;-)
On Jul 7, 2009, at 12:32 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Robert Haas <robertmhaas@gmail.com> writes: >> One possibility would be to remove join_collapse_limit entirely, but >> that would eliminate one possibily-useful piece of functionality that >> it current enables: namely, the ability to exactly specify the join >> order by setting join_collapse_limit to 1. So one possibility would >> be to rename the variable something like explicit_join_order and make >> it a Boolean; another possibility would be to change the default >> value >> to INT_MAX. > > As the person who put in those thresholds, I kind of prefer going over > to the boolean definition. I'm OK with that, but out of conservatism suggested changing the default to unlimited in this release. If by chance there is something we're missing and these parameters are doing someone any good, we can suggest that they set them back to the old values rather than telling them to use a private build. If on the other hand we don't get any complaints, we can remove them with greater confidence in a future release. But maybe that's too conservative. Now, here's another thought: if we think it's reasonable for people to want to explicitly specify the join order, a GUC isn't really the best fit, because it's all or nothing. Maybe we'd be better off dropping the GUCs entirely and adding some other bit of syntax that forces the join order, but only for that particular join. > That was the alternative that we considered; > the numeric thresholds were used instead because they were easy to > implement and seemed to possibly offer more control. But I'm not > convinced that anyone has really used them profitably. I agree that > the ability to use JOIN syntax to specify the join order exactly (with > join_collapse_limit=1) is the only really solid use-case anyone has > proposed for either threshold. I'm interested in Andreas' comment > that > he has use-cases where using the collapse_limit is better than > allowing > geqo to take over for very large problems ... but I think we need to > see > those use-cases and see if there's a better fix. > > regards, tom lane Agreed. ...Robert
Le 7 juil. 09 à 19:37, Greg Stark a écrit : > I propose that there's a maximum reasonable planning time It sounds so much like the planner_effort GUC that has been talked about in the past... http://archives.postgresql.org/pgsql-performance/2009-05/msg00137.php ...except this time you want to measure it in seconds. The problem with measuring it in seconds is that when the time has elapsed, it's uneasy to switch from classic to geqo and avoid beginning from scratch again. Would it be possible to start geqo from current planner state? Another idea would be to have more complex metrics for deciding when to run geqo, that is guesstimate the query planning difficulty very quickly, based on more than just the number of relations in the from: presence of subqueries, UNION, EXISTS, IN, or branches in where clause, number of operators and index support for them, maybe some information from the stats too... The idea would be to - set an effort threshold from where we'd better run geqo (GUC, disabling possible) - if threshold enabled, compute metrics - if metric >= threshold, use geqo, if not, classic planner -maybe default to disabling the threshold It seems it'd be easier to set the new GUC on a per query basis... The obvious problem to this approach is that computing the metric will take some time better spent at planning queries, but maybe we could have fast path for easy queries, which will look a lot like $subject. Regards, -- dim I hope this will give readers better ideas than its bare content...
Le 7 juil. 09 à 21:16, Robert Haas a écrit : > Now, here's another thought: if we think it's reasonable for people > to want to explicitly specify the join order, a GUC isn't really the > best fit, because it's all or nothing. Maybe we'd be better off > dropping the GUCs entirely and adding some other bit of syntax that > forces the join order, but only for that particular join. MySQL calls them Straight Joins: http://www.mysqlperformanceblog.com/2006/12/28/mysql-session-variables-and-hints/ I'm not sure our best move here would be in this direction :) -- dim
Dimitri Fontaine <dfontaine@hi-media.com> writes: > Another idea would be to have more complex metrics for deciding when > to run geqo, that is guesstimate the query planning difficulty very > quickly, based on more than just the number of relations in the from: > presence of subqueries, UNION, EXISTS, IN, or branches in where > clause, number of operators and index support for them, maybe some > information from the stats too... Pointless, since GEQO is only concerned with examining alternative join orderings. I see no reason whatever to think that number-of-relations isn't the correct variable to test to decide whether to use it. regards, tom lane
Robert Haas <robertmhaas@gmail.com> wrote: > if we think it's reasonable for people to want to explicitly specify > the join order Regardless of the syntax (GUC or otherwise), that is an optimizer hint. I thought we were trying to avoid those. Although -- we do have all those enable_* GUC values which are also optimizer hints; perhaps this should be another of those? enable_join_reorder? -Kevin
Le 7 juil. 09 à 21:45, Tom Lane a écrit : > Dimitri Fontaine <dfontaine@hi-media.com> writes: >> Another idea would be to have more complex metrics for deciding when >> to run geqo > > Pointless, since GEQO is only concerned with examining alternative > join > orderings. I see no reason whatever to think that number-of-relations > isn't the correct variable to test to decide whether to use it. Oh. It seems I prefer showing my ignorance rather than learning enough first. Writing mails is so much easier... Sorry for the noise, -- dim
"Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes: > Although -- we do have all those enable_* GUC values which are also > optimizer hints; perhaps this should be another of those? > enable_join_reorder? Not a bad suggestion, especially since turning it off would usually be considered just about as bad an idea as turning off the other ones. regards, tom lane
On Jul 7, 2009, at 3:03 PM, "Kevin Grittner" <Kevin.Grittner@wicourts.gov > wrote: > Robert Haas <robertmhaas@gmail.com> wrote: > >> if we think it's reasonable for people to want to explicitly specify >> the join order > > Regardless of the syntax (GUC or otherwise), that is an optimizer > hint. I thought we were trying to avoid those. I guess my point is that there's not a lot of obvious benefit in allowing the functionality to exist but handicapping it so that it's useful in as few cases as possible. If the consensus is that we want half a feature (but not more or less than half), that's OK with me, but it's not obvious to me why we should choose to want that. ...Robert
Robert Haas <robertmhaas@gmail.com> writes: > I guess my point is that there's not a lot of obvious benefit in > allowing the functionality to exist but handicapping it so that it's > useful in as few cases as possible. If the consensus is that we want > half a feature (but not more or less than half), that's OK with me, > but it's not obvious to me why we should choose to want that. Well, the question to my mind is whether the collapse_threshold GUCs in their current form actually represent a feature ;-). They were put in pretty much entirely on speculation that someone might find them useful. Your argument is that they are not only useless but a foot-gun, and so far we haven't got any clear contrary evidence. If we accept that argument then we should take them out, not just change the default. My own thought is that from_collapse_limit has more justification, since it basically acts to stop a subquery from being flattened when that would make the parent query too complex, and that seems like a more understandable and justifiable behavior than treating JOIN syntax specially. But I'm fine with removing join_collapse_limit or reducing it to a boolean. regards, tom lane
Robert Haas <robertmhaas@gmail.com> wrote: > "Kevin Grittner" <Kevin.Grittner@wicourts.gov wrote: >> Robert Haas <robertmhaas@gmail.com> wrote: >> >>> if we think it's reasonable for people to want to explicitly >>> specify the join order >> >> Regardless of the syntax (GUC or otherwise), that is an optimizer >> hint. I thought we were trying to avoid those. > > I guess my point is that there's not a lot of obvious benefit in > allowing the functionality to exist but handicapping it so that it's > useful in as few cases as possible. If the consensus is that we > want half a feature (but not more or less than half), that's OK with > me, but it's not obvious to me why we should choose to want that. Ever since I've been following these lists, there have been a pretty steady dribble of requests for optimizer hints of one type or another. The standard response has always been, "If you have a query which is optimizing poorly, show us, and we'll try to fix the optimizer so that it doesn't need hints to do a good job." The enable_* GUCs effectively *are* optimizer hints, but they are strongly discouraged for anything but diagnostic purposes. I guess I don't see any reason to consider this issue as being different. If implementing them this way seems to handicap them, I guess that's probably to discourage their use, which seems reasonable to me; I bet people would shoot themselves in the foot much more often than they would help themselves with such options, we'd lose information which might help to improve the optimizer, and the list would probably get a pretty steady stream of "slow queries" which would turn out to be (after digging deep enough) caused by people using hints that caused a sub-optimal plan to be chosen. -Kevin
Tom Lane escribió: > My own thought is that from_collapse_limit has more justification, > since it basically acts to stop a subquery from being flattened when > that would make the parent query too complex, and that seems like a > more understandable and justifiable behavior than treating JOIN > syntax specially. Isn't that what we use OFFSET 0 for? That one has also the nice property that you can actually specify which subquery you want to prevent from being flattened. Personally I have never seen a case where the collapse_limits were useful tools. -- Alvaro Herrera http://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc.
Alvaro Herrera <alvherre@commandprompt.com> writes: > Tom Lane escribi�: >> My own thought is that from_collapse_limit has more justification, >> since it basically acts to stop a subquery from being flattened when >> that would make the parent query too complex, and that seems like a >> more understandable and justifiable behavior than treating JOIN >> syntax specially. > Isn't that what we use OFFSET 0 for? That one has also the nice > property that you can actually specify which subquery you want to > prevent from being flattened. Well, if you want to modify your queries to prevent long planning times, that'd be one way to do it. It doesn't seem like a generally useful answer to me though. For example, typically the subquery would actually be a view that might be used in various contexts. If you stick an OFFSET in it then you disable flattening in all those contexts, likely not the best answer. > Personally I have never seen a case where the collapse_limits were > useful tools. I'm not convinced they're useful either. regards, tom lane
On Jul 7, 2009, at 4:56 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Robert Haas <robertmhaas@gmail.com> writes: >> I guess my point is that there's not a lot of obvious benefit in >> allowing the functionality to exist but handicapping it so that it's >> useful in as few cases as possible. If the consensus is that we want >> half a feature (but not more or less than half), that's OK with me, >> but it's not obvious to me why we should choose to want that. > > Well, the question to my mind is whether the collapse_threshold GUCs > in > their current form actually represent a feature ;-). They were put > in pretty much entirely on speculation that someone might find them > useful. Your argument is that they are not only useless but a foot- > gun, > and so far we haven't got any clear contrary evidence. If we accept > that argument then we should take them out, not just change the > default. > > My own thought is that from_collapse_limit has more justification, > since it basically acts to stop a subquery from being flattened when > that would make the parent query too complex, and that seems like a > more understandable and justifiable behavior than treating JOIN > syntax specially. But I'm fine with removing join_collapse_limit > or reducing it to a boolean. That's pretty much where I am with it, too. The feature I was referring to was not the collapse limits, but the ability to explicitly specify the join order, which perhaps could be a useful tool for reducing planning time or coping with bad estimates if you could do it for only some of the joins in the query, but which we're instead proposing to keep as an all-or-nothing flag. ...Robert
Robert Haas <robertmhaas@gmail.com> writes: > On Jul 7, 2009, at 4:56 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> My own thought is that from_collapse_limit has more justification, > That's pretty much where I am with it, too. The feature I was > referring to was not the collapse limits, but the ability to > explicitly specify the join order, which perhaps could be a useful > tool for reducing planning time or coping with bad estimates if you > could do it for only some of the joins in the query, but which we're > instead proposing to keep as an all-or-nothing flag. It's pretty much all-or-nothing now: the GUC does not give you any sort of useful control over *which* joins are reorderable. regards, tom lane
On Tue, Jul 7, 2009 at 6:33 PM, Tom Lane<tgl@sss.pgh.pa.us> wrote: > Robert Haas <robertmhaas@gmail.com> writes: >> On Jul 7, 2009, at 4:56 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >>> My own thought is that from_collapse_limit has more justification, > >> That's pretty much where I am with it, too. The feature I was >> referring to was not the collapse limits, but the ability to >> explicitly specify the join order, which perhaps could be a useful >> tool for reducing planning time or coping with bad estimates if you >> could do it for only some of the joins in the query, but which we're >> instead proposing to keep as an all-or-nothing flag. > > It's pretty much all-or-nothing now: the GUC does not give you any sort > of useful control over *which* joins are reorderable. Yes. So the way I see it, the options are: 1. We can remove join_collapse_limit completely and provide no substitute. In this case, the ability to explicitly specify the join order will be gone. 2. We can remove join_collapse_limit but provide a different, Boolean GUC instead, like enable_join_reordering. In this case, we're not actually reducing the number of GUCs, just the size of the foot-gun. 3. We can remove join_collapse_limit and provide an alternative way to explicitly specify the join order that is more flexible. This both reduces the number of GUCs and arguably provides some useful functionality that we don't have now. It sounds like your vote is for #2, which, as I say, seems like a feature with one arm tied behind its back, but hey, what do I know? Accepting that as the consensus in the absence of contrary votes, we still need to decide what to do about from_collapse_threshold and geqo_threshold. I'm pretty sure that we shouldn't eliminate GEQO or geqo_threshold, because the basic algorithm is clearly exponential time and eventually you have to start worrying about that, but we could raise the value. What to do about from_collapse_threshold is less clear to me. ...Robert
Tom Lane wrote: > "Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes: >> I guess the question is whether there is anyone who has had a contrary >> experience. (There must have been some benchmarks to justify adding >> geqo at some point?) > > The CVS history shows that geqo was integrated on 1997-02-19, which > I think means that it must have been developed against Postgres95 > So while I don't doubt that geqo was absolutely essential when it was > written, it's fair to question whether it still provides a real win. > And we could definitely stand to take another look at the default > thresholds. Well there is a TODO item about implementing an alternative to GEQO (which is being treated more and more as the underdog of the project): http://archives.postgresql.org/message-id/15658.1241278636%40sss.pgh.pa.us Would people be interested in someone working on that item? Cheers, Jan
On Tue, Jul 07, 2009 at 09:31:14AM -0500, Kevin Grittner wrote: > I don't remember any clear resolution to the wild variations in plan > time mentioned here: > > http://archives.postgresql.org/pgsql-hackers/2009-06/msg00743.php > > I think it would be prudent to try to figure out why small changes in > the query caused the large changes in the plan times Andres was > seeing. Has anyone else ever seen such behavior? Can we get > examples? (It should be enough to get the statistics and the schema, > since this is about planning time, not run time.) With joins between statistically indistinguishable columns, I see planning times change by a factor of ~4 for each join added or removed (postgres 8.3). Varying join_collapse_limit in the neighborhood of the actual number of joins has a similar effect. See attachment with annotated timings. The example uses a single table joined to itself, but using distinct tables with identical contents yields the same figures. The expontential factor seems smaller for real queries. I have a query of sixteen joins that takes 71s to plan deterministically; it looks like this: SELECT 1 FROM fact JOIN dim0 ... JOIN dim6 JOIN t t0 ON fact.key = t.key AND t.x = MCV0 LEFT JOIN t t1 ON fact.key = t.key AND t.x = MCV1 JOIN t t2 ON fact.key = t.key AND t.x = MCV2 LEFT JOIN t t3 ON fact.key = t.key AND t.x = NON-MCV0 LEFT JOIN t t4 ON fact.key = t.key AND t.x = NON-MCV1 LEFT JOIN t t5 ON fact.key = t.key AND t.x = NON-MCV2 LEFT JOIN t t6 ON fact.key = t.key AND t.x = NON-MCV3 LEFT JOIN t t7 ON fact.key = t.key AND t.x = NON-MCV4 For the real query, removing one join drops plan time to 26s, and removing two drops the time to 11s. I don't have a good theory for the multiplier changing from 4 for the trivial demonstration to ~2.5 for this real query. Re-enabling geqo drops plan time to .5s. These tests used default_statistics_target = 1000, but dropping that to 100 does not change anything dramatically. > I guess the question is whether there is anyone who has had a contrary > experience. (There must have been some benchmarks to justify adding > geqo at some point?) I have queries with a few more joins (19-21), and I cancelled attempts to plan them deterministically after 600+ seconds and 10+ GiB of memory usage. Even with geqo_effort = 10, they plan within 5-15s with good results. All that being said, I've never encountered a situation where a value other than 1 or <inf> for *_collapse_limit appeared optimal. nm
Attachment
Hi, When I was first familiarizing myself with PostgreSQL, I took a walk through its documentation on GECO and similar processes in the literature. One big advantage of GECO is that you can trade off planning time for plan optimization. I do agree that it should be updated, but there were definite cases in the literature where the planning time for exhaustive searches could take orders of magnitude more time to execute than the differences in the execution times of the differing plans. My two cents, Ken On Wed, Jul 08, 2009 at 09:43:12AM -0400, Noah Misch wrote: > On Tue, Jul 07, 2009 at 09:31:14AM -0500, Kevin Grittner wrote: > > I don't remember any clear resolution to the wild variations in plan > > time mentioned here: > > > > http://archives.postgresql.org/pgsql-hackers/2009-06/msg00743.php > > > > I think it would be prudent to try to figure out why small changes in > > the query caused the large changes in the plan times Andres was > > seeing. Has anyone else ever seen such behavior? Can we get > > examples? (It should be enough to get the statistics and the schema, > > since this is about planning time, not run time.) > > With joins between statistically indistinguishable columns, I see planning times > change by a factor of ~4 for each join added or removed (postgres 8.3). Varying > join_collapse_limit in the neighborhood of the actual number of joins has a > similar effect. See attachment with annotated timings. The example uses a > single table joined to itself, but using distinct tables with identical contents > yields the same figures. > > The expontential factor seems smaller for real queries. I have a query of > sixteen joins that takes 71s to plan deterministically; it looks like this: > > SELECT 1 FROM fact JOIN dim0 ... JOIN dim6 > JOIN t t0 ON fact.key = t.key AND t.x = MCV0 > LEFT JOIN t t1 ON fact.key = t.key AND t.x = MCV1 > JOIN t t2 ON fact.key = t.key AND t.x = MCV2 > LEFT JOIN t t3 ON fact.key = t.key AND t.x = NON-MCV0 > LEFT JOIN t t4 ON fact.key = t.key AND t.x = NON-MCV1 > LEFT JOIN t t5 ON fact.key = t.key AND t.x = NON-MCV2 > LEFT JOIN t t6 ON fact.key = t.key AND t.x = NON-MCV3 > LEFT JOIN t t7 ON fact.key = t.key AND t.x = NON-MCV4 > > For the real query, removing one join drops plan time to 26s, and removing two > drops the time to 11s. I don't have a good theory for the multiplier changing > from 4 for the trivial demonstration to ~2.5 for this real query. Re-enabling > geqo drops plan time to .5s. These tests used default_statistics_target = 1000, > but dropping that to 100 does not change anything dramatically. > > > I guess the question is whether there is anyone who has had a contrary > > experience. (There must have been some benchmarks to justify adding > > geqo at some point?) > > I have queries with a few more joins (19-21), and I cancelled attempts to plan > them deterministically after 600+ seconds and 10+ GiB of memory usage. Even > with geqo_effort = 10, they plan within 5-15s with good results. > > All that being said, I've never encountered a situation where a value other than > 1 or <inf> for *_collapse_limit appeared optimal. > > nm > SET geqo = off; > SET join_collapse_limit = 100; > CREATE TEMP TABLE t AS SELECT * FROM generate_series(1, 1000) f(n); ANALYZE t; > > --- Vary join count > -- 242.4s > EXPLAIN SELECT 1 FROM t t00 NATURAL JOIN t t01 NATURAL JOIN t t02 NATURAL JOIN t > t03 NATURAL JOIN t t04 NATURAL JOIN t t05 NATURAL JOIN t t06 NATURAL JOIN t t07 > NATURAL JOIN t t08 NATURAL JOIN t t09 NATURAL JOIN t t10 NATURAL JOIN t t11 > NATURAL JOIN t t12; > -- 31.2s > EXPLAIN SELECT 1 FROM t t00 NATURAL JOIN t t01 NATURAL JOIN t t02 NATURAL JOIN t > t03 NATURAL JOIN t t04 NATURAL JOIN t t05 NATURAL JOIN t t06 NATURAL JOIN t t07 > NATURAL JOIN t t08 NATURAL JOIN t t09 NATURAL JOIN t t10 NATURAL JOIN t t11; > -- 8.1s > EXPLAIN SELECT 1 FROM t t00 NATURAL JOIN t t01 NATURAL JOIN t t02 NATURAL JOIN t > t03 NATURAL JOIN t t04 NATURAL JOIN t t05 NATURAL JOIN t t06 NATURAL JOIN t t07 > NATURAL JOIN t t08 NATURAL JOIN t t09 NATURAL JOIN t t10; > -- 2.0s > EXPLAIN SELECT 1 FROM t t00 NATURAL JOIN t t01 NATURAL JOIN t t02 NATURAL JOIN t > t03 NATURAL JOIN t t04 NATURAL JOIN t t05 NATURAL JOIN t t06 NATURAL JOIN t t07 > NATURAL JOIN t t08 NATURAL JOIN t t09; > -- 0.5s > EXPLAIN SELECT 1 FROM t t00 NATURAL JOIN t t01 NATURAL JOIN t t02 NATURAL JOIN t > t03 NATURAL JOIN t t04 NATURAL JOIN t t05 NATURAL JOIN t t06 NATURAL JOIN t t07 > NATURAL JOIN t t08; > > --- Vary join_collapse_limit > -- 8.1s > SET join_collapse_limit = 100; > EXPLAIN SELECT 1 FROM t t00 NATURAL JOIN t t01 NATURAL JOIN t t02 NATURAL JOIN t > t03 NATURAL JOIN t t04 NATURAL JOIN t t05 NATURAL JOIN t t06 NATURAL JOIN t t07 > NATURAL JOIN t t08 NATURAL JOIN t t09 NATURAL JOIN t t10; > -- 8.0s > SET join_collapse_limit = 11; > EXPLAIN SELECT 1 FROM t t00 NATURAL JOIN t t01 NATURAL JOIN t t02 NATURAL JOIN t > t03 NATURAL JOIN t t04 NATURAL JOIN t t05 NATURAL JOIN t t06 NATURAL JOIN t t07 > NATURAL JOIN t t08 NATURAL JOIN t t09 NATURAL JOIN t t10; > -- 2.2s > SET join_collapse_limit = 10; > EXPLAIN SELECT 1 FROM t t00 NATURAL JOIN t t01 NATURAL JOIN t t02 NATURAL JOIN t > t03 NATURAL JOIN t t04 NATURAL JOIN t t05 NATURAL JOIN t t06 NATURAL JOIN t t07 > NATURAL JOIN t t08 NATURAL JOIN t t09 NATURAL JOIN t t10; > -- 0.5s > SET join_collapse_limit = 9; > EXPLAIN SELECT 1 FROM t t00 NATURAL JOIN t t01 NATURAL JOIN t t02 NATURAL JOIN t > t03 NATURAL JOIN t t04 NATURAL JOIN t t05 NATURAL JOIN t t06 NATURAL JOIN t t07 > NATURAL JOIN t t08 NATURAL JOIN t t09 NATURAL JOIN t t10; > -- 0.1s > SET join_collapse_limit = 8; > EXPLAIN SELECT 1 FROM t t00 NATURAL JOIN t t01 NATURAL JOIN t t02 NATURAL JOIN t > t03 NATURAL JOIN t t04 NATURAL JOIN t t05 NATURAL JOIN t t06 NATURAL JOIN t t07 > NATURAL JOIN t t08 NATURAL JOIN t t09 NATURAL JOIN t t10;
Robert Haas <robertmhaas@gmail.com> writes: > On Tue, Jul 7, 2009 at 6:33 PM, Tom Lane<tgl@sss.pgh.pa.us> wrote: >> It's pretty much all-or-nothing now: the GUC does not give you any sort >> of useful control over *which* joins are reorderable. > Yes. So the way I see it, the options are: > 1. We can remove join_collapse_limit completely and provide no > substitute. In this case, the ability to explicitly specify the join > order will be gone. > 2. We can remove join_collapse_limit but provide a different, Boolean > GUC instead, like enable_join_reordering. In this case, we're not > actually reducing the number of GUCs, just the size of the foot-gun. > 3. We can remove join_collapse_limit and provide an alternative way to > explicitly specify the join order that is more flexible. This both > reduces the number of GUCs and arguably provides some useful > functionality that we don't have now. > It sounds like your vote is for #2, which, as I say, seems like a > feature with one arm tied behind its back, but hey, what do I know? Well, the reason I'm not voting for #3 is that it looks like a lot of work to implement something that would basically be a planner hint, which I'm generally against; furthermore, it's a hint that there's been no demand for. (We're not even certain that anyone is using the ability to *fully* specify the join order, much less wanting some undetermined compromise between manual and automatic control.) And anyway I didn't hear anyone volunteering to do it. So the realistic alternatives are #1, #2, or "do nothing"; and out of those I like #2. > Accepting that as the consensus in the absence of contrary votes, we > still need to decide what to do about from_collapse_threshold and > geqo_threshold. I'm pretty sure that we shouldn't eliminate GEQO or > geqo_threshold, because the basic algorithm is clearly exponential > time and eventually you have to start worrying about that, but we > could raise the value. What to do about from_collapse_threshold is > less clear to me. I do not think there is a good argument for eliminating geqo_threshold. There might well be an argument for cranking up its default value; but that would take some hard data, which seems lacking at the moment. I'm on the fence about from_collapse_threshold. The argument for having it seems to be that there might be cases where not folding a subquery is preferable to folding it and then taking your chances with GEQO. But I'm not really convinced there are any. It occurs to me that one way to make GEQO less scary would be to take out the nondeterminism by resetting its random number generator for each query. You might get a good plan or an awful one, but at least it'd be the same one each time. DBAs like predictability. regards, tom lane
Tom Lane <tgl@sss.pgh.pa.us> wrote: > It occurs to me that one way to make GEQO less scary would be to > take out the nondeterminism by resetting its random number generator > for each query. You might get a good plan or an awful one, but at > least it'd be the same one each time. DBAs like predictability. +1 The biggest reason that I've tended to avoid geqo is that I would never know when it might do something really stupid with a query one time out of some large number, leading to mysterious complaints which could eat a lot of time. For a moment it seemed logical to suggest a session GUC for the seed, so if you got a bad plan you could keep rolling the dice until you got one you liked; but my right-brain kept sending shivers down my spine to suggest just how uncomfortable it was with that idea.... -Kevin
On Wed, Jul 08, 2009 at 04:13:11PM -0500, Kevin Grittner wrote: > Tom Lane <tgl@sss.pgh.pa.us> wrote: > > > It occurs to me that one way to make GEQO less scary would be to > > take out the nondeterminism by resetting its random number generator > > for each query. You might get a good plan or an awful one, but at > > least it'd be the same one each time. DBAs like predictability. > > +1 The biggest reason that I've tended to avoid geqo is that I would > never know when it might do something really stupid with a query one > time out of some large number, leading to mysterious complaints which > could eat a lot of time. > > For a moment it seemed logical to suggest a session GUC for the seed, > so if you got a bad plan you could keep rolling the dice until you got > one you liked; but my right-brain kept sending shivers down my spine > to suggest just how uncomfortable it was with that idea.... > > -Kevin > +1 I like the idea of a session GUC for the random number seed. If we can come up with a way to prune the search space more aggressively, GECO (or GECO2) will be much less prone to generating a bad plan. I also think that a session variable would make it easier to test GECO* by removing the nondeteminism. Ken
Noah Misch <noah@leadboat.com> writes: > With joins between statistically indistinguishable columns, I see planning times > change by a factor of ~4 for each join added or removed (postgres 8.3). Varying > join_collapse_limit in the neighborhood of the actual number of joins has a > similar effect. See attachment with annotated timings. The example uses a > single table joined to itself, but using distinct tables with identical contents > yields the same figures. Hey, some hard data! Thanks for doing that. > The expontential factor seems smaller for real queries. I have a query of > sixteen joins that takes 71s to plan deterministically; it looks like this: > SELECT 1 FROM fact JOIN dim0 ... JOIN dim6 > JOIN t t0 ON fact.key = t.key AND t.x = MCV0 > LEFT JOIN t t1 ON fact.key = t.key AND t.x = MCV1 > JOIN t t2 ON fact.key = t.key AND t.x = MCV2 > LEFT JOIN t t3 ON fact.key = t.key AND t.x = NON-MCV0 > LEFT JOIN t t4 ON fact.key = t.key AND t.x = NON-MCV1 > LEFT JOIN t t5 ON fact.key = t.key AND t.x = NON-MCV2 > LEFT JOIN t t6 ON fact.key = t.key AND t.x = NON-MCV3 > LEFT JOIN t t7 ON fact.key = t.key AND t.x = NON-MCV4 I'm confused here --- I think you must have over-anonymized your query. Surely the ON conditions for the left joins should be referencing t3, t4, etc? > For the real query, removing one join drops plan time to 26s, and > removing two drops the time to 11s. I don't have a good theory for > the multiplier changing from 4 for the trivial demonstration to ~2.5 > for this real query. The rule of thumb that says that an n-way join requires 2^n work is only true if we consider every single combination of possible joins, which normally we don't. The planner prefers join paths that follow join clauses, and will only consider clauseless joins when it has no other choice. I believe that real queries tend to be pretty non-flat in this space and so the number of join paths to consider is a lot less than 2^n. Your synthesized query, on the other hand, allows any relation to be joined to any other --- it might not look that way, but after creating derived equalities there will be a potential join clause linking every relation to every other one. So I think you were testing the worst case, and I'm not surprised that more-typical queries would show a slower growth curve. This is why I don't trust benchmarking the planner on artificial queries. Still, I appreciate your work, because it gives us at least some hard data to go on. regards, tom lane
"Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes: > For a moment it seemed logical to suggest a session GUC for the seed, > so if you got a bad plan you could keep rolling the dice until you got > one you liked; but my right-brain kept sending shivers down my spine > to suggest just how uncomfortable it was with that idea.... If memory serves, we actually had exactly that at some point. But I think the reason it got taken out was that it interfered with the behavior of the random() function for everything else. We'd have to give GEQO its own private random number generator. regards, tom lane
On Wed, Jul 08, 2009 at 05:46:02PM -0400, Tom Lane wrote: > "Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes: > > For a moment it seemed logical to suggest a session GUC for the seed, > > so if you got a bad plan you could keep rolling the dice until you got > > one you liked; but my right-brain kept sending shivers down my spine > > to suggest just how uncomfortable it was with that idea.... > > If memory serves, we actually had exactly that at some point. But I > think the reason it got taken out was that it interfered with the > behavior of the random() function for everything else. We'd have to > give GEQO its own private random number generator. > > regards, tom lane > A separate random number generator for GECO make a lot of sense. Cheers, Ken
On Jul 8, 2009, at 3:57 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Robert Haas <robertmhaas@gmail.com> writes: >> On Tue, Jul 7, 2009 at 6:33 PM, Tom Lane<tgl@sss.pgh.pa.us> wrote: >>> It's pretty much all-or-nothing now: the GUC does not give you any >>> sort >>> of useful control over *which* joins are reorderable. > >> Yes. So the way I see it, the options are: > >> 1. We can remove join_collapse_limit completely and provide no >> substitute. In this case, the ability to explicitly specify the join >> order will be gone. > >> 2. We can remove join_collapse_limit but provide a different, Boolean >> GUC instead, like enable_join_reordering. In this case, we're not >> actually reducing the number of GUCs, just the size of the foot-gun. > >> 3. We can remove join_collapse_limit and provide an alternative way >> to >> explicitly specify the join order that is more flexible. This both >> reduces the number of GUCs and arguably provides some useful >> functionality that we don't have now. > >> It sounds like your vote is for #2, which, as I say, seems like a >> feature with one arm tied behind its back, but hey, what do I know? > > Well, the reason I'm not voting for #3 is that it looks like a lot of > work to implement something that would basically be a planner hint, > which I'm generally against; furthermore, it's a hint that there's > been > no demand for. (We're not even certain that anyone is using the > ability > to *fully* specify the join order, much less wanting some undetermined > compromise between manual and automatic control.) And anyway I didn't > hear anyone volunteering to do it. So the realistic alternatives are > #1, #2, or "do nothing"; and out of those I like #2. That was my first reaction too, but now I'm wondering whether we shouldn't just do #1. #2 is a planner hint, too, just not a very good one. If, as you suggest, it isn't actually useful, then why keep it at all? (On the other hand, if someone thinks they need it, it would be interesting to know the use case, and think about the best way to address it.) > >> Accepting that as the consensus in the absence of contrary votes, we >> still need to decide what to do about from_collapse_threshold and >> geqo_threshold. I'm pretty sure that we shouldn't eliminate GEQO or >> geqo_threshold, because the basic algorithm is clearly exponential >> time and eventually you have to start worrying about that, but we >> could raise the value. What to do about from_collapse_threshold is >> less clear to me. > > I do not think there is a good argument for eliminating > geqo_threshold. > There might well be an argument for cranking up its default value; > but that would take some hard data, which seems lacking at the moment. > > I'm on the fence about from_collapse_threshold. The argument for > having > it seems to be that there might be cases where not folding a subquery > is preferable to folding it and then taking your chances with GEQO. > But I'm not really convinced there are any. Me either. You could probably get the same effect in other ways if you actually needed it, like OFFSET 0 or wrapping the subquery in a SRF. I'm leaning more and more toward thinking we should just nuke it. > It occurs to me that one way to make GEQO less scary would be to take > out the nondeterminism by resetting its random number generator for > each query. You might get a good plan or an awful one, but at least > it'd be the same one each time. DBAs like predictability. Hmm, that doesn't sound appealing to me, but I'm only a DBA at need. ...Robert
Robert Haas <robertmhaas@gmail.com> writes: > That was my first reaction too, but now I'm wondering whether we > shouldn't just do #1. #2 is a planner hint, too, just not a very good > one. If, as you suggest, it isn't actually useful, then why keep it > at all? (On the other hand, if someone thinks they need it, it would > be interesting to know the use case, and think about the best way to > address it.) Well, I can cite one reasonably plausible use case: when you have an umpteen-way join you need to execute a lot, and you don't want to pay for an exhaustive search, but GEQO doesn't reliably find a good plan. What you can do is let the system do an exhaustive search once to find the best plan, then you rearrange the query to specify that join order via JOINs, and turn off join collapsing. Presto, good plan every time with very little planning time expended. Now, your answer will probably be that we should provide some better mechanism for re-using a previously identified plan structure. No doubt that would be ideal, but the amount of effort required to get there is nontrivial, and I'm not at all convinced it would be repaid in usefulness. Whereas what I describe above is something that costs us nearly nothing to provide. The usefulness might be marginal too, but on the basis of cost/benefit ratio it's a clear win. >> It occurs to me that one way to make GEQO less scary would be to take >> out the nondeterminism by resetting its random number generator for >> each query. You might get a good plan or an awful one, but at least >> it'd be the same one each time. DBAs like predictability. > Hmm, that doesn't sound appealing to me, but I'm only a DBA at need. I was imagining a GUC that would make the reset optional, in case anyone really does want to have unstable plans. I don't have much doubt about what typical users will prefer though. regards, tom lane
On Wed, Jul 08, 2009 at 09:26:35PM -0400, Tom Lane wrote: > Robert Haas <robertmhaas@gmail.com> writes: > > That was my first reaction too, but now I'm wondering whether we > > shouldn't just do #1. #2 is a planner hint, too, just not a very good > > one. If, as you suggest, it isn't actually useful, then why keep it > > at all? (On the other hand, if someone thinks they need it, it would > > be interesting to know the use case, and think about the best way to > > address it.) > > Well, I can cite one reasonably plausible use case: when you have an > umpteen-way join you need to execute a lot, and you don't want to pay > for an exhaustive search, but GEQO doesn't reliably find a good plan. > What you can do is let the system do an exhaustive search once to find > the best plan, then you rearrange the query to specify that join order > via JOINs, and turn off join collapsing. Presto, good plan every time > with very little planning time expended. > > Now, your answer will probably be that we should provide some better > mechanism for re-using a previously identified plan structure. No > doubt that would be ideal, but the amount of effort required to get > there is nontrivial, and I'm not at all convinced it would be repaid > in usefulness. Whereas what I describe above is something that costs > us nearly nothing to provide. The usefulness might be marginal too, > but on the basis of cost/benefit ratio it's a clear win. This sounds like planner hints to me. The argument against hinting, AIUI, is that although the plan you've guaranteed via hints may be a good one today, when the data change a bit your carefully crafted plan happens to become a bad one, but you're no longer around to change the hints accordingly. Entire stored plans, or predetermined seeds for GEQO's random number generator all boil down to saying, "I want you to use this plan henceforth and forever." > >> It occurs to me that one way to make GEQO less scary would be to take > >> out the nondeterminism by resetting its random number generator for > >> each query. You might get a good plan or an awful one, but at least > >> it'd be the same one each time. DBAs like predictability. > > > Hmm, that doesn't sound appealing to me, but I'm only a DBA at need. > > I was imagining a GUC that would make the reset optional, in case anyone > really does want to have unstable plans. I don't have much doubt about > what typical users will prefer though. Do we know that GEQO plans are, in reality, less stable than than usual planner? Certainly on paper it appears they could be, but the mailing lists are full of emails about "this query's plan changed and performance suddenly tanked; how do I fix it?" so I'm unconvinced this is a problem unique to GEQO. Which in turn boils down to "we need real world data to look at". -- Joshua Tolley / eggyknap End Point Corporation http://www.endpoint.com
Joshua Tolley <eggyknap@gmail.com> writes: > This sounds like planner hints to me. The argument against hinting, AIUI, is > that although the plan you've guaranteed via hints may be a good one today, > when the data change a bit your carefully crafted plan happens to become a bad > one, but you're no longer around to change the hints accordingly. That's one argument against them. Another one is that time put into developing a planner hints mechanism is time that would be better spent on fixing the underlying planner problem. However, that second argument doesn't apply to something like join_collapse_limit, whose implementation is pretty nearly a one-liner (as are the various existing enable_whatever switches). Again, it's all about cost/benefit ratio. > Do we know that GEQO plans are, in reality, less stable than than usual > planner? Yes, we do. There aren't that many examples in the archives, but that likely is because join_collapse_limit and from_collapse_limit are by default set to try to prevent use of GEQO. The most recent clear-cut example I can find is http://archives.postgresql.org/pgsql-general/2008-10/msg01449.php wherein the guy has apparently written a "flat" FROM of 17 tables, and so neither collapse_limit will kick in. If we get rid of the collapse_limits as proposed in this thread, you'll start seeing those sorts of complaints all over the place, unless we make GEQO deterministic. regards, tom lane
Noah Misch <noah@leadboat.com> writes: > Describing in those terms illuminates much. While the concepts do suggest 2^N > worst-case planning cost, my artificial test case showed a rigid 4^N pattern; > what could explain that? Well, the point of the 2^N concept is just that adding one more relation multiplies the planning work by a constant factor. It's useful data that you find the factor to be about 4, but I wouldn't have expected the model to tell us that. regards, tom lane
On Jul 8, 2009, at 8:26 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Robert Haas <robertmhaas@gmail.com> writes: >> That was my first reaction too, but now I'm wondering whether we >> shouldn't just do #1. #2 is a planner hint, too, just not a very >> good >> one. If, as you suggest, it isn't actually useful, then why keep it >> at all? (On the other hand, if someone thinks they need it, it would >> be interesting to know the use case, and think about the best way to >> address it.) > > Well, I can cite one reasonably plausible use case: when you have an > umpteen-way join you need to execute a lot, and you don't want to pay > for an exhaustive search, but GEQO doesn't reliably find a good plan. > What you can do is let the system do an exhaustive search once to find > the best plan, then you rearrange the query to specify that join order > via JOINs, and turn off join collapsing. Presto, good plan every time > with very little planning time expended. > > Now, your answer will probably be that we should provide some better > mechanism for re-using a previously identified plan structure. No > doubt that would be ideal, but the amount of effort required to get > there is nontrivial, and I'm not at all convinced it would be repaid > in usefulness. Whereas what I describe above is something that costs > us nearly nothing to provide. The usefulness might be marginal too, > but on the basis of cost/benefit ratio it's a clear win. I don't think that would be my answer because plan caching sounds hard. It would be nice to have, though. :-) But it's clearly a planner hint, however you slice it. >>> It occurs to me that one way to make GEQO less scary would be to >>> take >>> out the nondeterminism by resetting its random number generator for >>> each query. You might get a good plan or an awful one, but at least >>> it'd be the same one each time. DBAs like predictability. > >> Hmm, that doesn't sound appealing to me, but I'm only a DBA at need. > > I was imagining a GUC that would make the reset optional, in case > anyone > really does want to have unstable plans. I don't have much doubt > about > what typical users will prefer though. OK. ...Robert
Hi, Robert Haas <robertmhaas@gmail.com> writes: > On Jul 8, 2009, at 8:26 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> Now, your answer will probably be that we should provide some better >> mechanism for re-using a previously identified plan structure. No >> doubt that would be ideal, but the amount of effort required to get >> there is nontrivial, and I'm not at all convinced it would be repaid >> in usefulness. Whereas what I describe above is something that costs >> us nearly nothing to provide. The usefulness might be marginal too, >> but on the basis of cost/benefit ratio it's a clear win. > > I don't think that would be my answer because plan caching sounds hard. It > would be nice to have, though. :-) In fact I think marshalling plans (and unmarshalling them) would rather be the easy part of it, what looks very hard is the problem already mentioned here in another context (gathering statistics on views): http://archives.postgresql.org/pgsql-performance/2009-06/msg00118.php How to match a random user query with the stored plans with as little analyze as possible? > But it's clearly a planner hint, however you slice it. Agreed. Regards, -- dim
On Wed, Jul 8, 2009 at 8:26 PM, Tom Lane<tgl@sss.pgh.pa.us> wrote: > Robert Haas <robertmhaas@gmail.com> writes: >> That was my first reaction too, but now I'm wondering whether we >> shouldn't just do #1. #2 is a planner hint, too, just not a very good >> one. If, as you suggest, it isn't actually useful, then why keep it >> at all? (On the other hand, if someone thinks they need it, it would >> be interesting to know the use case, and think about the best way to >> address it.) > > Well, I can cite one reasonably plausible use case: when you have an > umpteen-way join you need to execute a lot, and you don't want to pay > for an exhaustive search, but GEQO doesn't reliably find a good plan. > What you can do is let the system do an exhaustive search once to find > the best plan, then you rearrange the query to specify that join order > via JOINs, and turn off join collapsing. Presto, good plan every time > with very little planning time expended. In the Oracle world they use the hint "ORDERED" for this purpose. As the Oracle optimizer has gotten smarter over the years I've seen less need to use it, but over all, compared to other Oracle hints it does not seem to be extremely sensitive to data / index / stats changes. When you think about why such ordering might work that makes sense to me: small tables can be used early to prune large tables later on. Typically, these smaller tables are static config info type data. Eg. pick species, then choose which of the 10 million pathology samples you have match that species. > > Now, your answer will probably be that we should provide some better > mechanism for re-using a previously identified plan structure. No > doubt that would be ideal, but the amount of effort required to get > there is nontrivial, and I'm not at all convinced it would be repaid > in usefulness. Whereas what I describe above is something that costs > us nearly nothing to provide. The usefulness might be marginal too, > but on the basis of cost/benefit ratio it's a clear win. > Again Oracle has a mechanism for doing this. Don't know the details, but our DBA would if anyone cares... -- Peter Hunsberger
On Wed, Jul 08, 2009 at 05:23:16PM -0400, Tom Lane wrote: > Noah Misch <noah@leadboat.com> writes: > > The expontential factor seems smaller for real queries. I have a query of > > sixteen joins that takes 71s to plan deterministically; it looks like this: > > > SELECT 1 FROM fact JOIN dim0 ... JOIN dim6 > > JOIN t t0 ON fact.key = t.key AND t.x = MCV0 > > LEFT JOIN t t1 ON fact.key = t.key AND t.x = MCV1 > > JOIN t t2 ON fact.key = t.key AND t.x = MCV2 > > LEFT JOIN t t3 ON fact.key = t.key AND t.x = NON-MCV0 > > LEFT JOIN t t4 ON fact.key = t.key AND t.x = NON-MCV1 > > LEFT JOIN t t5 ON fact.key = t.key AND t.x = NON-MCV2 > > LEFT JOIN t t6 ON fact.key = t.key AND t.x = NON-MCV3 > > LEFT JOIN t t7 ON fact.key = t.key AND t.x = NON-MCV4 > > I'm confused here --- I think you must have over-anonymized your query. > Surely the ON conditions for the left joins should be referencing t3, > t4, etc? Yes. Aside from that error, I picked the wrong features to emphasize and gloss over. Only two relations (amidst `dim0 ... dim6') resemble dimensions, having an implied relative position on the plan tree. The other fourteen relations share a join key. That explains the distinctive plan cost; this query resembles the utterly unconstrained artificial query to a larger degree than most. > > For the real query, removing one join drops plan time to 26s, and > > removing two drops the time to 11s. I don't have a good theory for > > the multiplier changing from 4 for the trivial demonstration to ~2.5 > > for this real query. > > The rule of thumb that says that an n-way join requires 2^n work is only > true if we consider every single combination of possible joins, which > normally we don't. The planner prefers join paths that follow join > clauses, and will only consider clauseless joins when it has no other > choice. I believe that real queries tend to be pretty non-flat in this > space and so the number of join paths to consider is a lot less than 2^n. > Your synthesized query, on the other hand, allows any relation to be > joined to any other --- it might not look that way, but after creating > derived equalities there will be a potential join clause linking every > relation to every other one. So I think you were testing the worst case, > and I'm not surprised that more-typical queries would show a slower > growth curve. Describing in those terms illuminates much. While the concepts do suggest 2^N worst-case planning cost, my artificial test case showed a rigid 4^N pattern; what could explain that? Thanks, nm
Robert Haas <robertmhaas@gmail.com> wrote: > If, as you suggest, it isn't actually useful, then why keep it at > all? I was imagining that someone who has a query which is taking a long time to run, and asserts that it would run faster if only the optimizer would arrange the joins a certain way, could test that theory, as part of the diagnostic process. (In other words, for similar reasons that the other enable_* GUC settings exist.) I would certainly not want to use it in production, and wouldn't expect that would normally be a very good idea. -Kevin
On Tuesday 07 July 2009 17:40:50 Tom Lane wrote: > Andres Freund <andres@anarazel.de> writes: > > I cannot reasonably plan some queries with join_collapse_limit set to 20. > > At least not without setting the geqo limit very low and a geqo_effort to > > a low value. > > So I would definitely not agree that removing j_c_l is a good idea. > Can you show some specific examples? All of this discussion seems like > speculation in a vacuum ... As similar wishes came up multiple times now I started to create a schema I may present which is sufficiently similar to show the same effects. I had to cut down the complexity of the schema considerably - both for easier understanding and easier writing of the demo schema. I also have a moderately complex demo query similar to really used ones. Autogenerated (GUI) queries do not use views like I did in the example one but it seemed easier to play around with query size this way. Also the real queries often have way much more conditions than the one I present here. Also I have not "tuned" the queries here in any way, the join order is not optimized (like in the real application), but I don't think that does matter for the purpose of this discussion The queries itself only sketch what they are intended for and query many fictional datapoints, but again I dont think this is a problem. Is it helpfull this way? Some numbers about the query_2.sql are attached. Short overview: - a low from_collapse_limit is deadly - a high from_collapse_limit is not costly here - geqo_effort basically changes nothing - geqo changes basically nothing - with a higher join_collapse_limit (12) geqo=on costs quite a bit! (factor 20!). I double checked. At other times I get 'failed to make a valid plan' The numbers are all 8.5 as of today. Some explanations about the schema: - It uses surrogate keys everywhere as the real schema employs some form of row level, label based access checking (covert channel issues) - The real schema uses partitions - I don't think they would be interesting here? - its definitely not the most beautiful schema I have seen, but I have to admit that I cannot think of a much nicer one which serves the different purposes as well- somewhat complex queries- new "information_set"'s and "information"'s are added frequently- automated and manual dataentry has to work with such additions- GUI query tool needs to work in face of such changes - I have seen similar schemas multiple times now. - The original schema employs materialized views in parts (both for execution and planning speed) - The queries are crazy, but people definitely create/use them. Andres
Resending to list due to failure: 451 4.3.5 Server configuration problem Joshua Tolley <eggyknap@gmail.com> wrote: > Entire stored plans, or predetermined seeds for GEQO's random number > generator all boil down to saying, "I want you to use this plan > henceforth and forever." A predetermined seed for geqo's random number generator only guarantees that you get the same plan for the same query as long as the statistics remain the same. After the next ANALYZE, you may still get a different plan. > Do we know that GEQO plans are, in reality, less stable than than > usual planner? Yeah, I had a complaint of a slow query. When I ran EXPLAIN ANALYZE on it I ran it several times (as I usually do, to establish what happens with and without caching). I got different plans. So I ran the query a bunch of times and got a decent plan about 90% of the time, and an awful one about 10% of the time. It was clearly a geqo effect, so I didn't post on it. (Why post? The product was clearly operating as intended.) There was an easy solution; I disabled geqo. I'm sure it's useful to some; we haven't missed it. -Kevin
On Thursday 09 July 2009 17:37:41 Kevin Grittner wrote: > Resending to list due to failure: > 451 4.3.5 Server configuration problem > > Joshua Tolley <eggyknap@gmail.com> wrote: > > Entire stored plans, or predetermined seeds for GEQO's random number > > generator all boil down to saying, "I want you to use this plan > > henceforth and forever." > > A predetermined seed for geqo's random number generator only > guarantees that you get the same plan for the same query as long as > the statistics remain the same. After the next ANALYZE, you may still > get a different plan. > > > Do we know that GEQO plans are, in reality, less stable than than > > usual planner? Yes definitely. I.e. in the referenced schema (somewhere else in the thread) I even have queries which fail to plan only sometimes. Andres
On Wed, Jul 08, 2009 at 10:28:02AM -0500, Robert Haas wrote: > On Jul 8, 2009, at 8:43 AM, Noah Misch <noah@leadboat.com> wrote: >> The expontential factor seems smaller for real queries. I have a query of >> sixteen joins that takes 71s to plan deterministically; it looks like this: >> >> SELECT 1 FROM fact JOIN dim0 ... JOIN dim6 >> JOIN t t0 ON fact.key = t.key AND t.x = MCV0 >> LEFT JOIN t t1 ON fact.key = t.key AND t.x = MCV1 >> JOIN t t2 ON fact.key = t.key AND t.x = MCV2 >> LEFT JOIN t t3 ON fact.key = t.key AND t.x = NON-MCV0 >> LEFT JOIN t t4 ON fact.key = t.key AND t.x = NON-MCV1 >> LEFT JOIN t t5 ON fact.key = t.key AND t.x = NON-MCV2 >> LEFT JOIN t t6 ON fact.key = t.key AND t.x = NON-MCV3 >> LEFT JOIN t t7 ON fact.key = t.key AND t.x = NON-MCV4 > > Very interesting! I am guessing that the problem here is that all the > inner joins commute, as do all the left joins, so there are a lot of > possibilities to explore. I also suspect that the actual join order > doesn't matter much, so it's a good candidate for GEQO. Even if you had > some restriction clauses against your dimension/t tables, that would > probably just mean that you want to do those joins first, and the rest > afterwards, which still seems like it ought to be OK for GEQO. > > But, in many queries this size, the join order is more constrained. > Some of the later joins depend on the tables added by the earlier ones, > rather than all referring back to the base table. Is there some way we > could estimate the number of join offerings we'll have to consider > relatively cheaply? That might be a better metric than # of tables. Observing the pathological case taking plan time proportional to 4^N, apply geqo_threshold as "use GEQO when comparing more than geqo_threshold * 4^N join order possibilities"? I'm not sure whether that figure is available (early enough) to factor into the decision. Such a change would probably imply a lower default geqo_threshold, around 9 to 11. The number of typical queries subject to GEQO would nonetheless decrease. nm
On Wed, Jul 8, 2009 at 4:57 PM, Tom Lane<tgl@sss.pgh.pa.us> wrote: > Well, the reason I'm not voting for #3 is that it looks like a lot of > work to implement something that would basically be a planner hint, > which I'm generally against; furthermore, it's a hint that there's been > no demand for. (We're not even certain that anyone is using the ability > to *fully* specify the join order, much less wanting some undetermined > compromise between manual and automatic control.) And anyway I didn't > hear anyone volunteering to do it. So the realistic alternatives are > #1, #2, or "do nothing"; and out of those I like #2. I took a look at this and it seems that #3 can be implemented with essentially no additional code (the handful of lines I added where more than balanced out by some simplifications in ruleutils.c). Of course you still don't have to like it. :-) Patch attached. ...Robert
Attachment
Hi, Le 10 juil. 09 à 17:22, Robert Haas a écrit : > I took a look at this and it seems that #3 can be implemented with > essentially no additional code (the handful of lines I added where > more than balanced out by some simplifications in ruleutils.c). Of > course you still don't have to like it. :-) I see you're using the following syntax: ! SELECT * FROM a INNER FORCE JOIN (b INNER FORCE JOIN c ON (b.ref = c.id)) ON (a.id = b.id); The only place I've seen that before is MySQL straight_join feature: http://dev.mysql.com/doc/refman/5.0/en/join.html My first though at the time was "what a lame optimiser if I'm to tell it where not to reorder joins", but perhaps this was because the option there, IIRC, could impact the results... I guess I'm not going to like it, but nonetheless, if we're going to support the feature, what about calling it the same as MySQL, unless the standard has something to say? -- dim
Robert Haas <robertmhaas@gmail.com> writes: > I took a look at this and it seems that #3 can be implemented with > essentially no additional code (the handful of lines I added where > more than balanced out by some simplifications in ruleutils.c). Of > course you still don't have to like it. :-) You're right, I don't. Even if I thought this were a good idea, which I most definitely do not, the need to add a nonstandard fully-reserved word is sufficient reason to reject it. (The patch tries to pretend it's not going to reserve the word, but that only works because you have carefully changed only one of the five JOIN productions, leading to bizarrely non-orthogonal syntax.) regards, tom lane
On Jul 10, 2009, at 11:48 AM, Dimitri Fontaine <dfontaine@hi- media.com> wrote: > Hi, > > Le 10 juil. 09 à 17:22, Robert Haas a écrit : >> I took a look at this and it seems that #3 can be implemented with >> essentially no additional code (the handful of lines I added where >> more than balanced out by some simplifications in ruleutils.c). Of >> course you still don't have to like it. :-) > > I see you're using the following syntax: > ! SELECT * FROM a INNER FORCE JOIN (b INNER FORCE JOIN c ON (b.ref = > c.id)) ON (a.id = b.id); > > The only place I've seen that before is MySQL straight_join feature: > http://dev.mysql.com/doc/refman/5.0/en/join.html > > My first though at the time was "what a lame optimiser if I'm to > tell it where not to reorder joins", but perhaps this was because > the option there, IIRC, could impact the results... > > I guess I'm not going to like it, but nonetheless, if we're going to > support the feature, what about calling it the same as MySQL, unless > the standard has something to say? Well, I think we would be well-advised to get past the threshold issue of whether or not the overall design sucks before quibbling over details like my particular choice of keyword. That having been said, if the MySQL feature to which you linked actually does the same thing as what I implemented here, then it's amazingly poorly documented. We certainly don't guarantee anything about the order in which the input tables are read; I'd ask what that even means except I don't care. We're only making a promise about where that join will be implemented in the plan tree as compared with *other joins*. It was reported upthread that Oracle uses ORDERED for this but I don't know whether either the syntax or the semantics match what I did here. At any rate the particular choice of keyword seems rather insignificant; I picked it because it was already a keyword and seemed vaguely appropriate, but it could easily be changed. ...Robert
Robert Haas <robertmhaas@gmail.com> wrote: > At any rate the particular choice of keyword seems rather > insignificant; I picked it because it was already a keyword and > seemed vaguely appropriate, but it could easily be changed. Actually, if we were going to add fine-grained optimizer hints for this (which I'm not at all convinced is a good idea), I'd be tempted to go with what I saw a few years ago in SAP-DB (later rebranded as MySQL Max-DB): treat parentheses around JOIN operations as optimizer hints. I would only consider this as remotely sane if there was an enable_* GUC to disable the normal reordering of joins. It introduces no new keywords. The queries are still standard compliant. We would just have a configuration option which treats an optional part of the standard syntax as a hint. In other words: select * from (t1 join t2 on <whatever>) join t3 on <whatever>; would limit optimizer choice from those available with: select * from t1 join t2 on <whatever> join t3 on <whatever>; -Kevin
"Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes: > Actually, if we were going to add fine-grained optimizer hints for > this (which I'm not at all convinced is a good idea), I'd be tempted > to go with what I saw a few years ago in SAP-DB (later rebranded as > MySQL Max-DB): treat parentheses around JOIN operations as optimizer > hints. That's a *truly* horrid idea, as sometimes you need them simply to get the precedence correct. Such awful design from SAP doesn't surprise me, and MySQL copying a bad idea surprises me even less, but let's not go there. regards, tom lane
Tom Lane <tgl@sss.pgh.pa.us> wrote: > "Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes: >> treat parentheses around JOIN operations as optimizer hints. > > That's a *truly* horrid idea, as sometimes you need them simply to > get the precedence correct. You do, but it's been pretty rare in my experience, and we're considering alternatives which give a lot less flexibility that this. The *truly* awful thing about the SAP-DB implementation is that it wasn't optional -- parentheses in this part of a query always limited optimizer options; I sure wouldn't want to go there again. I thought we were talking about options for what to do when an enable_* setting was off for diagnostic purposes.... -Kevin
On Jul 10, 2009, at 12:06 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Robert Haas <robertmhaas@gmail.com> writes: >> I took a look at this and it seems that #3 can be implemented with >> essentially no additional code (the handful of lines I added where >> more than balanced out by some simplifications in ruleutils.c). Of >> course you still don't have to like it. :-) > > You're right, I don't. Even if I thought this were a good idea, which > I most definitely do not, the need to add a nonstandard fully-reserved > word is sufficient reason to reject it. (The patch tries to pretend > it's not going to reserve the word, but that only works because you > have > carefully changed only one of the five JOIN productions, leading to > bizarrely non-orthogonal syntax.) Well, it's pretty obvious that only one of those productions is actually a problem, and that is the one which produces an undecorated JOIN. The others could all be changed easily enough, but they add no expressive power, so I didn't think it very worthwhile to add MORE non- standard syntax. In any event, the current non-orthogonality is exponentially more bizarre: you can constrain the join order by setting join_collapse_limit to 1, but then you must write the joins you want constrained using JOIN and the others as FROM items, which of course doesn't work at all for LEFT or RIGHT joins and will have random and unpredictable effects on subqueries pulled in via VIEWs. Needing to write INNER to be able to use FORCE pales by comparison. That having been said, I'm not excited about pushing water up a hill. The important thing here is to remove the collapse limits; providing a tool to control the join order that won't make you want to beat your head in a brick is just something that can be trivially done with no extra code, not my primary goal. ...Robert
On Fri, Jul 10, 2009 at 10:22 AM, Robert Haas<robertmhaas@gmail.com> wrote: > I took a look at this and it seems that #3 can be implemented with > essentially no additional code (the handful of lines I added where > more than balanced out by some simplifications in ruleutils.c). Of > course you still don't have to like it. :-) > > Patch attached. > ! SELECT * FROM a INNER FORCE JOIN (b INNER FORCE JOIN c ON (b.ref = c.id)) ON (a.id = b.id); what's next? FORCE INDEX? once we implement one hint like this the complaints about the others will lose force -- Atentamente, Jaime Casanova Soporte y capacitación de PostgreSQL Asesoría y desarrollo de sistemas Guayaquil - Ecuador Cel. +59387171157
"Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes: > You do, but it's been pretty rare in my experience, and we're > considering alternatives which give a lot less flexibility that this. I dunno about "considering". We've already wasted vastly more time on this than it's worth. AFAIR there has never been one single user request for the ability to partially constrain join order. I think we should do an enable_join_ordering boolean and quit wasting brainpower on the issue. regards, tom lane
Tom Lane wrote: > "Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes: >> You do, but it's been pretty rare in my experience, and we're >> considering alternatives which give a lot less flexibility that this. > > I dunno about "considering". We've already wasted vastly more time on > this than it's worth. AFAIR there has never been one single user > request for the ability to partially constrain join order. I think we > should do an enable_join_ordering boolean and quit wasting brainpower on > the issue. I think I've found it useful in the past[1], but I also think we already have a way to give postgres such hints using subselects and "offset 0". Instead of SAP-DB's > select * from (t1 join t2 on <whatever>) join t3 on <whatever>; ISTM we can already do > select * from (select t1 join t2 on <whatever> offset 0) as a join t3 on <whatever>; which seems like a reasonably way of hinting which parenthesis can be reordered and which can't. Would these new proposals give (guc's or syntax hacks) anything that I can't already do? [1] http://archives.postgresql.org/pgsql-performance/2007-12/msg00088.php
On Fri, Jul 10, 2009 at 2:44 PM, Jaime Casanova<jcasanov@systemguards.com.ec> wrote: > On Fri, Jul 10, 2009 at 10:22 AM, Robert Haas<robertmhaas@gmail.com> wrote: >> I took a look at this and it seems that #3 can be implemented with >> essentially no additional code (the handful of lines I added where >> more than balanced out by some simplifications in ruleutils.c). Of >> course you still don't have to like it. :-) >> >> Patch attached. >> > > ! SELECT * FROM a INNER FORCE JOIN (b INNER FORCE JOIN c ON (b.ref = > c.id)) ON (a.id = b.id); > > what's next? FORCE INDEX? > once we implement one hint like this the complaints about the others > will lose force That would be pretty useless, because while it might work for trivial cases, in a complex plan, it's almost certainly not going to do what you want unless you specify every detail of the entire plan along with it, which kind of defeats the purpose of using a database with a sophisticated planner. Actually, it seems to me that if we were to add hints, the place to start would be with selectivity, since in my experience bad selectivity estimates (often by 3, 4, 5, or 6 orders of magnitude) are often the reason for a bad plan, and if you can fix that, the rest takes care of itself - but fixing it is often very difficult. Of course, improving the selectivity estimator would be even better, because time spent applying hints to queries is time that could be better spent doing something else, and in fact one of the reasons why I started using PostgreSQL is because complex queries Just Work. Constraining the join order does seem a lot less useful, because (for example) it won't compel the planner to use a hash join instead of a nest join, or to make a sensible decision about which relations should be on the inner side of the join. But I still think that it's worth considering, because: 1. We basically already support the functionality - it's just broken. Today, you can control the join order by setting join_collapse_limit = 1; then, the joins you want to order you specify using JOIN syntax; the ones you don't want to order, you write as FROM items. But all of your outer joins will necessarily have the order forced, because there's no way to write those as FROM items. And if you want to use a view that was written without this convention in mind, you're hosed. So I think we either ought to remove it or fix it. 2. It's actually LESS code to support it completely than it is to leave it the way it is now while removing from_collapse_limit and replacing join_collapse_limit with enable_join_ordering. Compare the patch I sent earlier with the one that I'm about to send. 3. The point of this particular type of hint, at least as I see it, is not so much to force the planner into the correct query plan as it is to constrain the planning time. There are very few tools available for this today: join_collapse_limit and from_collapse_limit do it by unexpectedly producing terrible plans, and the only other option is GEQO. In a case like the one posted upthread where you have ten or so joins that can be reordered almost arbitrarily, you could shove a single FORCE right into the middle and split it into two problems that can be planned consecutively. This isn't that different from what join_collapse_limit does today, but it doesn't force a uniform threshold on you across the board; you can apply where, when, and to the extent necessary to control planning time for a particular query, and you have to explicitly ask for it so you can't complain you were ambushed if it doesn't work out. But I can see I'm outvoted, so I give up! ...Robert
On Fri, Jul 10, 2009 at 2:48 PM, Tom Lane<tgl@sss.pgh.pa.us> wrote: > "Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes: >> You do, but it's been pretty rare in my experience, and we're >> considering alternatives which give a lot less flexibility that this. > > I dunno about "considering". We've already wasted vastly more time on > this than it's worth. AFAIR there has never been one single user > request for the ability to partially constrain join order. I think we > should do an enable_join_ordering boolean and quit wasting brainpower on > the issue. Patch attached. ...Robert
Attachment
On Wednesday 08 July 2009 23:46:02 Tom Lane wrote: > "Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes: > > For a moment it seemed logical to suggest a session GUC for the seed, > > so if you got a bad plan you could keep rolling the dice until you got > > one you liked; but my right-brain kept sending shivers down my spine > > to suggest just how uncomfortable it was with that idea.... > > If memory serves, we actually had exactly that at some point. But I > think the reason it got taken out was that it interfered with the > behavior of the random() function for everything else. We'd have to > give GEQO its own private random number generator. All of GEQOs usage of random() seems to be concentrated to geqo_random.h - so it would be a small change. I will happily tackle that. If only to narrow down in which cases geqo fails to plan - a behaviour we have seen at times at a bit more crazy queries. The only question I have is, whether random_r or similar is available on enough platforms... Has anybody an idea about this? On most unixoid system one could just wrap erand48() if random_r is not available. Windows? Andres
Andres Freund <andres@anarazel.de> writes: > The only question I have is, whether random_r or similar is available on > enough platforms... Has anybody an idea about this? > On most unixoid system one could just wrap erand48() if random_r is not > available. > Windows? random_r() isn't in the Single Unix Spec AFAICS, and I also don't find it on HPUX 10.20, so I'd vote against depending on it. What I do see in SUS is initstate() and setstate() which could be used to control the random() function: http://www.opengroup.org/onlinepubs/007908799/xsh/initstate.html It would also work to leave random() for use by the core code and have GEQO depend on something from the drand48() family: http://www.opengroup.org/onlinepubs/007908799/xsh/drand48.html Probably drand48() is less random than random(), but for the limited purposes of GEQO I doubt we care very much. So far as I can find in a quick google search, neither of these families of functions exist on Windows :-(. So I think maybe the best approach is the second one --- we could implement a port/ module that provides a version of whichever drand48 function we need. On reflection I think the best user API is probably a "geqo_seed" GUC in the range 0 to 1, and have GEQO always reset its seed to that value at start of a planning cycle. This ensures plan stability, and if you need to experiment with alternative plans you can change to different seed values. The "no reset" behavior doesn't seem to have much real-world usefulness, because even if you chance to get a good plan, you have no way to reproduce it... regards, tom lane
Hi, On Saturday 11 July 2009 18:23:59 Tom Lane wrote: > Andres Freund <andres@anarazel.de> writes: > > The only question I have is, whether random_r or similar is available on > > enough platforms... Has anybody an idea about this? > > On most unixoid system one could just wrap erand48() if random_r is not > > available. > > Windows? > > random_r() isn't in the Single Unix Spec AFAICS, and I also don't find > it on HPUX 10.20, so I'd vote against depending on it. What I do see > in SUS is initstate() and setstate() which could be used to control > the random() function: > http://www.opengroup.org/onlinepubs/007908799/xsh/initstate.html Using setstate() has a bit too many possible implications for my taste - especially as there is no way to portably get/save the current random state. (Running a known query to reset randoms seed or so) > It would also work to leave random() for use by the core code and have > GEQO depend on something from the drand48() family: > http://www.opengroup.org/onlinepubs/007908799/xsh/drand48.html > Probably drand48() is less random than random(), but for the limited > purposes of GEQO I doubt we care very much. Yes. > So far as I can find in a quick google search, neither of these families > of functions exist on Windows :-(. So I think maybe the best approach > is the second one --- we could implement a port/ module that provides a > version of whichever drand48 function we need. Okay. It would be possible to implement random_r the same way if we are going to write something for windows anyway - Is it possible that it might be usefull somewhere else? > On reflection I think the best user API is probably a "geqo_seed" GUC in > the range 0 to 1, and have GEQO always reset its seed to that value at > start of a planning cycle. This ensures plan stability, and if you need > to experiment with alternative plans you can change to different seed > values. The "no reset" behavior doesn't seem to have much real-world > usefulness, because even if you chance to get a good plan, you have no > way to reproduce it... That was my thinking as well. Should geqo_seed be documented from start or not? Andres
Andres Freund <andres@anarazel.de> writes: > On Saturday 11 July 2009 18:23:59 Tom Lane wrote: >> So far as I can find in a quick google search, neither of these families >> of functions exist on Windows :-(. So I think maybe the best approach >> is the second one --- we could implement a port/ module that provides a >> version of whichever drand48 function we need. > It would be possible to implement random_r the same way if we are going to > write something for windows anyway - Is it possible that it might be usefull > somewhere else? I think a decent version of random_r would be more work than it's worth. (In practice we'd probably just lift the module from one of the BSDen anyway, so maybe "work" is the wrong measure here, but code size is still relevant.) regards, tom lane
On Sat, Jul 11, 2009 at 12:23:59PM -0400, Tom Lane wrote: > Andres Freund <andres@anarazel.de> writes: > > The only question I have is, whether random_r or similar is available on > > enough platforms... Has anybody an idea about this? > > On most unixoid system one could just wrap erand48() if random_r is not > > available. > > Windows? > > random_r() isn't in the Single Unix Spec AFAICS, and I also don't find > it on HPUX 10.20, so I'd vote against depending on it. What I do see > in SUS is initstate() and setstate() which could be used to control > the random() function: > http://www.opengroup.org/onlinepubs/007908799/xsh/initstate.html > It would also work to leave random() for use by the core code and have > GEQO depend on something from the drand48() family: > http://www.opengroup.org/onlinepubs/007908799/xsh/drand48.html > Probably drand48() is less random than random(), but for the limited > purposes of GEQO I doubt we care very much. > Ugh, tracking down problems caused a poor random number generator is a difficult. Poor randomness often causes weird results from algorithms that were designed around the assumption of a "random" number. > So far as I can find in a quick google search, neither of these families > of functions exist on Windows :-(. So I think maybe the best approach > is the second one --- we could implement a port/ module that provides a > version of whichever drand48 function we need. > I think that having a port/module for a random number generator is a good idea. There are a number of good, fast random number generators to choose from. Cheers, Ken > On reflection I think the best user API is probably a "geqo_seed" GUC in > the range 0 to 1, and have GEQO always reset its seed to that value at > start of a planning cycle. This ensures plan stability, and if you need > to experiment with alternative plans you can change to different seed > values. The "no reset" behavior doesn't seem to have much real-world > usefulness, because even if you chance to get a good plan, you have no > way to reproduce it... > > regards, tom lane > > -- > Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-hackers >
On Saturday 11 July 2009 18:23:59 Tom Lane wrote: > On reflection I think the best user API is probably a "geqo_seed" GUC in > the range 0 to 1, and have GEQO always reset its seed to that value at > start of a planning cycle. This ensures plan stability, and if you need > to experiment with alternative plans you can change to different seed > values. The "no reset" behavior doesn't seem to have much real-world > usefulness, because even if you chance to get a good plan, you have no > way to reproduce it... I just realized doing it in a naive way (in geqo()) causes the state to be reset multiple times during one query- at every invocation of make_rel_from_joinlist. Does anybody see a problem with that? Andres
Andres Freund <andres@anarazel.de> writes: > I just realized doing it in a naive way (in geqo()) causes the state to be > reset multiple times during one query- at every invocation of > make_rel_from_joinlist. > Does anybody see a problem with that? I think that's probably what we want. Otherwise, you'd have a situation where two identical subproblems might get planned differently. This ties into something I was thinking about earlier: the planner is potentially recursive (eg, it might call a user-defined function that contains a plannable query), and it'd probably be a good idea if the behavior of GEQO wasn't sensitive to that. So the RNG's state needs to be kept in PlannerInfo or some associated structure, not be global. regards, tom lane
Hi Tom, On Saturday 11 July 2009 20:33:14 Tom Lane wrote: > Andres Freund <andres@anarazel.de> writes: > > I just realized doing it in a naive way (in geqo()) causes the state to > > be reset multiple times during one query- at every invocation of > > make_rel_from_joinlist. > > > > Does anybody see a problem with that? > > I think that's probably what we want. Otherwise, you'd have a situation > where two identical subproblems might get planned differently. > > This ties into something I was thinking about earlier: the planner is > potentially recursive (eg, it might call a user-defined function that > contains a plannable query), and it'd probably be a good idea if the > behavior of GEQO wasn't sensitive to that. So the RNG's state needs to > be kept in PlannerInfo or some associated structure, not be global. Hm. I looked a bit into this and I dont see a real problem with a global random state if that one gets reinitialized on every geqo() invocation. If I understood the code correctly - which is not sure at all - while make_rel_from_joinlist is itself recursively the actual join search code is not recursive. Correct? Thus it would be enough to reset the seed on every geqo() invocation... Any counter arguments? Andres
Andres Freund <andres@anarazel.de> writes: > On Saturday 11 July 2009 20:33:14 Tom Lane wrote: >> This ties into something I was thinking about earlier: the planner is >> potentially recursive (eg, it might call a user-defined function that >> contains a plannable query), and it'd probably be a good idea if the >> behavior of GEQO wasn't sensitive to that. So the RNG's state needs to >> be kept in PlannerInfo or some associated structure, not be global. > Hm. I looked a bit into this and I dont see a real problem with a global > random state if that one gets reinitialized on every geqo() invocation. If I > understood the code correctly - which is not sure at all - while > make_rel_from_joinlist is itself recursively the actual join search code is > not recursive. Correct? I wouldn't count on that. GEQO is not recursive with respect to a particular query, but there's still the risk of the planner deciding to call out to some user-defined code while it does selectivity estimates. So the planner as a whole has to be re-entrant. Now you could probably argue that the impact of extra RNG resets on the overall behavior is small enough that it doesn't matter. But if you really want to promise consistent GEQO behavior then I think the RNG state has to be local to a particular planner instantiation. regards, tom lane
On Sunday 12 July 2009 16:44:59 Tom Lane wrote: > Andres Freund <andres@anarazel.de> writes: > > On Saturday 11 July 2009 20:33:14 Tom Lane wrote: > >> This ties into something I was thinking about earlier: the planner is > >> potentially recursive (eg, it might call a user-defined function that > >> contains a plannable query), and it'd probably be a good idea if the > >> behavior of GEQO wasn't sensitive to that. So the RNG's state needs to > >> be kept in PlannerInfo or some associated structure, not be global. > > > > Hm. I looked a bit into this and I dont see a real problem with a global > > random state if that one gets reinitialized on every geqo() invocation. > > If I understood the code correctly - which is not sure at all - while > > make_rel_from_joinlist is itself recursively the actual join search code > > is not recursive. Correct? > I wouldn't count on that. GEQO is not recursive with respect to a > particular query, but there's still the risk of the planner deciding > to call out to some user-defined code while it does selectivity > estimates. So the planner as a whole has to be re-entrant. > Now you could probably argue that the impact of extra RNG resets on > the overall behavior is small enough that it doesn't matter. But if > you really want to promise consistent GEQO behavior then I think the > RNG state has to be local to a particular planner instantiation. I just did not see that it could call selectivity estimate functions. This will mean adding a additional Paramer (PlannerInfo) to most of the geqo functions, but I don't see a problem there. Has anybody tried Geqo without ERX in recent time? Andres
Andres Freund <andres@anarazel.de> writes: > Has anybody tried Geqo without ERX in recent time? No, I don't think so. Anything that's ifdef'd out at the moment has been that way for quite a few years, and so is likely broken anyhow :-( regards, tom lane
On Saturday 11 July 2009 17:19:18 Andres Freund wrote: > On Wednesday 08 July 2009 23:46:02 Tom Lane wrote: > > "Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes: > > > For a moment it seemed logical to suggest a session GUC for the seed, > > > so if you got a bad plan you could keep rolling the dice until you got > > > one you liked; but my right-brain kept sending shivers down my spine > > > to suggest just how uncomfortable it was with that idea.... > > > > If memory serves, we actually had exactly that at some point. But I > > think the reason it got taken out was that it interfered with the > > behavior of the random() function for everything else. We'd have to > > give GEQO its own private random number generator. > > All of GEQOs usage of random() seems to be concentrated to geqo_random.h - > so it would be a small change. > I will happily tackle that. If only to narrow down in which cases geqo > fails to plan - a behaviour we have seen at times at a bit more crazy > queries. > > The only question I have is, whether random_r or similar is available on > enough platforms... Has anybody an idea about this? > On most unixoid system one could just wrap erand48() if random_r is not > available. Sorry, had to stall the issue for a bit of time, here is a preliminary patch. - added a "void *join_search_private" to hold the random number generator which would also be usable by other join - PlannerInfo *root is now passed to all non-static geqo related functions. It seems cleaner to do this generally than on a function-by-function basis - replaced the sparsely used GeqoPrivateData by GeqoEvalData which is passed via join_search_private - if geqo_seed = 0 a global/per-backend state is used Whats missing: - Windows prng support - Perhaps: Configure check for erand48 - Documentation for geqo_seed I did not find any windows api for a random number generator with visible/changable state, so I think there is not much alternative to pulling in random_r or similar from *bsd if this seems worth pursuing Mostly sensible? Andres
On Thu, Jul 9, 2009 at 5:38 AM, Noah Misch<noah@leadboat.com> wrote: z > Describing in those terms illuminates much. While the concepts do suggest 2^N > worst-case planning cost, my artificial test case showed a rigid 4^N pattern; > what could explain that? > Isn`t that just so that the planner has to examine O(2^N) subsets of relations, and do O(2^N) work for each of them? To create level N join the planner chooses pairs of level k and level N-k joins. the count of level k joins is O(2^k), the count of level N-k ones is O(2^(N-k)). Together it is O(N) * O(2^N) * O(2^k) * O(2^(N-k)) which is O(N* 4^N) . This is for the worst case. If we could make a better estimate of the required planning time (I believe that the input data for a good heuristic is a matrix which says which relation is constrained to which relation), we could make better decisions about when to flatten subqueries, collapse joins, launch geqo... Greetings Marcin