Re: *_collapse_limit, geqo_threshold - Mailing list pgsql-hackers
From | Robert Haas |
---|---|
Subject | Re: *_collapse_limit, geqo_threshold |
Date | |
Msg-id | 603c8f070907102119n1428bca9j755b6c27502fc52d@mail.gmail.com Whole thread Raw |
In response to | Re: *_collapse_limit, geqo_threshold (Jaime Casanova <jcasanov@systemguards.com.ec>) |
List | pgsql-hackers |
On Fri, Jul 10, 2009 at 2:44 PM, Jaime Casanova<jcasanov@systemguards.com.ec> wrote: > On Fri, Jul 10, 2009 at 10:22 AM, Robert Haas<robertmhaas@gmail.com> wrote: >> I took a look at this and it seems that #3 can be implemented with >> essentially no additional code (the handful of lines I added where >> more than balanced out by some simplifications in ruleutils.c). Of >> course you still don't have to like it. :-) >> >> Patch attached. >> > > ! SELECT * FROM a INNER FORCE JOIN (b INNER FORCE JOIN c ON (b.ref = > c.id)) ON (a.id = b.id); > > what's next? FORCE INDEX? > once we implement one hint like this the complaints about the others > will lose force That would be pretty useless, because while it might work for trivial cases, in a complex plan, it's almost certainly not going to do what you want unless you specify every detail of the entire plan along with it, which kind of defeats the purpose of using a database with a sophisticated planner. Actually, it seems to me that if we were to add hints, the place to start would be with selectivity, since in my experience bad selectivity estimates (often by 3, 4, 5, or 6 orders of magnitude) are often the reason for a bad plan, and if you can fix that, the rest takes care of itself - but fixing it is often very difficult. Of course, improving the selectivity estimator would be even better, because time spent applying hints to queries is time that could be better spent doing something else, and in fact one of the reasons why I started using PostgreSQL is because complex queries Just Work. Constraining the join order does seem a lot less useful, because (for example) it won't compel the planner to use a hash join instead of a nest join, or to make a sensible decision about which relations should be on the inner side of the join. But I still think that it's worth considering, because: 1. We basically already support the functionality - it's just broken. Today, you can control the join order by setting join_collapse_limit = 1; then, the joins you want to order you specify using JOIN syntax; the ones you don't want to order, you write as FROM items. But all of your outer joins will necessarily have the order forced, because there's no way to write those as FROM items. And if you want to use a view that was written without this convention in mind, you're hosed. So I think we either ought to remove it or fix it. 2. It's actually LESS code to support it completely than it is to leave it the way it is now while removing from_collapse_limit and replacing join_collapse_limit with enable_join_ordering. Compare the patch I sent earlier with the one that I'm about to send. 3. The point of this particular type of hint, at least as I see it, is not so much to force the planner into the correct query plan as it is to constrain the planning time. There are very few tools available for this today: join_collapse_limit and from_collapse_limit do it by unexpectedly producing terrible plans, and the only other option is GEQO. In a case like the one posted upthread where you have ten or so joins that can be reordered almost arbitrarily, you could shove a single FORCE right into the middle and split it into two problems that can be planned consecutively. This isn't that different from what join_collapse_limit does today, but it doesn't force a uniform threshold on you across the board; you can apply where, when, and to the extent necessary to control planning time for a particular query, and you have to explicitly ask for it so you can't complain you were ambushed if it doesn't work out. But I can see I'm outvoted, so I give up! ...Robert
pgsql-hackers by date: