Re: generalizing the planner knobs - Mailing list pgsql-hackers
From | Greg Stark |
---|---|
Subject | Re: generalizing the planner knobs |
Date | |
Msg-id | 8764q8h71p.fsf@stark.xeocode.com Whole thread Raw |
In response to | Re: generalizing the planner knobs ("Jonah H. Harris" <jonah.harris@gmail.com>) |
Responses |
Re: generalizing the planner knobs
|
List | pgsql-hackers |
"Jonah H. Harris" <jonah.harris@gmail.com> writes: > Tom, > > Don't get me wrong, I agree with you completely. I would rather put effort > into enhancing the planner than in developing work-arounds. In 99% of all > cases the planner works correctly, but I know people who actually have to > disable planning options (mergejoin) in production applications because they > get bad plans. The "bad" plans are not really bad in terms of what the > planner knows about the query, just in areas where the planner doesn't look > at other things. I would like to draw a distinction between two sorts of hints. Currently you're talking about one sort of hint, namely hints that tell the planner to alter its cost model and choose a different plan than the inputs it has would dictate. Using these require the user to have a fairly in depth understanding of the planner and what options it has available. On the other hand the type I would prefer to see are hints that feed directly into filling in information the planner lacks. This only requires that the user understand his own data and still lets the planner pick the best plan based on the provided information. So for example I would love to see a hint that allowed you to specify the selectivity of a where clause. And one that let you specify the "density" of a grouping clause. Most of the time the planner makes a mistake it's because of a bad miscalculation in estimating these givens. If it had the correct values for the inputs then it would make the right decision about the plan. Making the planner very good at making the right decisions given accurate inputs is an attainable goal. Computers are pretty deterministic and it's possible to come up with very accurate cost models. Despite some known problems with Postgres's current models they're remarkably good already. And there's no particular reason to think they can't be made nearly perfect. Making the planner very good at producing accurate estimates is a much harder goal. No matter how accurate it gets there will always be more complex expressions that are harder to predict and there will always be cases the planner can't estimate well. The user however knows his own data and may well know the answer. In the extreme consider user-defined operators, which will always be dependent on the user to provide estimator functions. If it's a rarely used operator the user may find it easier to simply tell the planner the selectivity of each expression rather than come up with a general solution. (I also think things like joins and group by clauses will never be able to be guaranteed accurate results in general. Not without a lot more costs up front including giving up on calculating statistics based on only a sample.) -- greg
pgsql-hackers by date: