On Tue, Mar 18, 2014 at 12:46 AM, Merlin Moncure <mmoncure@gmail.com> wrote:
On Mon, Mar 17, 2014 at 1:45 PM, Pavel Stehule <pavel.stehule@gmail.com> wrote: > I don't believe so SELECTIVITY can work well too. Slow queries are usually > related to some strange points in data. I am thinking so well concept should > be based on validity of estimations. Some plans are based on totally wrong > estimation, but should be fast due less sensitivity to bad estimations. So > well concept is penalization some risk plans - or use brute force - like > COLUMN store engine does. Their plan is usually simply and tolerant to bad > estimations.
Disagree. There is a special case of slow query where problem is not with the data but with the expression over the data; something in the query defeats sampled selectivity. Common culprits are:
When using those expressions, you often end up with default selectivity assumptions and if they are way off -- watch out.
Plan risk analysis solves a different problem: small changes in the data mean big changes in the execution runtime. It probably wouldn't even help cases where the server thinks there is one row and you actually have thousands or millions unless you want to implement a selectivity range with perhaps a risk coefficient. This was also suggested sometime back and was also met with some skepticism (but it'd be interesting to see!).
Another case is with prepared statements, when things like array size are not know to the planner and the planner makes a hard coded estimate for it, leading to selection of a customized plan which is worse than the generic plan.
This would be even more useful for prepared statements since they need some support from the user in terms of the selectivity and the user should be allowed to tell more about the data, since he already has given us some indications about the type of query plans he requires using prepared statements.