On Mon, Mar 17, 2014 at 01:20:47PM -0500, Merlin Moncure wrote:
> A query plan is a complicated thing that is the result of detail
> analysis of the data. I bet there are less than 100 users on the
> planet with the architectural knowledge of the planner to submit a
> 'plan'. What users do have is knowledge of the data that the database
> can't effectively gather for some reason. Looking at my query above,
> what it would need (assuming the planner could not be made to look
> through length()) would be something like:
>
> SELECT * FROM foo WHERE
> length(bar) <= 1000 WITH SELECTIVITY 0.999
> AND length(bar) >= 2 WITH SELECTIVITY 0.999;
A small issue with selectivity is that the selectivity is probably not
what the users are expecting anyway, since many will related to
conditional selectivities. PostgreSQL is pretty good at single column
statistics, it just sometimes screws up on cross-column correlations.
This ties in with alerting about a bad plan: if the EXPLAIN output
could list for each condition what the actual selectivity was it might
give user a way of understanding the problem.
So the example given might lead to output like:
clause selectivity estimated
length(bar)>2 0.50 0.50
length(bar)<1000 | length(bar)>2 0.50 0.25
The execution engine can only output conditional selectivities because
of the order of execution. But this would at least give users a handle
on the problem.
Note that a first cut of the problem might simply be something like
likely()/unlikely() as in gcc.
Have a nice day,
--
Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/
> He who writes carelessly confesses thereby at the very outset that he does
> not attach much importance to his own thoughts. -- Arthur Schopenhauer