On 02 Dec 2005 15:49:02 -0500, Greg Stark <gsstark@mit.edu> wrote:
> Rod Taylor <pg@rbt.ca> writes:
> > The missing capability in this case is to be able to provide or generate
> > (self learning?) statistics for a function that describe a typical result
> > and the cost of getting that result.
>
> Ok, try "WHERE radius_authenticate(user, (select ...), ?)"
>
> The point is that you can improve the estimates the planner gets. But you can
> never make them omniscient. There will always be cases where the user knows
> his data more than the planner. And those hints are still valid when a new
> optimizer has new plans available.
Actually... If a statistics engine stores the entire query as well
and used that as a key I don't see why it couldn't figure this out.
I.e. in queries that look like Z operation X has historically had
selectivity Y.
The the instruction to the user is simple: 'make sure that queries
with different results look different' . This is often naturally the
case.
The challenge becomes how do you group together queries which are
mostly the same so that you get enough data, but not falsely cluster
queries with different statistics.
The simplest way check the statistics list for the most similar query
match, and use that information. If the result is similar to what is
expected, use it to update the statistics record. If the measured
selectivity is too different make a new record which will then attract
similar queries.
Sounds like a good research project for someone.