Re: Hints proposal - Mailing list pgsql-performance
From | Christopher Browne |
---|---|
Subject | Re: Hints proposal |
Date | |
Msg-id | 87r6xd9b91.fsf@wolfe.cbbrowne.com Whole thread Raw |
In response to | Re: Hints proposal (Arjen van der Meijden <acmmailing@tweakers.net>) |
List | pgsql-performance |
Quoth rabroersma@yahoo.com (Richard Broersma Jr): >> By the way, wouldn't it be possible if the planner learned from a query >> execution, so it would know if a choice for a specific plan or estimate >> was actually correct or not for future reference? Or is that in the line >> of DB2's complexity and a very hard problem and/or would it add too much >> overhead? > > Just thinking out-loud here... > > Wow, a learning cost based planner sounds a-lot like problem for > control & dynamical systems theory. Alas, dynamic control theory, home of considerable numbers of Hamiltonian equations, as well as Pontryagin's Minimum Principle, is replete with: a) Gory multivariate calculus b) Need for all kinds of continuity requirements (e.g. - continuous, smooth functions with no discontinuities or other "nastiness") otherwise the math gets *really* nasty We don't have anything even resembling "continuous" because our measures are all discrete (e.g. - the base values are all integers). > As I understand it, much of the advice given for setting > PostgreSQL's tune-able parameters are from "RULES-OF-THUMB." I am > sure that effect on server performance from all of the parameters > could be modeled and an adaptive feed-back controller could be > designed to tuned these parameters as demand on the server changes. Optimal control theory loves the "bang-bang" control, where you go to one extreme or another, which requires all those continuity conditions I mentioned, and is almost certainly not the right answer here. > Al-thought, I suppose that a controller like this would have limited > success since some of the most affective parameters are non-run-time > tune-able. > > In regards to query planning, I wonder if there is way to model a > controller that could adjust/alter query plans based on a comparison > of expected and actual query execution times. I think there would be something awesomely useful about recording expected+actual statistics along with some of the plans. The case that is easiest to argue for is where Actual >>> Expected (e.g. - Actual "was a whole lot larger than" Expected); in such cases, you've already spent a LONG time on the query, which means that spending millisecond recording the moral equivalent to "Explain Analyze" output should be an immaterial cost. If we could record a whole lot of these cases, and possibly, with some anonymization / permissioning, feed the data to a central place, then some analysis could be done to see if there's merit to particular modifications to the query plan cost model. Part of the *really* fundamental query optimization problem is that there seems to be some evidence that the cost model isn't perfectly reflective of the costs of queries. Improving the quality of the cost model is one of the factors that would improve the performance of the query optimizer. That would represent a fundamental improvement. -- let name="cbbrowne" and tld="gmail.com" in name ^ "@" ^ tld;; http://linuxdatabases.info/info/languages.html "If I can see farther it is because I am surrounded by dwarves." -- Murray Gell-Mann
pgsql-performance by date: