Thread: Optimizing prepared statements
I've rigged up a simple simulator for the scheme I described for detecting pseudo-constant parameters to prepared statements. It withstands simple tests, and it neatly picks up cases where some parameters are pseudo-constants and others aren't--even if some of them are more "pseudo" while others are more "constant." What I need now is some realistic test data. Does anyone have realistic SQL logs that use prepared statements? For now, I'll summarize some results I got from randomized input data. I used very simple traces, with 11 prepared statements, each taking a different number of parameters (0 through 10, inclusive). All calls were uniformly randomized. I used LRU replacement of cached plans, with up to 4 retained plans per statement. Confidence counters ran from 0 to 3 inclusive, with the confidence threshold lying between 1 and 2. Per simulation run, 20,000 statement invocations were processed. Runs of 20,000 took about 3.5 seconds of wall-clock time each, or 0.175 milliseconds per statement, on a lightly-loaded 1.8 GHz 32-bit Athlon XP. That's for simulation in Python 2.4, with no effort to optimize and no precompilation, and several lines of information composed and printed to /dev/null for every invocation. So I think the overhead of the matching and comparing that the algorithm does is not a performance worry. In my first test, parameters were uniformly-distributed integers in the range [0, 999]. Over this test, 104 plans were generated for the 11 plans, for an average 192 calls per generated plan. Only 133 calls out of 20,000 used optimized plans, in this case optimizing out only one pseudo-constant each. When parameters were made to follow the normal distribution with mean 500 and standard deviation 100 (rounded to integers), the number of generated plans went up to 305 as more patterns were recognized, and of course the number of calls per generated plan dropped to 65. Of the 20,000 invocations here, 770 used plans with one parameter optimized away, and 2 used plans with two. These results don't look very good, but bear in mind this is for randomized data. Can't expect to exploit many patterns in random inputs! Real-life use is probably going to be much more favourable. If we want to guard against fluke patterns in highly-variable parameters, we can always increase the range of the confidence counters. That would make the predictor more skeptical when it comes to accepting reptitions as patterns. Just how we tune the counters would be a tradeoff between the cost of additional planning and the benefits of optimizing out more parameters. So once again, does anyone know of any realistic logs that I can use for more useful simulations? Jeroen
"Jeroen T. Vermeulen" <jtv@xs4all.nl> writes: > For now, I'll summarize some results I got from randomized input data. I > used very simple traces, with 11 prepared statements, each taking a > different number of parameters (0 through 10, inclusive). All calls were > uniformly randomized. I used LRU replacement of cached plans, with up to > 4 retained plans per statement. Confidence counters ran from 0 to 3 > inclusive, with the confidence threshold lying between 1 and 2. I'm confused, what exactly are you trying to predict? Whether each parameter will be some cached value? Or whether the cached plan was correct? > So once again, does anyone know of any realistic logs that I can use for > more useful simulations? You might look at the DBT test suite, iirc the TPCC spec it implements intentionally mixes random queries with predictable queries. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com
On Sun, September 3, 2006 18:41, Gregory Stark wrote: > I'm confused, what exactly are you trying to predict? Whether each > parameter > will be some cached value? Or whether the cached plan was correct? That's described in more detail in a separate thread ("prepared statements considered harmful"). In a nutshell, the algorithm detects pseudoconstant parameters to prepared statements, and keeps a small set of different plans optimized for recurring combinations of constants. >> So once again, does anyone know of any realistic logs that I can use for >> more useful simulations? > > You might look at the DBT test suite, iirc the TPCC spec it implements > intentionally mixes random queries with predictable queries. I considered the TPC benchmarks, but they're still benchmarks. When seen from one angle they try to look like real applications, but I'm worried that in testing this algorithm, I may be looking at them from a very different angle. I'd still need actual application logs to validate them! Jeroen
"Jeroen T. Vermeulen" <jtv@xs4all.nl> writes: > On Sun, September 3, 2006 18:41, Gregory Stark wrote: > >> I'm confused, what exactly are you trying to predict? Whether each >> parameter >> will be some cached value? Or whether the cached plan was correct? > > That's described in more detail in a separate thread ("prepared statements > considered harmful"). In a nutshell, the algorithm detects pseudoconstant > parameters to prepared statements, and keeps a small set of different > plans optimized for recurring combinations of constants. I read that but apparently I misunderstood it since it would not have been doable the way I understood it. I thought you wanted the predictor bits to correspond to particular plans. If a plan was "wrong" then you marked it as a bad guess. I don't think that can be made to work though for the reasons I stated then. But if you have something working clearly that's not what you're doing. So what are you doing? Storing up a list of arguments seen for each parameter when executed and use the predictor bits to determine if any of those arguments are constants? Storing up a list of selectivity estimates? -- Gregory Stark EnterpriseDB http://www.enterprisedb.com
On Sun, September 3, 2006 21:52, Gregory Stark wrote: > I read that but apparently I misunderstood it since it would not have been > doable the way I understood it. I thought you wanted the predictor bits to > correspond to particular plans. If a plan was "wrong" then you marked it > as a > bad guess. I don't think that can be made to work though for the reasons I > stated then. Oh, sorry--I guess I haven't been too systematic about it. In the algorithm's current incarnation, the confidence counters don't mark anything as bad ideas, as such. Instead, whenever a new invocation has a set of parameters that are (1) correctly and (2) confidently predicted by the counters, the predictor decides that it's probably a good idea to plan for calls with that particular set of parameters built in as constants. Assuming we didn't already have a plan for the particular combination of values at hand, the algorithm generates a new one. The new plan is optimized on the assumption that those predicted parameters are constants. We keep a small cache of recently-used plans, possibly including the original plan where all parameters are truly variable. Every plan also remembers a list of the predicted parameter values, so on any next call, we can check whether a particular cached plan actually applies to the call. If it doesn't (because of a mismatch between the incoming parameters and the plan's assumed pseudo-constants), we just pick another plan. If multiple cached plans can be applied to a given call, we prefer the one that optimizes away the most parameters. Age is used as a tiebreaker, on the assumption that more recent planning information is likely to be more accurate. The tricky part is deciding when to generate a new, more specialized plan when we already have a matching one that may not be optimal. Without this step we'd never get beyond that first, completely generic plan--it applies to every call. The way I've approached it is this: when the predictor's current state correctly and confidently predicts more of the invocation's parameter values than any of the cached plans did, then it's time to generate a new plan. So let's say your statement has a parameter x that's always the same value, say x==0, and another parameter y that's a randomly alternating Boolean value, and another one z that varies randomly between lots of values. What order they're in doesn't matter, and they needn't be the only parameters. You're probably going to see four plans generated for this example: 1. Either on the first call or during definition, you get the generic plan. This is the same plan you'd get in the existing backend, with placeholders for variable x, y, and z. 2. Pretty soon, the algorithm is going to detect that x is always zero. It will generate a new plan, substituting the constant value 0 for x, and hopefully getting better optimization because of it. 3. Sooner or later (probably fairly soon) you'll see a run of consecutive calls where y happens to be "true." A new plan is generated with the assumption that y==true. The new plan will also still assume that x==0. 4. The same is going to happen for y==false. Yet another specialized plan is generated. If we keep up to 3 plans per statement, say, then this new plan overflows the cache. The least recently used plan is flushed to make room for the new one--in this case, the generic one because we haven't seen any cases where x!=0 recently. More complex scenarios will also happen, of course, such as "if y==true then x will usually be 0, but otherwise x will be highly variable" or "if y==true then x is pseudo-constant and z is highly variable, but if y==false then it's the other way around" or "if y==false then z is usually the empty string," and so on. The predictor as I've simulated it is "too dumb to be intimidated" by the complexity. It should work reasonably well for all those scenarios, assuming of course that its cache is large enough to remember the most frequent patterns. Right now I'm using the plans' time since last use as the only eviction criterion when the cache overflows. There may be a better policy; the one Achilles heel of LRU is the "big loop" where every cache entry is used once, then evicted shortly before it is needed again. (If the loop is so big that entries are flushed long before they're needed again, well, then it's just a big job and you stop blaming LRU :-) > But if you have something working clearly that's not what you're doing. So > what are you doing? Storing up a list of arguments seen for each parameter > when executed and use the predictor bits to determine if any of those > arguments are constants? Storing up a list of selectivity estimates? The former. I'm keeping a single predictor with a single "more or less last-seen value" per parameter; plus a checklist of pseudoconstants for every cached plan. It's pretty simple, really, with no cost functions or spanning trees or other intelligent logic--and certainly nothing original.Which is what makes me cautiously optimistic: it'snot so hard to come up with good or original ideas, but ones that are good *and* original are exceedingly rare. :) This particular algorithm is based entirely on standard processor architecture tricks. One (to me) surprising lesson I learned in that field was that simple, dynamic schemes with obvious flaws are often more effective than what the combination of a smart programmer, an expert user, and an aggressive compiler can come up with beforehand. Another one was that performance breakthroughs often come from applying these standard tricks to problems you'd think they'd already been applied to. Jeroen
On Sun, September 3, 2006 23:28, Jeroen T. Vermeulen wrote: > On Sun, September 3, 2006 21:52, Gregory Stark wrote: >> I read that but apparently I misunderstood it since it would not have >> been >> doable the way I understood it. I thought you wanted the predictor bits >> to >> correspond to particular plans. If a plan was "wrong" then you marked it >> as a >> bad guess. I don't think that can be made to work though for the reasons >> I >> stated then. Come to think of it, I still haven't answered your question very clearly... I keep one predictor for any given statement. The predictor keeps two pieces of state *per statement parameter*: 1. The predictor value (usually the last-seen value for that parameter, though it does take a few mismatches to make it drop a value that was repeated a lot). 2. A saturating counter measuring confidence in the current predictor value. As long as this counter is below the confidence threshold, the predictor value has no other meaning than "let's see if this value comes back." No information flows between these pieces of state. All logic in the predictor itself is entirely local to individual parameters. But any new plan is generated based on a "snapshot" of all predictor values that (i) match their respective parameter values in the ongoing call, and (ii) have their counters above the confidence threshold. It is that combination of parameters, for that combination of values, that is taken as a set of pseudo-constants. The cache can hold any number of plans optimized for the same combinations of parameters but with different values; or for different combinations of parameters but the same values; subsets and supersets of parameters with either the same or different values; completely disjoint sets; anything. The cache neither knows nor cares. The only thing that *won't* happen is two plans in the same cache covering the same set of parameters with the same set of values--because there is never any need to generate that second plan while the first is still in cache. Jeroen
"Jeroen T. Vermeulen" <jtv@xs4all.nl> writes: > If multiple cached plans can be applied to a given call, we prefer the one > that optimizes away the most parameters. What exactly do you mean by "optimize away a parameter"? The way you described the mechanism, there are no parameters that are "optimized away", you've merely adjusted selectivity predictions using some assumed values. Actually converting a parameter to a constant is a whole 'nother matter --- it allows constant-folding for example. But then you *cannot* use the plan unless there's an exact match to the assumed value. These two approaches provide very different tradeoffs of plan quality vs plan specificity, so it makes me uncomfortable that you're failing to clarify what you mean. regards, tom lane
"Jeroen T. Vermeulen" <jtv@xs4all.nl> writes: > Oh, sorry--I guess I haven't been too systematic about it. In the > algorithm's current incarnation, ... Thanks, that cleared things up enormously. I'm trying to figure how it would react to some of the queries I've written in the past. In particular I'm thinking of queries like WHERE (? OR category = ?) AND (? OR cost < ?) AND (? OR description like ?||'%') Where I then pass flags in from the application to indicate which search constraints to apply. If it notices that most searches are for a particular set of constraints it would be able to cache plans with the unused constraints removed. It would not however be able to notice that the last parameter never contains a % and therefore can use an index scan. I'm also wondering how this interacts with plan stability. Obviously the direct effect is to throw out any chance of it. But in the long run they may be two complementary sides of the same thing. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com
On Sun, September 3, 2006 23:52, Tom Lane wrote: > What exactly do you mean by "optimize away a parameter"? The way you > described the mechanism, there are no parameters that are "optimized > away", you've merely adjusted selectivity predictions using some assumed > values. I'm using "optimized away" as shorthand for "replaced with a literal constant in the statement definition used to generate the plan." So if a parameter $n is found to be, say, always equal to the string 'foo', then we might want to generate a specialized plan as if the statement's definition contained the literal string 'foo' wherever it really says $n. I've been calling that "optimized for $n=='foo'" or "$n is optimized away in this plan." > Actually converting a parameter to a constant is a whole > 'nother matter --- it allows constant-folding for example. But then you > *cannot* use the plan unless there's an exact match to the assumed > value. These two approaches provide very different tradeoffs of plan > quality vs plan specificity, so it makes me uncomfortable that you're > failing to clarify what you mean. Right. When I said "optimized for" a certain parameter value, I meant actual substitution the whole time. I'm sorry if I didn't make that clear; it seemed so basic that I must have forgotten to mention it. I guess the principle would also work otherwise, but it's intended to allow constant folding. So for any given statement, there would be a cache of frequently-needed plans for different sets of constant substitutions. As you point out, a call could only use a plan if the plan's substitutions were consistent with the call's parameter values (but within that constraint, the more substitutions the merrier). That's why I talked so much about comparing and matching: that part is for correctness, not optimization. As I've said before, all this falls down if there is a significant cost to keeping one or two extra plans per prepared statement. You mentioned something about "tracking" plans. I don't know what that means, but it sounded like it might impose a runtime cost on keeping plans around. Could you elaborate? Jeroen
On Mon, September 4, 2006 03:56, Gregory Stark wrote: > Thanks, that cleared things up enormously. I'm trying to figure how it > would > react to some of the queries I've written in the past. In particular I'm > thinking of queries like > > WHERE (? OR category = ?) > AND (? OR cost < ?) > AND (? OR description like ?||'%') > > Where I then pass flags in from the application to indicate which search > constraints to apply. If it notices that most searches are for a > particular > set of constraints it would be able to cache plans with the unused > constraints > removed. Right. That's pretty much the problem as Peter originally described it, I think. > It would not however be able to notice that the last parameter never > contains a % and therefore can use an index scan. If I understand you correctly, then no. If the algorithm sees highly variable values for that last parameter, it will never decide to assume that that parameter will never contain '%'--and I'm not sure how that could be done safely. I do see two glimmers of hope, however: 1. If that last parameter is usually some specific value, then you'll probably end up using specialized plans with that specific value in the parameter's place. If that value is a string without wildcards, you can use your index on description (assuming you have one). If it's '%' or null, the optimizer can decide to ignore the "like" clause. It's only when the scheme finds that it cannot predict what the parameter's value is going to be that you get the generic, poorly-performing code. 2. Once we have a predictor, and assuming it works, it could be tied in with the planner a bit more. As I believe Tom said, the planner can't afford to chase down lots of scenarios just in case they ever happen. But when a parameter is used only for simple matches or inserts on non-indexed columns, for example, the planner might find in the course of its normal activities that there's nothing useful it can do with that parameter and deliver this information with its plan, so that the predictor can ignore the parameter. > I'm also wondering how this interacts with plan stability. Obviously the > direct effect is to throw out any chance of it. But in the long run they > may be two complementary sides of the same thing. Well, it'll cause some plans to be re-generated, surely. But the impression I've gotten from the discussion so far is some that plans were getting too old anyway. Jeroen
On Mon, Sep 04, 2006 at 11:12:13AM +0700, Jeroen T. Vermeulen wrote: > As I've said before, all this falls down if there is a significant cost to > keeping one or two extra plans per prepared statement. You mentioned > something about "tracking" plans. I don't know what that means, but it > sounded like it might impose a runtime cost on keeping plans around. I think what he meant is tracking plans during the planning process. Currently at the end of each step you weed out all the plans that arn't the best for each path-key. To track multiple results at that stage would be expensive. However, just running the planner over the same query multiple times with different estimates shouldn't be too expensive to store. However, you're discussing the process of replanning based on changes in variables. At the moment we really need to work on replanning generally, it isn't done at all currently... Have a nice day, -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > From each according to his ability. To each according to his ability to litigate.
"Jeroen T. Vermeulen" <jtv@xs4all.nl> writes: > On Sun, September 3, 2006 23:52, Tom Lane wrote: >> What exactly do you mean by "optimize away a parameter"? The way you >> described the mechanism, there are no parameters that are "optimized >> away", you've merely adjusted selectivity predictions using some assumed >> values. > I'm using "optimized away" as shorthand for "replaced with a literal > constant in the statement definition used to generate the plan." Ah. I think you're confusing the spectators by using "predict" when you should say "match". You're looking for previously generated plans that have assumed parameter values matching the current query --- saying that the plan "predicts" a parameter value is just a really poor choice of wording. regards, tom lane
On Mon, September 4, 2006 23:03, Tom Lane wrote: > Ah. I think you're confusing the spectators by using "predict" when you > should say "match". You're looking for previously generated plans that > have assumed parameter values matching the current query --- saying that > the plan "predicts" a parameter value is just a really poor choice of > wording. I guess so... It's been over a decade since I last busied myself with database optimization, so I don't know any of the jargon. Had to use processor architecture jargon instead. So with that out of the way, can anyone think of some good real-life examples of prepared statement usage that I can test against? Suggestions I've had include TPC, DBT2 (based on TPC-C), and pgbench, but what I'm really looking for is traces of invocations by real applications. If we don't have a body of application traces available, can anyone think of a convenient, nonintrusive way I could extract them out of applications I'm running? If there is, I could write a filter to eliminate irrelevant information. The filter would ignore everything other than prepared statement invocations. It could randomize statement names, strip out their definitions, hide the ordering between different statements, and replace parameter values with meaningless numbers; so it would be relatively safe for others to volunteer traces of their own applications. None of those transformations would affect my simulation results. Jeroen
Jeroen, > So with that out of the way, can anyone think of some good real-life > examples of prepared statement usage that I can test against? Suggestions > I've had include TPC, DBT2 (based on TPC-C), and pgbench, but what I'm > really looking for is traces of invocations by real applications. I've requested that the Sun SpecJApp benchmark engineer harvest a log for us. -- Josh Berkus PostgreSQL @ Sun San Francisco