Thread: isSingleXXX in AbstractJdbc1Statement
G'morning (or as appropriate for your time zone), I'm working on refactoring some code from AbstractJdbc1Statement, and there are comments and design all over the place indicating or assuming that the isSingleStatement, isSingleDML, and isSingleSelect methods are terribly expensive. As far as I can see, though, they aren't! isSingleStatement is implicitly calculated when parsing the query, and the other two involve a single calls to each of String.trim, String.toLowerCase, and String.startsWith. I suppose the String.toLowerCase might be an issue for absolutely gigantic queries, but it looks like we could deal with that by simply doing a substring to grab the first seven letters before doing the compare. Anyone disagree? -- www.designacourse.com The Easiest Way to Train Anyone... Anywhere. Chris Smith - Lead Software Developer/Technical Trainer MindIQ Corporation
Chris Smith wrote: > G'morning (or as appropriate for your time zone), > > I'm working on refactoring some code from AbstractJdbc1Statement, and there > are comments and design all over the place indicating or assuming that the > isSingleStatement, isSingleDML, and isSingleSelect methods are terribly > expensive. As far as I can see, though, they aren't! isSingleStatement is > implicitly calculated when parsing the query, and the other two involve a > single calls to each of String.trim, String.toLowerCase, and > String.startsWith. I suppose the String.toLowerCase might be an issue for > absolutely gigantic queries, but it looks like we could deal with that by > simply doing a substring to grab the first seven letters before doing the > compare. > > Anyone disagree? When I put that code in I was trying to keep the cost of queries that didn't need to be transformed (due to non-query reasons e.g. state of autocommit) to a minimum -- the current approach is certainly cheaper and probably generates less garbage than the more obvious approach, but I don't know by how much. At the time it was a fairly big change and I was trying to preempt objections :) Also please don't reparse on each execution (which is the other thing that isSingleSelect and friends try to avoid) .. my use case for this is: prepare statement once execute statement 100k times with different parameters keep statement around for later reuse or the equivalent via batch updates. This currently chews a *lot* of CPU on the Java side while executing -- on the order of a 50/50 split between Java and the backend. I've been trying to reduce the JDBC overhead via things like isSingleSelect. However this use case seems to be common elsewhere: prepare statement execute statement with some parameters discard statement and we need to support both. -O
Oliver Jowett wrote: > When I put that code in I was trying to keep the cost of queries that > didn't need to be transformed (due to non-query reasons e.g. state of > autocommit) to a minimum -- the current approach is certainly cheaper > and probably generates less garbage than the more obvious approach, > but I don't know by how much. At the time it was a fairly big change > and I was trying to preempt objections :) Okay. As far as I can see, the calculation of all of these things requires the following: 1. Two comparisons per query character. 2. A constant 7 assignments. 3. A constant 28 comparisons. 4. Once extremely short-lived String object, that shares it data via substring). Frankly, that should be negligible cost on anything faster than a 6052 processor. > Also please don't reparse on each execution (which is the other thing > that isSingleSelect and friends try to avoid) Definitely. I don't plan to repeat the calculation; only to avoid deferring it. I think the deferred calculations introduce unnecessary work on every execution, just to avoid about 100 processor cycles or so when first parsing the query. > This currently chews a *lot* of > CPU on the Java side while executing -- on the order of a 50/50 split > between Java and the backend. I'm interested in your ideas on why this is the case. I suspect a lot of it may have something to do with building the text form of the parameters to inject them into the query (in which case I'm working on solving exactly that problem by making these changes). -- www.designacourse.com The Easiest Way to Train Anyone... Anywhere. Chris Smith - Lead Software Developer/Technical Trainer MindIQ Corporation
Chris Smith wrote: >>Also please don't reparse on each execution (which is the other thing >>that isSingleSelect and friends try to avoid) > > > Definitely. I don't plan to repeat the calculation; only to avoid deferring > it. I think the deferred calculations introduce unnecessary work on every > execution, just to avoid about 100 processor cycles or so when first parsing > the query. Not having benchmarked either approach I can hardly disagree :) Object creation is a real bugbear in our app so I tend to reflexively avoid it where I can, even when the direct CPU benefit is dubious. >>This currently chews a *lot* of >>CPU on the Java side while executing -- on the order of a 50/50 split >>between Java and the backend. > > > I'm interested in your ideas on why this is the case. I suspect a lot of it > may have something to do with building the text form of the parameters to > inject them into the query (in which case I'm working on solving exactly that > problem by making these changes). I can only speculate, since the java profiling tools are so bad -- I'm yet to get a good CPU profile out of this bit of code. It all *seems* fairly cheap so I can only guess that it's an accumulation of many small operations along the way. Object creation doesn't seem to be the root of the problem, as even with heap settings that avoid GCing frequently it's chewing CPU the whole time, not just during GCs. I should really do some more profiling of this area, I've just been avoiding it because it's so painful to do.. -O
Oliver Jowett wrote: > I can only speculate, since the java profiling tools are so bad -- I'm > yet to get a good CPU profile out of this bit of code. It all *seems* > fairly cheap so I can only guess that it's an accumulation of many > small operations along the way. Object creation doesn't seem to be > the root of the problem, as even with heap settings that avoid GCing > frequently it's chewing CPU the whole time, not just during GCs. > > I should really do some more profiling of this area, I've just been > avoiding it because it's so painful to do.. Fair enough. When I finish this patch, I'll certainly do some comparative testing and look forward to your own experiences. Finishing is starting to look not *so* far away any longer -- I just need to get batch updates and updatable result sets working, do the prepared statement and cursor stuff for v3 extended query, and then test and clean up. -- www.designacourse.com The Easiest Way to Train Anyone... Anywhere. Chris Smith - Lead Software Developer/Technical Trainer MindIQ Corporation