Re: [PERFORM] Query much slower when run from postgres function - Mailing list pgsql-jdbc
From | Guillaume Cottenceau |
---|---|
Subject | Re: [PERFORM] Query much slower when run from postgres function |
Date | |
Msg-id | 87wsaxk94w.fsf@meuh.mnc.lan Whole thread Raw |
In response to | Re: [PERFORM] Query much slower when run from postgres function (Oliver Jowett <oliver@opencloud.com>) |
Responses |
Re: [PERFORM] Query much slower when run from postgres
function
|
List | pgsql-jdbc |
Oliver Jowett <oliver 'at' opencloud.com> writes: > Guillaume Cottenceau wrote: >> Oliver Jowett <oliver 'at' opencloud.com> writes: >> >>> The idea behind the threshold is that if a PreparedStatement object is >>> reused, that's a fairly good indication that the application wants to >>> run the same query many times with different parameters (since it's >>> going to the trouble of preserving the statement object for reuse). But >> >> Or it may just need the safeness of driver/database parameter >> "interpolation", to get a "free" efficient safeguard against SQL >> injection. > > In which case, the application usually throws the PreparedStatement > object away after executing it once, and the threshold is never reached. > As I said, the application has to do extra work to preserve exactly the > same PreparedStatement object for reuse before the threshold applies, at > which point it's reasonable to assume that it could be a > performance-sensitive query that would benefit from preserving the query > plan and avoiding parse/plan costs on every execution. Thanks for the clarification! That may just be me, but I see two issues here: first, parsing and planning are tied together, but parsing should be always done first time only as I see no point in reparsing in subsequent uses of the PreparedStatement?; second, it's still questionable that a "performance-sensitive" query should mean benefiting from preserving the query plan: I have seen dramatic use cases where the preserved query plan opted for a seqscan and then the query was orders of magnitude slower than it should because the actual then used values would have qualified for an indexscan. > It's just a heuristic because there *is* a tradeoff and many/most > applications are not going to be customized specifically to know about > that tradeoff. And it's configurable because the tradeoff is not the > same in every case. Yes, and it's well documented, actually. I obviously didn't read it carefully enough last time :/ I guess my approach of using the protocol version 2 should be replaced by unsetting the prepared threshold.. I think I came up with that workaround after that post from Kris: http://archives.postgresql.org/pgsql-jdbc/2008-03/msg00070.php because strangely, you and I intervened in that thread, but the prepared threshold issue was not raised, so I followed the protocolVersion=2 path. Did I miss something - e.g. is the topic today different from the topic back then, for some reason? Am I wrong in assuming that your "please replan this statement every time you get new parameters" suggestion is nearly-achievable with unsetting the prepared threshold ("nearly" being the difference between replanning always, and replanning only when parameters are new)? Anyway, documentation-wise, I've tried to think of how the documentation could be a little more aggressive with the warning: http://zarb.org/~gc/t/jdbc-more-cautious-preparedstatements.diff That said, there's something more: when the documentation says: There are a number of ways to enable server side prepared statements depending on your application's needs. The general method is to set a threshold for a PreparedStatement. I assume that by default server side prepared statements are *not* enabled, although it seems to be the case, with a threshold of 5 as a simple test shows when using driver 8.3-604.jdbc3 (on PG 8.3.6). I think that either they should not be enabled by default (really, it could be better with, but it could be so much worse that is it really a good idea to make a "dropin" use of the driver use it?), or the documentation should clearly state they are, and add even more warnings about potential drawbacks. WDYT? http://zarb.org/~gc/t/jdbc-more-cautious-preparedstatements2.diff Btw, how can the doc be built? "ant doc" failed on missing docbook.stylesheet but I was unable to find how to set that value. > Do you have a suggestion for a better way to decide when to use a named > statement? Oh, I feel I don't have the qualifications to answer that question, sorry! The only thing I could think of, was what I talked about in a previous mail, e.g. save all plans of the first xx queries before reaching the threshold, and then when the threshold is reached, compare the global cost estimates of the saved plans, and do not activate server side prepare if they are too different, as caching the plan for that query would probably yield too slow results sometimes. Ideally, I guess a new PG-specific method should be added to activate that feature (and set the value for "are the plans too different?"). But bear in mind that it may be a stupid idea :) -- Guillaume Cottenceau
pgsql-jdbc by date: