Re: JDBC, prepared queries, and partitioning - Mailing list pgsql-jdbc
From | Oliver Jowett |
---|---|
Subject | Re: JDBC, prepared queries, and partitioning |
Date | |
Msg-id | 47B37549.3@opencloud.com Whole thread Raw |
In response to | Re: JDBC, prepared queries, and partitioning (Josh Berkus <josh@agliodbs.com>) |
List | pgsql-jdbc |
Josh Berkus wrote: >>> If the wild variations don't affect the cost estimate, they won't >>> affect the plan? >> Cost estimates are made at planning time. If you don't replan, no, they >> won't. > > So, just to confirm: there is no way with v3 to make it not use a cached > plan, either on 8.2 or 8.3, correct? I think there's a lot of confusion going on here.. "Cached plan" is pretty vague :( I'll try to summarize here: The JDBC driver, when using the v3 protocol, will use either a named or unnamed statement with parameter placeholders, and send the parameter values out of line from the query. If the unnamed statement is used: * Planning occurs at bind time, not parse time, and uses actual bound parameter values for cost estimation. * However the planner still produces a plan that is correct for all possible parameter values, as the protocol allows for the unnamed statement to be reused with different parameter values. So some optimizations are not possible. * The JDBC driver never actually reuses the unnamed statement with different values in this way. The unnamed statement is discarded after one use. If a named statement is used: * Planning occurs at parse time. No actual parameter values are available for cost estimation. A general plan is generated. * The JDBC driver may reuse the statement (and associated plan) for future query execution with different parameter values. The JDBC driver selects use of a named or unnamed statement based on the prepareThreshold connection parameter. For a particular PreparedStatement, a use count is maintained. If the use count is below the prepare threshold, an unnamed statement is used. If the use count exceeds the threshold, a named statement is used (and reused for subsequent execution of that same PreparedStatement object). This is done because many applications don't actually reuse their PreparedStatements; we want to make sure they're actually being reused before we pay the extra cost of using a named statement. Does that clear things up a bit ? To try to answer your specific question: You can disable use of a "cached plan" by forcing the driver to always use unnamed statements. In that case, every new query execution re-parses and re-plans. To do this, either don't reuse your PreparedStatement objects, or set prepareThreshold=0 (a special value meaning "never use named statements") However even when plans are not cached, you will still run into the issue that an unnamed statement using out-of-line parameter values may generate a less efficient plan than an unnamed statement using inline parameter values, because there is no way to tell the planner at the protocol level "I am really only ever using this query once, please give me a specific plan for these values and don't worry about generating a plan that is correct for other values too". The problem is that the current extended query protocol does two different things - out-of-line parameter passing and statement/plan caching - but provides no way to independently manage them. The unnamed statement behaviour is a compromise I suggested for 8.0 that gets rid of the worst problems for JDBC without an incompatible protocol change, but it isn't perfect. To get it 100% right it seems that a protocol change is needed. ... Using the version 2 protocol means that parameter values are sent inline with the query, which avoids the planning issue (but raises a number of other problems since the v2 protocol is much less flexible than v3) > Can I use v2 against 8.3? I think so. You obviously lose any behaviour that depends on v3, e.g. parameter metadata and smarter bytea streaming. -O
pgsql-jdbc by date: