Re: v3proto Parse/Bind and the query planner - Mailing list pgsql-jdbc
From | Oliver Jowett |
---|---|
Subject | Re: v3proto Parse/Bind and the query planner |
Date | |
Msg-id | 40AAA131.3010602@opencloud.com Whole thread Raw |
In response to | Re: v3proto Parse/Bind and the query planner (Kris Jurka <books@ejurka.com>) |
Responses |
Re: v3proto Parse/Bind and the query planner
|
List | pgsql-jdbc |
Kris Jurka wrote: > I seem to recall Tom Lane speculating about delaying the planning of a > prepared statement until it's first execution so it would have the bound > values and then using that plan with subsequent parameters. In the common > case multiple executions of a prepared query would use values of similar > statistical likelihood, but this wouldn't solve the example you gave if > you executed the plan twice for values that aren't similar. I think this > would be a reasonable compromise position. I don't see why the first execution is special in the general case.. you will suddenly get queries where the performance depends on the parameter values of *previous* queries (well beyond cache effects) which is not a very nice property to have. Planning on the first Bind would be particularly bad for the case I presented earlier (highly uneven index distribution) -- if the first execution happens to use a highly selective value and chooses an index scan, subsequent executions with non-selective values will perform very badly (considerably worse than the seqscan currently chosen for a parameterized query). The cure seems worse than the disease in this case. Instead, how about something like: - For named statements, plan at Parse time always. - For unnamed statements, plan at Bind time always. The assumption here is that if the client is using the unnamed statement, it's unlikely that it will be repeatedly reusing that statement with different parameter values so there is little benefit to preserving the query plan at the cost of being unable to plan for specific parameter values. If the client is using named statements, there's no change in behaviour from the current approach, so presumably the client knows what it's doing! :) A client that's aware of this behaviour can get some control over when the planning is done without needing a protocol change. e.g. in the JDBC driver we already have setUseServerPrepare(); that could be used to control whether named statements are used when executing a PreparedStatement or not. The application can enable named statements where it knows the cost of re-planning the query repeatedly is greater than the benefit of having actual parameter values to plan from -- e.g. if the indexes have an even distribution of values. (is this better discussed on -hackers?) >>The existing strategy of doing parameter replacement on the driver side >>works, but we lose the benefits of passing parameters via Bind. > > > We also have the possibility of doing selective replacement and binding > other values. This would allow using binary bytea transfers (which are > unlikely to be used in selectivity estimates) while doing parameter > replacement for other values. This is a possibility but it's going to lead to pretty ugly code to track the JDBC parameter index vs. the backend's idea of the parameter index. We also go back to playing games with the parser to make it interpret our parameters as the type we specified at the JDBC level. It seems a bit counterproductive to jump through several layers of parsing and re-substitution just to specify parameter types that JDBC already knew, when there's already a protocol mechanism to specify type information directly to the backend. We'd also end up with two code paths again if we wanted to benefit from named statements for queries that aren't affected by this case and would benefit from all of their parameters being real parameters (this is one of the reasons I'm making these changes in the first place, so..) -O
pgsql-jdbc by date: