Thread: Bind message
Hi All,
From my understanding, when I create a prepare statement I can execute it multiple times and to increase the efficiency. But from what I seen from the logging, Postgres always send the same message sequences over the wire.
parse,
bind,
describe,
execute
....
parse,
bind,
describe
execute
It should execute parse only one time right?
If this is not the right forum then please let me know or direct me to the right forum
thanks
Leon Do
Hi,
The default is not to re-use statements, you have add prepareThreshold=n to the url where n is greater than 0. What this means is that after n times of use the prepared statement will be re-used..
see http://jdbc.postgresql.org/documentation/head/connect.html#connection-parameters for more details
Dave
On 25-Sep-06, at 1:27 PM, Do, Leon wrote:
Hi All,From my understanding, when I create a prepare statement I can execute it multiple times and to increase the efficiency. But from what I seen from the logging, Postgres always send the same message sequences over the wire.parse,bind,describe,execute....parse,bind,describeexecuteIt should execute parse only one time right?If this is not the right forum then please let me know or direct me to the right forumthanksLeon Do
Dave Cramer wrote: > Hi, > > > The default is not to re-use statements, you have add > prepareThreshold=n to the url where n is greater than 0. What this > means is that after n times of use the prepared statement will be > re-used.. That's peculiar. The whole point of prepared statements is to obtain the optimization of statement reuse. If PostgreSQL is going to prevent that, I'd expect the prevention to be the exception rather than the rule. That is, programmers should reasonably expect that the default behavior is statement reuse. > > see > http://jdbc.postgresql.org/documentation/head/connect.html#connection-pa rameters > for more details -- Guy Rouillier
On Mon, 25 Sep 2006, Guy Rouillier wrote: > Dave Cramer wrote: >> >> The default is not to re-use statements, you have add >> prepareThreshold=n to the url where n is greater than 0. What this >> means is that after n times of use the prepared statement will be >> re-used.. > > That's peculiar. The whole point of prepared statements is to obtain > the optimization of statement reuse. Sometimes it is, sometimes the purpose of using a PreparedStatement is to avoid escaping data yourself for a plain Statement execution. There are potential performance penalties for using a "reusable" server side prepared statement because it plans them with generic parameters, not those supplied in your query. For this reason we want to be sure that people will actually be reusing these statements before switching to this mode. I believe the default prepareThreshold is set to 5 despite what the documentation says. Kris Jurka
On Mon, 2006-09-25 at 14:00 -0500, Guy Rouillier wrote: > Dave Cramer wrote: > > Hi, > > > > > > The default is not to re-use statements, you have add > > prepareThreshold=n to the url where n is greater than 0. What this > > means is that after n times of use the prepared statement will be > > re-used.. > > That's peculiar. The whole point of prepared statements is to obtain > the optimization of statement reuse. If PostgreSQL is going to prevent > that, I'd expect the prevention to be the exception rather than the > rule. That is, programmers should reasonably expect that the default > behavior is statement reuse. Well, yes and no. Yes, the original intent of PreparedStatements was to increase efficiency by reuse. But in practice, since there's no easy way of specifying parameters programatically except by using the PreparedStatement interface, most programs which construct one-time-only SQL on the fly also use PreparedStatements. There can be a significant benefit to NOT using generic prepared statements in the one-shot (or few-shot) case; the planner can see real parameter values instead of placeholders, and can thus make smarter plan decisions. -- Mark
Hi, Kris, Kris Jurka wrote: > Sometimes it is, sometimes the purpose of using a PreparedStatement is > to avoid escaping data yourself for a plain Statement execution. There > are potential performance penalties for using a "reusable" server side > prepared statement because it plans them with generic parameters, not > those supplied in your query. For this reason we want to be sure that > people will actually be reusing these statements before switching to > this mode. I believe the default prepareThreshold is set to 5 despite > what the documentation says. I remember having read something about delaying the actual planning until the first parameterset arrives. Am I mislead by my weak memory? Thanks Markus -- Markus Schaber | Logical Tracking&Tracing International AG Dipl. Inf. | Software Development GIS Fight against software patents in Europe! www.ffii.org www.nosoftwarepatents.org
Kris Jurka wrote: > On Mon, 25 Sep 2006, Guy Rouillier wrote: > >> Dave Cramer wrote: >>> >>> The default is not to re-use statements, you have add >>> prepareThreshold=n to the url where n is greater than 0. What this >>> means is that after n times of use the prepared statement will be >>> re-used.. >> >> That's peculiar. The whole point of prepared statements is to obtain >> the optimization of statement reuse. > > we want to be sure that people will actually be reusing these > statements before switching to this mode. I believe the default > prepareThreshold is set to 5 despite what the documentation says. If true, that's a good compromise. It addresses the use of prepared statements for purposes other than optimization that both you and Mark mention, but doesn't unwittingly bite the programmer who has a statement he intends to execute 1000 times and tries to do the right thing by using a prepared statement. -- Guy Rouillier
Markus Schaber wrote: > Hi, Kris, > > Kris Jurka wrote: > >>Sometimes it is, sometimes the purpose of using a PreparedStatement is >>to avoid escaping data yourself for a plain Statement execution. There >>are potential performance penalties for using a "reusable" server side >>prepared statement because it plans them with generic parameters, not >>those supplied in your query. For this reason we want to be sure that >>people will actually be reusing these statements before switching to >>this mode. I believe the default prepareThreshold is set to 5 despite >>what the documentation says. > > > I remember having read something about delaying the actual planning > until the first parameterset arrives. The server only does this when you use the unnamed statement (i.e. before the driver's prepareThreshold is reached), because the resulting plan is only really useful for that particular set of parameter values, and using a named statement implies that you want to reuse the parse/plan results for other parameter values. -O
On Mon, 25 Sep 2006, Kris Jurka wrote: > I believe the default prepareThreshold is set to 5 despite what the > documentation says. > Documentation updated. Kris Jurka
Hi, Oliver, Oliver Jowett wrote: >> I remember having read something about delaying the actual planning >> until the first parameterset arrives. > The server only does this when you use the unnamed statement (i.e. > before the driver's prepareThreshold is reached), because the resulting > plan is only really useful for that particular set of parameter values, > and using a named statement implies that you want to reuse the > parse/plan results for other parameter values. Ah. But usually, the other parameter values for the same prepared statement produce similar query plans, because they're the same use case on similar data. So, at least for most of the apps we're running here, the first set of parameters makes more sense than planning without any parameters. Maybe this behaviour should be configurable via some session configuration variable? And for applications that really have two very different usecases resulting in identical prepared statements, it's affordable to either switch the delayed planning of, or to prepare the statement twice, with different names and different first parameter sets. Maybe we should crosspost this to pg_hackers? Markus -- Markus Schaber | Logical Tracking&Tracing International AG Dipl. Inf. | Software Development GIS Fight against software patents in Europe! www.ffii.org www.nosoftwarepatents.org
Markus Schaber wrote: > Ah. But usually, the other parameter values for the same prepared > statement produce similar query plans, because they're the same use case > on similar data. For the "wrong" values though the plan might be pathologically bad. > Maybe this behaviour should be configurable via some session > configuration variable? Isn't this just a very roundabout way of overriding the planner? i.e. what you are really doing is saying "ignore your statistics, I know more about the parameter values I'm going to give you than I'm actually telling you".. -O
Kris Jurka <books@ejurka.com> writes: >> That's peculiar. The whole point of prepared statements is to obtain >> the optimization of statement reuse. > > Sometimes it is, sometimes the purpose of using a PreparedStatement is > to avoid escaping data yourself for a plain Statement execution. By the way this confusion has been introduced by JDBC. In ODBC the functions SQLBindParameter() and SQLPrepare() were not related. <http://msdn.microsoft.com/library/default.asp?url=/library/en-us/odbc/htm/odbcbinding_parameter_markers.asp> Both ODBC and JDBC docs warn the user that some drivers may not fully implement PREPARE. Exercise: define "fully" in a portable way. Granted, you can still complain if PREPARE does nothing at all. I remember a number of past discussions on this topic. Among others this one: "Subject: Limit vs setMaxRows issue" <http://archives.postgresql.org/pgsql-jdbc/2006-07/threads.php#00020>
Hi, Oliver, Oliver Jowett wrote: >> Ah. But usually, the other parameter values for the same prepared >> statement produce similar query plans, because they're the same use case >> on similar data. > > For the "wrong" values though the plan might be pathologically bad. Yes, I know, and that's why I'd like to retain the possibilitiy of preparing a new statement with a different first parameter set for different use cases, and introduce the option of switching back to the current behaviour. >> Maybe this behaviour should be configurable via some session >> configuration variable? > > Isn't this just a very roundabout way of overriding the planner? i.e. > what you are really doing is saying "ignore your statistics, I know more > about the parameter values I'm going to give you than I'm actually > telling you".. No, it's telling the planner "You can assume that all those queries for my statement are similar enough that the same query plan will fit them, so you can safe the overhead of both parsing and planning". Don't forget that, in the current implementation, the query planner has no choice but planning the query without any actual parameter values, which is likely to give equal or worse results in most cases. The use case I have in mind are geometric data lookup of mostly static data (PostGIS objects) that are nicely distributed, and the queries all fetch a small bounding box hitting a fraction of at most 1/1000th, more likely 1/1000000 of the data. For them, the GIST index scan is the best query plan. Another option would be that the planner uses some self-learning algorithm that analyzes the parameter sets for a given prepared statement, and sees whether it's better to replan always, keep the same plan, or even efficiently switch between a small set of fixed plans. But that's out of scope for now, I'm afraid. Thanks, Markus -- Markus Schaber | Logical Tracking&Tracing International AG Dipl. Inf. | Software Development GIS Fight against software patents in Europe! www.ffii.org www.nosoftwarepatents.org
Markus Schaber wrote: > Hi, Oliver, > > Oliver Jowett wrote: >>Isn't this just a very roundabout way of overriding the planner? i.e. >>what you are really doing is saying "ignore your statistics, I know more >>about the parameter values I'm going to give you than I'm actually >>telling you".. > > > No, it's telling the planner "You can assume that all those queries for > my statement are similar enough that the same query plan will fit them, > so you can safe the overhead of both parsing and planning". What I mean is, if you are prepared to say that, what you are really wanting is to second-guess the planner and tell the server "use *THIS* plan" explicitly .. In both approaches, you (the app developer) need detailed knowledge about what sort of plans work for the parameters & query you are using (for your particular server version and dataset etc). Doing that at the Parse/Bind level doesn't seem right and in fact seems even more errorprone than explicitly specifying a plan, since you are relying on the planner picking a particular plan for a particular set of parameter values that you happened to use first which may or may not always be true depending on things like the current state of DB statistics.. -O
Markus Schaber <schabi@logix-tt.com> writes: > Don't forget that, in the current implementation, the query planner has > no choice but planning the query without any actual parameter values, > which is likely to give equal or worse results in most cases. No, I think you are oversimplifying. What the planner is supposed to do when given a parameterized query (and no parameter value info) is to generate a plan that won't be too terribly awful regardless of the parameter values. With a set of sample values, it may generate a plan that is great for those values and utterly unusable for anything else. Here's a simple example: select ... from a join b on (a.key = b.key) where a.val = ? Given a parameter value that looks like it will match only one A row, you will likely get a plan like this: Nest Loop Index Scan on A Index Cond: val = ? Index Scan on B Index Cond: b.key = a.key which is about as fast as you can get if indeed there's only one match. However if the parameter value matches very many A rows, this plan is horrid. With no value for the parameter, the planner should pick a compromise plan (perhaps a mergejoin) that may not be as fast for the single-match case, but will finish before doomsday in the other case. regards, tom lane
Hi, Tom, Tom Lane wrote: > Markus Schaber <schabi@logix-tt.com> writes: >> Don't forget that, in the current implementation, the query planner has >> no choice but planning the query without any actual parameter values, >> which is likely to give equal or worse results in most cases. > > No, I think you are oversimplifying. What the planner is supposed to do > when given a parameterized query (and no parameter value info) is to > generate a plan that won't be too terribly awful regardless of the > parameter values. With a set of sample values, it may generate a plan > that is great for those values and utterly unusable for anything else. That's why I want to have it configurable, and maybe even disabled by default. When I know that the cases are similar enough, it can be enabled, and possibly benchmarked whether it's really faster (better or equal plan and less planning overhead) or slower (horrible plan due to failure of premise that the cases are similar enough). Thanks, Markus -- Markus Schaber | Logical Tracking&Tracing International AG Dipl. Inf. | Software Development GIS Fight against software patents in Europe! www.ffii.org www.nosoftwarepatents.org
Oliver Jowett wrote: > Isn't this just a very roundabout way of overriding the planner? i.e. > what you are really doing is saying "ignore your statistics, I know > more about the parameter values I'm going to give you than I'm > actually telling you".. Is there a non-roundabout way of overriding the planner? Oracle has hints. Sometimes the query analyzer benefits from a helping hand. -- Guy Rouillier
On Tue, 2006-09-26 at 11:06 -0500, Guy Rouillier wrote: > Oliver Jowett wrote: > > Isn't this just a very roundabout way of overriding the planner? i.e. > > what you are really doing is saying "ignore your statistics, I know > > more about the parameter values I'm going to give you than I'm > > actually telling you".. > > Is there a non-roundabout way of overriding the planner? Oracle has > hints. Sometimes the query analyzer benefits from a helping hand. The PG core doesn't support hints. I believe the general feeling among the PG developers is that their time is better spent making the planner smarter than in supporting a hints system which would introduce its own maintenance headaches. -- Mark