Thread: Prepared statement performance...
Hi, everybody. I am running into huge performance problems, due to JDBC not being able to cache query plans :-( My java program runs the (set of about 5) identical statements with different parameters for about 30 million times... What I am talking about below has to do with JDBC, but is not really specific to it - it seems to me, that the backend itself could be improved to better handle this kind of scenario, when the same statement is executed many times in the same session. It takes about 3 weeks(!) to run, and it looks like about *half* of that time is spent by the query planner, creating and recreating the query plans every time I ran the damn statement.... I am looking into implementing some kind of a solution, that would let me work around that problem... So far, I only see two possibilities: - a general solution, that would involve extending postgres SQL gramma to include a 'prepare' statement - or am ugly work around, that would involve moving all my sql statements into stored procedures, and have those cache the query plans inside... The second solution is not only ugly (because it requires the application code to be changed and to have a specialized stored procedure for every query), but also requires some additional hacks (to overcome the hard limit on the number of function arguments and the inability for functions to return tuples) - the only way I imagine this can be made to work is to glue all the arguments together into a text string, and have the stored procedure parse it back, execute the query, then glue the resulting tuple(s) into another text string, return it, and have the application (or, perhaps, JDBC layer) parseit back into columns... I was wonderring if anybody has any better ideas how this can be made to work (I am looking for a solution that would minimize changes to the existing JDBC applications that use PreparedStatements)? If the maintainers of the involved code are interested, I would be willing to implement and contribute the solution we comeup with ... (I figure, nobody would really be interested in getting that second solution I mentioned into the mainstream :-), but, ifwe are able to come up with something more general and less ugly, perhaps, I am not the only one who would be able to contributefrom it)... For example, I believe, it should not be too complicated to implement that first possibility I described above... The way I see it would involve adding two statements to postgres SQL syntax: prepare <name> as <sql statement> and execute <name> with (<parameter list>) For example: prepare mystatement as select * from mytable where id = $1 and name like $2; and then execute mystatement with (1, 'Dima'); execute mystatement with (2, 'John'); etc.... The JDBC driver would then send the 'prepare' command to the backend in Connection.prepareStatement (), and use the 'execute' in PreparedStatement.execute (); One potential problem with implementation I see here is that the query planner wants to know the argument types ahead oftime... I guess, I could get around that by making all the arguments 'text', and having them casted into the right types when the statement is actually executed. There is, probably a need to also have some kind of a 'close' command to throw away the prepared query plans... or we couldjust make them last forever until, say, the end of transaction (or, perhaps, the end of the session?)... If there is anyone interested in discussing various possibilities, and getting this implemented one way or another, I would like to hear from you! Thanks! Dima.
On 25 Sep 2002 at 11:14, Dmitry Tkach wrote: > Hi, everybody. > > I am running into huge performance problems, due to JDBC not being able to cache query plans :-( > My java program runs the (set of about 5) identical statements with different parameters for about 30 million times... > What I am talking about below has to do with JDBC, but is not really specific to it - it seems to me, that the backenditself > could be improved to better handle this kind of scenario, when the same statement is executed many times in the same session. > - a general solution, that would involve extending postgres SQL gramma to include a 'prepare' statement Added in upcoming 7.3.. So youmight want to test the betas or CVS.. Just wondering.. Would you see any performance difference in case you put the statements in pl/pgsql functions? Just a thought.. Bye Shridhar -- The Sixth Commandment of Frisbee: The greatest single aid to distance is for the disc to be going in a direction you did not want. (Goes the wrong way = Goes a long way.) -- Dan Roddick
Dmitry Tkach <dmitry@openratings.com> writes: > - a general solution, that would involve extending postgres SQL gramma > to include a 'prepare' statement As someone else mentioned, this has been implemented for 7.3. I implemented PREPARE/EXECUTE/DEALLOCATE on the backend side, Barry Lind (I believe) added support for using backend prepared statements to the JDBC driver. > The second solution is not only ugly (because it requires the > application code to be changed and to have a specialized stored > procedure for every query), but also requires some additional hacks > (to overcome the hard limit on the number of function arguments and > the inability for functions to return tuples) Note that in 7.3, functions can return sets of tuples. Cheers, Neil -- Neil Conway <neilc@samurai.com> || PGP Key ID: DB3C29FC
Dmitry Tkach <dmitry@openratings.com> writes: > Any chance to get a patch that could be applied over a 7.2 source? I'm not planning to -- but since the PREPARE/EXECUTE stuff is almost entirely separated from the rest of the system, it would probably be pretty easy to do. > >Note that in 7.3, functions can return sets of tuples. > Yeah... I know that... But, same comment as above applies - the > migration is what scares the hell out of me :-( FYI, backpatching this to 7.2 would not be feasible. Cheers, Neil -- Neil Conway <neilc@samurai.com> || PGP Key ID: DB3C29FC
Dimitry, I have some good news for you. All of this functionality is in 7.3 of the server and the current development build of the jdbc driver. Of course 7.3 isn't quite production yet, but beta 2 should be out this week. Your first option is what has been implemented. In 7.3 the server now supports 'prepare <name> as <sql>' and 'execute <name>' (and variants for passing bind variables). The jdbc driver also now has a method on org.postgresql.PGStatement.setUseServerPrepare(true) to enable the the use of server side prepared statements (so you can cast your Statement object to an org.postgresql.PGStatement object and enable the functionality for a specific statement). This hasn't yet made its way into the jdbc documentation for 7.3, but this will be documented before 7.3 goes production. This implementation does still require that you make some changes to your app in order to enable it but that is by design for the following reasons: 1) In order to be useful you need to reuse a statement object multiple times. Most code generally will just create a statement and use it only once, and since more work is being done to prepare the statement that would have a negative performance impact for the majority of statements being executed in normal code. 2) Prepare functionality will not work for all types of sql that you can execute through the jdbc driver. For example you can quite easily do the following: conn.prepareStatement("update foo set a = ?; insert into bar values (?,?,?);"); By issuing both sql statements in one call only one network roundtrip is needed to execute both statements providing what is likely a more performant application. However this doesn't work with prepare because: prepare foo as update foo set a = ?; insert into bar values(?,?,?); - is one prepared statement and one regular statement which will cause very strange things to happen. It is for the reasons above that the developer needs to explicitly turn on the use of server side prepared statements when it makes sense to do so. thanks, --Barry PS. I would really appreciate it if you could test this new functionality. As far as I know I am the only one who has used it. But for one of my frequently used and complex sql statements the time to execute dropped by 60% using server side prepared statements. Dmitry Tkach wrote: > Hi, everybody. > > I am running into huge performance problems, due to JDBC not being able > to cache query plans :-( > My java program runs the (set of about 5) identical statements with > different parameters for about 30 million times... > What I am talking about below has to do with JDBC, but is not really > specific to it - it seems to me, that the backend itself > could be improved to better handle this kind of scenario, when the same > statement is executed many times in the same session. > > It takes about 3 weeks(!) to run, and it looks like about *half* of that > time is spent by the query planner, > creating and recreating the query plans every time I ran the damn > statement.... > > I am looking into implementing some kind of a solution, that would let > me work around that problem... > So far, I only see two possibilities: > > - a general solution, that would involve extending postgres SQL gramma > to include a 'prepare' statement > - or am ugly work around, that would involve moving all my sql > statements into stored procedures, and have > those cache the query plans inside... > > The second solution is not only ugly (because it requires the > application code to be changed and to have a specialized > stored procedure for every query), but also requires some additional > hacks (to overcome the hard limit on the number of > function arguments and the inability for functions to return tuples) - > the only way I imagine this can be made to work is > to glue all the arguments together into a text string, and have the > stored procedure parse it back, execute the query, then > glue the resulting tuple(s) into another text string, return it, and > have the application (or, perhaps, JDBC layer) parse it back > into columns... > > I was wonderring if anybody has any better ideas how this can be made to > work (I am looking for a solution that would > minimize changes to the existing JDBC applications that use > PreparedStatements)? > > If the maintainers of the involved code are interested, I would be > willing to implement and contribute the solution we come up with ... > (I figure, nobody would really be interested in getting that second > solution I mentioned into the mainstream :-), but, if we are > able to come up with something more general and less ugly, perhaps, I am > not the only one who would be able to contribute from > it)... > > For example, I believe, it should not be too complicated to implement > that first possibility I described above... > The way I see it would involve adding two statements to postgres SQL > syntax: > > prepare <name> as <sql statement> > and > execute <name> with (<parameter list>) > > For example: > > prepare mystatement as select * from mytable where id = $1 and name like > $2; > > and then > > execute mystatement with (1, 'Dima'); > execute mystatement with (2, 'John'); > > etc.... > > The JDBC driver would then send the 'prepare' command to the backend in > Connection.prepareStatement (), and > use the 'execute' in PreparedStatement.execute (); > > One potential problem with implementation I see here is that the query > planner wants to know the argument types ahead of time... > I guess, I could get around that by making all the arguments 'text', and > having them casted into the right types when the > statement is actually executed. > > There is, probably a need to also have some kind of a 'close' command to > throw away the prepared query plans... or we could just > make them last forever until, say, the end of transaction (or, perhaps, > the end of the session?)... > > If there is anyone interested in discussing various possibilities, and > getting this implemented one way or another, > I would like to hear from you! > > Thanks! > > Dima. > > > > ---------------------------(end of broadcast)--------------------------- > TIP 2: you can get off all lists at once with the unregister command > (send "unregister YourEmailAddressHere" to majordomo@postgresql.org) >
Peter, I have to agree with Barry here. Since it is possible to send two (or more ) statements to the backend, the programmer needs some mechanism for enabling/disabling this feature. Unless of course we were to write a sql parser in the jdbc driver so that we could predetermine which statements should use the prepare mechanism or not. I often use prepared statements instead of simple statements even when only doing one select. The difference being that I can use setXXX to set the parameters in the statment. Which is particularly important for setDate, setTime, setTimestamp And you would be amazed at the unique (irregular) coding practices which exist out there, so having the default do something sane is a GOOD THING. Dave On Fri, 2002-09-27 at 02:52, Peter Kovacs wrote: > Barry, > > It's great to have this functionality. > > However, I am surprised about the need/requirement to explicitely turning on > this feature in the JDBC driver. As you explain, the reason for this > behaviour is to support the use of multiple statements with the > prepareStatement method. But this is supporting incorrect us of the JDBC > interface, since the JDBC API says that the parameter of the > prepareStatement method is one(!) SQL statetment. On the other hand, it is > clear that the intended purpose of the prepareStatement method is to use > prepared statment -- exclusively and not by default. For statements executed > only once, the designers of the API made available the method > "createStatment". > > So the buttom line is that you have an irregular behaviour in support of an > irregular coding practice. I understand that there is "legacy" code that > needs to use the prepareStatement method with multiple SQL statements. But > since such practice was already incorrect at the time of its inception, the > right way would be to allow a transition from bad to good by requiring the > "sinners" to explicitly use some kind of > "org.postgresql.PGStatement.setUseServerPrepare(false)" tweaking in their > code, and let new coders use standard code. > > Of course, you're running the show, so you will do it as you deem right, but > IMHO the implementation as you described it is clearly not reasonable. > > Cheers, > > Peter > > ----- Original Message ----- > From: "Barry Lind" <barry@xythos.com> > To: "Dmitry Tkach" <dmitry@openratings.com> > Cc: <pgsql-general@postgresql.org>; <pgsql-jdbc@postgresql.org> > Sent: Thursday, September 26, 2002 10:56 PM > Subject: Re: [JDBC] [GENERAL] Prepared statement performance... > > > > Dimitry, > > > > I have some good news for you. All of this functionality is in 7.3 of > > the server and the current development build of the jdbc driver. Of > > course 7.3 isn't quite production yet, but beta 2 should be out this week. > > > > Your first option is what has been implemented. In 7.3 the server now > > supports 'prepare <name> as <sql>' and 'execute <name>' (and variants > > for passing bind variables). The jdbc driver also now has a method on > > org.postgresql.PGStatement.setUseServerPrepare(true) to enable the the > > use of server side prepared statements (so you can cast your Statement > > object to an org.postgresql.PGStatement object and enable the > > functionality for a specific statement). > > > > This hasn't yet made its way into the jdbc documentation for 7.3, but > > this will be documented before 7.3 goes production. > > > > This implementation does still require that you make some changes to > > your app in order to enable it but that is by design for the following > > reasons: > > 1) In order to be useful you need to reuse a statement object > > multiple times. Most code generally will just create a statement and > > use it only once, and since more work is being done to prepare the > > statement that would have a negative performance impact for the majority > > of statements being executed in normal code. > > 2) Prepare functionality will not work for all types of sql that you > > can execute through the jdbc driver. For example you can quite easily > > do the following: conn.prepareStatement("update foo set a = ?; insert > > into bar values (?,?,?);"); > > By issuing both sql statements in one call only one network roundtrip is > > needed to execute both statements providing what is likely a more > > performant application. However this doesn't work with prepare because: > > prepare foo as update foo set a = ?; insert into bar values(?,?,?); - > > is one prepared statement and one regular statement which will cause > > very strange things to happen. > > > > It is for the reasons above that the developer needs to explicitly turn > > on the use of server side prepared statements when it makes sense to do > so. > > > > thanks, > > --Barry > > > > PS. I would really appreciate it if you could test this new > > functionality. As far as I know I am the only one who has used it. But > > for one of my frequently used and complex sql statements the time to > > execute dropped by 60% using server side prepared statements. > > > > > > > > Dmitry Tkach wrote: > > > Hi, everybody. > > > > > > I am running into huge performance problems, due to JDBC not being able > > > to cache query plans :-( > > > My java program runs the (set of about 5) identical statements with > > > different parameters for about 30 million times... > > > What I am talking about below has to do with JDBC, but is not really > > > specific to it - it seems to me, that the backend itself > > > could be improved to better handle this kind of scenario, when the same > > > statement is executed many times in the same session. > > > > > > It takes about 3 weeks(!) to run, and it looks like about *half* of > that > > > time is spent by the query planner, > > > creating and recreating the query plans every time I ran the damn > > > statement.... > > > > > > I am looking into implementing some kind of a solution, that would let > > > me work around that problem... > > > So far, I only see two possibilities: > > > > > > - a general solution, that would involve extending postgres SQL gramma > > > to include a 'prepare' statement > > > - or am ugly work around, that would involve moving all my sql > > > statements into stored procedures, and have > > > those cache the query plans inside... > > > > > > The second solution is not only ugly (because it requires the > > > application code to be changed and to have a specialized > > > stored procedure for every query), but also requires some additional > > > hacks (to overcome the hard limit on the number of > > > function arguments and the inability for functions to return tuples) - > > > the only way I imagine this can be made to work is > > > to glue all the arguments together into a text string, and have the > > > stored procedure parse it back, execute the query, then > > > glue the resulting tuple(s) into another text string, return it, and > > > have the application (or, perhaps, JDBC layer) parse it back > > > into columns... > > > > > > I was wonderring if anybody has any better ideas how this can be made > to > > > work (I am looking for a solution that would > > > minimize changes to the existing JDBC applications that use > > > PreparedStatements)? > > > > > > If the maintainers of the involved code are interested, I would be > > > willing to implement and contribute the solution we come up with ... > > > (I figure, nobody would really be interested in getting that second > > > solution I mentioned into the mainstream :-), but, if we are > > > able to come up with something more general and less ugly, perhaps, I > am > > > not the only one who would be able to contribute from > > > it)... > > > > > > For example, I believe, it should not be too complicated to implement > > > that first possibility I described above... > > > The way I see it would involve adding two statements to postgres SQL > > > syntax: > > > > > > prepare <name> as <sql statement> > > > and > > > execute <name> with (<parameter list>) > > > > > > For example: > > > > > > prepare mystatement as select * from mytable where id = $1 and name > like > > > $2; > > > > > > and then > > > > > > execute mystatement with (1, 'Dima'); > > > execute mystatement with (2, 'John'); > > > > > > etc.... > > > > > > The JDBC driver would then send the 'prepare' command to the backend in > > > Connection.prepareStatement (), and > > > use the 'execute' in PreparedStatement.execute (); > > > > > > One potential problem with implementation I see here is that the query > > > planner wants to know the argument types ahead of time... > > > I guess, I could get around that by making all the arguments 'text', > and > > > having them casted into the right types when the > > > statement is actually executed. > > > > > > There is, probably a need to also have some kind of a 'close' command > to > > > throw away the prepared query plans... or we could just > > > make them last forever until, say, the end of transaction (or, perhaps, > > > the end of the session?)... > > > > > > If there is anyone interested in discussing various possibilities, and > > > getting this implemented one way or another, > > > I would like to hear from you! > > > > > > Thanks! > > > > > > Dima. > > > > > > > > > > > > ---------------------------(end of > broadcast)--------------------------- > > > TIP 2: you can get off all lists at once with the unregister command > > > (send "unregister YourEmailAddressHere" to majordomo@postgresql.org) > > > > > > > > > > > ---------------------------(end of broadcast)--------------------------- > > TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org > > > > > > ---------------------------(end of broadcast)--------------------------- > TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org > >
Barry Lind wrote: > > Your first option is what has been implemented. In 7.3 the server now > supports 'prepare <name> as <sql>' and 'execute <name>' (and variants > for passing bind variables). The jdbc driver also now has a method on > org.postgresql.PGStatement.setUseServerPrepare(true) to enable the the > use of server side prepared statements (so you can cast your Statement > object to an org.postgresql.PGStatement object and enable the > functionality for a specific statement). [..] > It is for the reasons above that the developer needs to explicitly turn > on the use of server side prepared statements when it makes sense to do so. How would the developer do that? With the front-end I am working with (ColdFusion MX) I have very little control over the actual calls to the driver. I can make CF MX use createStatement() or prepareStatement() but that is all the control I have. And the only way to send parameters to the driver would be through a JDBC URL. Would that enable me to use prepared statements for real or not? Jochem
Cool an interesting thread on the jdbc list. First off I dissagree with your reading of the spec that the statement or prepared statement object are only intended for one sql statement. If that were true why would you have Statement.getMoreResults()? The API clearly supports a single statement execute returning multiple result sets which implies to me that the single execute can be used for multiple individual sql statements chained together. The uses of the Statement object and PreparedStatement object do not IMHO have anything to do with you many times you use them. They have to do with how you supply values in the sql statement. If you use just the Statement object you are required to know the postgres date format in order to include a date value, where it is much easier (and more portable across databases) to just use a PreparedStatement, a ?, and a setDate() call. thanks, --Barry Peter Kovacs wrote: > Barry, > > It's great to have this functionality. > > However, I am surprised about the need/requirement to explicitely turning on > this feature in the JDBC driver. As you explain, the reason for this > behaviour is to support the use of multiple statements with the > prepareStatement method. But this is supporting incorrect us of the JDBC > interface, since the JDBC API says that the parameter of the > prepareStatement method is one(!) SQL statetment. On the other hand, it is > clear that the intended purpose of the prepareStatement method is to use > prepared statment -- exclusively and not by default. For statements executed > only once, the designers of the API made available the method > "createStatment". > > So the buttom line is that you have an irregular behaviour in support of an > irregular coding practice. I understand that there is "legacy" code that > needs to use the prepareStatement method with multiple SQL statements. But > since such practice was already incorrect at the time of its inception, the > right way would be to allow a transition from bad to good by requiring the > "sinners" to explicitly use some kind of > "org.postgresql.PGStatement.setUseServerPrepare(false)" tweaking in their > code, and let new coders use standard code. > > Of course, you're running the show, so you will do it as you deem right, but > IMHO the implementation as you described it is clearly not reasonable. > > Cheers, > > Peter > > ----- Original Message ----- > From: "Barry Lind" <barry@xythos.com> > To: "Dmitry Tkach" <dmitry@openratings.com> > Cc: <pgsql-general@postgresql.org>; <pgsql-jdbc@postgresql.org> > Sent: Thursday, September 26, 2002 10:56 PM > Subject: Re: [JDBC] [GENERAL] Prepared statement performance... > > > >>Dimitry, >> >>I have some good news for you. All of this functionality is in 7.3 of >>the server and the current development build of the jdbc driver. Of >>course 7.3 isn't quite production yet, but beta 2 should be out this week. >> >>Your first option is what has been implemented. In 7.3 the server now >>supports 'prepare <name> as <sql>' and 'execute <name>' (and variants >>for passing bind variables). The jdbc driver also now has a method on >>org.postgresql.PGStatement.setUseServerPrepare(true) to enable the the >>use of server side prepared statements (so you can cast your Statement >>object to an org.postgresql.PGStatement object and enable the >>functionality for a specific statement). >> >>This hasn't yet made its way into the jdbc documentation for 7.3, but >>this will be documented before 7.3 goes production. >> >>This implementation does still require that you make some changes to >>your app in order to enable it but that is by design for the following >>reasons: >> 1) In order to be useful you need to reuse a statement object >>multiple times. Most code generally will just create a statement and >>use it only once, and since more work is being done to prepare the >>statement that would have a negative performance impact for the majority >>of statements being executed in normal code. >> 2) Prepare functionality will not work for all types of sql that you >>can execute through the jdbc driver. For example you can quite easily >>do the following: conn.prepareStatement("update foo set a = ?; insert >>into bar values (?,?,?);"); >>By issuing both sql statements in one call only one network roundtrip is >>needed to execute both statements providing what is likely a more >>performant application. However this doesn't work with prepare because: >>prepare foo as update foo set a = ?; insert into bar values(?,?,?); - >>is one prepared statement and one regular statement which will cause >>very strange things to happen. >> >>It is for the reasons above that the developer needs to explicitly turn >>on the use of server side prepared statements when it makes sense to do > > so. > >>thanks, >>--Barry >> >>PS. I would really appreciate it if you could test this new >>functionality. As far as I know I am the only one who has used it. But >>for one of my frequently used and complex sql statements the time to >>execute dropped by 60% using server side prepared statements. >> >> >> >>Dmitry Tkach wrote: >> > Hi, everybody. >> > >> > I am running into huge performance problems, due to JDBC not being able >> > to cache query plans :-( >> > My java program runs the (set of about 5) identical statements with >> > different parameters for about 30 million times... >> > What I am talking about below has to do with JDBC, but is not really >> > specific to it - it seems to me, that the backend itself >> > could be improved to better handle this kind of scenario, when the same >> > statement is executed many times in the same session. >> > >> > It takes about 3 weeks(!) to run, and it looks like about *half* of > > that > >> > time is spent by the query planner, >> > creating and recreating the query plans every time I ran the damn >> > statement.... >> > >> > I am looking into implementing some kind of a solution, that would let >> > me work around that problem... >> > So far, I only see two possibilities: >> > >> > - a general solution, that would involve extending postgres SQL gramma >> > to include a 'prepare' statement >> > - or am ugly work around, that would involve moving all my sql >> > statements into stored procedures, and have >> > those cache the query plans inside... >> > >> > The second solution is not only ugly (because it requires the >> > application code to be changed and to have a specialized >> > stored procedure for every query), but also requires some additional >> > hacks (to overcome the hard limit on the number of >> > function arguments and the inability for functions to return tuples) - >> > the only way I imagine this can be made to work is >> > to glue all the arguments together into a text string, and have the >> > stored procedure parse it back, execute the query, then >> > glue the resulting tuple(s) into another text string, return it, and >> > have the application (or, perhaps, JDBC layer) parse it back >> > into columns... >> > >> > I was wonderring if anybody has any better ideas how this can be made > > to > >> > work (I am looking for a solution that would >> > minimize changes to the existing JDBC applications that use >> > PreparedStatements)? >> > >> > If the maintainers of the involved code are interested, I would be >> > willing to implement and contribute the solution we come up with ... >> > (I figure, nobody would really be interested in getting that second >> > solution I mentioned into the mainstream :-), but, if we are >> > able to come up with something more general and less ugly, perhaps, I > > am > >> > not the only one who would be able to contribute from >> > it)... >> > >> > For example, I believe, it should not be too complicated to implement >> > that first possibility I described above... >> > The way I see it would involve adding two statements to postgres SQL >> > syntax: >> > >> > prepare <name> as <sql statement> >> > and >> > execute <name> with (<parameter list>) >> > >> > For example: >> > >> > prepare mystatement as select * from mytable where id = $1 and name > > like > >> > $2; >> > >> > and then >> > >> > execute mystatement with (1, 'Dima'); >> > execute mystatement with (2, 'John'); >> > >> > etc.... >> > >> > The JDBC driver would then send the 'prepare' command to the backend in >> > Connection.prepareStatement (), and >> > use the 'execute' in PreparedStatement.execute (); >> > >> > One potential problem with implementation I see here is that the query >> > planner wants to know the argument types ahead of time... >> > I guess, I could get around that by making all the arguments 'text', > > and > >> > having them casted into the right types when the >> > statement is actually executed. >> > >> > There is, probably a need to also have some kind of a 'close' command > > to > >> > throw away the prepared query plans... or we could just >> > make them last forever until, say, the end of transaction (or, perhaps, >> > the end of the session?)... >> > >> > If there is anyone interested in discussing various possibilities, and >> > getting this implemented one way or another, >> > I would like to hear from you! >> > >> > Thanks! >> > >> > Dima. >> > >> > >> > >> > ---------------------------(end of > > broadcast)--------------------------- > >> > TIP 2: you can get off all lists at once with the unregister command >> > (send "unregister YourEmailAddressHere" to majordomo@postgresql.org) >> > >> >> >> >>---------------------------(end of broadcast)--------------------------- >>TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org >> > > > >
Jochem, Currently you probably can't. But given the information I have mentioned in my previous mails in this thread, I am not sure that you would really want to unless you somehow have the ability to do it statement by statement since in most cases I think you don't want to use server side prepared statements. If someone can think of a way through the standard jdbc api to enable/disable something like this at the statement level I am all ears. I haven't been able to think of a mechanism other that the one that has been implemented. thanks, --Barry Jochem van Dieten wrote: > Barry Lind wrote: > >> >> Your first option is what has been implemented. In 7.3 the server now >> supports 'prepare <name> as <sql>' and 'execute <name>' (and variants >> for passing bind variables). The jdbc driver also now has a method on >> org.postgresql.PGStatement.setUseServerPrepare(true) to enable the the >> use of server side prepared statements (so you can cast your Statement >> object to an org.postgresql.PGStatement object and enable the >> functionality for a specific statement). > > [..] > >> It is for the reasons above that the developer needs to explicitly turn >> on the use of server side prepared statements when it makes sense to >> do so. > > > How would the developer do that? With the front-end I am working with > (ColdFusion MX) I have very little control over the actual calls to the > driver. I can make CF MX use createStatement() or prepareStatement() but > that is all the control I have. And the only way to send parameters to > the driver would be through a JDBC URL. > Would that enable me to use prepared statements for real or not? > > Jochem > >
Dimtry, Dmitry Tkach wrote: > Not realy... You should not be required to know about postgres date > format, as long as it matches Timestamp.toString () output (and it > currently does), and I > see no reason why it would not (you just need to make sure that the > client and the server are in the same locale, but that's a different > discussion, because whatever > JDBC does in setTimestamp () would locale-dependent anyway). This certainly isn't true for all databases. Oracle for example where the database format for dates is DD-MON-YY which is very different from the Timestamp.toString() method. And actually the postgres format is different and incompatible with javas format when you start having to deal with timezone information. thanks, --Barry
Dmitry Tkach wrote: > > BTW, here is a quote from JDBC javadoc: > > * An object that represents a precompiled SQL statement. > * <P>A SQL statement is precompiled and stored in a > * <code>PreparedStatement</code> object. This object can then be used to > * efficiently execute this statement multiple times. > > I read 'precompiled' as something more than 'parsed for question marks' :-) > And also note the phrase about being able to 'efficiently execute this > statement multiple times'... > I don't read more into this than simply parsing question marks. It is all going to depend on the capabilities of the server. To the extent the server has additional features that can be leveraged to improve the performance those should be used. Up until 7.3 of postgres no such features existed. Now they do and we are exploring how to best take advantage of them. But it doesn't make sense to use a specific server feature if it doesn't improve performance in the general case. The testing I have done so far indicates that using the current implementation of server side prepared statements doesn't help performance in the general case and may hinder it. I am reluctant to make a non-backwardly compatible change in the driver until more data points come in that there are significant performance improvements possible without decreasing performance in the general case. > It seems quite clear to me that it states the intent behind having this > interface to provide the ability for the users to > parse a query once and be able to execute it again and again... > As for the set*() functions, PreparedStatement without them would be of > very limited use (how often would we need to execute > *the same exact query* multiple times???), and that's the reason it has > them, *not* to abstract from the database data representation format... > I have many statements that don't take any arguments and that do get executed multiple times. It all depends on the nature of the application. thanks, --Barry
Dmitry Tkach wrote: > True... But how does PreparedStatement.setTimestamp () help here? > All it does (at least in 7.2) is Timestamp.toString () :-) > Huh? In 7.3 setTimestamp() is much more than Timestamp.toString() (and in 7.2 it was as well, I think you need to go back to 7.1 for it to be as simple as you describe). public void setTimestamp(int parameterIndex, Timestamp x) throws SQLException { if (null == x) { setNull(parameterIndex, Types.OTHER); } else { // Use the shared StringBuffer synchronized (sbuf) { sbuf.setLength(0); sbuf.ensureCapacity(32); sbuf.append("'"); //format the timestamp //we do our own formating so that we can get a format //that works with both timestamp with time zone and //timestamp without time zone datatypes. //The format is '2002-01-01 23:59:59.123456-0130' //we need to include the local time and timezone offset //so that timestamp without time zone works correctly int l_year = x.getYear() + 1900; sbuf.append(l_year); sbuf.append('-'); int l_month = x.getMonth() + 1; if (l_month < 10) sbuf.append('0'); sbuf.append(l_month); sbuf.append('-'); int l_day = x.getDate(); if (l_day < 10) sbuf.append('0'); sbuf.append(l_day); sbuf.append(' '); int l_hours = x.getHours(); if (l_hours < 10) sbuf.append('0'); sbuf.append(l_hours); sbuf.append(':'); int l_minutes = x.getMinutes(); if (l_minutes < 10) sbuf.append('0'); sbuf.append(l_minutes); sbuf.append(':'); int l_seconds = x.getSeconds(); if (l_seconds < 10) sbuf.append('0'); sbuf.append(l_seconds); // Make decimal from nanos. char[] l_decimal = {'0', '0', '0', '0', '0', '0', '0', '0', '0'}; char[] l_nanos = Integer.toString(x.getNanos()).toCharArray(); System.arraycopy(l_nanos, 0, l_decimal, l_decimal.length - l_nanos.length, l_nanos.length); sbuf.append('.'); if (connection.haveMinimumServerVersion("7.2")) { sbuf.append(l_decimal, 0, 6); } else { // Because 7.1 include bug that "hh:mm:59.999" becomes "hh:mm:60.00". sbuf.append(l_decimal, 0, 2); } //add timezone offset int l_offset = -(x.getTimezoneOffset()); int l_houros = l_offset / 60; if (l_houros >= 0) { sbuf.append('+'); } else { sbuf.append('-'); } if (l_houros > -10 && l_houros < 10) sbuf.append('0'); if (l_houros >= 0) { sbuf.append(l_houros); } else { sbuf.append( -l_houros); } int l_minos = l_offset - (l_houros * 60); if (l_minos != 0) { if (l_minos < 10) sbuf.append('0'); sbuf.append(l_minos); } sbuf.append("'"); bind(parameterIndex, sbuf.toString(), PG_TIMESTAMPTZ); } } }
Dmitry Tkach wrote: > Barry Lind wrote: > >> >> >> Dmitry Tkach wrote: >> >> >>> True... But how does PreparedStatement.setTimestamp () help here? >>> All it does (at least in 7.2) is Timestamp.toString () :-) >>> >> >> Huh? In 7.3 setTimestamp() is much more than Timestamp.toString() >> (and in 7.2 it was as well, I think you need to go back to 7.1 for it >> to be as simple as you describe). > > > Not really... It *seems* to be doing more in 7.2, but then here is what > it says right after it's done: > > // The above works, but so does the following. > I'm leaving the above in, but this seems > // to be identical. Pays to read the docs ;-) > //set(parameterIndex,"'"+x.toString()+"'"); > > And this is exactly right :-) Those were old comments that were incorrect, thus they are removed in 7.3. > > I have indeed seen that timezones did not use to work correctly with 7.2 > JDBC, but it seems that all that needed to be done to fix that was to > remove the df.setTimezone() call (it shifts the time to GMT, and sends > it to the server that way, but the backend assumes it is in the local > timezone, so the timestamp ends up having wrong time), or, even better, > just get rid of the whole formatting stuff alltogether and stick to the > toString () call as that comment suggets :-) That doesn't work when the client and server are running in different timezones. > Everything would work, except for that 7.1 thing of course, that > converts 59.999 to 60.00 :-) > > Still all of the above would only work with your local timezone, and, if > the timestamp has some other offset, it would still be wrong. The > complexity of the 7.3 code > seems to be inteded to fix that, which is a good thing, although, the > way it is done seems to be way overcomplicated to me - it seems that if > you just did x.toGMTString () and sent it to the backend as text, it > would do the same conversion automatically... This is wrong if the datatype is timestamp with out timezone, but would do the correct thing (I think) for timestamp with timezone. > BTW, the same approach could be used if you wanted to use Statement to set > a timestamptz properly, without calling setTimestamp (): > > c.createStatement ().executeQuery ("select * from foo where > timestamptzfield = '" + mytimestamp.toGMTString () +"'); > > It's true that not every database would understand GMT format (for > example, informix doesn't), but, > on the other hand, not every database has the notion of timezones either > (informix doesn't either) - i.e., if you are using anything other than > the locale's default timezone in your application (which would just work > if you did Timestamp.toString () and sent it in as text), then you are > already aware that you are working with postgres, and, thus the > 'database abstraction' argument for using 'set*()' stuff doesn't shoot :-) > > and actually, I don't even see any reason why something like: > > c.createStatement ().executeQuery ("select * from foo where > timestamptzfiled = '" + mytimestamp + "'") > > ... would not do the same thing as the example above... > > To cut it short, the bottomline is - if JDBC implementation just ignored > the timezone stuff alltogether, and left it all to the backend, > everything would just work :-) > > I may be wrong here, but if I am, I would be very curious to see an > example where this would not work. > See responses above. If you have suggested improvements try them out and if you can get them to pass the TimestampTest.java regression I would like to look at them. thanks, --Barry
... > This certainly isn't true for all databases. Oracle for example where > the database format for dates is DD-MON-YY which is very different from > the Timestamp.toString() method. And actually the postgres format is > different and incompatible with javas format when you start having to > deal with timezone information. Oracle have nls_format_date global parameter and default format for date can be any. regrads haris peco
"Peter Kovacs" <peter.kovacs@sysdata.siemens.hu> writes: > ... but the SQL 92 spec should be, of course, the > ultimate authority in this regard. The SQL spec has never heard of JDBC. I assume there is a spec somewhere for the JDBC API, but it's not SQL92 nor SQL99. regards, tom lane
Barry, It's great to have this functionality. However, I am surprised about the need/requirement to explicitely turning on this feature in the JDBC driver. As you explain, the reason for this behaviour is to support the use of multiple statements with the prepareStatement method. But this is supporting incorrect us of the JDBC interface, since the JDBC API says that the parameter of the prepareStatement method is one(!) SQL statetment. On the other hand, it is clear that the intended purpose of the prepareStatement method is to use prepared statment -- exclusively and not by default. For statements executed only once, the designers of the API made available the method "createStatment". So the buttom line is that you have an irregular behaviour in support of an irregular coding practice. I understand that there is "legacy" code that needs to use the prepareStatement method with multiple SQL statements. But since such practice was already incorrect at the time of its inception, the right way would be to allow a transition from bad to good by requiring the "sinners" to explicitly use some kind of "org.postgresql.PGStatement.setUseServerPrepare(false)" tweaking in their code, and let new coders use standard code. Of course, you're running the show, so you will do it as you deem right, but IMHO the implementation as you described it is clearly not reasonable. Cheers, Peter ----- Original Message ----- From: "Barry Lind" <barry@xythos.com> To: "Dmitry Tkach" <dmitry@openratings.com> Cc: <pgsql-general@postgresql.org>; <pgsql-jdbc@postgresql.org> Sent: Thursday, September 26, 2002 10:56 PM Subject: Re: [JDBC] [GENERAL] Prepared statement performance... > Dimitry, > > I have some good news for you. All of this functionality is in 7.3 of > the server and the current development build of the jdbc driver. Of > course 7.3 isn't quite production yet, but beta 2 should be out this week. > > Your first option is what has been implemented. In 7.3 the server now > supports 'prepare <name> as <sql>' and 'execute <name>' (and variants > for passing bind variables). The jdbc driver also now has a method on > org.postgresql.PGStatement.setUseServerPrepare(true) to enable the the > use of server side prepared statements (so you can cast your Statement > object to an org.postgresql.PGStatement object and enable the > functionality for a specific statement). > > This hasn't yet made its way into the jdbc documentation for 7.3, but > this will be documented before 7.3 goes production. > > This implementation does still require that you make some changes to > your app in order to enable it but that is by design for the following > reasons: > 1) In order to be useful you need to reuse a statement object > multiple times. Most code generally will just create a statement and > use it only once, and since more work is being done to prepare the > statement that would have a negative performance impact for the majority > of statements being executed in normal code. > 2) Prepare functionality will not work for all types of sql that you > can execute through the jdbc driver. For example you can quite easily > do the following: conn.prepareStatement("update foo set a = ?; insert > into bar values (?,?,?);"); > By issuing both sql statements in one call only one network roundtrip is > needed to execute both statements providing what is likely a more > performant application. However this doesn't work with prepare because: > prepare foo as update foo set a = ?; insert into bar values(?,?,?); - > is one prepared statement and one regular statement which will cause > very strange things to happen. > > It is for the reasons above that the developer needs to explicitly turn > on the use of server side prepared statements when it makes sense to do so. > > thanks, > --Barry > > PS. I would really appreciate it if you could test this new > functionality. As far as I know I am the only one who has used it. But > for one of my frequently used and complex sql statements the time to > execute dropped by 60% using server side prepared statements. > > > > Dmitry Tkach wrote: > > Hi, everybody. > > > > I am running into huge performance problems, due to JDBC not being able > > to cache query plans :-( > > My java program runs the (set of about 5) identical statements with > > different parameters for about 30 million times... > > What I am talking about below has to do with JDBC, but is not really > > specific to it - it seems to me, that the backend itself > > could be improved to better handle this kind of scenario, when the same > > statement is executed many times in the same session. > > > > It takes about 3 weeks(!) to run, and it looks like about *half* of that > > time is spent by the query planner, > > creating and recreating the query plans every time I ran the damn > > statement.... > > > > I am looking into implementing some kind of a solution, that would let > > me work around that problem... > > So far, I only see two possibilities: > > > > - a general solution, that would involve extending postgres SQL gramma > > to include a 'prepare' statement > > - or am ugly work around, that would involve moving all my sql > > statements into stored procedures, and have > > those cache the query plans inside... > > > > The second solution is not only ugly (because it requires the > > application code to be changed and to have a specialized > > stored procedure for every query), but also requires some additional > > hacks (to overcome the hard limit on the number of > > function arguments and the inability for functions to return tuples) - > > the only way I imagine this can be made to work is > > to glue all the arguments together into a text string, and have the > > stored procedure parse it back, execute the query, then > > glue the resulting tuple(s) into another text string, return it, and > > have the application (or, perhaps, JDBC layer) parse it back > > into columns... > > > > I was wonderring if anybody has any better ideas how this can be made to > > work (I am looking for a solution that would > > minimize changes to the existing JDBC applications that use > > PreparedStatements)? > > > > If the maintainers of the involved code are interested, I would be > > willing to implement and contribute the solution we come up with ... > > (I figure, nobody would really be interested in getting that second > > solution I mentioned into the mainstream :-), but, if we are > > able to come up with something more general and less ugly, perhaps, I am > > not the only one who would be able to contribute from > > it)... > > > > For example, I believe, it should not be too complicated to implement > > that first possibility I described above... > > The way I see it would involve adding two statements to postgres SQL > > syntax: > > > > prepare <name> as <sql statement> > > and > > execute <name> with (<parameter list>) > > > > For example: > > > > prepare mystatement as select * from mytable where id = $1 and name like > > $2; > > > > and then > > > > execute mystatement with (1, 'Dima'); > > execute mystatement with (2, 'John'); > > > > etc.... > > > > The JDBC driver would then send the 'prepare' command to the backend in > > Connection.prepareStatement (), and > > use the 'execute' in PreparedStatement.execute (); > > > > One potential problem with implementation I see here is that the query > > planner wants to know the argument types ahead of time... > > I guess, I could get around that by making all the arguments 'text', and > > having them casted into the right types when the > > statement is actually executed. > > > > There is, probably a need to also have some kind of a 'close' command to > > throw away the prepared query plans... or we could just > > make them last forever until, say, the end of transaction (or, perhaps, > > the end of the session?)... > > > > If there is anyone interested in discussing various possibilities, and > > getting this implemented one way or another, > > I would like to hear from you! > > > > Thanks! > > > > Dima. > > > > > > > > ---------------------------(end of broadcast)--------------------------- > > TIP 2: you can get off all lists at once with the unregister command > > (send "unregister YourEmailAddressHere" to majordomo@postgresql.org) > > > > > > ---------------------------(end of broadcast)--------------------------- > TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org >
Dave, Barry, et al.: > Unless of course we were to write a > sql parser in the jdbc driver so that we could predetermine which > statements should use the prepare mechanism or not. That's a bit of an exaggeration -- all you need to check in the JDBC driver is whether there's an unescaped ; to see if there's more than one statement, right? It wouldn't be unreasonable to default to enabling the feature unless there's a ; in the statement (or unless there's a ; with anything but whitespace afterward, if you like) -- it wouldn't even be a disaster to forget escaping and just decide based on a ; in the statement period -- few enough statements have an escaped ; that the suboptimal performance in that case won't matter. I'm a little concerned about how this feature interacts with connection pooling as currently implemented -- it probably needs to be reset to a known state every time a connection is returned to the pool. Which is OK for the PostgreSQL implementation (which you'll note is not supposed to be used in an app server environment), but impossible for all the app servers out there which won't take driver-specific steps when connections are returned to their generic pools. Furthermore, with the default behavior set to "disabled", you're reducing performance (you quoted 60%?) in the application server environment, when there will never be multiple statements executed in the same call, but there will frequently be multiple identical statements executed on the same PreparedStatement, but there's no available way to reset the flag, short of writing a wrapper driver to stand between the app server and the real driver. On the other hand, if you set the default to a global "enabled", it breaks DBVisualizer and other development tools which currently allow multiple statements by default. IMHO, it would be better to have the feature enabled for queries where it's appropriate, and disabled where not, and let the programmer specify a global override where they absolutely must. That way DB-specific actions are only required in truly exceptional cases, and both the app servers and the tools behave properly by default. If you absolutely object, I think we should add a property to the Driver and the ConnectionPoolDataSource so that in an app server environment you can set the property to enabled at configuration time. But again, I would prefer smarter default behavior. Aaron On 27 Sep 2002, Dave Cramer wrote: > Peter, > > I have to agree with Barry here. Since it is possible to send two (or > more ) statements to the backend, the programmer needs some mechanism > for enabling/disabling this feature. Unless of course we were to write a > sql parser in the jdbc driver so that we could predetermine which > statements should use the prepare mechanism or not. > > I often use prepared statements instead of simple statements even when > only doing one select. The difference being that I can use setXXX to set > the parameters in the statment. Which is particularly important for > setDate, setTime, setTimestamp > > And you would be amazed at the unique (irregular) coding practices which > exist out there, so having the default do something sane is a GOOD > THING. > > Dave > > > On Fri, 2002-09-27 at 02:52, Peter Kovacs wrote: > > Barry, > > > > It's great to have this functionality. > > > > However, I am surprised about the need/requirement to explicitely turning on > > this feature in the JDBC driver. As you explain, the reason for this > > behaviour is to support the use of multiple statements with the > > prepareStatement method. But this is supporting incorrect us of the JDBC > > interface, since the JDBC API says that the parameter of the > > prepareStatement method is one(!) SQL statetment. On the other hand, it is > > clear that the intended purpose of the prepareStatement method is to use > > prepared statment -- exclusively and not by default. For statements executed > > only once, the designers of the API made available the method > > "createStatment". > > > > So the buttom line is that you have an irregular behaviour in support of an > > irregular coding practice. I understand that there is "legacy" code that > > needs to use the prepareStatement method with multiple SQL statements. But > > since such practice was already incorrect at the time of its inception, the > > right way would be to allow a transition from bad to good by requiring the > > "sinners" to explicitly use some kind of > > "org.postgresql.PGStatement.setUseServerPrepare(false)" tweaking in their > > code, and let new coders use standard code. > > > > Of course, you're running the show, so you will do it as you deem right, but > > IMHO the implementation as you described it is clearly not reasonable. > > > > Cheers, > > > > Peter > > > > ----- Original Message ----- > > From: "Barry Lind" <barry@xythos.com> > > To: "Dmitry Tkach" <dmitry@openratings.com> > > Cc: <pgsql-general@postgresql.org>; <pgsql-jdbc@postgresql.org> > > Sent: Thursday, September 26, 2002 10:56 PM > > Subject: Re: [JDBC] [GENERAL] Prepared statement performance... > > > > > > > Dimitry, > > > > > > I have some good news for you. All of this functionality is in 7.3 of > > > the server and the current development build of the jdbc driver. Of > > > course 7.3 isn't quite production yet, but beta 2 should be out this week. > > > > > > Your first option is what has been implemented. In 7.3 the server now > > > supports 'prepare <name> as <sql>' and 'execute <name>' (and variants > > > for passing bind variables). The jdbc driver also now has a method on > > > org.postgresql.PGStatement.setUseServerPrepare(true) to enable the the > > > use of server side prepared statements (so you can cast your Statement > > > object to an org.postgresql.PGStatement object and enable the > > > functionality for a specific statement). > > > > > > This hasn't yet made its way into the jdbc documentation for 7.3, but > > > this will be documented before 7.3 goes production. > > > > > > This implementation does still require that you make some changes to > > > your app in order to enable it but that is by design for the following > > > reasons: > > > 1) In order to be useful you need to reuse a statement object > > > multiple times. Most code generally will just create a statement and > > > use it only once, and since more work is being done to prepare the > > > statement that would have a negative performance impact for the majority > > > of statements being executed in normal code. > > > 2) Prepare functionality will not work for all types of sql that you > > > can execute through the jdbc driver. For example you can quite easily > > > do the following: conn.prepareStatement("update foo set a = ?; insert > > > into bar values (?,?,?);"); > > > By issuing both sql statements in one call only one network roundtrip is > > > needed to execute both statements providing what is likely a more > > > performant application. However this doesn't work with prepare because: > > > prepare foo as update foo set a = ?; insert into bar values(?,?,?); - > > > is one prepared statement and one regular statement which will cause > > > very strange things to happen. > > > > > > It is for the reasons above that the developer needs to explicitly turn > > > on the use of server side prepared statements when it makes sense to do > > so. > > > > > > thanks, > > > --Barry > > > > > > PS. I would really appreciate it if you could test this new > > > functionality. As far as I know I am the only one who has used it. But > > > for one of my frequently used and complex sql statements the time to > > > execute dropped by 60% using server side prepared statements. > > > > > > > > > > > > Dmitry Tkach wrote: > > > > Hi, everybody. > > > > > > > > I am running into huge performance problems, due to JDBC not being able > > > > to cache query plans :-( > > > > My java program runs the (set of about 5) identical statements with > > > > different parameters for about 30 million times... > > > > What I am talking about below has to do with JDBC, but is not really > > > > specific to it - it seems to me, that the backend itself > > > > could be improved to better handle this kind of scenario, when the same > > > > statement is executed many times in the same session. > > > > > > > > It takes about 3 weeks(!) to run, and it looks like about *half* of > > that > > > > time is spent by the query planner, > > > > creating and recreating the query plans every time I ran the damn > > > > statement.... > > > > > > > > I am looking into implementing some kind of a solution, that would let > > > > me work around that problem... > > > > So far, I only see two possibilities: > > > > > > > > - a general solution, that would involve extending postgres SQL gramma > > > > to include a 'prepare' statement > > > > - or am ugly work around, that would involve moving all my sql > > > > statements into stored procedures, and have > > > > those cache the query plans inside... > > > > > > > > The second solution is not only ugly (because it requires the > > > > application code to be changed and to have a specialized > > > > stored procedure for every query), but also requires some additional > > > > hacks (to overcome the hard limit on the number of > > > > function arguments and the inability for functions to return tuples) - > > > > the only way I imagine this can be made to work is > > > > to glue all the arguments together into a text string, and have the > > > > stored procedure parse it back, execute the query, then > > > > glue the resulting tuple(s) into another text string, return it, and > > > > have the application (or, perhaps, JDBC layer) parse it back > > > > into columns... > > > > > > > > I was wonderring if anybody has any better ideas how this can be made > > to > > > > work (I am looking for a solution that would > > > > minimize changes to the existing JDBC applications that use > > > > PreparedStatements)? > > > > > > > > If the maintainers of the involved code are interested, I would be > > > > willing to implement and contribute the solution we come up with ... > > > > (I figure, nobody would really be interested in getting that second > > > > solution I mentioned into the mainstream :-), but, if we are > > > > able to come up with something more general and less ugly, perhaps, I > > am > > > > not the only one who would be able to contribute from > > > > it)... > > > > > > > > For example, I believe, it should not be too complicated to implement > > > > that first possibility I described above... > > > > The way I see it would involve adding two statements to postgres SQL > > > > syntax: > > > > > > > > prepare <name> as <sql statement> > > > > and > > > > execute <name> with (<parameter list>) > > > > > > > > For example: > > > > > > > > prepare mystatement as select * from mytable where id = $1 and name > > like > > > > $2; > > > > > > > > and then > > > > > > > > execute mystatement with (1, 'Dima'); > > > > execute mystatement with (2, 'John'); > > > > > > > > etc.... > > > > > > > > The JDBC driver would then send the 'prepare' command to the backend in > > > > Connection.prepareStatement (), and > > > > use the 'execute' in PreparedStatement.execute (); > > > > > > > > One potential problem with implementation I see here is that the query > > > > planner wants to know the argument types ahead of time... > > > > I guess, I could get around that by making all the arguments 'text', > > and > > > > having them casted into the right types when the > > > > statement is actually executed. > > > > > > > > There is, probably a need to also have some kind of a 'close' command > > to > > > > throw away the prepared query plans... or we could just > > > > make them last forever until, say, the end of transaction (or, perhaps, > > > > the end of the session?)... > > > > > > > > If there is anyone interested in discussing various possibilities, and > > > > getting this implemented one way or another, > > > > I would like to hear from you! > > > > > > > > Thanks! > > > > > > > > Dima. > > > > > > > > > > > > > > > > ---------------------------(end of > > broadcast)--------------------------- > > > > TIP 2: you can get off all lists at once with the unregister command > > > > (send "unregister YourEmailAddressHere" to majordomo@postgresql.org) > > > > > > > > > > > > > > > > ---------------------------(end of broadcast)--------------------------- > > > TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org > > > > > > > > > > > ---------------------------(end of broadcast)--------------------------- > > TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org > > > > > > > > > ---------------------------(end of broadcast)--------------------------- > TIP 5: Have you checked our extensive FAQ? > > http://www.postgresql.org/users-lounge/docs/faq.html >
Barry, The API spec says for PreparedStatement: "An object that represents a precompiled SQL statement. A SQL statement is precompiled and stored in a PreparedStatement object. This object can then be used to efficiently execute this statement multiple times." This tells me that the main purpose of this object type is to allow you to multiple execution with possibly different parameters. IMHO, the setters are there just in order to provide you a way to specify different parameters. This also tells me that there is no alternative behaviour (i.e. the statement represented by this kind of object IS PRECOMPILED). If the PostgreSQL jdbc driver allows an alternative behaviour, this should not be default and should be made available I have to admit that linking the above question (behaviour of Prepared Statement) with the question of how many queries a SQL statement is allowed to contain may have been a bit arbitrary. (Actually, *I* did not make the linkage, I was just probing the way along a course of thoughts already offerred. :)) I wanted to tackle this question, because the practice of putting multiple statements in one Statement object (separated with semicolons) has already come up in connection with other --earlier not implemented-- jdbc feature(s) and there the problem to handle this practice also turned out to be a pain in the ass. Unfortunately, I do not have the SQL 92 spec. I could only look into the Postgres Reference documentation. Based on how it uses the term "statement", I would conclude that "statement" is just an entry in the reference doc. As to the getMoreResults method: issuing several queries is not the only way to obtain multiple results sets. E.g: your query may involve a function which returns multiple result sets (or just one result set and an update count). My personal opinion is that putting multiple statements in one Statement object is just abusing the fact that the backend allows to do, but the SQL 92 spec should be, of course, the ultimate authority in this regard. Cheers, Peter ----- Original Message ----- From: "Barry Lind" <barry@xythos.com> To: "Peter Kovacs" <peter.kovacs@sysdata.siemens.hu> Cc: "Dmitry Tkach" <dmitry@openratings.com>; <pgsql-general@postgresql.org>; <pgsql-jdbc@postgresql.org> Sent: Friday, September 27, 2002 8:24 PM Subject: Re: [JDBC] [GENERAL] Prepared statement performance... > Cool an interesting thread on the jdbc list. > > First off I dissagree with your reading of the spec that the statement > or prepared statement object are only intended for one sql statement. > If that were true why would you have Statement.getMoreResults()? The > API clearly supports a single statement execute returning multiple > result sets which implies to me that the single execute can be used for > multiple individual sql statements chained together. > > The uses of the Statement object and PreparedStatement object do not > IMHO have anything to do with you many times you use them. They have to > do with how you supply values in the sql statement. If you use just the > Statement object you are required to know the postgres date format in > order to include a date value, where it is much easier (and more > portable across databases) to just use a PreparedStatement, a ?, and a > setDate() call. > > thanks, > --Barry > > Peter Kovacs wrote: > > Barry, > > > > It's great to have this functionality. > > > > However, I am surprised about the need/requirement to explicitely > turning on > > this feature in the JDBC driver. As you explain, the reason for this > > behaviour is to support the use of multiple statements with the > > prepareStatement method. But this is supporting incorrect us of the JDBC > > interface, since the JDBC API says that the parameter of the > > prepareStatement method is one(!) SQL statetment. On the other hand, > it is > > clear that the intended purpose of the prepareStatement method is to use > > prepared statment -- exclusively and not by default. For statements > executed > > only once, the designers of the API made available the method > > "createStatment". > > > > So the buttom line is that you have an irregular behaviour in support > of an > > irregular coding practice. I understand that there is "legacy" code that > > needs to use the prepareStatement method with multiple SQL > statements. But > > since such practice was already incorrect at the time of its > inception, the > > right way would be to allow a transition from bad to good by > requiring the > > "sinners" to explicitly use some kind of > > "org.postgresql.PGStatement.setUseServerPrepare(false)" tweaking in their > > code, and let new coders use standard code. > > > > Of course, you're running the show, so you will do it as you deem > right, but > > IMHO the implementation as you described it is clearly not reasonable. > > > > Cheers, > > > > Peter > > > > ----- Original Message ----- > > From: "Barry Lind" <barry@xythos.com> > > To: "Dmitry Tkach" <dmitry@openratings.com> > > Cc: <pgsql-general@postgresql.org>; <pgsql-jdbc@postgresql.org> > > Sent: Thursday, September 26, 2002 10:56 PM > > Subject: Re: [JDBC] [GENERAL] Prepared statement performance... > > > > > > > >>Dimitry, > >> > >>I have some good news for you. All of this functionality is in 7.3 of > >>the server and the current development build of the jdbc driver. Of > >>course 7.3 isn't quite production yet, but beta 2 should be out this > week. > >> > >>Your first option is what has been implemented. In 7.3 the server now > >>supports 'prepare <name> as <sql>' and 'execute <name>' (and variants > >>for passing bind variables). The jdbc driver also now has a method on > >>org.postgresql.PGStatement.setUseServerPrepare(true) to enable the the > >>use of server side prepared statements (so you can cast your Statement > >>object to an org.postgresql.PGStatement object and enable the > >>functionality for a specific statement). > >> > >>This hasn't yet made its way into the jdbc documentation for 7.3, but > >>this will be documented before 7.3 goes production. > >> > >>This implementation does still require that you make some changes to > >>your app in order to enable it but that is by design for the following > >>reasons: > >> 1) In order to be useful you need to reuse a statement object > >>multiple times. Most code generally will just create a statement and > >>use it only once, and since more work is being done to prepare the > >>statement that would have a negative performance impact for the majority > >>of statements being executed in normal code. > >> 2) Prepare functionality will not work for all types of sql that you > >>can execute through the jdbc driver. For example you can quite easily > >>do the following: conn.prepareStatement("update foo set a = ?; insert > >>into bar values (?,?,?);"); > >>By issuing both sql statements in one call only one network roundtrip is > >>needed to execute both statements providing what is likely a more > >>performant application. However this doesn't work with prepare because: > >>prepare foo as update foo set a = ?; insert into bar values(?,?,?); - > >>is one prepared statement and one regular statement which will cause > >>very strange things to happen. > >> > >>It is for the reasons above that the developer needs to explicitly turn > >>on the use of server side prepared statements when it makes sense to do > > > > so. > > > >>thanks, > >>--Barry > >> > >>PS. I would really appreciate it if you could test this new > >>functionality. As far as I know I am the only one who has used it. But > >>for one of my frequently used and complex sql statements the time to > >>execute dropped by 60% using server side prepared statements. > >> > >> > >> > >>Dmitry Tkach wrote: > >> > Hi, everybody. > >> > > >> > I am running into huge performance problems, due to JDBC not being > able > >> > to cache query plans :-( > >> > My java program runs the (set of about 5) identical statements with > >> > different parameters for about 30 million times... > >> > What I am talking about below has to do with JDBC, but is not really > >> > specific to it - it seems to me, that the backend itself > >> > could be improved to better handle this kind of scenario, when the > same > >> > statement is executed many times in the same session. > >> > > >> > It takes about 3 weeks(!) to run, and it looks like about *half* of > > > > that > > > >> > time is spent by the query planner, > >> > creating and recreating the query plans every time I ran the damn > >> > statement.... > >> > > >> > I am looking into implementing some kind of a solution, that would let > >> > me work around that problem... > >> > So far, I only see two possibilities: > >> > > >> > - a general solution, that would involve extending postgres SQL gramma > >> > to include a 'prepare' statement > >> > - or am ugly work around, that would involve moving all my sql > >> > statements into stored procedures, and have > >> > those cache the query plans inside... > >> > > >> > The second solution is not only ugly (because it requires the > >> > application code to be changed and to have a specialized > >> > stored procedure for every query), but also requires some additional > >> > hacks (to overcome the hard limit on the number of > >> > function arguments and the inability for functions to return tuples) - > >> > the only way I imagine this can be made to work is > >> > to glue all the arguments together into a text string, and have the > >> > stored procedure parse it back, execute the query, then > >> > glue the resulting tuple(s) into another text string, return it, and > >> > have the application (or, perhaps, JDBC layer) parse it back > >> > into columns... > >> > > >> > I was wonderring if anybody has any better ideas how this can be made > > > > to > > > >> > work (I am looking for a solution that would > >> > minimize changes to the existing JDBC applications that use > >> > PreparedStatements)? > >> > > >> > If the maintainers of the involved code are interested, I would be > >> > willing to implement and contribute the solution we come up with ... > >> > (I figure, nobody would really be interested in getting that second > >> > solution I mentioned into the mainstream :-), but, if we are > >> > able to come up with something more general and less ugly, perhaps, I > > > > am > > > >> > not the only one who would be able to contribute from > >> > it)... > >> > > >> > For example, I believe, it should not be too complicated to implement > >> > that first possibility I described above... > >> > The way I see it would involve adding two statements to postgres SQL > >> > syntax: > >> > > >> > prepare <name> as <sql statement> > >> > and > >> > execute <name> with (<parameter list>) > >> > > >> > For example: > >> > > >> > prepare mystatement as select * from mytable where id = $1 and name > > > > like > > > >> > $2; > >> > > >> > and then > >> > > >> > execute mystatement with (1, 'Dima'); > >> > execute mystatement with (2, 'John'); > >> > > >> > etc.... > >> > > >> > The JDBC driver would then send the 'prepare' command to the > backend in > >> > Connection.prepareStatement (), and > >> > use the 'execute' in PreparedStatement.execute (); > >> > > >> > One potential problem with implementation I see here is that the query > >> > planner wants to know the argument types ahead of time... > >> > I guess, I could get around that by making all the arguments 'text', > > > > and > > > >> > having them casted into the right types when the > >> > statement is actually executed. > >> > > >> > There is, probably a need to also have some kind of a 'close' command > > > > to > > > >> > throw away the prepared query plans... or we could just > >> > make them last forever until, say, the end of transaction (or, > perhaps, > >> > the end of the session?)... > >> > > >> > If there is anyone interested in discussing various possibilities, and > >> > getting this implemented one way or another, > >> > I would like to hear from you! > >> > > >> > Thanks! > >> > > >> > Dima. > >> > > >> > > >> > > >> > ---------------------------(end of > > > > broadcast)--------------------------- > > > >> > TIP 2: you can get off all lists at once with the unregister command > >> > (send "unregister YourEmailAddressHere" to > majordomo@postgresql.org) > >> > > >> > >> > >> > >>---------------------------(end of broadcast)--------------------------- > >>TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org > >> > > > > > > > > > >
Thank you for your remark. But we're talking about SQL statements (and not JDBC statements), aren't we? I have not found anything as yet which would indicate that the JDBC API significantly redefined the meaning of the term "statement" compared to what had been known as SQL statement well before JDBC appeared. And in this regard (i.e. to clarify what has/had been meant by "SQL statement"), the SQL spec seemed to me a good starting point, although anecdotical evidence can, of course, also be used, if you can provide some. Your remark is correct, if you mean that the JDBC API does not explicitely restricts you from putting in your SQL statement anything you wish. But (provided I understand the big picture) JDBC has been designed for a very specific kind of resource managers: database servers communicating with their clients via SQL statements. There exists a specification for a very generic JAVA resource adapter, which explicitely allows you to put over the wire anything you wish (actually, anything the resource manager it is intended for understands). But this generic resource adapter is called Connector, and not JDBC driver. So if you want to interpret liberally the API between the server and the client, the postgres java client classes (in what is now called JDBC driver) should implement the "connector" interfaces rather than the "sql" interfaces. And as to the whereabouts of the JDBC API spec, a good place to look for it is www.java.sun.com . Cheers, Peter ----- Original Message ----- From: "Tom Lane" <tgl@sss.pgh.pa.us> To: "Peter Kovacs" <peter.kovacs@sysdata.siemens.hu> Cc: "Barry Lind" <barry@xythos.com>; "Dmitry Tkach" <dmitry@openratings.com>; <pgsql-general@postgresql.org>; <pgsql-jdbc@postgresql.org> Sent: Saturday, September 28, 2002 6:25 PM Subject: Re: [JDBC] [GENERAL] Prepared statement performance... > "Peter Kovacs" <peter.kovacs@sysdata.siemens.hu> writes: > > ... but the SQL 92 spec should be, of course, the > > ultimate authority in this regard. > > The SQL spec has never heard of JDBC. I assume there is a spec > somewhere for the JDBC API, but it's not SQL92 nor SQL99. > > regards, tom lane >
Barry Lind wrote: > > Currently you probably can't. But given the information I have > mentioned in my previous mails in this thread, I am not sure that you > would really want to unless you somehow have the ability to do it > statement by statement since in most cases I think you don't want to use > server side prepared statements. OK, so I understand you see a difference between a JDBC PrepareStatement and a server side prepared statement, and the first does not necessarily imply the second. I don't see that difference per se :) If I don't want to use a server side prepared statement I will make use of CreateStatement. But I can imagine that for some applications that are a bit more advanced as mine a more granular control is needed. Would it be possible to have a parameter in the JDBC URL that would switch the behaviour with regard to server side prepared statements? Something like ?serverPrepare={always|statement|never} which defaults to statement (i.e. the setUseServerPrepare() you are suggesting). Jochem
On Fri, 27 Sep 2002, Barry Lind wrote: > First off I dissagree with your reading of the spec that the statement > or prepared statement object are only intended for one sql statement. > If that were true why would you have Statement.getMoreResults()? Because a single statement can return multiple result sets. Consider a stored procedure that does several SELECTs, for example. > The API clearly supports a single statement execute returning multiple > result sets which implies to me that the single execute can be used for > multiple individual sql statements chained together. I'd disagree. Otherwise why would we have a facility for statement batches? At any rate, I can guarantee you that statement chaning is not portable. > The uses of the Statement object and PreparedStatement object do not > IMHO have anything to do with you many times you use them. They have to > do with how you supply values in the sql statement. If you use just the > Statement object you are required to know the postgres date format in > order to include a date value, where it is much easier (and more > portable across databases) to just use a PreparedStatement, a ?, and a > setDate() call. I agree entirely. Even more importantly, it's much, much more difficult to write a secure application (one not subject to SQL injection attacks) if you are forced to quote data yourself. The driver already knows how to deal with any string you pass to it; why would you want to re-write that code, possibily introducing errors into it? (For those not familiar with SQL injection attacks, I'm referring to the situation where someone types, say, "foo'; DROP TABLE bar;" into the "Name" field on your web form, or whatever.) cjs -- Curt Sampson <cjs@cynic.net> +81 90 7737 2974 http://www.netbsd.org Don't you know, in this new Dark Age, we're all light. --XTC
> > The uses of the Statement object and PreparedStatement object do not > > IMHO have anything to do with you many times you use them. They have to > > do with how you supply values in the sql statement. If you use just the > > Statement object you are required to know the postgres date format in > > order to include a date value, where it is much easier (and more > > portable across databases) to just use a PreparedStatement, a ?, and a > > setDate() call. > > I agree entirely. Even more importantly, it's much, much more difficult > to write a secure application (one not subject to SQL injection attacks) > if you are forced to quote data yourself. The driver already knows how > to deal with any string you pass to it; why would you want to re-write > that code, possibily introducing errors into it? (For those not familiar > with SQL injection attacks, I'm referring to the situation where someone > types, say, "foo'; DROP TABLE bar;" into the "Name" field on your web > form, or whatever.) > I do not clearly understand what the problem is with someone typing in "foo'; DROP TABLE bar;" into the "Name" field on your web form.