Thread: Re: [GENERAL] Prepared statement performance...

Re: [GENERAL] Prepared statement performance...

From

Barry Lind

Date:

26 September 2002, 16:54:53

Dimitry,

I have some good news for you.  All of this functionality is in 7.3 of
the server and the current development build of the jdbc driver.  Of
course 7.3 isn't quite production yet, but beta 2 should be out this week.

Your first option is what has been implemented.  In 7.3 the server now
supports 'prepare <name> as <sql>' and 'execute <name>' (and variants
for passing bind variables).  The jdbc driver also now has a method on
org.postgresql.PGStatement.setUseServerPrepare(true) to enable the the
use of server side prepared statements (so you can cast your Statement
object to an org.postgresql.PGStatement object and enable the
functionality for a specific statement).

This hasn't yet made its way into the jdbc documentation for 7.3, but
this will be documented before 7.3 goes production.

This implementation does still require that you make some changes to
your app in order to enable it but that is by design for the following
reasons:
    1) In order to be useful you need to reuse a statement object
multiple times.  Most code generally will just create a statement and
use it only once, and since more work is being done to prepare the
statement that would have a negative performance impact for the majority
of statements being executed in normal code.
    2) Prepare functionality will not work for all types of sql that you
can execute through the jdbc driver.  For example you can quite easily
do the following:  conn.prepareStatement("update foo set a = ?; insert
into bar values (?,?,?);");
By issuing both sql statements in one call only one network roundtrip is
needed to execute both statements providing what is likely a more
performant application.  However this doesn't work with prepare because:
prepare foo as update foo set a = ?; insert into bar values(?,?,?);   -
is one prepared statement and one regular statement which will cause
very strange things to happen.

It is for the reasons above that the developer needs to explicitly turn
on the use of server side prepared statements when it makes sense to do so.

thanks,
--Barry

PS.  I would really appreciate it if you could test this new
functionality.  As far as I know I am the only one who has used it.  But
for one of my frequently used and complex sql statements the time to
execute dropped by 60% using server side prepared statements.



Dmitry Tkach wrote:
 > Hi, everybody.
 >
 > I am running into huge performance problems, due to JDBC not being able
 > to cache query plans :-(
 > My java program runs the (set of about 5) identical statements with
 > different parameters for about 30 million times...
 > What I am talking about below has to do with JDBC, but is not really
 > specific to it - it seems to me, that the backend itself
 > could be improved to better handle this kind of scenario, when the same
 > statement is executed many times in the same session.
 >
 > It takes about 3 weeks(!) to run, and it looks like about *half* of that
 > time is spent by the query planner,
 > creating and recreating the query plans every time I ran the damn
 > statement....
 >
 > I am looking into implementing some kind of a solution, that would let
 > me work around that problem...
 > So far, I only see two possibilities:
 >
 > - a general solution, that would involve extending postgres SQL gramma
 > to include a 'prepare' statement
 > - or am ugly work around, that would involve moving all my sql
 > statements into stored procedures, and have
 >   those cache the query plans inside...
 >
 > The second solution is not only ugly (because it requires the
 > application code to be changed and to have a specialized
 > stored procedure for every query), but also requires some additional
 > hacks (to overcome the hard limit on the number of
 > function arguments and the inability for functions to return tuples) -
 > the only way I imagine this can be made to work is
 > to glue all the arguments together into a text string, and have the
 > stored procedure parse it back, execute the query, then
 > glue the resulting tuple(s) into another text string, return it, and
 > have the application (or, perhaps, JDBC layer) parse it back
 > into columns...
 >
 > I was wonderring if anybody has any better ideas how this can be made to
 > work (I am looking for a solution that would
 > minimize changes to the existing JDBC applications that use
 > PreparedStatements)?
 >
 > If the maintainers of the involved code are interested, I would be
 > willing to implement and contribute the solution we come up with ...
 > (I figure, nobody would really be interested in getting that second
 > solution I mentioned into the mainstream :-), but, if we are
 > able to come up with something more general and less ugly, perhaps, I am
 > not the only one who would be able to contribute from
 > it)...
 >
 > For example, I believe, it should not be too complicated to implement
 > that first possibility I described above...
 > The way I see it would involve adding two statements to postgres SQL
 > syntax:
 >
 > prepare <name> as <sql statement>
 > and
 > execute <name> with (<parameter list>)
 >
 > For example:
 >
 > prepare mystatement as select * from mytable where id = $1 and name like
 > $2;
 >
 > and then
 >
 > execute mystatement with (1, 'Dima');
 > execute mystatement with (2, 'John');
 >
 > etc....
 >
 > The JDBC driver would then send the 'prepare' command to the backend in
 > Connection.prepareStatement (), and
 > use the 'execute' in PreparedStatement.execute ();
 >
 > One potential problem with implementation I see here is that the query
 > planner wants to know the argument types ahead of time...
 > I guess, I could get around that by making all the arguments 'text', and
 > having them casted into the right types when the
 > statement is actually executed.
 >
 > There is, probably a need to also have some kind of a 'close' command to
 > throw away the prepared query plans... or we could just
 > make them last forever until, say, the end of transaction (or, perhaps,
 > the end of the session?)...
 >
 > If there is anyone interested in discussing various possibilities, and
 > getting this implemented one way or another,
 >  I would like to hear from you!
 >
 > Thanks!
 >
 > Dima.
 >
 >
 >
 > ---------------------------(end of broadcast)---------------------------
 > TIP 2: you can get off all lists at once with the unregister command
 >    (send "unregister YourEmailAddressHere" to majordomo@postgresql.org)
 >

Re: [GENERAL] Prepared statement performance...

From

"Peter Kovacs"

Date:

27 September 2002, 05:48:09

Barry,

It's great to have this functionality.

However, I am surprised about the need/requirement to explicitely turning on
this feature in the JDBC driver. As you explain, the reason for this
behaviour is to support the use of multiple statements with the
prepareStatement method. But this is supporting incorrect us of the JDBC
interface, since the JDBC API says that the parameter of the
prepareStatement method is one(!) SQL statetment. On the other hand, it is
clear that the intended purpose of the prepareStatement method is to use
prepared statment -- exclusively and not by default. For statements executed
only once, the designers of the API made available the method
"createStatment".

So the buttom line is that you have an irregular behaviour in support of an
irregular coding practice. I understand that there is "legacy" code that
needs to use the prepareStatement method with multiple SQL statements. But
since such practice was already incorrect at the time of its inception, the
right way would be to allow a transition from bad to good by requiring the
"sinners" to explicitly use some kind of
"org.postgresql.PGStatement.setUseServerPrepare(false)" tweaking in their
code, and let new coders use standard code.

Of course, you're running the show, so you will do it as you deem right, but
IMHO the implementation as you described it is clearly not reasonable.

Cheers,

Peter

----- Original Message -----
From: "Barry Lind" <barry@xythos.com>
To: "Dmitry Tkach" <dmitry@openratings.com>
Cc: <pgsql-general@postgresql.org>; <pgsql-jdbc@postgresql.org>
Sent: Thursday, September 26, 2002 10:56 PM
Subject: Re: [JDBC] [GENERAL] Prepared statement performance...


> Dimitry,
>
> I have some good news for you.  All of this functionality is in 7.3 of
> the server and the current development build of the jdbc driver.  Of
> course 7.3 isn't quite production yet, but beta 2 should be out this week.
>
> Your first option is what has been implemented.  In 7.3 the server now
> supports 'prepare <name> as <sql>' and 'execute <name>' (and variants
> for passing bind variables).  The jdbc driver also now has a method on
> org.postgresql.PGStatement.setUseServerPrepare(true) to enable the the
> use of server side prepared statements (so you can cast your Statement
> object to an org.postgresql.PGStatement object and enable the
> functionality for a specific statement).
>
> This hasn't yet made its way into the jdbc documentation for 7.3, but
> this will be documented before 7.3 goes production.
>
> This implementation does still require that you make some changes to
> your app in order to enable it but that is by design for the following
> reasons:
>     1) In order to be useful you need to reuse a statement object
> multiple times.  Most code generally will just create a statement and
> use it only once, and since more work is being done to prepare the
> statement that would have a negative performance impact for the majority
> of statements being executed in normal code.
>     2) Prepare functionality will not work for all types of sql that you
> can execute through the jdbc driver.  For example you can quite easily
> do the following:  conn.prepareStatement("update foo set a = ?; insert
> into bar values (?,?,?);");
> By issuing both sql statements in one call only one network roundtrip is
> needed to execute both statements providing what is likely a more
> performant application.  However this doesn't work with prepare because:
> prepare foo as update foo set a = ?; insert into bar values(?,?,?);   -
> is one prepared statement and one regular statement which will cause
> very strange things to happen.
>
> It is for the reasons above that the developer needs to explicitly turn
> on the use of server side prepared statements when it makes sense to do
so.
>
> thanks,
> --Barry
>
> PS.  I would really appreciate it if you could test this new
> functionality.  As far as I know I am the only one who has used it.  But
> for one of my frequently used and complex sql statements the time to
> execute dropped by 60% using server side prepared statements.
>
>
>
> Dmitry Tkach wrote:
>  > Hi, everybody.
>  >
>  > I am running into huge performance problems, due to JDBC not being able
>  > to cache query plans :-(
>  > My java program runs the (set of about 5) identical statements with
>  > different parameters for about 30 million times...
>  > What I am talking about below has to do with JDBC, but is not really
>  > specific to it - it seems to me, that the backend itself
>  > could be improved to better handle this kind of scenario, when the same
>  > statement is executed many times in the same session.
>  >
>  > It takes about 3 weeks(!) to run, and it looks like about *half* of
that
>  > time is spent by the query planner,
>  > creating and recreating the query plans every time I ran the damn
>  > statement....
>  >
>  > I am looking into implementing some kind of a solution, that would let
>  > me work around that problem...
>  > So far, I only see two possibilities:
>  >
>  > - a general solution, that would involve extending postgres SQL gramma
>  > to include a 'prepare' statement
>  > - or am ugly work around, that would involve moving all my sql
>  > statements into stored procedures, and have
>  >   those cache the query plans inside...
>  >
>  > The second solution is not only ugly (because it requires the
>  > application code to be changed and to have a specialized
>  > stored procedure for every query), but also requires some additional
>  > hacks (to overcome the hard limit on the number of
>  > function arguments and the inability for functions to return tuples) -
>  > the only way I imagine this can be made to work is
>  > to glue all the arguments together into a text string, and have the
>  > stored procedure parse it back, execute the query, then
>  > glue the resulting tuple(s) into another text string, return it, and
>  > have the application (or, perhaps, JDBC layer) parse it back
>  > into columns...
>  >
>  > I was wonderring if anybody has any better ideas how this can be made
to
>  > work (I am looking for a solution that would
>  > minimize changes to the existing JDBC applications that use
>  > PreparedStatements)?
>  >
>  > If the maintainers of the involved code are interested, I would be
>  > willing to implement and contribute the solution we come up with ...
>  > (I figure, nobody would really be interested in getting that second
>  > solution I mentioned into the mainstream :-), but, if we are
>  > able to come up with something more general and less ugly, perhaps, I
am
>  > not the only one who would be able to contribute from
>  > it)...
>  >
>  > For example, I believe, it should not be too complicated to implement
>  > that first possibility I described above...
>  > The way I see it would involve adding two statements to postgres SQL
>  > syntax:
>  >
>  > prepare <name> as <sql statement>
>  > and
>  > execute <name> with (<parameter list>)
>  >
>  > For example:
>  >
>  > prepare mystatement as select * from mytable where id = $1 and name
like
>  > $2;
>  >
>  > and then
>  >
>  > execute mystatement with (1, 'Dima');
>  > execute mystatement with (2, 'John');
>  >
>  > etc....
>  >
>  > The JDBC driver would then send the 'prepare' command to the backend in
>  > Connection.prepareStatement (), and
>  > use the 'execute' in PreparedStatement.execute ();
>  >
>  > One potential problem with implementation I see here is that the query
>  > planner wants to know the argument types ahead of time...
>  > I guess, I could get around that by making all the arguments 'text',
and
>  > having them casted into the right types when the
>  > statement is actually executed.
>  >
>  > There is, probably a need to also have some kind of a 'close' command
to
>  > throw away the prepared query plans... or we could just
>  > make them last forever until, say, the end of transaction (or, perhaps,
>  > the end of the session?)...
>  >
>  > If there is anyone interested in discussing various possibilities, and
>  > getting this implemented one way or another,
>  >  I would like to hear from you!
>  >
>  > Thanks!
>  >
>  > Dima.
>  >
>  >
>  >
>  > ---------------------------(end of
broadcast)---------------------------
>  > TIP 2: you can get off all lists at once with the unregister command
>  >    (send "unregister YourEmailAddressHere" to majordomo@postgresql.org)
>  >
>
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org
>

Re: [GENERAL] Prepared statement performance...

From

Dave Cramer

Date:

27 September 2002, 07:34:28

Peter,

I have to agree with Barry here. Since it is possible to send two (or
more ) statements to the backend, the programmer needs some mechanism
for enabling/disabling this feature. Unless of course we were to write a
sql parser in the jdbc driver so that we could predetermine which
statements should use the prepare mechanism or not.

I often use prepared statements instead of simple statements even when
only doing one select. The difference being that I can use setXXX to set
the parameters in the statment. Which is particularly important for
setDate, setTime, setTimestamp

And you would be amazed at the unique (irregular) coding practices which
exist out there, so having the default do something sane is a GOOD
THING.

Dave


On Fri, 2002-09-27 at 02:52, Peter Kovacs wrote:
> Barry,
>
> It's great to have this functionality.
>
> However, I am surprised about the need/requirement to explicitely turning on
> this feature in the JDBC driver. As you explain, the reason for this
> behaviour is to support the use of multiple statements with the
> prepareStatement method. But this is supporting incorrect us of the JDBC
> interface, since the JDBC API says that the parameter of the
> prepareStatement method is one(!) SQL statetment. On the other hand, it is
> clear that the intended purpose of the prepareStatement method is to use
> prepared statment -- exclusively and not by default. For statements executed
> only once, the designers of the API made available the method
> "createStatment".
>
> So the buttom line is that you have an irregular behaviour in support of an
> irregular coding practice. I understand that there is "legacy" code that
> needs to use the prepareStatement method with multiple SQL statements. But
> since such practice was already incorrect at the time of its inception, the
> right way would be to allow a transition from bad to good by requiring the
> "sinners" to explicitly use some kind of
> "org.postgresql.PGStatement.setUseServerPrepare(false)" tweaking in their
> code, and let new coders use standard code.
>
> Of course, you're running the show, so you will do it as you deem right, but
> IMHO the implementation as you described it is clearly not reasonable.
>
> Cheers,
>
> Peter
>
> ----- Original Message -----
> From: "Barry Lind" <barry@xythos.com>
> To: "Dmitry Tkach" <dmitry@openratings.com>
> Cc: <pgsql-general@postgresql.org>; <pgsql-jdbc@postgresql.org>
> Sent: Thursday, September 26, 2002 10:56 PM
> Subject: Re: [JDBC] [GENERAL] Prepared statement performance...
>
>
> > Dimitry,
> >
> > I have some good news for you.  All of this functionality is in 7.3 of
> > the server and the current development build of the jdbc driver.  Of
> > course 7.3 isn't quite production yet, but beta 2 should be out this week.
> >
> > Your first option is what has been implemented.  In 7.3 the server now
> > supports 'prepare <name> as <sql>' and 'execute <name>' (and variants
> > for passing bind variables).  The jdbc driver also now has a method on
> > org.postgresql.PGStatement.setUseServerPrepare(true) to enable the the
> > use of server side prepared statements (so you can cast your Statement
> > object to an org.postgresql.PGStatement object and enable the
> > functionality for a specific statement).
> >
> > This hasn't yet made its way into the jdbc documentation for 7.3, but
> > this will be documented before 7.3 goes production.
> >
> > This implementation does still require that you make some changes to
> > your app in order to enable it but that is by design for the following
> > reasons:
> >     1) In order to be useful you need to reuse a statement object
> > multiple times.  Most code generally will just create a statement and
> > use it only once, and since more work is being done to prepare the
> > statement that would have a negative performance impact for the majority
> > of statements being executed in normal code.
> >     2) Prepare functionality will not work for all types of sql that you
> > can execute through the jdbc driver.  For example you can quite easily
> > do the following:  conn.prepareStatement("update foo set a = ?; insert
> > into bar values (?,?,?);");
> > By issuing both sql statements in one call only one network roundtrip is
> > needed to execute both statements providing what is likely a more
> > performant application.  However this doesn't work with prepare because:
> > prepare foo as update foo set a = ?; insert into bar values(?,?,?);   -
> > is one prepared statement and one regular statement which will cause
> > very strange things to happen.
> >
> > It is for the reasons above that the developer needs to explicitly turn
> > on the use of server side prepared statements when it makes sense to do
> so.
> >
> > thanks,
> > --Barry
> >
> > PS.  I would really appreciate it if you could test this new
> > functionality.  As far as I know I am the only one who has used it.  But
> > for one of my frequently used and complex sql statements the time to
> > execute dropped by 60% using server side prepared statements.
> >
> >
> >
> > Dmitry Tkach wrote:
> >  > Hi, everybody.
> >  >
> >  > I am running into huge performance problems, due to JDBC not being able
> >  > to cache query plans :-(
> >  > My java program runs the (set of about 5) identical statements with
> >  > different parameters for about 30 million times...
> >  > What I am talking about below has to do with JDBC, but is not really
> >  > specific to it - it seems to me, that the backend itself
> >  > could be improved to better handle this kind of scenario, when the same
> >  > statement is executed many times in the same session.
> >  >
> >  > It takes about 3 weeks(!) to run, and it looks like about *half* of
> that
> >  > time is spent by the query planner,
> >  > creating and recreating the query plans every time I ran the damn
> >  > statement....
> >  >
> >  > I am looking into implementing some kind of a solution, that would let
> >  > me work around that problem...
> >  > So far, I only see two possibilities:
> >  >
> >  > - a general solution, that would involve extending postgres SQL gramma
> >  > to include a 'prepare' statement
> >  > - or am ugly work around, that would involve moving all my sql
> >  > statements into stored procedures, and have
> >  >   those cache the query plans inside...
> >  >
> >  > The second solution is not only ugly (because it requires the
> >  > application code to be changed and to have a specialized
> >  > stored procedure for every query), but also requires some additional
> >  > hacks (to overcome the hard limit on the number of
> >  > function arguments and the inability for functions to return tuples) -
> >  > the only way I imagine this can be made to work is
> >  > to glue all the arguments together into a text string, and have the
> >  > stored procedure parse it back, execute the query, then
> >  > glue the resulting tuple(s) into another text string, return it, and
> >  > have the application (or, perhaps, JDBC layer) parse it back
> >  > into columns...
> >  >
> >  > I was wonderring if anybody has any better ideas how this can be made
> to
> >  > work (I am looking for a solution that would
> >  > minimize changes to the existing JDBC applications that use
> >  > PreparedStatements)?
> >  >
> >  > If the maintainers of the involved code are interested, I would be
> >  > willing to implement and contribute the solution we come up with ...
> >  > (I figure, nobody would really be interested in getting that second
> >  > solution I mentioned into the mainstream :-), but, if we are
> >  > able to come up with something more general and less ugly, perhaps, I
> am
> >  > not the only one who would be able to contribute from
> >  > it)...
> >  >
> >  > For example, I believe, it should not be too complicated to implement
> >  > that first possibility I described above...
> >  > The way I see it would involve adding two statements to postgres SQL
> >  > syntax:
> >  >
> >  > prepare <name> as <sql statement>
> >  > and
> >  > execute <name> with (<parameter list>)
> >  >
> >  > For example:
> >  >
> >  > prepare mystatement as select * from mytable where id = $1 and name
> like
> >  > $2;
> >  >
> >  > and then
> >  >
> >  > execute mystatement with (1, 'Dima');
> >  > execute mystatement with (2, 'John');
> >  >
> >  > etc....
> >  >
> >  > The JDBC driver would then send the 'prepare' command to the backend in
> >  > Connection.prepareStatement (), and
> >  > use the 'execute' in PreparedStatement.execute ();
> >  >
> >  > One potential problem with implementation I see here is that the query
> >  > planner wants to know the argument types ahead of time...
> >  > I guess, I could get around that by making all the arguments 'text',
> and
> >  > having them casted into the right types when the
> >  > statement is actually executed.
> >  >
> >  > There is, probably a need to also have some kind of a 'close' command
> to
> >  > throw away the prepared query plans... or we could just
> >  > make them last forever until, say, the end of transaction (or, perhaps,
> >  > the end of the session?)...
> >  >
> >  > If there is anyone interested in discussing various possibilities, and
> >  > getting this implemented one way or another,
> >  >  I would like to hear from you!
> >  >
> >  > Thanks!
> >  >
> >  > Dima.
> >  >
> >  >
> >  >
> >  > ---------------------------(end of
> broadcast)---------------------------
> >  > TIP 2: you can get off all lists at once with the unregister command
> >  >    (send "unregister YourEmailAddressHere" to majordomo@postgresql.org)
> >  >
> >
> >
> >
> > ---------------------------(end of broadcast)---------------------------
> > TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org
> >
>
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org
>
>

Re: [GENERAL] Prepared statement performance...

From

Jochem van Dieten

Date:

27 September 2002, 07:50:23

Barry Lind wrote:
>
> Your first option is what has been implemented.  In 7.3 the server now
> supports 'prepare <name> as <sql>' and 'execute <name>' (and variants
> for passing bind variables).  The jdbc driver also now has a method on
> org.postgresql.PGStatement.setUseServerPrepare(true) to enable the the
> use of server side prepared statements (so you can cast your Statement
> object to an org.postgresql.PGStatement object and enable the
> functionality for a specific statement).
[..]
> It is for the reasons above that the developer needs to explicitly turn
> on the use of server side prepared statements when it makes sense to do so.

How would the developer do that? With the front-end I am working with
(ColdFusion MX) I have very little control over the actual calls to the
driver. I can make CF MX use createStatement() or prepareStatement() but
that is all the control I have. And the only way to send parameters to
the driver would be through a JDBC URL.
Would that enable me to use prepared statements for real or not?

Jochem

Re: [GENERAL] Prepared statement performance...

From

Aaron Mulder

Date:

27 September 2002, 08:52:09

Dave, Barry, et al.:

> Unless of course we were to write a
> sql parser in the jdbc driver so that we could predetermine which
> statements should use the prepare mechanism or not.

    That's a bit of an exaggeration -- all you need to check in the
JDBC driver is whether there's an unescaped ; to see if there's more than
one statement, right?  It wouldn't be unreasonable to default to enabling
the feature unless there's a ; in the statement (or unless there's a ;
with anything but whitespace afterward, if you like) -- it wouldn't even
be a disaster to forget escaping and just decide based on a ; in the
statement period -- few enough statements have an escaped ; that the
suboptimal performance in that case won't matter.
    I'm a little concerned about how this feature interacts with
connection pooling as currently implemented -- it probably needs to be
reset to a known state every time a connection is returned to the pool.
Which is OK for the PostgreSQL implementation (which you'll note is not
supposed to be used in an app server environment), but impossible for all
the app servers out there which won't take driver-specific steps when
connections are returned to their generic pools.  Furthermore, with the
default behavior set to "disabled", you're reducing performance (you
quoted 60%?) in the application server environment, when there will never
be multiple statements executed in the same call, but there will
frequently be multiple identical statements executed on the same
PreparedStatement, but there's no available way to reset the flag, short
of writing a wrapper driver to stand between the app server and the real
driver.  On the other hand, if you set the default to a global "enabled",
it breaks DBVisualizer and other development tools which currently allow
multiple statements by default.
    IMHO, it would be better to have the feature enabled for queries
where it's appropriate, and disabled where not, and let the programmer
specify a global override where they absolutely must.  That way
DB-specific actions are only required in truly exceptional cases, and both
the app servers and the tools behave properly by default.
    If you absolutely object, I think we should add a property to the
Driver and the ConnectionPoolDataSource so that in an app server
environment you can set the property to enabled at configuration time.
But again, I would prefer smarter default behavior.

Aaron

On 27 Sep 2002, Dave Cramer wrote:
> Peter,
>
> I have to agree with Barry here. Since it is possible to send two (or
> more ) statements to the backend, the programmer needs some mechanism
> for enabling/disabling this feature. Unless of course we were to write a
> sql parser in the jdbc driver so that we could predetermine which
> statements should use the prepare mechanism or not.
>
> I often use prepared statements instead of simple statements even when
> only doing one select. The difference being that I can use setXXX to set
> the parameters in the statment. Which is particularly important for
> setDate, setTime, setTimestamp
>
> And you would be amazed at the unique (irregular) coding practices which
> exist out there, so having the default do something sane is a GOOD
> THING.
>
> Dave
>
>
> On Fri, 2002-09-27 at 02:52, Peter Kovacs wrote:
> > Barry,
> >
> > It's great to have this functionality.
> >
> > However, I am surprised about the need/requirement to explicitely turning on
> > this feature in the JDBC driver. As you explain, the reason for this
> > behaviour is to support the use of multiple statements with the
> > prepareStatement method. But this is supporting incorrect us of the JDBC
> > interface, since the JDBC API says that the parameter of the
> > prepareStatement method is one(!) SQL statetment. On the other hand, it is
> > clear that the intended purpose of the prepareStatement method is to use
> > prepared statment -- exclusively and not by default. For statements executed
> > only once, the designers of the API made available the method
> > "createStatment".
> >
> > So the buttom line is that you have an irregular behaviour in support of an
> > irregular coding practice. I understand that there is "legacy" code that
> > needs to use the prepareStatement method with multiple SQL statements. But
> > since such practice was already incorrect at the time of its inception, the
> > right way would be to allow a transition from bad to good by requiring the
> > "sinners" to explicitly use some kind of
> > "org.postgresql.PGStatement.setUseServerPrepare(false)" tweaking in their
> > code, and let new coders use standard code.
> >
> > Of course, you're running the show, so you will do it as you deem right, but
> > IMHO the implementation as you described it is clearly not reasonable.
> >
> > Cheers,
> >
> > Peter
> >
> > ----- Original Message -----
> > From: "Barry Lind" <barry@xythos.com>
> > To: "Dmitry Tkach" <dmitry@openratings.com>
> > Cc: <pgsql-general@postgresql.org>; <pgsql-jdbc@postgresql.org>
> > Sent: Thursday, September 26, 2002 10:56 PM
> > Subject: Re: [JDBC] [GENERAL] Prepared statement performance...
> >
> >
> > > Dimitry,
> > >
> > > I have some good news for you.  All of this functionality is in 7.3 of
> > > the server and the current development build of the jdbc driver.  Of
> > > course 7.3 isn't quite production yet, but beta 2 should be out this week.
> > >
> > > Your first option is what has been implemented.  In 7.3 the server now
> > > supports 'prepare <name> as <sql>' and 'execute <name>' (and variants
> > > for passing bind variables).  The jdbc driver also now has a method on
> > > org.postgresql.PGStatement.setUseServerPrepare(true) to enable the the
> > > use of server side prepared statements (so you can cast your Statement
> > > object to an org.postgresql.PGStatement object and enable the
> > > functionality for a specific statement).
> > >
> > > This hasn't yet made its way into the jdbc documentation for 7.3, but
> > > this will be documented before 7.3 goes production.
> > >
> > > This implementation does still require that you make some changes to
> > > your app in order to enable it but that is by design for the following
> > > reasons:
> > >     1) In order to be useful you need to reuse a statement object
> > > multiple times.  Most code generally will just create a statement and
> > > use it only once, and since more work is being done to prepare the
> > > statement that would have a negative performance impact for the majority
> > > of statements being executed in normal code.
> > >     2) Prepare functionality will not work for all types of sql that you
> > > can execute through the jdbc driver.  For example you can quite easily
> > > do the following:  conn.prepareStatement("update foo set a = ?; insert
> > > into bar values (?,?,?);");
> > > By issuing both sql statements in one call only one network roundtrip is
> > > needed to execute both statements providing what is likely a more
> > > performant application.  However this doesn't work with prepare because:
> > > prepare foo as update foo set a = ?; insert into bar values(?,?,?);   -
> > > is one prepared statement and one regular statement which will cause
> > > very strange things to happen.
> > >
> > > It is for the reasons above that the developer needs to explicitly turn
> > > on the use of server side prepared statements when it makes sense to do
> > so.
> > >
> > > thanks,
> > > --Barry
> > >
> > > PS.  I would really appreciate it if you could test this new
> > > functionality.  As far as I know I am the only one who has used it.  But
> > > for one of my frequently used and complex sql statements the time to
> > > execute dropped by 60% using server side prepared statements.
> > >
> > >
> > >
> > > Dmitry Tkach wrote:
> > >  > Hi, everybody.
> > >  >
> > >  > I am running into huge performance problems, due to JDBC not being able
> > >  > to cache query plans :-(
> > >  > My java program runs the (set of about 5) identical statements with
> > >  > different parameters for about 30 million times...
> > >  > What I am talking about below has to do with JDBC, but is not really
> > >  > specific to it - it seems to me, that the backend itself
> > >  > could be improved to better handle this kind of scenario, when the same
> > >  > statement is executed many times in the same session.
> > >  >
> > >  > It takes about 3 weeks(!) to run, and it looks like about *half* of
> > that
> > >  > time is spent by the query planner,
> > >  > creating and recreating the query plans every time I ran the damn
> > >  > statement....
> > >  >
> > >  > I am looking into implementing some kind of a solution, that would let
> > >  > me work around that problem...
> > >  > So far, I only see two possibilities:
> > >  >
> > >  > - a general solution, that would involve extending postgres SQL gramma
> > >  > to include a 'prepare' statement
> > >  > - or am ugly work around, that would involve moving all my sql
> > >  > statements into stored procedures, and have
> > >  >   those cache the query plans inside...
> > >  >
> > >  > The second solution is not only ugly (because it requires the
> > >  > application code to be changed and to have a specialized
> > >  > stored procedure for every query), but also requires some additional
> > >  > hacks (to overcome the hard limit on the number of
> > >  > function arguments and the inability for functions to return tuples) -
> > >  > the only way I imagine this can be made to work is
> > >  > to glue all the arguments together into a text string, and have the
> > >  > stored procedure parse it back, execute the query, then
> > >  > glue the resulting tuple(s) into another text string, return it, and
> > >  > have the application (or, perhaps, JDBC layer) parse it back
> > >  > into columns...
> > >  >
> > >  > I was wonderring if anybody has any better ideas how this can be made
> > to
> > >  > work (I am looking for a solution that would
> > >  > minimize changes to the existing JDBC applications that use
> > >  > PreparedStatements)?
> > >  >
> > >  > If the maintainers of the involved code are interested, I would be
> > >  > willing to implement and contribute the solution we come up with ...
> > >  > (I figure, nobody would really be interested in getting that second
> > >  > solution I mentioned into the mainstream :-), but, if we are
> > >  > able to come up with something more general and less ugly, perhaps, I
> > am
> > >  > not the only one who would be able to contribute from
> > >  > it)...
> > >  >
> > >  > For example, I believe, it should not be too complicated to implement
> > >  > that first possibility I described above...
> > >  > The way I see it would involve adding two statements to postgres SQL
> > >  > syntax:
> > >  >
> > >  > prepare <name> as <sql statement>
> > >  > and
> > >  > execute <name> with (<parameter list>)
> > >  >
> > >  > For example:
> > >  >
> > >  > prepare mystatement as select * from mytable where id = $1 and name
> > like
> > >  > $2;
> > >  >
> > >  > and then
> > >  >
> > >  > execute mystatement with (1, 'Dima');
> > >  > execute mystatement with (2, 'John');
> > >  >
> > >  > etc....
> > >  >
> > >  > The JDBC driver would then send the 'prepare' command to the backend in
> > >  > Connection.prepareStatement (), and
> > >  > use the 'execute' in PreparedStatement.execute ();
> > >  >
> > >  > One potential problem with implementation I see here is that the query
> > >  > planner wants to know the argument types ahead of time...
> > >  > I guess, I could get around that by making all the arguments 'text',
> > and
> > >  > having them casted into the right types when the
> > >  > statement is actually executed.
> > >  >
> > >  > There is, probably a need to also have some kind of a 'close' command
> > to
> > >  > throw away the prepared query plans... or we could just
> > >  > make them last forever until, say, the end of transaction (or, perhaps,
> > >  > the end of the session?)...
> > >  >
> > >  > If there is anyone interested in discussing various possibilities, and
> > >  > getting this implemented one way or another,
> > >  >  I would like to hear from you!
> > >  >
> > >  > Thanks!
> > >  >
> > >  > Dima.
> > >  >
> > >  >
> > >  >
> > >  > ---------------------------(end of
> > broadcast)---------------------------
> > >  > TIP 2: you can get off all lists at once with the unregister command
> > >  >    (send "unregister YourEmailAddressHere" to majordomo@postgresql.org)
> > >  >
> > >
> > >
> > >
> > > ---------------------------(end of broadcast)---------------------------
> > > TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org
> > >
> >
> >
> >
> > ---------------------------(end of broadcast)---------------------------
> > TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org
> >
> >
>
>
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 5: Have you checked our extensive FAQ?
>
> http://www.postgresql.org/users-lounge/docs/faq.html
>

Re: [GENERAL] Prepared statement performance...

From

Dave Cramer

Date:

27 September 2002, 09:08:11

Ok, I agree with having a configuration parameter which sets the
default, that way anyone with the where withall to use this properly can
do so without changing their code.

AFAICS the parameter is statement specific,not connection specific.


Dave
On Fri, 2002-09-27 at 08:43, Aaron Mulder wrote:
> Dave, Barry, et al.:
>
> > Unless of course we were to write a
> > sql parser in the jdbc driver so that we could predetermine which
> > statements should use the prepare mechanism or not.
>
>     That's a bit of an exaggeration -- all you need to check in the
> JDBC driver is whether there's an unescaped ; to see if there's more than
> one statement, right?  It wouldn't be unreasonable to default to enabling
> the feature unless there's a ; in the statement (or unless there's a ;
> with anything but whitespace afterward, if you like) -- it wouldn't even
> be a disaster to forget escaping and just decide based on a ; in the
> statement period -- few enough statements have an escaped ; that the
> suboptimal performance in that case won't matter.
>     I'm a little concerned about how this feature interacts with
> connection pooling as currently implemented -- it probably needs to be
> reset to a known state every time a connection is returned to the pool.
> Which is OK for the PostgreSQL implementation (which you'll note is not
> supposed to be used in an app server environment), but impossible for all
> the app servers out there which won't take driver-specific steps when
> connections are returned to their generic pools.  Furthermore, with the
> default behavior set to "disabled", you're reducing performance (you
> quoted 60%?) in the application server environment, when there will never
> be multiple statements executed in the same call, but there will
> frequently be multiple identical statements executed on the same
> PreparedStatement, but there's no available way to reset the flag, short
> of writing a wrapper driver to stand between the app server and the real
> driver.  On the other hand, if you set the default to a global "enabled",
> it breaks DBVisualizer and other development tools which currently allow
> multiple statements by default.
>     IMHO, it would be better to have the feature enabled for queries
> where it's appropriate, and disabled where not, and let the programmer
> specify a global override where they absolutely must.  That way
> DB-specific actions are only required in truly exceptional cases, and both
> the app servers and the tools behave properly by default.
>     If you absolutely object, I think we should add a property to the
> Driver and the ConnectionPoolDataSource so that in an app server
> environment you can set the property to enabled at configuration time.
> But again, I would prefer smarter default behavior.
>
> Aaron
>
> On 27 Sep 2002, Dave Cramer wrote:
> > Peter,
> >
> > I have to agree with Barry here. Since it is possible to send two (or
> > more ) statements to the backend, the programmer needs some mechanism
> > for enabling/disabling this feature. Unless of course we were to write a
> > sql parser in the jdbc driver so that we could predetermine which
> > statements should use the prepare mechanism or not.
> >
> > I often use prepared statements instead of simple statements even when
> > only doing one select. The difference being that I can use setXXX to set
> > the parameters in the statment. Which is particularly important for
> > setDate, setTime, setTimestamp
> >
> > And you would be amazed at the unique (irregular) coding practices which
> > exist out there, so having the default do something sane is a GOOD
> > THING.
> >
> > Dave
> >
> >
> > On Fri, 2002-09-27 at 02:52, Peter Kovacs wrote:
> > > Barry,
> > >
> > > It's great to have this functionality.
> > >
> > > However, I am surprised about the need/requirement to explicitely turning on
> > > this feature in the JDBC driver. As you explain, the reason for this
> > > behaviour is to support the use of multiple statements with the
> > > prepareStatement method. But this is supporting incorrect us of the JDBC
> > > interface, since the JDBC API says that the parameter of the
> > > prepareStatement method is one(!) SQL statetment. On the other hand, it is
> > > clear that the intended purpose of the prepareStatement method is to use
> > > prepared statment -- exclusively and not by default. For statements executed
> > > only once, the designers of the API made available the method
> > > "createStatment".
> > >
> > > So the buttom line is that you have an irregular behaviour in support of an
> > > irregular coding practice. I understand that there is "legacy" code that
> > > needs to use the prepareStatement method with multiple SQL statements. But
> > > since such practice was already incorrect at the time of its inception, the
> > > right way would be to allow a transition from bad to good by requiring the
> > > "sinners" to explicitly use some kind of
> > > "org.postgresql.PGStatement.setUseServerPrepare(false)" tweaking in their
> > > code, and let new coders use standard code.
> > >
> > > Of course, you're running the show, so you will do it as you deem right, but
> > > IMHO the implementation as you described it is clearly not reasonable.
> > >
> > > Cheers,
> > >
> > > Peter
> > >
> > > ----- Original Message -----
> > > From: "Barry Lind" <barry@xythos.com>
> > > To: "Dmitry Tkach" <dmitry@openratings.com>
> > > Cc: <pgsql-general@postgresql.org>; <pgsql-jdbc@postgresql.org>
> > > Sent: Thursday, September 26, 2002 10:56 PM
> > > Subject: Re: [JDBC] [GENERAL] Prepared statement performance...
> > >
> > >
> > > > Dimitry,
> > > >
> > > > I have some good news for you.  All of this functionality is in 7.3 of
> > > > the server and the current development build of the jdbc driver.  Of
> > > > course 7.3 isn't quite production yet, but beta 2 should be out this week.
> > > >
> > > > Your first option is what has been implemented.  In 7.3 the server now
> > > > supports 'prepare <name> as <sql>' and 'execute <name>' (and variants
> > > > for passing bind variables).  The jdbc driver also now has a method on
> > > > org.postgresql.PGStatement.setUseServerPrepare(true) to enable the the
> > > > use of server side prepared statements (so you can cast your Statement
> > > > object to an org.postgresql.PGStatement object and enable the
> > > > functionality for a specific statement).
> > > >
> > > > This hasn't yet made its way into the jdbc documentation for 7.3, but
> > > > this will be documented before 7.3 goes production.
> > > >
> > > > This implementation does still require that you make some changes to
> > > > your app in order to enable it but that is by design for the following
> > > > reasons:
> > > >     1) In order to be useful you need to reuse a statement object
> > > > multiple times.  Most code generally will just create a statement and
> > > > use it only once, and since more work is being done to prepare the
> > > > statement that would have a negative performance impact for the majority
> > > > of statements being executed in normal code.
> > > >     2) Prepare functionality will not work for all types of sql that you
> > > > can execute through the jdbc driver.  For example you can quite easily
> > > > do the following:  conn.prepareStatement("update foo set a = ?; insert
> > > > into bar values (?,?,?);");
> > > > By issuing both sql statements in one call only one network roundtrip is
> > > > needed to execute both statements providing what is likely a more
> > > > performant application.  However this doesn't work with prepare because:
> > > > prepare foo as update foo set a = ?; insert into bar values(?,?,?);   -
> > > > is one prepared statement and one regular statement which will cause
> > > > very strange things to happen.
> > > >
> > > > It is for the reasons above that the developer needs to explicitly turn
> > > > on the use of server side prepared statements when it makes sense to do
> > > so.
> > > >
> > > > thanks,
> > > > --Barry
> > > >
> > > > PS.  I would really appreciate it if you could test this new
> > > > functionality.  As far as I know I am the only one who has used it.  But
> > > > for one of my frequently used and complex sql statements the time to
> > > > execute dropped by 60% using server side prepared statements.
> > > >
> > > >
> > > >
> > > > Dmitry Tkach wrote:
> > > >  > Hi, everybody.
> > > >  >
> > > >  > I am running into huge performance problems, due to JDBC not being able
> > > >  > to cache query plans :-(
> > > >  > My java program runs the (set of about 5) identical statements with
> > > >  > different parameters for about 30 million times...
> > > >  > What I am talking about below has to do with JDBC, but is not really
> > > >  > specific to it - it seems to me, that the backend itself
> > > >  > could be improved to better handle this kind of scenario, when the same
> > > >  > statement is executed many times in the same session.
> > > >  >
> > > >  > It takes about 3 weeks(!) to run, and it looks like about *half* of
> > > that
> > > >  > time is spent by the query planner,
> > > >  > creating and recreating the query plans every time I ran the damn
> > > >  > statement....
> > > >  >
> > > >  > I am looking into implementing some kind of a solution, that would let
> > > >  > me work around that problem...
> > > >  > So far, I only see two possibilities:
> > > >  >
> > > >  > - a general solution, that would involve extending postgres SQL gramma
> > > >  > to include a 'prepare' statement
> > > >  > - or am ugly work around, that would involve moving all my sql
> > > >  > statements into stored procedures, and have
> > > >  >   those cache the query plans inside...
> > > >  >
> > > >  > The second solution is not only ugly (because it requires the
> > > >  > application code to be changed and to have a specialized
> > > >  > stored procedure for every query), but also requires some additional
> > > >  > hacks (to overcome the hard limit on the number of
> > > >  > function arguments and the inability for functions to return tuples) -
> > > >  > the only way I imagine this can be made to work is
> > > >  > to glue all the arguments together into a text string, and have the
> > > >  > stored procedure parse it back, execute the query, then
> > > >  > glue the resulting tuple(s) into another text string, return it, and
> > > >  > have the application (or, perhaps, JDBC layer) parse it back
> > > >  > into columns...
> > > >  >
> > > >  > I was wonderring if anybody has any better ideas how this can be made
> > > to
> > > >  > work (I am looking for a solution that would
> > > >  > minimize changes to the existing JDBC applications that use
> > > >  > PreparedStatements)?
> > > >  >
> > > >  > If the maintainers of the involved code are interested, I would be
> > > >  > willing to implement and contribute the solution we come up with ...
> > > >  > (I figure, nobody would really be interested in getting that second
> > > >  > solution I mentioned into the mainstream :-), but, if we are
> > > >  > able to come up with something more general and less ugly, perhaps, I
> > > am
> > > >  > not the only one who would be able to contribute from
> > > >  > it)...
> > > >  >
> > > >  > For example, I believe, it should not be too complicated to implement
> > > >  > that first possibility I described above...
> > > >  > The way I see it would involve adding two statements to postgres SQL
> > > >  > syntax:
> > > >  >
> > > >  > prepare <name> as <sql statement>
> > > >  > and
> > > >  > execute <name> with (<parameter list>)
> > > >  >
> > > >  > For example:
> > > >  >
> > > >  > prepare mystatement as select * from mytable where id = $1 and name
> > > like
> > > >  > $2;
> > > >  >
> > > >  > and then
> > > >  >
> > > >  > execute mystatement with (1, 'Dima');
> > > >  > execute mystatement with (2, 'John');
> > > >  >
> > > >  > etc....
> > > >  >
> > > >  > The JDBC driver would then send the 'prepare' command to the backend in
> > > >  > Connection.prepareStatement (), and
> > > >  > use the 'execute' in PreparedStatement.execute ();
> > > >  >
> > > >  > One potential problem with implementation I see here is that the query
> > > >  > planner wants to know the argument types ahead of time...
> > > >  > I guess, I could get around that by making all the arguments 'text',
> > > and
> > > >  > having them casted into the right types when the
> > > >  > statement is actually executed.
> > > >  >
> > > >  > There is, probably a need to also have some kind of a 'close' command
> > > to
> > > >  > throw away the prepared query plans... or we could just
> > > >  > make them last forever until, say, the end of transaction (or, perhaps,
> > > >  > the end of the session?)...
> > > >  >
> > > >  > If there is anyone interested in discussing various possibilities, and
> > > >  > getting this implemented one way or another,
> > > >  >  I would like to hear from you!
> > > >  >
> > > >  > Thanks!
> > > >  >
> > > >  > Dima.
> > > >  >
> > > >  >
> > > >  >
> > > >  > ---------------------------(end of
> > > broadcast)---------------------------
> > > >  > TIP 2: you can get off all lists at once with the unregister command
> > > >  >    (send "unregister YourEmailAddressHere" to majordomo@postgresql.org)
> > > >  >
> > > >
> > > >
> > > >
> > > > ---------------------------(end of broadcast)---------------------------
> > > > TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org
> > > >
> > >
> > >
> > >
> > > ---------------------------(end of broadcast)---------------------------
> > > TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org
> > >
> > >
> >
> >
> >
> >
> > ---------------------------(end of broadcast)---------------------------
> > TIP 5: Have you checked our extensive FAQ?
> >
> > http://www.postgresql.org/users-lounge/docs/faq.html
> >
>
>

Re: [GENERAL] Prepared statement performance...

From

Bruce Momjian

Date:

27 September 2002, 12:42:10

Aaron Mulder wrote:
> Dave, Barry, et al.:
>
> > Unless of course we were to write a
> > sql parser in the jdbc driver so that we could predetermine which
> > statements should use the prepare mechanism or not.
>
>     That's a bit of an exaggeration -- all you need to check in the
> JDBC driver is whether there's an unescaped ; to see if there's more than
> one statement, right?  It wouldn't be unreasonable to default to enabling
> the feature unless there's a ; in the statement (or unless there's a ;
> with anything but whitespace afterward, if you like) -- it wouldn't even
> be a disaster to forget escaping and just decide based on a ; in the
> statement period -- few enough statements have an escaped ; that the
> suboptimal performance in that case won't matter.

psql does do this by tracking quotes and parens and looking for ';'.

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073

Re: [GENERAL] Prepared statement performance...

From

Barry Lind

Date:

27 September 2002, 14:23:08

Cool an interesting thread on the jdbc list.

First off I dissagree with your reading of the spec that the statement
or prepared statement object are only intended for one sql statement.
If that were true why would you have Statement.getMoreResults()?  The
API clearly supports a single statement execute returning multiple
result sets which implies to me that the single execute can be used for
multiple individual sql statements chained together.

The uses of the Statement object and PreparedStatement object do not
IMHO have anything to do with you many times you use them.  They have to
do with how you supply values in the sql statement.  If you use just the
Statement object you are required to know the postgres date format in
order to include a date value, where it is much easier (and more
portable across databases) to just use a PreparedStatement, a ?, and a
setDate() call.

thanks,
--Barry

Peter Kovacs wrote:
 > Barry,
 >
 > It's great to have this functionality.
 >
 > However, I am surprised about the need/requirement to explicitely
turning on
 > this feature in the JDBC driver. As you explain, the reason for this
 > behaviour is to support the use of multiple statements with the
 > prepareStatement method. But this is supporting incorrect us of the JDBC
 > interface, since the JDBC API says that the parameter of the
 > prepareStatement method is one(!) SQL statetment. On the other hand,
it is
 > clear that the intended purpose of the prepareStatement method is to use
 > prepared statment -- exclusively and not by default. For statements
executed
 > only once, the designers of the API made available the method
 > "createStatment".
 >
 > So the buttom line is that you have an irregular behaviour in support
of an
 > irregular coding practice. I understand that there is "legacy" code that
 > needs to use the prepareStatement method with multiple SQL
statements. But
 > since such practice was already incorrect at the time of its
inception, the
 > right way would be to allow a transition from bad to good by
requiring the
 > "sinners" to explicitly use some kind of
 > "org.postgresql.PGStatement.setUseServerPrepare(false)" tweaking in their
 > code, and let new coders use standard code.
 >
 > Of course, you're running the show, so you will do it as you deem
right, but
 > IMHO the implementation as you described it is clearly not reasonable.
 >
 > Cheers,
 >
 > Peter
 >
 > ----- Original Message -----
 > From: "Barry Lind" <barry@xythos.com>
 > To: "Dmitry Tkach" <dmitry@openratings.com>
 > Cc: <pgsql-general@postgresql.org>; <pgsql-jdbc@postgresql.org>
 > Sent: Thursday, September 26, 2002 10:56 PM
 > Subject: Re: [JDBC] [GENERAL] Prepared statement performance...
 >
 >
 >
 >>Dimitry,
 >>
 >>I have some good news for you.  All of this functionality is in 7.3 of
 >>the server and the current development build of the jdbc driver.  Of
 >>course 7.3 isn't quite production yet, but beta 2 should be out this
week.
 >>
 >>Your first option is what has been implemented.  In 7.3 the server now
 >>supports 'prepare <name> as <sql>' and 'execute <name>' (and variants
 >>for passing bind variables).  The jdbc driver also now has a method on
 >>org.postgresql.PGStatement.setUseServerPrepare(true) to enable the the
 >>use of server side prepared statements (so you can cast your Statement
 >>object to an org.postgresql.PGStatement object and enable the
 >>functionality for a specific statement).
 >>
 >>This hasn't yet made its way into the jdbc documentation for 7.3, but
 >>this will be documented before 7.3 goes production.
 >>
 >>This implementation does still require that you make some changes to
 >>your app in order to enable it but that is by design for the following
 >>reasons:
 >>    1) In order to be useful you need to reuse a statement object
 >>multiple times.  Most code generally will just create a statement and
 >>use it only once, and since more work is being done to prepare the
 >>statement that would have a negative performance impact for the majority
 >>of statements being executed in normal code.
 >>    2) Prepare functionality will not work for all types of sql that you
 >>can execute through the jdbc driver.  For example you can quite easily
 >>do the following:  conn.prepareStatement("update foo set a = ?; insert
 >>into bar values (?,?,?);");
 >>By issuing both sql statements in one call only one network roundtrip is
 >>needed to execute both statements providing what is likely a more
 >>performant application.  However this doesn't work with prepare because:
 >>prepare foo as update foo set a = ?; insert into bar values(?,?,?);   -
 >>is one prepared statement and one regular statement which will cause
 >>very strange things to happen.
 >>
 >>It is for the reasons above that the developer needs to explicitly turn
 >>on the use of server side prepared statements when it makes sense to do
 >
 > so.
 >
 >>thanks,
 >>--Barry
 >>
 >>PS.  I would really appreciate it if you could test this new
 >>functionality.  As far as I know I am the only one who has used it.  But
 >>for one of my frequently used and complex sql statements the time to
 >>execute dropped by 60% using server side prepared statements.
 >>
 >>
 >>
 >>Dmitry Tkach wrote:
 >> > Hi, everybody.
 >> >
 >> > I am running into huge performance problems, due to JDBC not being
able
 >> > to cache query plans :-(
 >> > My java program runs the (set of about 5) identical statements with
 >> > different parameters for about 30 million times...
 >> > What I am talking about below has to do with JDBC, but is not really
 >> > specific to it - it seems to me, that the backend itself
 >> > could be improved to better handle this kind of scenario, when the
same
 >> > statement is executed many times in the same session.
 >> >
 >> > It takes about 3 weeks(!) to run, and it looks like about *half* of
 >
 > that
 >
 >> > time is spent by the query planner,
 >> > creating and recreating the query plans every time I ran the damn
 >> > statement....
 >> >
 >> > I am looking into implementing some kind of a solution, that would let
 >> > me work around that problem...
 >> > So far, I only see two possibilities:
 >> >
 >> > - a general solution, that would involve extending postgres SQL gramma
 >> > to include a 'prepare' statement
 >> > - or am ugly work around, that would involve moving all my sql
 >> > statements into stored procedures, and have
 >> >   those cache the query plans inside...
 >> >
 >> > The second solution is not only ugly (because it requires the
 >> > application code to be changed and to have a specialized
 >> > stored procedure for every query), but also requires some additional
 >> > hacks (to overcome the hard limit on the number of
 >> > function arguments and the inability for functions to return tuples) -
 >> > the only way I imagine this can be made to work is
 >> > to glue all the arguments together into a text string, and have the
 >> > stored procedure parse it back, execute the query, then
 >> > glue the resulting tuple(s) into another text string, return it, and
 >> > have the application (or, perhaps, JDBC layer) parse it back
 >> > into columns...
 >> >
 >> > I was wonderring if anybody has any better ideas how this can be made
 >
 > to
 >
 >> > work (I am looking for a solution that would
 >> > minimize changes to the existing JDBC applications that use
 >> > PreparedStatements)?
 >> >
 >> > If the maintainers of the involved code are interested, I would be
 >> > willing to implement and contribute the solution we come up with ...
 >> > (I figure, nobody would really be interested in getting that second
 >> > solution I mentioned into the mainstream :-), but, if we are
 >> > able to come up with something more general and less ugly, perhaps, I
 >
 > am
 >
 >> > not the only one who would be able to contribute from
 >> > it)...
 >> >
 >> > For example, I believe, it should not be too complicated to implement
 >> > that first possibility I described above...
 >> > The way I see it would involve adding two statements to postgres SQL
 >> > syntax:
 >> >
 >> > prepare <name> as <sql statement>
 >> > and
 >> > execute <name> with (<parameter list>)
 >> >
 >> > For example:
 >> >
 >> > prepare mystatement as select * from mytable where id = $1 and name
 >
 > like
 >
 >> > $2;
 >> >
 >> > and then
 >> >
 >> > execute mystatement with (1, 'Dima');
 >> > execute mystatement with (2, 'John');
 >> >
 >> > etc....
 >> >
 >> > The JDBC driver would then send the 'prepare' command to the
backend in
 >> > Connection.prepareStatement (), and
 >> > use the 'execute' in PreparedStatement.execute ();
 >> >
 >> > One potential problem with implementation I see here is that the query
 >> > planner wants to know the argument types ahead of time...
 >> > I guess, I could get around that by making all the arguments 'text',
 >
 > and
 >
 >> > having them casted into the right types when the
 >> > statement is actually executed.
 >> >
 >> > There is, probably a need to also have some kind of a 'close' command
 >
 > to
 >
 >> > throw away the prepared query plans... or we could just
 >> > make them last forever until, say, the end of transaction (or,
perhaps,
 >> > the end of the session?)...
 >> >
 >> > If there is anyone interested in discussing various possibilities, and
 >> > getting this implemented one way or another,
 >> >  I would like to hear from you!
 >> >
 >> > Thanks!
 >> >
 >> > Dima.
 >> >
 >> >
 >> >
 >> > ---------------------------(end of
 >
 > broadcast)---------------------------
 >
 >> > TIP 2: you can get off all lists at once with the unregister command
 >> >    (send "unregister YourEmailAddressHere" to
majordomo@postgresql.org)
 >> >
 >>
 >>
 >>
 >>---------------------------(end of broadcast)---------------------------
 >>TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org
 >>
 >
 >
 >
 >

Re: [GENERAL] Prepared statement performance...

From

Barry Lind

Date:

27 September 2002, 14:43:01

Aaron,

It is a bit more complex than just looking for a ; since a ; is a valid
character in a quoted string.  But as Bruce has mentioned in a followup,
psql does it so it is doable.

I think the real question here is when does it make sense to use server
side prepared statements.  In the little bit of testing I have done, I
would say the answer is rarely.  You need many factors to come into
place for it to make sense to use server side prepared statements:

1) The statement needs to be big and complex such that there is
significant overhead in the parsing and planning stages of execution.
The query that I test with is about 3K in size and joins about 10
different tables and includes a union.  In this case there is a
significant overhead involved in both parsing and planning.   However
for a simple query that affects only one or two tables there is little
overhead.

2) The statement object needs to be reused multiple times.  Using a
server prepared statement requires at a minimum three sql statements to
be executed to do the work of one original statement:
select foo from bar;
becomes
prepare <name> as select foo from bar;
execute <name>;
deallocate <name>;

We can do the first two together in one roundtrip to the server, but the
last one requires a separate roundtrip.  So if you are only using a the
statement object/query only once then using server side prepared
statements will make performance worse than not.  The vast majority of
statement objects are created, executed once and then closed.  I don't
think it makes sence to turn on a feature that will make the overall
performance for most users worse.  There is very little jdbc code that I
have seen that creates a statement, call execute multiple times on that
same statement object and then finally closes it.

This is the first version of the feature.  Improvements will come with
implementation feedback and I welcome all feedback.  But there are
reasons it is implemented the way it is.  In the long term what I would
really like to see is this functionality done at the BE/FE protocol
level instead of at the sql level as I have seen other databases do.

Someone has suggestted a jdbc url arguement to enable the functionality
by default and that is probably a good idea if we learn from real use
that it makes sense to have all statements use this.  But I am not
convinced (given my reasons above) that this really makes sense.

thanks,
--Barry

Aaron Mulder wrote:
> Dave, Barry, et al.:
>
>
>>Unless of course we were to write a
>>sql parser in the jdbc driver so that we could predetermine which
>>statements should use the prepare mechanism or not.
>
>
>     That's a bit of an exaggeration -- all you need to check in the
> JDBC driver is whether there's an unescaped ; to see if there's more than
> one statement, right?  It wouldn't be unreasonable to default to enabling
> the feature unless there's a ; in the statement (or unless there's a ;
> with anything but whitespace afterward, if you like) -- it wouldn't even
> be a disaster to forget escaping and just decide based on a ; in the
> statement period -- few enough statements have an escaped ; that the
> suboptimal performance in that case won't matter.
>     I'm a little concerned about how this feature interacts with
> connection pooling as currently implemented -- it probably needs to be
> reset to a known state every time a connection is returned to the pool.
> Which is OK for the PostgreSQL implementation (which you'll note is not
> supposed to be used in an app server environment), but impossible for all
> the app servers out there which won't take driver-specific steps when
> connections are returned to their generic pools.  Furthermore, with the
> default behavior set to "disabled", you're reducing performance (you
> quoted 60%?) in the application server environment, when there will never
> be multiple statements executed in the same call, but there will
> frequently be multiple identical statements executed on the same
> PreparedStatement, but there's no available way to reset the flag, short
> of writing a wrapper driver to stand between the app server and the real
> driver.  On the other hand, if you set the default to a global "enabled",
> it breaks DBVisualizer and other development tools which currently allow
> multiple statements by default.
>     IMHO, it would be better to have the feature enabled for queries
> where it's appropriate, and disabled where not, and let the programmer
> specify a global override where they absolutely must.  That way
> DB-specific actions are only required in truly exceptional cases, and both
> the app servers and the tools behave properly by default.
>     If you absolutely object, I think we should add a property to the
> Driver and the ConnectionPoolDataSource so that in an app server
> environment you can set the property to enabled at configuration time.
> But again, I would prefer smarter default behavior.
>
> Aaron
>
> On 27 Sep 2002, Dave Cramer wrote:
>
>>Peter,
>>
>>I have to agree with Barry here. Since it is possible to send two (or
>>more ) statements to the backend, the programmer needs some mechanism
>>for enabling/disabling this feature. Unless of course we were to write a
>>sql parser in the jdbc driver so that we could predetermine which
>>statements should use the prepare mechanism or not.
>>
>>I often use prepared statements instead of simple statements even when
>>only doing one select. The difference being that I can use setXXX to set
>>the parameters in the statment. Which is particularly important for
>>setDate, setTime, setTimestamp
>>
>>And you would be amazed at the unique (irregular) coding practices which
>>exist out there, so having the default do something sane is a GOOD
>>THING.
>>
>>Dave
>>
>>
>>On Fri, 2002-09-27 at 02:52, Peter Kovacs wrote:
>>
>>>Barry,
>>>
>>>It's great to have this functionality.
>>>
>>>However, I am surprised about the need/requirement to explicitely turning on
>>>this feature in the JDBC driver. As you explain, the reason for this
>>>behaviour is to support the use of multiple statements with the
>>>prepareStatement method. But this is supporting incorrect us of the JDBC
>>>interface, since the JDBC API says that the parameter of the
>>>prepareStatement method is one(!) SQL statetment. On the other hand, it is
>>>clear that the intended purpose of the prepareStatement method is to use
>>>prepared statment -- exclusively and not by default. For statements executed
>>>only once, the designers of the API made available the method
>>>"createStatment".
>>>
>>>So the buttom line is that you have an irregular behaviour in support of an
>>>irregular coding practice. I understand that there is "legacy" code that
>>>needs to use the prepareStatement method with multiple SQL statements. But
>>>since such practice was already incorrect at the time of its inception, the
>>>right way would be to allow a transition from bad to good by requiring the
>>>"sinners" to explicitly use some kind of
>>>"org.postgresql.PGStatement.setUseServerPrepare(false)" tweaking in their
>>>code, and let new coders use standard code.
>>>
>>>Of course, you're running the show, so you will do it as you deem right, but
>>>IMHO the implementation as you described it is clearly not reasonable.
>>>
>>>Cheers,
>>>
>>>Peter
>>>
>>>----- Original Message -----
>>>From: "Barry Lind" <barry@xythos.com>
>>>To: "Dmitry Tkach" <dmitry@openratings.com>
>>>Cc: <pgsql-general@postgresql.org>; <pgsql-jdbc@postgresql.org>
>>>Sent: Thursday, September 26, 2002 10:56 PM
>>>Subject: Re: [JDBC] [GENERAL] Prepared statement performance...
>>>
>>>
>>>
>>>>Dimitry,
>>>>
>>>>I have some good news for you.  All of this functionality is in 7.3 of
>>>>the server and the current development build of the jdbc driver.  Of
>>>>course 7.3 isn't quite production yet, but beta 2 should be out this week.
>>>>
>>>>Your first option is what has been implemented.  In 7.3 the server now
>>>>supports 'prepare <name> as <sql>' and 'execute <name>' (and variants
>>>>for passing bind variables).  The jdbc driver also now has a method on
>>>>org.postgresql.PGStatement.setUseServerPrepare(true) to enable the the
>>>>use of server side prepared statements (so you can cast your Statement
>>>>object to an org.postgresql.PGStatement object and enable the
>>>>functionality for a specific statement).
>>>>
>>>>This hasn't yet made its way into the jdbc documentation for 7.3, but
>>>>this will be documented before 7.3 goes production.
>>>>
>>>>This implementation does still require that you make some changes to
>>>>your app in order to enable it but that is by design for the following
>>>>reasons:
>>>>    1) In order to be useful you need to reuse a statement object
>>>>multiple times.  Most code generally will just create a statement and
>>>>use it only once, and since more work is being done to prepare the
>>>>statement that would have a negative performance impact for the majority
>>>>of statements being executed in normal code.
>>>>    2) Prepare functionality will not work for all types of sql that you
>>>>can execute through the jdbc driver.  For example you can quite easily
>>>>do the following:  conn.prepareStatement("update foo set a = ?; insert
>>>>into bar values (?,?,?);");
>>>>By issuing both sql statements in one call only one network roundtrip is
>>>>needed to execute both statements providing what is likely a more
>>>>performant application.  However this doesn't work with prepare because:
>>>>prepare foo as update foo set a = ?; insert into bar values(?,?,?);   -
>>>>is one prepared statement and one regular statement which will cause
>>>>very strange things to happen.
>>>>
>>>>It is for the reasons above that the developer needs to explicitly turn
>>>>on the use of server side prepared statements when it makes sense to do
>>>
>>>so.
>>>
>>>>thanks,
>>>>--Barry
>>>>
>>>>PS.  I would really appreciate it if you could test this new
>>>>functionality.  As far as I know I am the only one who has used it.  But
>>>>for one of my frequently used and complex sql statements the time to
>>>>execute dropped by 60% using server side prepared statements.
>>>>
>>>>
>>>>
>>>>Dmitry Tkach wrote:
>>>> > Hi, everybody.
>>>> >
>>>> > I am running into huge performance problems, due to JDBC not being able
>>>> > to cache query plans :-(
>>>> > My java program runs the (set of about 5) identical statements with
>>>> > different parameters for about 30 million times...
>>>> > What I am talking about below has to do with JDBC, but is not really
>>>> > specific to it - it seems to me, that the backend itself
>>>> > could be improved to better handle this kind of scenario, when the same
>>>> > statement is executed many times in the same session.
>>>> >
>>>> > It takes about 3 weeks(!) to run, and it looks like about *half* of
>>>
>>>that
>>>
>>>> > time is spent by the query planner,
>>>> > creating and recreating the query plans every time I ran the damn
>>>> > statement....
>>>> >
>>>> > I am looking into implementing some kind of a solution, that would let
>>>> > me work around that problem...
>>>> > So far, I only see two possibilities:
>>>> >
>>>> > - a general solution, that would involve extending postgres SQL gramma
>>>> > to include a 'prepare' statement
>>>> > - or am ugly work around, that would involve moving all my sql
>>>> > statements into stored procedures, and have
>>>> >   those cache the query plans inside...
>>>> >
>>>> > The second solution is not only ugly (because it requires the
>>>> > application code to be changed and to have a specialized
>>>> > stored procedure for every query), but also requires some additional
>>>> > hacks (to overcome the hard limit on the number of
>>>> > function arguments and the inability for functions to return tuples) -
>>>> > the only way I imagine this can be made to work is
>>>> > to glue all the arguments together into a text string, and have the
>>>> > stored procedure parse it back, execute the query, then
>>>> > glue the resulting tuple(s) into another text string, return it, and
>>>> > have the application (or, perhaps, JDBC layer) parse it back
>>>> > into columns...
>>>> >
>>>> > I was wonderring if anybody has any better ideas how this can be made
>>>
>>>to
>>>
>>>> > work (I am looking for a solution that would
>>>> > minimize changes to the existing JDBC applications that use
>>>> > PreparedStatements)?
>>>> >
>>>> > If the maintainers of the involved code are interested, I would be
>>>> > willing to implement and contribute the solution we come up with ...
>>>> > (I figure, nobody would really be interested in getting that second
>>>> > solution I mentioned into the mainstream :-), but, if we are
>>>> > able to come up with something more general and less ugly, perhaps, I
>>>
>>>am
>>>
>>>> > not the only one who would be able to contribute from
>>>> > it)...
>>>> >
>>>> > For example, I believe, it should not be too complicated to implement
>>>> > that first possibility I described above...
>>>> > The way I see it would involve adding two statements to postgres SQL
>>>> > syntax:
>>>> >
>>>> > prepare <name> as <sql statement>
>>>> > and
>>>> > execute <name> with (<parameter list>)
>>>> >
>>>> > For example:
>>>> >
>>>> > prepare mystatement as select * from mytable where id = $1 and name
>>>
>>>like
>>>
>>>> > $2;
>>>> >
>>>> > and then
>>>> >
>>>> > execute mystatement with (1, 'Dima');
>>>> > execute mystatement with (2, 'John');
>>>> >
>>>> > etc....
>>>> >
>>>> > The JDBC driver would then send the 'prepare' command to the backend in
>>>> > Connection.prepareStatement (), and
>>>> > use the 'execute' in PreparedStatement.execute ();
>>>> >
>>>> > One potential problem with implementation I see here is that the query
>>>> > planner wants to know the argument types ahead of time...
>>>> > I guess, I could get around that by making all the arguments 'text',
>>>
>>>and
>>>
>>>> > having them casted into the right types when the
>>>> > statement is actually executed.
>>>> >
>>>> > There is, probably a need to also have some kind of a 'close' command
>>>
>>>to
>>>
>>>> > throw away the prepared query plans... or we could just
>>>> > make them last forever until, say, the end of transaction (or, perhaps,
>>>> > the end of the session?)...
>>>> >
>>>> > If there is anyone interested in discussing various possibilities, and
>>>> > getting this implemented one way or another,
>>>> >  I would like to hear from you!
>>>> >
>>>> > Thanks!
>>>> >
>>>> > Dima.
>>>> >
>>>> >
>>>> >
>>>> > ---------------------------(end of
>>>
>>>broadcast)---------------------------
>>>
>>>> > TIP 2: you can get off all lists at once with the unregister command
>>>> >    (send "unregister YourEmailAddressHere" to majordomo@postgresql.org)
>>>> >
>>>>
>>>>
>>>>
>>>>---------------------------(end of broadcast)---------------------------
>>>>TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org
>>>>
>>>
>>>
>>>
>>>---------------------------(end of broadcast)---------------------------
>>>TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org
>>>
>>>
>>
>>
>>
>>
>>---------------------------(end of broadcast)---------------------------
>>TIP 5: Have you checked our extensive FAQ?
>>
>>http://www.postgresql.org/users-lounge/docs/faq.html
>>
>
>
>

Re: [GENERAL] Prepared statement performance...

From

Barry Lind

Date:

27 September 2002, 14:49:22

Aaron,

Currently there is no issue with connection pools since the setting is
at the statement level and as far as I know no connection pool keeps
open statements across checkin/out of a connection.

If however this does become a connection level default setting then your
issues do come into play.

thanks,
--Barry



Aaron Mulder wrote:
> Dave, Barry, et al.:
>
>
>>Unless of course we were to write a
>>sql parser in the jdbc driver so that we could predetermine which
>>statements should use the prepare mechanism or not.
>
>
>     That's a bit of an exaggeration -- all you need to check in the
> JDBC driver is whether there's an unescaped ; to see if there's more than
> one statement, right?  It wouldn't be unreasonable to default to enabling
> the feature unless there's a ; in the statement (or unless there's a ;
> with anything but whitespace afterward, if you like) -- it wouldn't even
> be a disaster to forget escaping and just decide based on a ; in the
> statement period -- few enough statements have an escaped ; that the
> suboptimal performance in that case won't matter.
>     I'm a little concerned about how this feature interacts with
> connection pooling as currently implemented -- it probably needs to be
> reset to a known state every time a connection is returned to the pool.
> Which is OK for the PostgreSQL implementation (which you'll note is not
> supposed to be used in an app server environment), but impossible for all
> the app servers out there which won't take driver-specific steps when
> connections are returned to their generic pools.  Furthermore, with the
> default behavior set to "disabled", you're reducing performance (you
> quoted 60%?) in the application server environment, when there will never
> be multiple statements executed in the same call, but there will
> frequently be multiple identical statements executed on the same
> PreparedStatement, but there's no available way to reset the flag, short
> of writing a wrapper driver to stand between the app server and the real
> driver.  On the other hand, if you set the default to a global "enabled",
> it breaks DBVisualizer and other development tools which currently allow
> multiple statements by default.
>     IMHO, it would be better to have the feature enabled for queries
> where it's appropriate, and disabled where not, and let the programmer
> specify a global override where they absolutely must.  That way
> DB-specific actions are only required in truly exceptional cases, and both
> the app servers and the tools behave properly by default.
>     If you absolutely object, I think we should add a property to the
> Driver and the ConnectionPoolDataSource so that in an app server
> environment you can set the property to enabled at configuration time.
> But again, I would prefer smarter default behavior.
>
> Aaron
>
> On 27 Sep 2002, Dave Cramer wrote:
>
>>Peter,
>>
>>I have to agree with Barry here. Since it is possible to send two (or
>>more ) statements to the backend, the programmer needs some mechanism
>>for enabling/disabling this feature. Unless of course we were to write a
>>sql parser in the jdbc driver so that we could predetermine which
>>statements should use the prepare mechanism or not.
>>
>>I often use prepared statements instead of simple statements even when
>>only doing one select. The difference being that I can use setXXX to set
>>the parameters in the statment. Which is particularly important for
>>setDate, setTime, setTimestamp
>>
>>And you would be amazed at the unique (irregular) coding practices which
>>exist out there, so having the default do something sane is a GOOD
>>THING.
>>
>>Dave
>>
>>
>>On Fri, 2002-09-27 at 02:52, Peter Kovacs wrote:
>>
>>>Barry,
>>>
>>>It's great to have this functionality.
>>>
>>>However, I am surprised about the need/requirement to explicitely turning on
>>>this feature in the JDBC driver. As you explain, the reason for this
>>>behaviour is to support the use of multiple statements with the
>>>prepareStatement method. But this is supporting incorrect us of the JDBC
>>>interface, since the JDBC API says that the parameter of the
>>>prepareStatement method is one(!) SQL statetment. On the other hand, it is
>>>clear that the intended purpose of the prepareStatement method is to use
>>>prepared statment -- exclusively and not by default. For statements executed
>>>only once, the designers of the API made available the method
>>>"createStatment".
>>>
>>>So the buttom line is that you have an irregular behaviour in support of an
>>>irregular coding practice. I understand that there is "legacy" code that
>>>needs to use the prepareStatement method with multiple SQL statements. But
>>>since such practice was already incorrect at the time of its inception, the
>>>right way would be to allow a transition from bad to good by requiring the
>>>"sinners" to explicitly use some kind of
>>>"org.postgresql.PGStatement.setUseServerPrepare(false)" tweaking in their
>>>code, and let new coders use standard code.
>>>
>>>Of course, you're running the show, so you will do it as you deem right, but
>>>IMHO the implementation as you described it is clearly not reasonable.
>>>
>>>Cheers,
>>>
>>>Peter
>>>
>>>----- Original Message -----
>>>From: "Barry Lind" <barry@xythos.com>
>>>To: "Dmitry Tkach" <dmitry@openratings.com>
>>>Cc: <pgsql-general@postgresql.org>; <pgsql-jdbc@postgresql.org>
>>>Sent: Thursday, September 26, 2002 10:56 PM
>>>Subject: Re: [JDBC] [GENERAL] Prepared statement performance...
>>>
>>>
>>>
>>>>Dimitry,
>>>>
>>>>I have some good news for you.  All of this functionality is in 7.3 of
>>>>the server and the current development build of the jdbc driver.  Of
>>>>course 7.3 isn't quite production yet, but beta 2 should be out this week.
>>>>
>>>>Your first option is what has been implemented.  In 7.3 the server now
>>>>supports 'prepare <name> as <sql>' and 'execute <name>' (and variants
>>>>for passing bind variables).  The jdbc driver also now has a method on
>>>>org.postgresql.PGStatement.setUseServerPrepare(true) to enable the the
>>>>use of server side prepared statements (so you can cast your Statement
>>>>object to an org.postgresql.PGStatement object and enable the
>>>>functionality for a specific statement).
>>>>
>>>>This hasn't yet made its way into the jdbc documentation for 7.3, but
>>>>this will be documented before 7.3 goes production.
>>>>
>>>>This implementation does still require that you make some changes to
>>>>your app in order to enable it but that is by design for the following
>>>>reasons:
>>>>    1) In order to be useful you need to reuse a statement object
>>>>multiple times.  Most code generally will just create a statement and
>>>>use it only once, and since more work is being done to prepare the
>>>>statement that would have a negative performance impact for the majority
>>>>of statements being executed in normal code.
>>>>    2) Prepare functionality will not work for all types of sql that you
>>>>can execute through the jdbc driver.  For example you can quite easily
>>>>do the following:  conn.prepareStatement("update foo set a = ?; insert
>>>>into bar values (?,?,?);");
>>>>By issuing both sql statements in one call only one network roundtrip is
>>>>needed to execute both statements providing what is likely a more
>>>>performant application.  However this doesn't work with prepare because:
>>>>prepare foo as update foo set a = ?; insert into bar values(?,?,?);   -
>>>>is one prepared statement and one regular statement which will cause
>>>>very strange things to happen.
>>>>
>>>>It is for the reasons above that the developer needs to explicitly turn
>>>>on the use of server side prepared statements when it makes sense to do
>>>
>>>so.
>>>
>>>>thanks,
>>>>--Barry
>>>>
>>>>PS.  I would really appreciate it if you could test this new
>>>>functionality.  As far as I know I am the only one who has used it.  But
>>>>for one of my frequently used and complex sql statements the time to
>>>>execute dropped by 60% using server side prepared statements.
>>>>
>>>>
>>>>
>>>>Dmitry Tkach wrote:
>>>> > Hi, everybody.
>>>> >
>>>> > I am running into huge performance problems, due to JDBC not being able
>>>> > to cache query plans :-(
>>>> > My java program runs the (set of about 5) identical statements with
>>>> > different parameters for about 30 million times...
>>>> > What I am talking about below has to do with JDBC, but is not really
>>>> > specific to it - it seems to me, that the backend itself
>>>> > could be improved to better handle this kind of scenario, when the same
>>>> > statement is executed many times in the same session.
>>>> >
>>>> > It takes about 3 weeks(!) to run, and it looks like about *half* of
>>>
>>>that
>>>
>>>> > time is spent by the query planner,
>>>> > creating and recreating the query plans every time I ran the damn
>>>> > statement....
>>>> >
>>>> > I am looking into implementing some kind of a solution, that would let
>>>> > me work around that problem...
>>>> > So far, I only see two possibilities:
>>>> >
>>>> > - a general solution, that would involve extending postgres SQL gramma
>>>> > to include a 'prepare' statement
>>>> > - or am ugly work around, that would involve moving all my sql
>>>> > statements into stored procedures, and have
>>>> >   those cache the query plans inside...
>>>> >
>>>> > The second solution is not only ugly (because it requires the
>>>> > application code to be changed and to have a specialized
>>>> > stored procedure for every query), but also requires some additional
>>>> > hacks (to overcome the hard limit on the number of
>>>> > function arguments and the inability for functions to return tuples) -
>>>> > the only way I imagine this can be made to work is
>>>> > to glue all the arguments together into a text string, and have the
>>>> > stored procedure parse it back, execute the query, then
>>>> > glue the resulting tuple(s) into another text string, return it, and
>>>> > have the application (or, perhaps, JDBC layer) parse it back
>>>> > into columns...
>>>> >
>>>> > I was wonderring if anybody has any better ideas how this can be made
>>>
>>>to
>>>
>>>> > work (I am looking for a solution that would
>>>> > minimize changes to the existing JDBC applications that use
>>>> > PreparedStatements)?
>>>> >
>>>> > If the maintainers of the involved code are interested, I would be
>>>> > willing to implement and contribute the solution we come up with ...
>>>> > (I figure, nobody would really be interested in getting that second
>>>> > solution I mentioned into the mainstream :-), but, if we are
>>>> > able to come up with something more general and less ugly, perhaps, I
>>>
>>>am
>>>
>>>> > not the only one who would be able to contribute from
>>>> > it)...
>>>> >
>>>> > For example, I believe, it should not be too complicated to implement
>>>> > that first possibility I described above...
>>>> > The way I see it would involve adding two statements to postgres SQL
>>>> > syntax:
>>>> >
>>>> > prepare <name> as <sql statement>
>>>> > and
>>>> > execute <name> with (<parameter list>)
>>>> >
>>>> > For example:
>>>> >
>>>> > prepare mystatement as select * from mytable where id = $1 and name
>>>
>>>like
>>>
>>>> > $2;
>>>> >
>>>> > and then
>>>> >
>>>> > execute mystatement with (1, 'Dima');
>>>> > execute mystatement with (2, 'John');
>>>> >
>>>> > etc....
>>>> >
>>>> > The JDBC driver would then send the 'prepare' command to the backend in
>>>> > Connection.prepareStatement (), and
>>>> > use the 'execute' in PreparedStatement.execute ();
>>>> >
>>>> > One potential problem with implementation I see here is that the query
>>>> > planner wants to know the argument types ahead of time...
>>>> > I guess, I could get around that by making all the arguments 'text',
>>>
>>>and
>>>
>>>> > having them casted into the right types when the
>>>> > statement is actually executed.
>>>> >
>>>> > There is, probably a need to also have some kind of a 'close' command
>>>
>>>to
>>>
>>>> > throw away the prepared query plans... or we could just
>>>> > make them last forever until, say, the end of transaction (or, perhaps,
>>>> > the end of the session?)...
>>>> >
>>>> > If there is anyone interested in discussing various possibilities, and
>>>> > getting this implemented one way or another,
>>>> >  I would like to hear from you!
>>>> >
>>>> > Thanks!
>>>> >
>>>> > Dima.
>>>> >
>>>> >
>>>> >
>>>> > ---------------------------(end of
>>>
>>>broadcast)---------------------------
>>>
>>>> > TIP 2: you can get off all lists at once with the unregister command
>>>> >    (send "unregister YourEmailAddressHere" to majordomo@postgresql.org)
>>>> >
>>>>
>>>>
>>>>
>>>>---------------------------(end of broadcast)---------------------------
>>>>TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org
>>>>
>>>
>>>
>>>
>>>---------------------------(end of broadcast)---------------------------
>>>TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org
>>>
>>>
>>
>>
>>
>>
>>---------------------------(end of broadcast)---------------------------
>>TIP 5: Have you checked our extensive FAQ?
>>
>>http://www.postgresql.org/users-lounge/docs/faq.html
>>
>
>
>

Re: [GENERAL] Prepared statement performance...

From

Barry Lind

Date:

27 September 2002, 14:51:15

Aaron,

You quote my 60% number.  To clarify that was on a 3K sized statment
involving a union and joins to about 10 tables.  With more regular sql
statements it reduces performance.

thanks,
--Barry


Aaron Mulder wrote:
> Dave, Barry, et al.:
>
>
>>Unless of course we were to write a
>>sql parser in the jdbc driver so that we could predetermine which
>>statements should use the prepare mechanism or not.
>
>
>     That's a bit of an exaggeration -- all you need to check in the
> JDBC driver is whether there's an unescaped ; to see if there's more than
> one statement, right?  It wouldn't be unreasonable to default to enabling
> the feature unless there's a ; in the statement (or unless there's a ;
> with anything but whitespace afterward, if you like) -- it wouldn't even
> be a disaster to forget escaping and just decide based on a ; in the
> statement period -- few enough statements have an escaped ; that the
> suboptimal performance in that case won't matter.
>     I'm a little concerned about how this feature interacts with
> connection pooling as currently implemented -- it probably needs to be
> reset to a known state every time a connection is returned to the pool.
> Which is OK for the PostgreSQL implementation (which you'll note is not
> supposed to be used in an app server environment), but impossible for all
> the app servers out there which won't take driver-specific steps when
> connections are returned to their generic pools.  Furthermore, with the
> default behavior set to "disabled", you're reducing performance (you
> quoted 60%?) in the application server environment, when there will never
> be multiple statements executed in the same call, but there will
> frequently be multiple identical statements executed on the same
> PreparedStatement, but there's no available way to reset the flag, short
> of writing a wrapper driver to stand between the app server and the real
> driver.  On the other hand, if you set the default to a global "enabled",
> it breaks DBVisualizer and other development tools which currently allow
> multiple statements by default.
>     IMHO, it would be better to have the feature enabled for queries
> where it's appropriate, and disabled where not, and let the programmer
> specify a global override where they absolutely must.  That way
> DB-specific actions are only required in truly exceptional cases, and both
> the app servers and the tools behave properly by default.
>     If you absolutely object, I think we should add a property to the
> Driver and the ConnectionPoolDataSource so that in an app server
> environment you can set the property to enabled at configuration time.
> But again, I would prefer smarter default behavior.
>
> Aaron
>
> On 27 Sep 2002, Dave Cramer wrote:
>
>>Peter,
>>
>>I have to agree with Barry here. Since it is possible to send two (or
>>more ) statements to the backend, the programmer needs some mechanism
>>for enabling/disabling this feature. Unless of course we were to write a
>>sql parser in the jdbc driver so that we could predetermine which
>>statements should use the prepare mechanism or not.
>>
>>I often use prepared statements instead of simple statements even when
>>only doing one select. The difference being that I can use setXXX to set
>>the parameters in the statment. Which is particularly important for
>>setDate, setTime, setTimestamp
>>
>>And you would be amazed at the unique (irregular) coding practices which
>>exist out there, so having the default do something sane is a GOOD
>>THING.
>>
>>Dave
>>
>>
>>On Fri, 2002-09-27 at 02:52, Peter Kovacs wrote:
>>
>>>Barry,
>>>
>>>It's great to have this functionality.
>>>
>>>However, I am surprised about the need/requirement to explicitely turning on
>>>this feature in the JDBC driver. As you explain, the reason for this
>>>behaviour is to support the use of multiple statements with the
>>>prepareStatement method. But this is supporting incorrect us of the JDBC
>>>interface, since the JDBC API says that the parameter of the
>>>prepareStatement method is one(!) SQL statetment. On the other hand, it is
>>>clear that the intended purpose of the prepareStatement method is to use
>>>prepared statment -- exclusively and not by default. For statements executed
>>>only once, the designers of the API made available the method
>>>"createStatment".
>>>
>>>So the buttom line is that you have an irregular behaviour in support of an
>>>irregular coding practice. I understand that there is "legacy" code that
>>>needs to use the prepareStatement method with multiple SQL statements. But
>>>since such practice was already incorrect at the time of its inception, the
>>>right way would be to allow a transition from bad to good by requiring the
>>>"sinners" to explicitly use some kind of
>>>"org.postgresql.PGStatement.setUseServerPrepare(false)" tweaking in their
>>>code, and let new coders use standard code.
>>>
>>>Of course, you're running the show, so you will do it as you deem right, but
>>>IMHO the implementation as you described it is clearly not reasonable.
>>>
>>>Cheers,
>>>
>>>Peter
>>>
>>>----- Original Message -----
>>>From: "Barry Lind" <barry@xythos.com>
>>>To: "Dmitry Tkach" <dmitry@openratings.com>
>>>Cc: <pgsql-general@postgresql.org>; <pgsql-jdbc@postgresql.org>
>>>Sent: Thursday, September 26, 2002 10:56 PM
>>>Subject: Re: [JDBC] [GENERAL] Prepared statement performance...
>>>
>>>
>>>
>>>>Dimitry,
>>>>
>>>>I have some good news for you.  All of this functionality is in 7.3 of
>>>>the server and the current development build of the jdbc driver.  Of
>>>>course 7.3 isn't quite production yet, but beta 2 should be out this week.
>>>>
>>>>Your first option is what has been implemented.  In 7.3 the server now
>>>>supports 'prepare <name> as <sql>' and 'execute <name>' (and variants
>>>>for passing bind variables).  The jdbc driver also now has a method on
>>>>org.postgresql.PGStatement.setUseServerPrepare(true) to enable the the
>>>>use of server side prepared statements (so you can cast your Statement
>>>>object to an org.postgresql.PGStatement object and enable the
>>>>functionality for a specific statement).
>>>>
>>>>This hasn't yet made its way into the jdbc documentation for 7.3, but
>>>>this will be documented before 7.3 goes production.
>>>>
>>>>This implementation does still require that you make some changes to
>>>>your app in order to enable it but that is by design for the following
>>>>reasons:
>>>>    1) In order to be useful you need to reuse a statement object
>>>>multiple times.  Most code generally will just create a statement and
>>>>use it only once, and since more work is being done to prepare the
>>>>statement that would have a negative performance impact for the majority
>>>>of statements being executed in normal code.
>>>>    2) Prepare functionality will not work for all types of sql that you
>>>>can execute through the jdbc driver.  For example you can quite easily
>>>>do the following:  conn.prepareStatement("update foo set a = ?; insert
>>>>into bar values (?,?,?);");
>>>>By issuing both sql statements in one call only one network roundtrip is
>>>>needed to execute both statements providing what is likely a more
>>>>performant application.  However this doesn't work with prepare because:
>>>>prepare foo as update foo set a = ?; insert into bar values(?,?,?);   -
>>>>is one prepared statement and one regular statement which will cause
>>>>very strange things to happen.
>>>>
>>>>It is for the reasons above that the developer needs to explicitly turn
>>>>on the use of server side prepared statements when it makes sense to do
>>>
>>>so.
>>>
>>>>thanks,
>>>>--Barry
>>>>
>>>>PS.  I would really appreciate it if you could test this new
>>>>functionality.  As far as I know I am the only one who has used it.  But
>>>>for one of my frequently used and complex sql statements the time to
>>>>execute dropped by 60% using server side prepared statements.
>>>>
>>>>
>>>>
>>>>Dmitry Tkach wrote:
>>>> > Hi, everybody.
>>>> >
>>>> > I am running into huge performance problems, due to JDBC not being able
>>>> > to cache query plans :-(
>>>> > My java program runs the (set of about 5) identical statements with
>>>> > different parameters for about 30 million times...
>>>> > What I am talking about below has to do with JDBC, but is not really
>>>> > specific to it - it seems to me, that the backend itself
>>>> > could be improved to better handle this kind of scenario, when the same
>>>> > statement is executed many times in the same session.
>>>> >
>>>> > It takes about 3 weeks(!) to run, and it looks like about *half* of
>>>
>>>that
>>>
>>>> > time is spent by the query planner,
>>>> > creating and recreating the query plans every time I ran the damn
>>>> > statement....
>>>> >
>>>> > I am looking into implementing some kind of a solution, that would let
>>>> > me work around that problem...
>>>> > So far, I only see two possibilities:
>>>> >
>>>> > - a general solution, that would involve extending postgres SQL gramma
>>>> > to include a 'prepare' statement
>>>> > - or am ugly work around, that would involve moving all my sql
>>>> > statements into stored procedures, and have
>>>> >   those cache the query plans inside...
>>>> >
>>>> > The second solution is not only ugly (because it requires the
>>>> > application code to be changed and to have a specialized
>>>> > stored procedure for every query), but also requires some additional
>>>> > hacks (to overcome the hard limit on the number of
>>>> > function arguments and the inability for functions to return tuples) -
>>>> > the only way I imagine this can be made to work is
>>>> > to glue all the arguments together into a text string, and have the
>>>> > stored procedure parse it back, execute the query, then
>>>> > glue the resulting tuple(s) into another text string, return it, and
>>>> > have the application (or, perhaps, JDBC layer) parse it back
>>>> > into columns...
>>>> >
>>>> > I was wonderring if anybody has any better ideas how this can be made
>>>
>>>to
>>>
>>>> > work (I am looking for a solution that would
>>>> > minimize changes to the existing JDBC applications that use
>>>> > PreparedStatements)?
>>>> >
>>>> > If the maintainers of the involved code are interested, I would be
>>>> > willing to implement and contribute the solution we come up with ...
>>>> > (I figure, nobody would really be interested in getting that second
>>>> > solution I mentioned into the mainstream :-), but, if we are
>>>> > able to come up with something more general and less ugly, perhaps, I
>>>
>>>am
>>>
>>>> > not the only one who would be able to contribute from
>>>> > it)...
>>>> >
>>>> > For example, I believe, it should not be too complicated to implement
>>>> > that first possibility I described above...
>>>> > The way I see it would involve adding two statements to postgres SQL
>>>> > syntax:
>>>> >
>>>> > prepare <name> as <sql statement>
>>>> > and
>>>> > execute <name> with (<parameter list>)
>>>> >
>>>> > For example:
>>>> >
>>>> > prepare mystatement as select * from mytable where id = $1 and name
>>>
>>>like
>>>
>>>> > $2;
>>>> >
>>>> > and then
>>>> >
>>>> > execute mystatement with (1, 'Dima');
>>>> > execute mystatement with (2, 'John');
>>>> >
>>>> > etc....
>>>> >
>>>> > The JDBC driver would then send the 'prepare' command to the backend in
>>>> > Connection.prepareStatement (), and
>>>> > use the 'execute' in PreparedStatement.execute ();
>>>> >
>>>> > One potential problem with implementation I see here is that the query
>>>> > planner wants to know the argument types ahead of time...
>>>> > I guess, I could get around that by making all the arguments 'text',
>>>
>>>and
>>>
>>>> > having them casted into the right types when the
>>>> > statement is actually executed.
>>>> >
>>>> > There is, probably a need to also have some kind of a 'close' command
>>>
>>>to
>>>
>>>> > throw away the prepared query plans... or we could just
>>>> > make them last forever until, say, the end of transaction (or, perhaps,
>>>> > the end of the session?)...
>>>> >
>>>> > If there is anyone interested in discussing various possibilities, and
>>>> > getting this implemented one way or another,
>>>> >  I would like to hear from you!
>>>> >
>>>> > Thanks!
>>>> >
>>>> > Dima.
>>>> >
>>>> >
>>>> >
>>>> > ---------------------------(end of
>>>
>>>broadcast)---------------------------
>>>
>>>> > TIP 2: you can get off all lists at once with the unregister command
>>>> >    (send "unregister YourEmailAddressHere" to majordomo@postgresql.org)
>>>> >
>>>>
>>>>
>>>>
>>>>---------------------------(end of broadcast)---------------------------
>>>>TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org
>>>>
>>>
>>>
>>>
>>>---------------------------(end of broadcast)---------------------------
>>>TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org
>>>
>>>
>>
>>
>>
>>
>>---------------------------(end of broadcast)---------------------------
>>TIP 5: Have you checked our extensive FAQ?
>>
>>http://www.postgresql.org/users-lounge/docs/faq.html
>>
>
>
>

Re: [GENERAL] Prepared statement performance...

From

Barry Lind

Date:

27 September 2002, 14:56:56

Dimtry,

Server side prepare does not map to jdbc concept of PreparedStatement
and it is important to understand how they are not the same.

Server side prepare means that you can parse and plan the statement once
and reexecute it multiple times so:
select foo from bar;
becomes
prepare <name> as select foo from bar;
execute <name>;
deallocate <name>;

This applies to all sql statements.  So server side prepared statements
can equally be used for regular JDBC Statement objects as well as JDBC
PreparedStatements.

JDBC PreparedStatements provide an interface to bind values into a sql
statement.

Server side prepare provides the ability to reduce the overhead of
parse/plan across muliple executions of a sql statement that may or may
not have bind values.

They are different even though they both have the word 'prepare' in
their names.

thanks,
--Barry



Dmitry Tkach wrote:
> Actually, I still don't see what's wrong with enabling that behaviour by
> default....
> People who do not want to use it,  can still execute queries using
> Statement instead of PreparedStatement, right?
> As far as I know, people, who use JDBC are usually fully aware about the
> differences between the two, and
> the overhead involved in preparing the query plans - if you don't want
> it prepared, why would you ask foir it?
>
> Dima
>
> Dave Cramer wrote:
>
>> Ok, I agree with having a configuration parameter which sets the
>> default, that way anyone with the where withall to use this properly can
>> do so without changing their code.
>>
>> AFAICS the parameter is statement specific,not connection specific.
>>
>>
>> Dave
>> On Fri, 2002-09-27 at 08:43, Aaron Mulder wrote:
>>
>>> Dave, Barry, et al.:
>>>
>>>> Unless of course we were to write a
>>>> sql parser in the jdbc driver so that we could predetermine which
>>>> statements should use the prepare mechanism or not.
>>>>
>>>     That's a bit of an exaggeration -- all you need to check in the
>>> JDBC driver is whether there's an unescaped ; to see if there's more
>>> than one statement, right?  It wouldn't be unreasonable to default to
>>> enabling the feature unless there's a ; in the statement (or unless
>>> there's a ; with anything but whitespace afterward, if you like) --
>>> it wouldn't even be a disaster to forget escaping and just decide
>>> based on a ; in the statement period -- few enough statements have an
>>> escaped ; that the suboptimal performance in that case won't matter.
>>>     I'm a little concerned about how this feature interacts with
>>> connection pooling as currently implemented -- it probably needs to be
>>> reset to a known state every time a connection is returned to the
>>> pool.  Which is OK for the PostgreSQL implementation (which you'll
>>> note is not
>>> supposed to be used in an app server environment), but impossible for
>>> all
>>> the app servers out there which won't take driver-specific steps when
>>> connections are returned to their generic pools.  Furthermore, with the
>>> default behavior set to "disabled", you're reducing performance (you
>>> quoted 60%?) in the application server environment, when there will
>>> never
>>> be multiple statements executed in the same call, but there will
>>> frequently be multiple identical statements executed on the same
>>> PreparedStatement, but there's no available way to reset the flag, short
>>> of writing a wrapper driver to stand between the app server and the real
>>> driver.  On the other hand, if you set the default to a global
>>> "enabled",
>>> it breaks DBVisualizer and other development tools which currently allow
>>> multiple statements by default.
>>>     IMHO, it would be better to have the feature enabled for queries
>>> where it's appropriate, and disabled where not, and let the
>>> programmer specify a global override where they absolutely must.
>>> That way DB-specific actions are only required in truly exceptional
>>> cases, and both the app servers and the tools behave properly by
>>> default.
>>>     If you absolutely object, I think we should add a property to the
>>> Driver and the ConnectionPoolDataSource so that in an app server
>>> environment you can set the property to enabled at configuration
>>> time.  But again, I would prefer smarter default behavior.
>>>
>>> Aaron
>>>
>>> On 27 Sep 2002, Dave Cramer wrote:
>>>
>>>> Peter,
>>>>
>>>> I have to agree with Barry here. Since it is possible to send two (or
>>>> more ) statements to the backend, the programmer needs some mechanism
>>>> for enabling/disabling this feature. Unless of course we were to
>>>> write a
>>>> sql parser in the jdbc driver so that we could predetermine which
>>>> statements should use the prepare mechanism or not.
>>>>
>>>> I often use prepared statements instead of simple statements even when
>>>> only doing one select. The difference being that I can use setXXX to
>>>> set
>>>> the parameters in the statment. Which is particularly important for
>>>> setDate, setTime, setTimestamp
>>>>
>>>> And you would be amazed at the unique (irregular) coding practices
>>>> which
>>>> exist out there, so having the default do something sane is a GOOD
>>>> THING.
>>>>
>>>> Dave
>>>>
>>>>
>>>> On Fri, 2002-09-27 at 02:52, Peter Kovacs wrote:
>>>>
>>>>> Barry,
>>>>>
>>>>> It's great to have this functionality.
>>>>>
>>>>> However, I am surprised about the need/requirement to explicitely
>>>>> turning on
>>>>> this feature in the JDBC driver. As you explain, the reason for this
>>>>> behaviour is to support the use of multiple statements with the
>>>>> prepareStatement method. But this is supporting incorrect us of the
>>>>> JDBC
>>>>> interface, since the JDBC API says that the parameter of the
>>>>> prepareStatement method is one(!) SQL statetment. On the other
>>>>> hand, it is
>>>>> clear that the intended purpose of the prepareStatement method is
>>>>> to use
>>>>> prepared statment -- exclusively and not by default. For statements
>>>>> executed
>>>>> only once, the designers of the API made available the method
>>>>> "createStatment".
>>>>>
>>>>> So the buttom line is that you have an irregular behaviour in
>>>>> support of an
>>>>> irregular coding practice. I understand that there is "legacy" code
>>>>> that
>>>>> needs to use the prepareStatement method with multiple SQL
>>>>> statements. But
>>>>> since such practice was already incorrect at the time of its
>>>>> inception, the
>>>>> right way would be to allow a transition from bad to good by
>>>>> requiring the
>>>>> "sinners" to explicitly use some kind of
>>>>> "org.postgresql.PGStatement.setUseServerPrepare(false)" tweaking in
>>>>> their
>>>>> code, and let new coders use standard code.
>>>>>
>>>>> Of course, you're running the show, so you will do it as you deem
>>>>> right, but
>>>>> IMHO the implementation as you described it is clearly not reasonable.
>>>>>
>>>>> Cheers,
>>>>>
>>>>> Peter
>>>>>
>>>>> ----- Original Message -----
>>>>> From: "Barry Lind" <barry@xythos.com>
>>>>> To: "Dmitry Tkach" <dmitry@openratings.com>
>>>>> Cc: <pgsql-general@postgresql.org>; <pgsql-jdbc@postgresql.org>
>>>>> Sent: Thursday, September 26, 2002 10:56 PM
>>>>> Subject: Re: [JDBC] [GENERAL] Prepared statement performance...
>>>>>
>>>>>
>>>>>> Dimitry,
>>>>>>
>>>>>> I have some good news for you.  All of this functionality is in
>>>>>> 7.3 of
>>>>>> the server and the current development build of the jdbc driver.  Of
>>>>>> course 7.3 isn't quite production yet, but beta 2 should be out
>>>>>> this week.
>>>>>>
>>>>>> Your first option is what has been implemented.  In 7.3 the server
>>>>>> now
>>>>>> supports 'prepare <name> as <sql>' and 'execute <name>' (and variants
>>>>>> for passing bind variables).  The jdbc driver also now has a
>>>>>> method on
>>>>>> org.postgresql.PGStatement.setUseServerPrepare(true) to enable the
>>>>>> the
>>>>>> use of server side prepared statements (so you can cast your
>>>>>> Statement
>>>>>> object to an org.postgresql.PGStatement object and enable the
>>>>>> functionality for a specific statement).
>>>>>>
>>>>>> This hasn't yet made its way into the jdbc documentation for 7.3, but
>>>>>> this will be documented before 7.3 goes production.
>>>>>>
>>>>>> This implementation does still require that you make some changes to
>>>>>> your app in order to enable it but that is by design for the
>>>>>> following
>>>>>> reasons:
>>>>>>    1) In order to be useful you need to reuse a statement object
>>>>>> multiple times.  Most code generally will just create a statement and
>>>>>> use it only once, and since more work is being done to prepare the
>>>>>> statement that would have a negative performance impact for the
>>>>>> majority
>>>>>> of statements being executed in normal code.
>>>>>>    2) Prepare functionality will not work for all types of sql
>>>>>> that you
>>>>>> can execute through the jdbc driver.  For example you can quite
>>>>>> easily
>>>>>> do the following:  conn.prepareStatement("update foo set a = ?;
>>>>>> insert
>>>>>> into bar values (?,?,?);");
>>>>>> By issuing both sql statements in one call only one network
>>>>>> roundtrip is
>>>>>> needed to execute both statements providing what is likely a more
>>>>>> performant application.  However this doesn't work with prepare
>>>>>> because:
>>>>>> prepare foo as update foo set a = ?; insert into bar
>>>>>> values(?,?,?);   -
>>>>>> is one prepared statement and one regular statement which will cause
>>>>>> very strange things to happen.
>>>>>>
>>>>>> It is for the reasons above that the developer needs to explicitly
>>>>>> turn
>>>>>> on the use of server side prepared statements when it makes sense
>>>>>> to do
>>>>>>
>>>>> so.
>>>>>
>>>>>> thanks,
>>>>>> --Barry
>>>>>>
>>>>>> PS.  I would really appreciate it if you could test this new
>>>>>> functionality.  As far as I know I am the only one who has used
>>>>>> it.  But
>>>>>> for one of my frequently used and complex sql statements the time to
>>>>>> execute dropped by 60% using server side prepared statements.
>>>>>>
>>>>>>
>>>>>>
>>>>>> Dmitry Tkach wrote:
>>>>>> > Hi, everybody.
>>>>>> >
>>>>>> > I am running into huge performance problems, due to JDBC not
>>>>>> being able
>>>>>> > to cache query plans :-(
>>>>>> > My java program runs the (set of about 5) identical statements with
>>>>>> > different parameters for about 30 million times...
>>>>>> > What I am talking about below has to do with JDBC, but is not
>>>>>> really
>>>>>> > specific to it - it seems to me, that the backend itself
>>>>>> > could be improved to better handle this kind of scenario, when
>>>>>> the same
>>>>>> > statement is executed many times in the same session.
>>>>>> >
>>>>>> > It takes about 3 weeks(!) to run, and it looks like about *half* of
>>>>>>
>>>>> that
>>>>>
>>>>>> > time is spent by the query planner,
>>>>>> > creating and recreating the query plans every time I ran the damn
>>>>>> > statement....
>>>>>> >
>>>>>> > I am looking into implementing some kind of a solution, that
>>>>>> would let
>>>>>> > me work around that problem...
>>>>>> > So far, I only see two possibilities:
>>>>>> >
>>>>>> > - a general solution, that would involve extending postgres SQL
>>>>>> gramma
>>>>>> > to include a 'prepare' statement
>>>>>> > - or am ugly work around, that would involve moving all my sql
>>>>>> > statements into stored procedures, and have
>>>>>> >   those cache the query plans inside...
>>>>>> >
>>>>>> > The second solution is not only ugly (because it requires the
>>>>>> > application code to be changed and to have a specialized
>>>>>> > stored procedure for every query), but also requires some
>>>>>> additional
>>>>>> > hacks (to overcome the hard limit on the number of
>>>>>> > function arguments and the inability for functions to return
>>>>>> tuples) -
>>>>>> > the only way I imagine this can be made to work is
>>>>>> > to glue all the arguments together into a text string, and have the
>>>>>> > stored procedure parse it back, execute the query, then
>>>>>> > glue the resulting tuple(s) into another text string, return it,
>>>>>> and
>>>>>> > have the application (or, perhaps, JDBC layer) parse it back
>>>>>> > into columns...
>>>>>> >
>>>>>> > I was wonderring if anybody has any better ideas how this can be
>>>>>> made
>>>>>>
>>>>> to
>>>>>
>>>>>> > work (I am looking for a solution that would
>>>>>> > minimize changes to the existing JDBC applications that use
>>>>>> > PreparedStatements)?
>>>>>> >
>>>>>> > If the maintainers of the involved code are interested, I would be
>>>>>> > willing to implement and contribute the solution we come up with
>>>>>> ...
>>>>>> > (I figure, nobody would really be interested in getting that second
>>>>>> > solution I mentioned into the mainstream :-), but, if we are
>>>>>> > able to come up with something more general and less ugly,
>>>>>> perhaps, I
>>>>>>
>>>>> am
>>>>>
>>>>>> > not the only one who would be able to contribute from
>>>>>> > it)...
>>>>>> >
>>>>>> > For example, I believe, it should not be too complicated to
>>>>>> implement
>>>>>> > that first possibility I described above...
>>>>>> > The way I see it would involve adding two statements to postgres
>>>>>> SQL
>>>>>> > syntax:
>>>>>> >
>>>>>> > prepare <name> as <sql statement>
>>>>>> > and
>>>>>> > execute <name> with (<parameter list>)
>>>>>> >
>>>>>> > For example:
>>>>>> >
>>>>>> > prepare mystatement as select * from mytable where id = $1 and name
>>>>>>
>>>>> like
>>>>>
>>>>>> > $2;
>>>>>> >
>>>>>> > and then
>>>>>> >
>>>>>> > execute mystatement with (1, 'Dima');
>>>>>> > execute mystatement with (2, 'John');
>>>>>> >
>>>>>> > etc....
>>>>>> >
>>>>>> > The JDBC driver would then send the 'prepare' command to the
>>>>>> backend in
>>>>>> > Connection.prepareStatement (), and
>>>>>> > use the 'execute' in PreparedStatement.execute ();
>>>>>> >
>>>>>> > One potential problem with implementation I see here is that the
>>>>>> query
>>>>>> > planner wants to know the argument types ahead of time...
>>>>>> > I guess, I could get around that by making all the arguments
>>>>>> 'text',
>>>>>>
>>>>> and
>>>>>
>>>>>> > having them casted into the right types when the
>>>>>> > statement is actually executed.
>>>>>> >
>>>>>> > There is, probably a need to also have some kind of a 'close'
>>>>>> command
>>>>>>
>>>>> to
>>>>>
>>>>>> > throw away the prepared query plans... or we could just
>>>>>> > make them last forever until, say, the end of transaction (or,
>>>>>> perhaps,
>>>>>> > the end of the session?)...
>>>>>> >
>>>>>> > If there is anyone interested in discussing various
>>>>>> possibilities, and
>>>>>> > getting this implemented one way or another,
>>>>>> >  I would like to hear from you!
>>>>>> >
>>>>>> > Thanks!
>>>>>> >
>>>>>> > Dima.
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> > ---------------------------(end of
>>>>>>
>>>>> broadcast)---------------------------
>>>>>
>>>>>> > TIP 2: you can get off all lists at once with the unregister
>>>>>> command
>>>>>> >    (send "unregister YourEmailAddressHere" to
>>>>>> majordomo@postgresql.org)
>>>>>> >
>>>>>>
>>>>>>
>>>>>>
>>>>>> ---------------------------(end of
>>>>>> broadcast)---------------------------
>>>>>> TIP 1: subscribe and unsubscribe commands go to
>>>>>> majordomo@postgresql.org
>>>>>>
>>>>>
>>>>>
>>>>> ---------------------------(end of
>>>>> broadcast)---------------------------
>>>>> TIP 1: subscribe and unsubscribe commands go to
>>>>> majordomo@postgresql.org
>>>>>
>>>>>
>>>>
>>>>
>>>>
>>>> ---------------------------(end of
>>>> broadcast)---------------------------
>>>> TIP 5: Have you checked our extensive FAQ?
>>>>
>>>> http://www.postgresql.org/users-lounge/docs/faq.html
>>>>
>>>
>>
>
>
>
>

Re: [GENERAL] Prepared statement performance...

From

Barry Lind

Date:

27 September 2002, 15:00:31

Jochem,

Currently you probably can't.  But given the information I have
mentioned in my previous mails in this thread, I am not sure that you
would really want to unless you somehow have the ability to do it
statement by statement since in most cases I think you don't want to use
server side prepared statements.

If someone can think of a way through the standard jdbc api to
enable/disable something like this at the statement level I am all ears.
  I haven't been able to think of a mechanism other that the one that
has been implemented.

thanks,
--Barry

Jochem van Dieten wrote:
> Barry Lind wrote:
>
>>
>> Your first option is what has been implemented.  In 7.3 the server now
>> supports 'prepare <name> as <sql>' and 'execute <name>' (and variants
>> for passing bind variables).  The jdbc driver also now has a method on
>> org.postgresql.PGStatement.setUseServerPrepare(true) to enable the the
>> use of server side prepared statements (so you can cast your Statement
>> object to an org.postgresql.PGStatement object and enable the
>> functionality for a specific statement).
>
> [..]
>
>> It is for the reasons above that the developer needs to explicitly turn
>> on the use of server side prepared statements when it makes sense to
>> do so.
>
>
> How would the developer do that? With the front-end I am working with
> (ColdFusion MX) I have very little control over the actual calls to the
> driver. I can make CF MX use createStatement() or prepareStatement() but
> that is all the control I have. And the only way to send parameters to
> the driver would be through a JDBC URL.
> Would that enable me to use prepared statements for real or not?
>
> Jochem
>
>

Re: [GENERAL] Prepared statement performance...

From

Barry Lind

Date:

27 September 2002, 15:12:24

Dimtry,

Dmitry Tkach wrote:

> Not realy... You should not be required to know about postgres date
> format, as long as it matches Timestamp.toString () output (and it
> currently does), and I
> see no reason why it would not (you just need to make sure that the
> client and the server are in the same locale, but that's a different
> discussion, because whatever
> JDBC does in setTimestamp () would locale-dependent anyway).

This certainly isn't true for all databases.  Oracle for example where
the database format for dates is DD-MON-YY which is very different from
the Timestamp.toString() method.  And actually the postgres format is
different and incompatible with javas format when you start having to
deal with timezone information.

thanks,
--Barry

Re: [GENERAL] Prepared statement performance...

From

Barry Lind

Date:

27 September 2002, 15:15:27


Dmitry Tkach wrote:

> What is this BE/FE thing?

It is the protocol a client like jdbc uses to talk to the postgres server.

BE/FE = Back End / Front End protocol

--Barry

Re: [GENERAL] Prepared statement performance...

From

Barry Lind

Date:

27 September 2002, 15:27:44

Dmitry Tkach wrote:
>
> I have to disagree with that too, based on my experience...
> In that application I mentioned in the topic that initiated this
> discussion, the statement, that I have to execute about 30 million times
> looks like
>
> "select * from foo where primary_key=bar"
>
> What could possibly be simpler than that? :-)
> I was surprised to find out however, that in the end the backend turns
> out to have *more* time parsing and planning that query then actually
> executing it :-(
>

I don't disagree that this is a simple query.  But remember that since
the implementation of server side prepare is through the sql interface
with each call to execute this statement you still need to issue the
following sql statement:  execute <name>(bar);  which will still need to
invoke the parser.  Using server side prepared statements doesn't
eliminate all parses.  It isn't true that you can just take the original
  execute time of a statement and assume that with server side prepared
statements the total time will be the same as that, it will be more.

Executing something 30million times will certainly make even a little
improvement for a simple statement like this noticable, but 30million
executes of the same statement certainly isn't the norm.  And I would
suggest it is the type of environment where serverside prepared
statements are the most useful.  I hope you can try the new
functionality and provide us all with real numbers as to how much it helps.

thanks,
--Barry

Re: [GENERAL] Prepared statement performance...

From

Barry Lind

Date:

27 September 2002, 16:02:59

Dmitry Tkach wrote:
>
> BTW, here is a quote from JDBC javadoc:
>
> * An object that represents a precompiled SQL statement.
> * <P>A SQL statement is precompiled and stored in a
> * <code>PreparedStatement</code> object. This object can then be used to
> * efficiently execute this statement multiple times.
>
> I read 'precompiled' as something more than 'parsed for question marks' :-)
> And also note the phrase about being able to 'efficiently execute this
> statement multiple times'...
>

I don't read more into this than simply parsing question marks.  It is
all going to depend on the capabilities of the server.  To the extent
the server has additional features that can be leveraged to improve the
performance those should be used.  Up until 7.3 of postgres no such
features existed.  Now they do and we are exploring how to best take
advantage of them.  But it doesn't make sense to use a specific server
feature if it doesn't improve performance in the general case.  The
testing I have done so far indicates that using the current
implementation of server side prepared statements doesn't help
performance in the general case and may hinder it.  I am reluctant to
make a non-backwardly compatible change in the driver until more data
points come in that there are significant performance improvements
possible without decreasing performance in the general case.

> It seems quite clear to me that it states the intent behind having this
> interface to provide the ability for the users to
> parse a query once and be able to execute it again and again...
> As for the set*() functions,  PreparedStatement without them would be of
> very limited use (how often would we need to execute
> *the same exact query* multiple times???), and that's the reason it has
> them, *not* to abstract from the database data representation format...
>

I have many statements that don't take any arguments and that do get
executed multiple times.  It all depends on the nature of the application.

thanks,
--Barry

Re: [GENERAL] Prepared statement performance...

From

Barry Lind

Date:

27 September 2002, 16:15:12


Dmitry Tkach wrote:


> True... But how does PreparedStatement.setTimestamp () help here?
> All it does (at least in 7.2) is Timestamp.toString () :-)
>

Huh?  In 7.3 setTimestamp() is much more than Timestamp.toString() (and
in 7.2 it was as well, I think you need to go back to 7.1 for it to be
as simple as you describe).

    public void setTimestamp(int parameterIndex, Timestamp x) throws
SQLException
    {
        if (null == x)
        {
            setNull(parameterIndex, Types.OTHER);
        }
        else
        {
            // Use the shared StringBuffer
            synchronized (sbuf)
            {
                sbuf.setLength(0);
                sbuf.ensureCapacity(32);
                sbuf.append("'");
                //format the timestamp
                //we do our own formating so that we can get a format
                //that works with both timestamp with time zone and
                //timestamp without time zone datatypes.
                //The format is '2002-01-01 23:59:59.123456-0130'
                //we need to include the local time and timezone offset
                //so that timestamp without time zone works correctly
                int l_year = x.getYear() + 1900;
                sbuf.append(l_year);
                sbuf.append('-');
                int l_month = x.getMonth() + 1;
                if (l_month < 10)
                    sbuf.append('0');
                sbuf.append(l_month);
                sbuf.append('-');
                int l_day = x.getDate();
                if (l_day < 10)
                    sbuf.append('0');
                sbuf.append(l_day);
                sbuf.append(' ');
                int l_hours = x.getHours();
                if (l_hours < 10)
                    sbuf.append('0');
                sbuf.append(l_hours);
                sbuf.append(':');
                int l_minutes = x.getMinutes();
                if (l_minutes < 10)
                    sbuf.append('0');
                sbuf.append(l_minutes);
                sbuf.append(':');
                int l_seconds = x.getSeconds();
                if (l_seconds < 10)
                    sbuf.append('0');
                sbuf.append(l_seconds);
                // Make decimal from nanos.
                char[] l_decimal = {'0', '0', '0', '0', '0', '0', '0', '0', '0'};
                char[] l_nanos = Integer.toString(x.getNanos()).toCharArray();
                System.arraycopy(l_nanos, 0, l_decimal, l_decimal.length -
l_nanos.length, l_nanos.length);
                sbuf.append('.');
                if (connection.haveMinimumServerVersion("7.2"))
                {
                    sbuf.append(l_decimal, 0, 6);
                }
                else
                {
                    // Because 7.1 include bug that "hh:mm:59.999" becomes "hh:mm:60.00".
                    sbuf.append(l_decimal, 0, 2);
                }
                //add timezone offset
                int l_offset = -(x.getTimezoneOffset());
                int l_houros = l_offset / 60;
                if (l_houros >= 0)
                {
                    sbuf.append('+');
                }
                else
                {
                    sbuf.append('-');
                }
                if (l_houros > -10 && l_houros < 10)
                    sbuf.append('0');
                if (l_houros >= 0)
                {
                    sbuf.append(l_houros);
                }
                else
                {
                    sbuf.append( -l_houros);
                }
                int l_minos = l_offset - (l_houros * 60);
                if (l_minos != 0)
                {
                    if (l_minos < 10)
                        sbuf.append('0');
                    sbuf.append(l_minos);
                }
                sbuf.append("'");
                bind(parameterIndex, sbuf.toString(), PG_TIMESTAMPTZ);
            }

        }
    }

Re: [GENERAL] Prepared statement performance...

From

Barry Lind

Date:

27 September 2002, 17:47:53


Dmitry Tkach wrote:
> Barry Lind wrote:
>
>>
>>
>> Dmitry Tkach wrote:
>>
>>
>>> True... But how does PreparedStatement.setTimestamp () help here?
>>> All it does (at least in 7.2) is Timestamp.toString () :-)
>>>
>>
>> Huh?  In 7.3 setTimestamp() is much more than Timestamp.toString()
>> (and in 7.2 it was as well, I think you need to go back to 7.1 for it
>> to be as simple as you describe).
>
>
> Not really... It *seems* to be doing  more in 7.2, but then here is what
> it says right after it's done:
>
>                        // The above works, but so does the following.
> I'm leaving the above in, but this seems
>                        // to be identical. Pays to read the docs ;-)
>                        //set(parameterIndex,"'"+x.toString()+"'");
>
> And this is exactly right :-)

Those were old comments that were incorrect, thus they are removed in 7.3.

>
> I have indeed seen that timezones did not use to work correctly with 7.2
> JDBC, but it seems that all that needed to be done to fix that was to
> remove the df.setTimezone() call (it shifts the time to GMT, and sends
> it to the server that way, but the backend assumes it is in the local
> timezone, so the timestamp ends up having wrong time), or, even better,
> just get rid of the whole formatting stuff alltogether and stick to the
> toString () call as that comment suggets :-)

That doesn't work when the client and server are running in different
timezones.

> Everything would work, except for that 7.1 thing of course, that
> converts 59.999 to 60.00 :-)
>
> Still all of the above would only work with your local timezone, and, if
> the timestamp has some other offset, it would still be wrong. The
> complexity of the 7.3 code
> seems to be inteded to fix that, which is a good thing, although,  the
> way it is done seems to be way overcomplicated to me - it seems that if
> you just did x.toGMTString () and sent it to the backend as text, it
> would do the same conversion automatically...

This is wrong if the datatype is timestamp with out timezone, but would
do the correct thing (I think) for timestamp with timezone.

> BTW, the same approach could be used if you wanted to use Statement to set
> a timestamptz properly, without calling setTimestamp ():
>
> c.createStatement ().executeQuery ("select * from foo where
> timestamptzfield = '" + mytimestamp.toGMTString () +"');
>
> It's true that not every database would understand GMT format (for
> example, informix doesn't), but,
> on the other hand, not every database has the notion of timezones either
> (informix doesn't either) - i.e., if you are using anything other than
> the locale's default timezone in your application (which would just work
> if you did Timestamp.toString () and sent it in as text), then you are
> already aware that you are working with postgres, and, thus the
> 'database abstraction' argument for using 'set*()' stuff doesn't shoot :-)
>
> and actually, I don't even see any reason why something like:
>
> c.createStatement ().executeQuery ("select * from foo where
> timestamptzfiled = '" + mytimestamp + "'")
>
> ... would not do the same thing as the example above...
>
> To cut it short, the bottomline is - if JDBC implementation just ignored
> the timezone stuff alltogether, and left it all to the backend,
> everything would just work :-)
>
> I may be wrong here, but if I am, I would be very curious to see an
> example where this would not work.
>

See responses above.  If you have suggested improvements try them out
and if you can get them to pass the TimestampTest.java regression I
would like to look at them.

thanks,
--Barry

Re: [GENERAL] Prepared statement performance...

From

"Peter Kovacs"

Date:

28 September 2002, 10:21:52

Barry,

The API spec says for PreparedStatement:

"An object that represents a precompiled SQL statement. A SQL statement is
precompiled and stored in a PreparedStatement object. This object can then
be used to efficiently execute this statement multiple times."

This tells me that the main purpose of this object type is to allow you to
multiple execution with possibly different parameters. IMHO, the setters are
there just in order to provide you a way to specify different parameters.
This also tells me that there is no alternative behaviour (i.e. the
statement represented by this kind of object IS PRECOMPILED). If the
PostgreSQL jdbc driver allows an alternative behaviour, this should not be
default and should be made available

I have to admit that linking the above question (behaviour of Prepared
Statement) with the question of how many queries a SQL statement is allowed
to contain may have been a bit arbitrary. (Actually, *I* did not make the
linkage, I was just probing the way along a course of thoughts already
offerred. :)) I wanted to tackle this question, because the practice of
putting multiple statements in one Statement object (separated with
semicolons) has already come up in connection with other --earlier not
implemented-- jdbc feature(s) and there the problem to handle this practice
also turned out to be a pain in the ass. Unfortunately, I do not have the
SQL 92 spec. I could only look into the Postgres Reference documentation.
Based on how it uses the term "statement", I would conclude that "statement"
is just an entry in the reference doc. As to the getMoreResults method:
issuing several queries is not the only way to obtain multiple results sets.
E.g: your query may involve a function which returns multiple result sets
(or just one result set and an update count). My personal opinion is that
putting multiple statements in one Statement object is just abusing the fact
that the backend allows to do, but the SQL 92 spec should be, of course, the
ultimate authority in this regard.

Cheers,

Peter

----- Original Message -----
From: "Barry Lind" <barry@xythos.com>
To: "Peter Kovacs" <peter.kovacs@sysdata.siemens.hu>
Cc: "Dmitry Tkach" <dmitry@openratings.com>; <pgsql-general@postgresql.org>;
<pgsql-jdbc@postgresql.org>
Sent: Friday, September 27, 2002 8:24 PM
Subject: Re: [JDBC] [GENERAL] Prepared statement performance...


> Cool an interesting thread on the jdbc list.
>
> First off I dissagree with your reading of the spec that the statement
> or prepared statement object are only intended for one sql statement.
> If that were true why would you have Statement.getMoreResults()?  The
> API clearly supports a single statement execute returning multiple
> result sets which implies to me that the single execute can be used for
> multiple individual sql statements chained together.
>
> The uses of the Statement object and PreparedStatement object do not
> IMHO have anything to do with you many times you use them.  They have to
> do with how you supply values in the sql statement.  If you use just the
> Statement object you are required to know the postgres date format in
> order to include a date value, where it is much easier (and more
> portable across databases) to just use a PreparedStatement, a ?, and a
> setDate() call.
>
> thanks,
> --Barry
>
> Peter Kovacs wrote:
>  > Barry,
>  >
>  > It's great to have this functionality.
>  >
>  > However, I am surprised about the need/requirement to explicitely
> turning on
>  > this feature in the JDBC driver. As you explain, the reason for this
>  > behaviour is to support the use of multiple statements with the
>  > prepareStatement method. But this is supporting incorrect us of the
JDBC
>  > interface, since the JDBC API says that the parameter of the
>  > prepareStatement method is one(!) SQL statetment. On the other hand,
> it is
>  > clear that the intended purpose of the prepareStatement method is to
use
>  > prepared statment -- exclusively and not by default. For statements
> executed
>  > only once, the designers of the API made available the method
>  > "createStatment".
>  >
>  > So the buttom line is that you have an irregular behaviour in support
> of an
>  > irregular coding practice. I understand that there is "legacy" code
that
>  > needs to use the prepareStatement method with multiple SQL
> statements. But
>  > since such practice was already incorrect at the time of its
> inception, the
>  > right way would be to allow a transition from bad to good by
> requiring the
>  > "sinners" to explicitly use some kind of
>  > "org.postgresql.PGStatement.setUseServerPrepare(false)" tweaking in
their
>  > code, and let new coders use standard code.
>  >
>  > Of course, you're running the show, so you will do it as you deem
> right, but
>  > IMHO the implementation as you described it is clearly not reasonable.
>  >
>  > Cheers,
>  >
>  > Peter
>  >
>  > ----- Original Message -----
>  > From: "Barry Lind" <barry@xythos.com>
>  > To: "Dmitry Tkach" <dmitry@openratings.com>
>  > Cc: <pgsql-general@postgresql.org>; <pgsql-jdbc@postgresql.org>
>  > Sent: Thursday, September 26, 2002 10:56 PM
>  > Subject: Re: [JDBC] [GENERAL] Prepared statement performance...
>  >
>  >
>  >
>  >>Dimitry,
>  >>
>  >>I have some good news for you.  All of this functionality is in 7.3 of
>  >>the server and the current development build of the jdbc driver.  Of
>  >>course 7.3 isn't quite production yet, but beta 2 should be out this
> week.
>  >>
>  >>Your first option is what has been implemented.  In 7.3 the server now
>  >>supports 'prepare <name> as <sql>' and 'execute <name>' (and variants
>  >>for passing bind variables).  The jdbc driver also now has a method on
>  >>org.postgresql.PGStatement.setUseServerPrepare(true) to enable the the
>  >>use of server side prepared statements (so you can cast your Statement
>  >>object to an org.postgresql.PGStatement object and enable the
>  >>functionality for a specific statement).
>  >>
>  >>This hasn't yet made its way into the jdbc documentation for 7.3, but
>  >>this will be documented before 7.3 goes production.
>  >>
>  >>This implementation does still require that you make some changes to
>  >>your app in order to enable it but that is by design for the following
>  >>reasons:
>  >>    1) In order to be useful you need to reuse a statement object
>  >>multiple times.  Most code generally will just create a statement and
>  >>use it only once, and since more work is being done to prepare the
>  >>statement that would have a negative performance impact for the
majority
>  >>of statements being executed in normal code.
>  >>    2) Prepare functionality will not work for all types of sql that
you
>  >>can execute through the jdbc driver.  For example you can quite easily
>  >>do the following:  conn.prepareStatement("update foo set a = ?; insert
>  >>into bar values (?,?,?);");
>  >>By issuing both sql statements in one call only one network roundtrip
is
>  >>needed to execute both statements providing what is likely a more
>  >>performant application.  However this doesn't work with prepare
because:
>  >>prepare foo as update foo set a = ?; insert into bar values(?,?,?);   -
>  >>is one prepared statement and one regular statement which will cause
>  >>very strange things to happen.
>  >>
>  >>It is for the reasons above that the developer needs to explicitly turn
>  >>on the use of server side prepared statements when it makes sense to do
>  >
>  > so.
>  >
>  >>thanks,
>  >>--Barry
>  >>
>  >>PS.  I would really appreciate it if you could test this new
>  >>functionality.  As far as I know I am the only one who has used it.
But
>  >>for one of my frequently used and complex sql statements the time to
>  >>execute dropped by 60% using server side prepared statements.
>  >>
>  >>
>  >>
>  >>Dmitry Tkach wrote:
>  >> > Hi, everybody.
>  >> >
>  >> > I am running into huge performance problems, due to JDBC not being
> able
>  >> > to cache query plans :-(
>  >> > My java program runs the (set of about 5) identical statements with
>  >> > different parameters for about 30 million times...
>  >> > What I am talking about below has to do with JDBC, but is not really
>  >> > specific to it - it seems to me, that the backend itself
>  >> > could be improved to better handle this kind of scenario, when the
> same
>  >> > statement is executed many times in the same session.
>  >> >
>  >> > It takes about 3 weeks(!) to run, and it looks like about *half* of
>  >
>  > that
>  >
>  >> > time is spent by the query planner,
>  >> > creating and recreating the query plans every time I ran the damn
>  >> > statement....
>  >> >
>  >> > I am looking into implementing some kind of a solution, that would
let
>  >> > me work around that problem...
>  >> > So far, I only see two possibilities:
>  >> >
>  >> > - a general solution, that would involve extending postgres SQL
gramma
>  >> > to include a 'prepare' statement
>  >> > - or am ugly work around, that would involve moving all my sql
>  >> > statements into stored procedures, and have
>  >> >   those cache the query plans inside...
>  >> >
>  >> > The second solution is not only ugly (because it requires the
>  >> > application code to be changed and to have a specialized
>  >> > stored procedure for every query), but also requires some additional
>  >> > hacks (to overcome the hard limit on the number of
>  >> > function arguments and the inability for functions to return
tuples) -
>  >> > the only way I imagine this can be made to work is
>  >> > to glue all the arguments together into a text string, and have the
>  >> > stored procedure parse it back, execute the query, then
>  >> > glue the resulting tuple(s) into another text string, return it, and
>  >> > have the application (or, perhaps, JDBC layer) parse it back
>  >> > into columns...
>  >> >
>  >> > I was wonderring if anybody has any better ideas how this can be
made
>  >
>  > to
>  >
>  >> > work (I am looking for a solution that would
>  >> > minimize changes to the existing JDBC applications that use
>  >> > PreparedStatements)?
>  >> >
>  >> > If the maintainers of the involved code are interested, I would be
>  >> > willing to implement and contribute the solution we come up with ...
>  >> > (I figure, nobody would really be interested in getting that second
>  >> > solution I mentioned into the mainstream :-), but, if we are
>  >> > able to come up with something more general and less ugly, perhaps,
I
>  >
>  > am
>  >
>  >> > not the only one who would be able to contribute from
>  >> > it)...
>  >> >
>  >> > For example, I believe, it should not be too complicated to
implement
>  >> > that first possibility I described above...
>  >> > The way I see it would involve adding two statements to postgres SQL
>  >> > syntax:
>  >> >
>  >> > prepare <name> as <sql statement>
>  >> > and
>  >> > execute <name> with (<parameter list>)
>  >> >
>  >> > For example:
>  >> >
>  >> > prepare mystatement as select * from mytable where id = $1 and name
>  >
>  > like
>  >
>  >> > $2;
>  >> >
>  >> > and then
>  >> >
>  >> > execute mystatement with (1, 'Dima');
>  >> > execute mystatement with (2, 'John');
>  >> >
>  >> > etc....
>  >> >
>  >> > The JDBC driver would then send the 'prepare' command to the
> backend in
>  >> > Connection.prepareStatement (), and
>  >> > use the 'execute' in PreparedStatement.execute ();
>  >> >
>  >> > One potential problem with implementation I see here is that the
query
>  >> > planner wants to know the argument types ahead of time...
>  >> > I guess, I could get around that by making all the arguments 'text',
>  >
>  > and
>  >
>  >> > having them casted into the right types when the
>  >> > statement is actually executed.
>  >> >
>  >> > There is, probably a need to also have some kind of a 'close'
command
>  >
>  > to
>  >
>  >> > throw away the prepared query plans... or we could just
>  >> > make them last forever until, say, the end of transaction (or,
> perhaps,
>  >> > the end of the session?)...
>  >> >
>  >> > If there is anyone interested in discussing various possibilities,
and
>  >> > getting this implemented one way or another,
>  >> >  I would like to hear from you!
>  >> >
>  >> > Thanks!
>  >> >
>  >> > Dima.
>  >> >
>  >> >
>  >> >
>  >> > ---------------------------(end of
>  >
>  > broadcast)---------------------------
>  >
>  >> > TIP 2: you can get off all lists at once with the unregister command
>  >> >    (send "unregister YourEmailAddressHere" to
> majordomo@postgresql.org)
>  >> >
>  >>
>  >>
>  >>
>  >>---------------------------(end of
broadcast)---------------------------
>  >>TIP 1: subscribe and unsubscribe commands go to
majordomo@postgresql.org
>  >>
>  >
>  >
>  >
>  >
>
>

Re: [GENERAL] Prepared statement performance...

From

snpe

Date:

28 September 2002, 10:45:05

...
> This certainly isn't true for all databases.  Oracle for example where
> the database format for dates is DD-MON-YY which is very different from
> the Timestamp.toString() method.  And actually the postgres format is
> different and incompatible with javas format when you start having to
> deal with timezone information.

Oracle have nls_format_date global parameter and default format for date can
be any.
regrads
haris peco

Re: [GENERAL] Prepared statement performance...

From

Tom Lane

Date:

28 September 2002, 12:26:28

"Peter Kovacs" <peter.kovacs@sysdata.siemens.hu> writes:
> ... but the SQL 92 spec should be, of course, the
> ultimate authority in this regard.

The SQL spec has never heard of JDBC.  I assume there is a spec
somewhere for the JDBC API, but it's not SQL92 nor SQL99.

            regards, tom lane

Re: [GENERAL] Prepared statement performance...

From

Neil Conway

Date:

28 September 2002, 13:26:47

Barry Lind <barry@xythos.com> writes:
> It is a bit more complex than just looking for a ; since a ; is a
> valid character in a quoted string.  But as Bruce has mentioned in a
> followup, psql does it so it is doable.
>
> I think the real question here is when does it make sense to use
> server side prepared statements.  In the little bit of testing I have
> done, I would say the answer is rarely.  You need many factors to come
> into place for it to make sense to use server side prepared statements:
>
> 1) The statement needs to be big and complex such that there is
> significant overhead in the parsing and planning stages of
> execution. The query that I test with is about 3K in size and joins
> about 10 different tables and includes a union.  In this case there is
> a significant overhead involved in both parsing and planning.
> However for a simple query that affects only one or two tables there
> is little overhead.
>
> 2) The statement object needs to be reused multiple times.  Using a
> server prepared statement requires at a minimum three sql statements
> to be executed to do the work of one original statement:
> select foo from bar;
> becomes
> prepare <name> as select foo from bar;
> execute <name>;
> deallocate <name>;

Note that DEALLOCATE is not really necessary -- prepared statements
are flushed from memory when the backend exits (I suppose if you're
using a connection pool, however, you should still explicitely
DEALLOCATE prepared statements when you're done with them).

> We can do the first two together in one roundtrip to the server, but
> the last one requires a separate roundtrip.  So if you are only using
> a the statement object/query only once then using server side prepared
> statements will make performance worse than not.  The vast majority of
> statement objects are created, executed once and then closed.  I don't
> think it makes sence to turn on a feature that will make the overall
> performance for most users worse.  There is very little jdbc code that
> I have seen that creates a statement, call execute multiple times on
> that same statement object and then finally closes it.
>
> This is the first version of the feature.  Improvements will come with
> implementation feedback and I welcome all feedback.

As do I (in reference to the backend part of the feature).

> In the long term what I would really like to see is this
> functionality done at the BE/FE protocol level instead of at the sql
> level as I have seen other databases do.

Yeah, that would be good. However, it requires a protocol change,
which wasn't feasible during the 7.3 development cycle. However, it
looks like there will be an FE/BE protocol change for 7.4, so that
might be a good opportunity to add protocol-level support.

>
> Someone has suggestted a jdbc url arguement to enable the
> functionality by default and that is probably a good idea if we learn
> from real use that it makes sense to have all statements use this.
> But I am not convinced (given my reasons above) that this really makes
> sense.

--
Neil Conway <neilc@samurai.com> || PGP Key ID: DB3C29FC

Re: [GENERAL] Prepared statement performance...

From

"Peter Kovacs"

Date:

28 September 2002, 13:37:24

Thank you for your remark. But we're talking about SQL statements (and not
JDBC statements), aren't we? I have not found anything as yet which would
indicate that the JDBC API significantly redefined the meaning of the term
"statement" compared to what had been known as SQL statement well before
JDBC appeared. And in this regard (i.e. to clarify what has/had been meant
by "SQL statement"), the SQL spec seemed to me a good starting point,
although anecdotical evidence can, of course, also be used, if you can
provide some.

Your remark is correct, if you mean that the JDBC API does not explicitely
restricts you from putting in your SQL statement anything you wish. But
(provided I understand the big picture) JDBC has been designed for a very
specific kind of resource managers: database servers communicating with
their clients via SQL statements. There exists a specification for a very
generic JAVA resource adapter, which explicitely allows you to put over the
wire anything you wish (actually, anything the resource manager it is
intended for understands). But this generic resource adapter is called
Connector, and not JDBC driver. So if you want to interpret liberally the
API between the server and the client, the postgres java client classes (in
what is now called JDBC driver) should implement the "connector" interfaces
rather than the "sql" interfaces.

And as to the whereabouts of the JDBC API spec, a good place to look for it
is www.java.sun.com .

Cheers,

Peter

----- Original Message -----
From: "Tom Lane" <tgl@sss.pgh.pa.us>
To: "Peter Kovacs" <peter.kovacs@sysdata.siemens.hu>
Cc: "Barry Lind" <barry@xythos.com>; "Dmitry Tkach"
<dmitry@openratings.com>; <pgsql-general@postgresql.org>;
<pgsql-jdbc@postgresql.org>
Sent: Saturday, September 28, 2002 6:25 PM
Subject: Re: [JDBC] [GENERAL] Prepared statement performance...

> "Peter Kovacs" <peter.kovacs@sysdata.siemens.hu> writes:
> > ... but the SQL 92 spec should be, of course, the
> > ultimate authority in this regard.
>
> The SQL spec has never heard of JDBC.  I assume there is a spec
> somewhere for the JDBC API, but it's not SQL92 nor SQL99.
>
> regards, tom lane
>

Re: [GENERAL] Prepared statement performance...

From

Jochem van Dieten

Date:

30 September 2002, 06:19:45

Barry Lind wrote:
>
> Currently you probably can't.  But given the information I have
> mentioned in my previous mails in this thread, I am not sure that you
> would really want to unless you somehow have the ability to do it
> statement by statement since in most cases I think you don't want to use
> server side prepared statements.

OK, so I understand you see a difference between a JDBC PrepareStatement
and a server side prepared statement, and the first does not necessarily
imply the second. I don't see that difference per se :) If I don't want
to use a server side prepared statement I will make use of
CreateStatement. But I can imagine that for some applications that are a
bit more advanced as mine a more granular control is needed.

Would it be possible to have a parameter in the JDBC URL that would
switch the behaviour with regard to server side prepared statements?
Something like ?serverPrepare={always|statement|never} which defaults to
statement (i.e. the setUseServerPrepare() you are suggesting).

Jochem

Re: [GENERAL] Prepared statement performance...

From

Curt Sampson

Date:

06 October 2002, 21:57:51

On Fri, 27 Sep 2002, Barry Lind wrote:

> First off I dissagree with your reading of the spec that the statement
> or prepared statement object are only intended for one sql statement.
> If that were true why would you have Statement.getMoreResults()?

Because a single statement can return multiple result sets. Consider
a stored procedure that does several SELECTs, for example.

> The API clearly supports a single statement execute returning multiple
> result sets which implies to me that the single execute can be used for
> multiple individual sql statements chained together.

I'd disagree. Otherwise why would we have a facility for statement batches?

At any rate, I can guarantee you that statement chaning is not portable.

> The uses of the Statement object and PreparedStatement object do not
> IMHO have anything to do with you many times you use them.  They have to
> do with how you supply values in the sql statement.  If you use just the
> Statement object you are required to know the postgres date format in
> order to include a date value, where it is much easier (and more
> portable across databases) to just use a PreparedStatement, a ?, and a
> setDate() call.

I agree entirely. Even more importantly, it's much, much more difficult
to write a secure application (one not subject to SQL injection attacks)
if you are forced to quote data yourself. The driver already knows how
to deal with any string you pass to it; why would you want to re-write
that code, possibily introducing errors into it? (For those not familiar
with SQL injection attacks, I'm referring to the situation where someone
types, say, "foo'; DROP TABLE bar;" into the "Name" field on your web
form, or whatever.)

cjs
--
Curt Sampson  <cjs@cynic.net>   +81 90 7737 2974   http://www.netbsd.org
    Don't you know, in this new Dark Age, we're all light.  --XTC

Re: [GENERAL] Prepared statement performance...

From

"Peter Kovacs"

Date:

14 October 2002, 03:32:09

> > The uses of the Statement object and PreparedStatement object do not
> > IMHO have anything to do with you many times you use them.  They have to
> > do with how you supply values in the sql statement.  If you use just the
> > Statement object you are required to know the postgres date format in
> > order to include a date value, where it is much easier (and more
> > portable across databases) to just use a PreparedStatement, a ?, and a
> > setDate() call.
>
> I agree entirely. Even more importantly, it's much, much more difficult
> to write a secure application (one not subject to SQL injection attacks)
> if you are forced to quote data yourself. The driver already knows how
> to deal with any string you pass to it; why would you want to re-write
> that code, possibily introducing errors into it? (For those not familiar
> with SQL injection attacks, I'm referring to the situation where someone
> types, say, "foo'; DROP TABLE bar;" into the "Name" field on your web
> form, or whatever.)
>

I do not clearly understand what the problem is with someone typing in
"foo'; DROP TABLE bar;" into the "Name" field on your web form.

Re: [GENERAL] Prepared statement performance...

From

Toby

Date:

14 October 2002, 04:00:23

> >I do not clearly understand what the problem is with someone typing in
> >"foo'; DROP TABLE bar;" into the "Name" field on your web form.

many sites take text directly from text boxes in a web form and submit them
to a database without escaping the string, thereby allowing unscrupulous
people to execute SQL directly on your live production database.

for example, if there was a form like below

<form action="blah.jsp" method="post">
   <b>Username:</b> <input type="text" name="username" value="joe'; DROP
TABLE users">
</form>


then on the server you have blah.jsp which will handle the processing for
the page, which might look something like

String username = request.getParameter("username");
Connection conn =
DriverManager.getConnection("jdbc:postgresql:somedatabase", "someuser",
"somepassword");
Statement stmt = conn.createStatement();

// and now the crucial line
stmt.execute("INSERT INTO Users (username) VALUES ('" + username + "')");



from the above, you can see that whatever is entered into the username FORM
item is executed directly on the database which, in this case, can cause
the INSERT statement to actually perform the following

         INSERT INTO Users (username) VALUES ('joe'; DROP TABLE users');

This might cause the users table to be dropped. Not entirely sure if the
above would actually do this, but a little bit of mucking about with what
you type into the FORM field would certainly do this. Of course, you need
to make sure the user you execute the SQL as (when you login to the
database) has DROP permissions, but nonetheless the security concern is clear.

A simple way to prevent this is to use a PreparedStatement

PreparedStatement pstmt = conn.prepareStatement("INSERT INTO Users
(username) VALUES (%)");
pstmt.setString(1, username);
pstmt.execute();

This would escape the username string correctly thus executing the
following SQL

         INSERT INTO Users (username) VALUES ('joe''; DROP TABLE users');

NOTE: the ' after the word "joe" has 2 ' not 1.

In this case, the username would be exactly what the user typed. The
database would not try to execute the DROP TABLE statement.

I use this method but then I also sometimes escape the string myself by
simply replacing all instances of a single apostrophe character (') with 2
('').  (NOTE: do not replace it with the " quote mark, use 2 apostrophe
characters).

I suspect someone will lambast me for escaping the string myself since "why
do it when the driver will do it for you". Well, sometimes i find it
necessary of convenient.

So there you go.
By the way, I've noticed many sites that have this error. It's sloppy and
unforgiveable.

toby

Re: [GENERAL] Prepared statement performance...

From

"Peter Kovacs"

Date:

14 October 2002, 05:06:06

Thank you for your explanation. But I still do not see how
>          INSERT INTO Users (username) VALUES ('joe'; DROP TABLE users');
will be evaluated so that it drops table 'users'. Actually, this should
evaluate to a syntax error, shouldn't it?

----- Original Message -----
From: "Toby" <toby@paperjet.com>
To: <pgsql-jdbc@postgresql.org>
Sent: Monday, October 14, 2002 9:59 AM
Subject: Re: [JDBC] [GENERAL] Prepared statement performance...


>
> > >I do not clearly understand what the problem is with someone typing in
> > >"foo'; DROP TABLE bar;" into the "Name" field on your web form.
>
> many sites take text directly from text boxes in a web form and submit
them
> to a database without escaping the string, thereby allowing unscrupulous
> people to execute SQL directly on your live production database.
>
> for example, if there was a form like below
>
> <form action="blah.jsp" method="post">
>    <b>Username:</b> <input type="text" name="username" value="joe'; DROP
> TABLE users">
> </form>
>
>
> then on the server you have blah.jsp which will handle the processing for
> the page, which might look something like
>
> String username = request.getParameter("username");
> Connection conn =
> DriverManager.getConnection("jdbc:postgresql:somedatabase", "someuser",
> "somepassword");
> Statement stmt = conn.createStatement();
>
> // and now the crucial line
> stmt.execute("INSERT INTO Users (username) VALUES ('" + username + "')");
>
>
>
> from the above, you can see that whatever is entered into the username
FORM
> item is executed directly on the database which, in this case, can cause
> the INSERT statement to actually perform the following
>
>          INSERT INTO Users (username) VALUES ('joe'; DROP TABLE users');
>
> This might cause the users table to be dropped. Not entirely sure if the
> above would actually do this, but a little bit of mucking about with what
> you type into the FORM field would certainly do this. Of course, you need

> to make sure the user you execute the SQL as (when you login to the
> database) has DROP permissions, but nonetheless the security concern is
clear.
>
> A simple way to prevent this is to use a PreparedStatement
>
> PreparedStatement pstmt = conn.prepareStatement("INSERT INTO Users
> (username) VALUES (%)");
> pstmt.setString(1, username);
> pstmt.execute();
>
> This would escape the username string correctly thus executing the
> following SQL
>
>          INSERT INTO Users (username) VALUES ('joe''; DROP TABLE users');
>
> NOTE: the ' after the word "joe" has 2 ' not 1.
>
> In this case, the username would be exactly what the user typed. The
> database would not try to execute the DROP TABLE statement.
>
> I use this method but then I also sometimes escape the string myself by
> simply replacing all instances of a single apostrophe character (') with 2
> ('').  (NOTE: do not replace it with the " quote mark, use 2 apostrophe
> characters).
>
> I suspect someone will lambast me for escaping the string myself since
"why
> do it when the driver will do it for you". Well, sometimes i find it
> necessary of convenient.
>
> So there you go.
> By the way, I've noticed many sites that have this error. It's sloppy and
> unforgiveable.
>
> toby
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 6: Have you searched our list archives?
>
> http://archives.postgresql.org

Re: [GENERAL] Prepared statement performance...

From

nferrier@tapsellferrier.co.uk

Date:

14 October 2002, 05:21:23

"Peter Kovacs" <peter.kovacs@sysdata.siemens.hu> writes:

> Thank you for your explanation. But I still do not see how
> >          INSERT INTO Users (username) VALUES ('joe'; DROP TABLE users');
> will be evaluated so that it drops table 'users'. Actually, this should
> evaluate to a syntax error, shouldn't it?

That's right. I think toby is mistaking the classic javascript hack
for a SQL hack.

The JS hack is possible because developers rarely use strong
validation for input fields, thus allowing JS statements into the
database. When these are presented on webpages they can get up to all
sorts of tricks and wheezes.

I've never heard of a SQL hack based on input fields, it seems most
unlikely but something could probably be done based on stored procs,
the hacker would have to have intimiate knowledge of the stored procs
and would also have to find one that would do something dangerous.

Nic

Re: [GENERAL] Prepared statement performance...

From

Toby

Date:

14 October 2002, 05:47:41

At 11:05 14/10/2002 +0200, Peter Kovacs wrote:
> >will be evaluated so that it drops table 'users'. Actually, this should
> >evaluate to a syntax error, shouldn't it?

well, as my explanation noted

> >          INSERT INTO Users (username) VALUES ('joe'; DROP TABLE users');
> >
> > This might cause the users table to be dropped. Not entirely sure if the
> > above would actually do this, but a little bit of mucking about with what
> > you type into the FORM field would certainly do this.

the point being that a string containing an SQL command sent to the backend
can contain several statements, like the above. the more i look at it, the
more i agree that the above would throw a syntax error. however, if there
was a ; after the table name, users, then
then the resultant SQL would be

         INSERT INTO Users (username) VALUES ('joe'; DROP TABLE users;');

i suspect this would work.

I don't see how the above is a classic javascript hack, since there's no
javascript. i've seen on production code places where strings taken from
form fields are stored in cookies and session variables and subsequently
written directly to the database, as shown above.

i've been to sites where this is possible and it also does not require
intimate knowledge of the target database. a bit of messing around will
often cause errors to be thrown and, unless the target webserver is
configured appropriately (IIS is good for this), it is often possible to
see the detailed error message...whihc itself can show hints of the backend
structure.

by way of example, try going to

         http://www.westmisnter.ac.uk/

and in one of the search boxes enter the following

         ';select * from msdb..sysjobs;

Now then, if someone spent a few minutes working on this, I'm sure it would
be possible to drop a table or 2 or, at the very least, trash a load of
data. the same will be possible on a postgres backend.

course, what this has to do with performance I don't know.

toby

Re: [GENERAL] Prepared statement performance...

From

"Peter Kovacs"

Date:

14 October 2002, 06:04:14

Thank you. So I think herewith we made pointless the original arguement Curt
Sampson made in his mail in support Barry's twisted interpretation of
PreparedStatement.

Peter

PS:
When I am travelling in Budapest (capital of Hungary, my country) on the
bus, and some exhausted poor elderly tries to sheepishly argument with some
youngsters occupying recklessly the last seat in the bus leaving the elderly
standing, I always tempted to cynically tell the upset elderly: "This public
transport my lady. If you wish first class service, you should take a taxi."
Of course, I never say this, because it would be a sign of very bad taste.
But I often giving *myself* this kind of rebuff when I am surfing on open
source mailing lists and find people saying this and that without giving a
shit to think about it first or --more importantly-- to think about the
context they're making their remarks: "My dear, this is public transport,
worth the money you pay for it."

The above was meant to be humorous.

----- Original Message -----
From: <nferrier@tapsellferrier.co.uk>
To: "Peter Kovacs" <peter.kovacs@sysdata.siemens.hu>
Cc: <pgsql-jdbc@postgresql.org>; "Toby" <toby@paperjet.com>
Sent: Monday, October 14, 2002 11:20 AM
Subject: Re: [JDBC] [GENERAL] Prepared statement performance...

> "Peter Kovacs" <peter.kovacs@sysdata.siemens.hu> writes:
>
> > Thank you for your explanation. But I still do not see how
> > >          INSERT INTO Users (username) VALUES ('joe'; DROP TABLE
users');
> > will be evaluated so that it drops table 'users'. Actually, this should
> > evaluate to a syntax error, shouldn't it?
>
> That's right. I think toby is mistaking the classic javascript hack
> for a SQL hack.
>
> The JS hack is possible because developers rarely use strong
> validation for input fields, thus allowing JS statements into the
> database. When these are presented on webpages they can get up to all
> sorts of tricks and wheezes.
>
> I've never heard of a SQL hack based on input fields, it seems most
> unlikely but something could probably be done based on stored procs,
> the hacker would have to have intimiate knowledge of the stored procs
> and would also have to find one that would do something dangerous.
>
>
> Nic
>

Re: [GENERAL] Prepared statement performance...

From

nferrier@tapsellferrier.co.uk

Date:

14 October 2002, 06:04:38

Toby <toby@paperjet.com> writes:

> then the resultant SQL would be
>
>          INSERT INTO Users (username) VALUES ('joe'; DROP TABLE users;');
>
> i suspect this would work.

Try it   /8->

> I don't see how the above is a classic javascript hack, since there's no
> javascript. i've seen on production code places where strings taken from
> form fields are stored in cookies and session variables and subsequently
> written directly to the database, as shown above.

I didn't mean that the code you used was a javascript hack, but the
worry about unvalidated input fields is related to that. Javascript is
often used for hacking in unvalidated column values passed in and out
of databases.

> i've been to sites where this is possible and it also does not require
> intimate knowledge of the target database. a bit of messing around will
> often cause errors to be thrown and, unless the target webserver is
> configured appropriately (IIS is good for this), it is often possible to
> see the detailed error message...whihc itself can show hints of the backend
> structure.
>
> by way of example, try going to
>
>          http://www.westmisnter.ac.uk/
>
> and in one of the search boxes enter the following
>
>          ';select * from msdb..sysjobs;
>
> Now then, if someone spent a few minutes working on this, I'm sure it would
> be possible to drop a table or 2 or, at the very least, trash a load of
> data. the same will be possible on a postgres backend.
>
> course, what this has to do with performance I don't know.

Presumably, The original discourse was about whether you could put
these sorts of strings into PS bind variables. I don't believe you can
since bind variables are part of the syntax tree of a legal SQL
statement.

If you had:

   input=";select * from somedb;"
   PreparedStatement ps
        = con.prepareStatement("insert into users"
                               + " value ( ? );");
   ps.setString(1, input);

You're not going to get anything legal.

However, I agree that SQL hacks are possible when you're combining
unvalidated strings from the frontend. This is because the query
parser has a chance to be terminated by the ";" at the start of the
input expression (when it's embedded in another query).

A select example would be:

  Statement st = con.createStatement();
  ResultSet rs = st.executeQuery("select * from tab "
                                + where id = " + input + ";");

The above insert, when done with combination, would also work:

   input=";select * from somedb;"
   Statement st = con.createStatement();
   st.execute("insert into users value ( ? );");

However, this is different from a PS because a PS has already parsed
the string, therefore a certain amount of lexical protection is
offered.

I don't know if PostgreSQL works this way yet: I suspect that it does
because the PS facilities are being offered by the backend.

Note 1: PS's do not protect you from javascript hacks.

Note 2: one good reason to have PS's is that they offer this lexical
protection, using combination is quick, but unsafe.

Note 3: Oracle does not allow querys to be ended with ";" in JDBC
statements, this presumably stops a lot of the rot. It would be nice
if PostgreSQL JDBC had an Oracle compatibility mode for Oracle querys:
I find the biggest pain in porting apps between the two is this query
ending nonsense.

Nic

Re: [GENERAL] Prepared statement performance...

From

nferrier@tapsellferrier.co.uk

Date:

14 October 2002, 06:08:55

"Peter Kovacs" <peter.kovacs@sysdata.siemens.hu> writes:

> Thank you. So I think herewith we made pointless the original arguement Curt
> Sampson made in his mail in support Barry's twisted interpretation of
> PreparedStatement.
>
> Peter
>
> PS:
> When I am travelling in Budapest (capital of Hungary, my country) on the
> bus, and some exhausted poor elderly tries to sheepishly argument with some
> youngsters occupying recklessly the last seat in the bus leaving the elderly
> standing, I always tempted to cynically tell the upset elderly: "This public
> transport my lady. If you wish first class service, you should take a taxi."
> Of course, I never say this, because it would be a sign of very bad taste.
> But I often giving *myself* this kind of rebuff when I am surfing on open
> source mailing lists and find people saying this and that without giving a
> shit to think about it first or --more importantly-- to think about the
> context they're making their remarks: "My dear, this is public transport,
> worth the money you pay for it."
>
> The above was meant to be humorous.

It's funny: but I don't believe you're right. I belive the correct
thing to say is: "well, you could always fix it to be the way you
want."

Unfortunately, the same is not true for old ladies on trams. Even if
the old lady has welding skills and the materials necessary to bolt on
extra seats, it's still too difficult to do whilst the tram is in
motion.


Nic

Re: [GENERAL] Prepared statement performance...

From

"Peter Kovacs"

Date:

14 October 2002, 08:04:26

> Unfortunately, the same is not true for old ladies on trams. Even if
> the old lady has welding skills and the materials necessary to bolt on
> extra seats, it's still too difficult to do whilst the tram is in
> motion.
>
>
> Nic

I really like this one. :)))))

Peter

Re: [GENERAL] Prepared statement performance...

From

Tom Lane

Date:

14 October 2002, 10:11:09

"Peter Kovacs" <peter.kovacs@sysdata.siemens.hu> writes:
> Thank you for your explanation. But I still do not see how
>> INSERT INTO Users (username) VALUES ('joe'; DROP TABLE users');
> will be evaluated so that it drops table 'users'. Actually, this should
> evaluate to a syntax error, shouldn't it?

The given example was sloppy, but that doesn't mean that there is no
security risk here.  Assuming that the webscript will execute

    INSERT INTO Users (username) VALUES ('$1');

(where $1 means the raw string supplied by the form user), consider
input like

    '); DROP TABLE users --

This will result in the backend seeing

    INSERT INTO Users (username) VALUES (''); DROP TABLE users --');

which is 100% syntactically okay.

So you really need to double or escape quotes and backslashes in
user-supplied strings, or you have a security problem.  Nic is correct
to note that this is not specific to Javascript; it is a problem for any
database frontend no matter what it's written in.

            regards, tom lane

Re: [GENERAL] Prepared statement performance...

From

Curt Sampson

Date:

16 October 2002, 02:06:37

On Mon, 14 Oct 2002, Peter Kovacs wrote:

> I do not clearly understand what the problem is with someone typing in
> "foo'; DROP TABLE bar;" into the "Name" field on your web form.

If you do just a dumb string concatination, it can insert arbitrary
commands into your conversation with the postgres database. E.g.,
your code does this to construct the query:

    "SELECT id FROM users WHERE name = '" + name + '"'"

and name, taken directly from the input field, is

    foo'; DROP TABLE users; SELECT 'foo

Your query ends up being:

    SELECT id FROM users WHERE name = 'foo'; DROP TABLE users; SELECT 'foo';

cjs
--
Curt Sampson  <cjs@cynic.net>   +81 90 7737 2974   http://www.netbsd.org
    Don't you know, in this new Dark Age, we're all light.  --XTC