Home > mailing lists
Re: [GENERAL] Prepared statement performance... - Mailing list pgsql-jdbc

From	Barry Lind
Subject	Re: [GENERAL] Prepared statement performance...
Date	September 27, 2002 14:56:56
Msg-id	3D94AA32.1050604@xythos.com Whole thread Raw
In response to	Re: [GENERAL] Prepared statement performance... (Aaron Mulder <ammulder@alumni.princeton.edu>)
List	pgsql-jdbc
Tree view
Dimtry,

Server side prepare does not map to jdbc concept of PreparedStatement
and it is important to understand how they are not the same.

Server side prepare means that you can parse and plan the statement once
and reexecute it multiple times so:
select foo from bar;
becomes
prepare <name> as select foo from bar;
execute <name>;
deallocate <name>;

This applies to all sql statements.  So server side prepared statements
can equally be used for regular JDBC Statement objects as well as JDBC
PreparedStatements.

JDBC PreparedStatements provide an interface to bind values into a sql
statement.

Server side prepare provides the ability to reduce the overhead of
parse/plan across muliple executions of a sql statement that may or may
not have bind values.

They are different even though they both have the word 'prepare' in
their names.

thanks,
--Barry



Dmitry Tkach wrote:
> Actually, I still don't see what's wrong with enabling that behaviour by
> default....
> People who do not want to use it,  can still execute queries using
> Statement instead of PreparedStatement, right?
> As far as I know, people, who use JDBC are usually fully aware about the
> differences between the two, and
> the overhead involved in preparing the query plans - if you don't want
> it prepared, why would you ask foir it?
>
> Dima
>
> Dave Cramer wrote:
>
>> Ok, I agree with having a configuration parameter which sets the
>> default, that way anyone with the where withall to use this properly can
>> do so without changing their code.
>>
>> AFAICS the parameter is statement specific,not connection specific.
>>
>>
>> Dave
>> On Fri, 2002-09-27 at 08:43, Aaron Mulder wrote:
>>
>>> Dave, Barry, et al.:
>>>
>>>> Unless of course we were to write a
>>>> sql parser in the jdbc driver so that we could predetermine which
>>>> statements should use the prepare mechanism or not.
>>>>
>>>     That's a bit of an exaggeration -- all you need to check in the
>>> JDBC driver is whether there's an unescaped ; to see if there's more
>>> than one statement, right?  It wouldn't be unreasonable to default to
>>> enabling the feature unless there's a ; in the statement (or unless
>>> there's a ; with anything but whitespace afterward, if you like) --
>>> it wouldn't even be a disaster to forget escaping and just decide
>>> based on a ; in the statement period -- few enough statements have an
>>> escaped ; that the suboptimal performance in that case won't matter.
>>>     I'm a little concerned about how this feature interacts with
>>> connection pooling as currently implemented -- it probably needs to be
>>> reset to a known state every time a connection is returned to the
>>> pool.  Which is OK for the PostgreSQL implementation (which you'll
>>> note is not
>>> supposed to be used in an app server environment), but impossible for
>>> all
>>> the app servers out there which won't take driver-specific steps when
>>> connections are returned to their generic pools.  Furthermore, with the
>>> default behavior set to "disabled", you're reducing performance (you
>>> quoted 60%?) in the application server environment, when there will
>>> never
>>> be multiple statements executed in the same call, but there will
>>> frequently be multiple identical statements executed on the same
>>> PreparedStatement, but there's no available way to reset the flag, short
>>> of writing a wrapper driver to stand between the app server and the real
>>> driver.  On the other hand, if you set the default to a global
>>> "enabled",
>>> it breaks DBVisualizer and other development tools which currently allow
>>> multiple statements by default.
>>>     IMHO, it would be better to have the feature enabled for queries
>>> where it's appropriate, and disabled where not, and let the
>>> programmer specify a global override where they absolutely must.
>>> That way DB-specific actions are only required in truly exceptional
>>> cases, and both the app servers and the tools behave properly by
>>> default.
>>>     If you absolutely object, I think we should add a property to the
>>> Driver and the ConnectionPoolDataSource so that in an app server
>>> environment you can set the property to enabled at configuration
>>> time.  But again, I would prefer smarter default behavior.
>>>
>>> Aaron
>>>
>>> On 27 Sep 2002, Dave Cramer wrote:
>>>
>>>> Peter,
>>>>
>>>> I have to agree with Barry here. Since it is possible to send two (or
>>>> more ) statements to the backend, the programmer needs some mechanism
>>>> for enabling/disabling this feature. Unless of course we were to
>>>> write a
>>>> sql parser in the jdbc driver so that we could predetermine which
>>>> statements should use the prepare mechanism or not.
>>>>
>>>> I often use prepared statements instead of simple statements even when
>>>> only doing one select. The difference being that I can use setXXX to
>>>> set
>>>> the parameters in the statment. Which is particularly important for
>>>> setDate, setTime, setTimestamp
>>>>
>>>> And you would be amazed at the unique (irregular) coding practices
>>>> which
>>>> exist out there, so having the default do something sane is a GOOD
>>>> THING.
>>>>
>>>> Dave
>>>>
>>>>
>>>> On Fri, 2002-09-27 at 02:52, Peter Kovacs wrote:
>>>>
>>>>> Barry,
>>>>>
>>>>> It's great to have this functionality.
>>>>>
>>>>> However, I am surprised about the need/requirement to explicitely
>>>>> turning on
>>>>> this feature in the JDBC driver. As you explain, the reason for this
>>>>> behaviour is to support the use of multiple statements with the
>>>>> prepareStatement method. But this is supporting incorrect us of the
>>>>> JDBC
>>>>> interface, since the JDBC API says that the parameter of the
>>>>> prepareStatement method is one(!) SQL statetment. On the other
>>>>> hand, it is
>>>>> clear that the intended purpose of the prepareStatement method is
>>>>> to use
>>>>> prepared statment -- exclusively and not by default. For statements
>>>>> executed
>>>>> only once, the designers of the API made available the method
>>>>> "createStatment".
>>>>>
>>>>> So the buttom line is that you have an irregular behaviour in
>>>>> support of an
>>>>> irregular coding practice. I understand that there is "legacy" code
>>>>> that
>>>>> needs to use the prepareStatement method with multiple SQL
>>>>> statements. But
>>>>> since such practice was already incorrect at the time of its
>>>>> inception, the
>>>>> right way would be to allow a transition from bad to good by
>>>>> requiring the
>>>>> "sinners" to explicitly use some kind of
>>>>> "org.postgresql.PGStatement.setUseServerPrepare(false)" tweaking in
>>>>> their
>>>>> code, and let new coders use standard code.
>>>>>
>>>>> Of course, you're running the show, so you will do it as you deem
>>>>> right, but
>>>>> IMHO the implementation as you described it is clearly not reasonable.
>>>>>
>>>>> Cheers,
>>>>>
>>>>> Peter
>>>>>
>>>>> ----- Original Message -----
>>>>> From: "Barry Lind" <barry@xythos.com>
>>>>> To: "Dmitry Tkach" <dmitry@openratings.com>
>>>>> Cc: <pgsql-general@postgresql.org>; <pgsql-jdbc@postgresql.org>
>>>>> Sent: Thursday, September 26, 2002 10:56 PM
>>>>> Subject: Re: [JDBC] [GENERAL] Prepared statement performance...
>>>>>
>>>>>
>>>>>> Dimitry,
>>>>>>
>>>>>> I have some good news for you.  All of this functionality is in
>>>>>> 7.3 of
>>>>>> the server and the current development build of the jdbc driver.  Of
>>>>>> course 7.3 isn't quite production yet, but beta 2 should be out
>>>>>> this week.
>>>>>>
>>>>>> Your first option is what has been implemented.  In 7.3 the server
>>>>>> now
>>>>>> supports 'prepare <name> as <sql>' and 'execute <name>' (and variants
>>>>>> for passing bind variables).  The jdbc driver also now has a
>>>>>> method on
>>>>>> org.postgresql.PGStatement.setUseServerPrepare(true) to enable the
>>>>>> the
>>>>>> use of server side prepared statements (so you can cast your
>>>>>> Statement
>>>>>> object to an org.postgresql.PGStatement object and enable the
>>>>>> functionality for a specific statement).
>>>>>>
>>>>>> This hasn't yet made its way into the jdbc documentation for 7.3, but
>>>>>> this will be documented before 7.3 goes production.
>>>>>>
>>>>>> This implementation does still require that you make some changes to
>>>>>> your app in order to enable it but that is by design for the
>>>>>> following
>>>>>> reasons:
>>>>>>    1) In order to be useful you need to reuse a statement object
>>>>>> multiple times.  Most code generally will just create a statement and
>>>>>> use it only once, and since more work is being done to prepare the
>>>>>> statement that would have a negative performance impact for the
>>>>>> majority
>>>>>> of statements being executed in normal code.
>>>>>>    2) Prepare functionality will not work for all types of sql
>>>>>> that you
>>>>>> can execute through the jdbc driver.  For example you can quite
>>>>>> easily
>>>>>> do the following:  conn.prepareStatement("update foo set a = ?;
>>>>>> insert
>>>>>> into bar values (?,?,?);");
>>>>>> By issuing both sql statements in one call only one network
>>>>>> roundtrip is
>>>>>> needed to execute both statements providing what is likely a more
>>>>>> performant application.  However this doesn't work with prepare
>>>>>> because:
>>>>>> prepare foo as update foo set a = ?; insert into bar
>>>>>> values(?,?,?);   -
>>>>>> is one prepared statement and one regular statement which will cause
>>>>>> very strange things to happen.
>>>>>>
>>>>>> It is for the reasons above that the developer needs to explicitly
>>>>>> turn
>>>>>> on the use of server side prepared statements when it makes sense
>>>>>> to do
>>>>>>
>>>>> so.
>>>>>
>>>>>> thanks,
>>>>>> --Barry
>>>>>>
>>>>>> PS.  I would really appreciate it if you could test this new
>>>>>> functionality.  As far as I know I am the only one who has used
>>>>>> it.  But
>>>>>> for one of my frequently used and complex sql statements the time to
>>>>>> execute dropped by 60% using server side prepared statements.
>>>>>>
>>>>>>
>>>>>>
>>>>>> Dmitry Tkach wrote:
>>>>>> > Hi, everybody.
>>>>>> >
>>>>>> > I am running into huge performance problems, due to JDBC not
>>>>>> being able
>>>>>> > to cache query plans :-(
>>>>>> > My java program runs the (set of about 5) identical statements with
>>>>>> > different parameters for about 30 million times...
>>>>>> > What I am talking about below has to do with JDBC, but is not
>>>>>> really
>>>>>> > specific to it - it seems to me, that the backend itself
>>>>>> > could be improved to better handle this kind of scenario, when
>>>>>> the same
>>>>>> > statement is executed many times in the same session.
>>>>>> >
>>>>>> > It takes about 3 weeks(!) to run, and it looks like about *half* of
>>>>>>
>>>>> that
>>>>>
>>>>>> > time is spent by the query planner,
>>>>>> > creating and recreating the query plans every time I ran the damn
>>>>>> > statement....
>>>>>> >
>>>>>> > I am looking into implementing some kind of a solution, that
>>>>>> would let
>>>>>> > me work around that problem...
>>>>>> > So far, I only see two possibilities:
>>>>>> >
>>>>>> > - a general solution, that would involve extending postgres SQL
>>>>>> gramma
>>>>>> > to include a 'prepare' statement
>>>>>> > - or am ugly work around, that would involve moving all my sql
>>>>>> > statements into stored procedures, and have
>>>>>> >   those cache the query plans inside...
>>>>>> >
>>>>>> > The second solution is not only ugly (because it requires the
>>>>>> > application code to be changed and to have a specialized
>>>>>> > stored procedure for every query), but also requires some
>>>>>> additional
>>>>>> > hacks (to overcome the hard limit on the number of
>>>>>> > function arguments and the inability for functions to return
>>>>>> tuples) -
>>>>>> > the only way I imagine this can be made to work is
>>>>>> > to glue all the arguments together into a text string, and have the
>>>>>> > stored procedure parse it back, execute the query, then
>>>>>> > glue the resulting tuple(s) into another text string, return it,
>>>>>> and
>>>>>> > have the application (or, perhaps, JDBC layer) parse it back
>>>>>> > into columns...
>>>>>> >
>>>>>> > I was wonderring if anybody has any better ideas how this can be
>>>>>> made
>>>>>>
>>>>> to
>>>>>
>>>>>> > work (I am looking for a solution that would
>>>>>> > minimize changes to the existing JDBC applications that use
>>>>>> > PreparedStatements)?
>>>>>> >
>>>>>> > If the maintainers of the involved code are interested, I would be
>>>>>> > willing to implement and contribute the solution we come up with
>>>>>> ...
>>>>>> > (I figure, nobody would really be interested in getting that second
>>>>>> > solution I mentioned into the mainstream :-), but, if we are
>>>>>> > able to come up with something more general and less ugly,
>>>>>> perhaps, I
>>>>>>
>>>>> am
>>>>>
>>>>>> > not the only one who would be able to contribute from
>>>>>> > it)...
>>>>>> >
>>>>>> > For example, I believe, it should not be too complicated to
>>>>>> implement
>>>>>> > that first possibility I described above...
>>>>>> > The way I see it would involve adding two statements to postgres
>>>>>> SQL
>>>>>> > syntax:
>>>>>> >
>>>>>> > prepare <name> as <sql statement>
>>>>>> > and
>>>>>> > execute <name> with (<parameter list>)
>>>>>> >
>>>>>> > For example:
>>>>>> >
>>>>>> > prepare mystatement as select * from mytable where id = $1 and name
>>>>>>
>>>>> like
>>>>>
>>>>>> > $2;
>>>>>> >
>>>>>> > and then
>>>>>> >
>>>>>> > execute mystatement with (1, 'Dima');
>>>>>> > execute mystatement with (2, 'John');
>>>>>> >
>>>>>> > etc....
>>>>>> >
>>>>>> > The JDBC driver would then send the 'prepare' command to the
>>>>>> backend in
>>>>>> > Connection.prepareStatement (), and
>>>>>> > use the 'execute' in PreparedStatement.execute ();
>>>>>> >
>>>>>> > One potential problem with implementation I see here is that the
>>>>>> query
>>>>>> > planner wants to know the argument types ahead of time...
>>>>>> > I guess, I could get around that by making all the arguments
>>>>>> 'text',
>>>>>>
>>>>> and
>>>>>
>>>>>> > having them casted into the right types when the
>>>>>> > statement is actually executed.
>>>>>> >
>>>>>> > There is, probably a need to also have some kind of a 'close'
>>>>>> command
>>>>>>
>>>>> to
>>>>>
>>>>>> > throw away the prepared query plans... or we could just
>>>>>> > make them last forever until, say, the end of transaction (or,
>>>>>> perhaps,
>>>>>> > the end of the session?)...
>>>>>> >
>>>>>> > If there is anyone interested in discussing various
>>>>>> possibilities, and
>>>>>> > getting this implemented one way or another,
>>>>>> >  I would like to hear from you!
>>>>>> >
>>>>>> > Thanks!
>>>>>> >
>>>>>> > Dima.
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> > ---------------------------(end of
>>>>>>
>>>>> broadcast)---------------------------
>>>>>
>>>>>> > TIP 2: you can get off all lists at once with the unregister
>>>>>> command
>>>>>> >    (send "unregister YourEmailAddressHere" to
>>>>>> majordomo@postgresql.org)
>>>>>> >
>>>>>>
>>>>>>
>>>>>>
>>>>>> ---------------------------(end of
>>>>>> broadcast)---------------------------
>>>>>> TIP 1: subscribe and unsubscribe commands go to
>>>>>> majordomo@postgresql.org
>>>>>>
>>>>>
>>>>>
>>>>> ---------------------------(end of
>>>>> broadcast)---------------------------
>>>>> TIP 1: subscribe and unsubscribe commands go to
>>>>> majordomo@postgresql.org
>>>>>
>>>>>
>>>>
>>>>
>>>>
>>>> ---------------------------(end of
>>>> broadcast)---------------------------
>>>> TIP 5: Have you checked our extensive FAQ?
>>>>
>>>> http://www.postgresql.org/users-lounge/docs/faq.html
>>>>
>>>
>>
>
>
>
>
pgsql-jdbc by date:
From: Barry Lind
Date: 27 September 2002, 14:51:15
Subject: Re: [GENERAL] Prepared statement performance...
From: Barry Lind
Date: 27 September 2002, 15:00:31
Subject: Re: [GENERAL] Prepared statement performance...
Re: [GENERAL] Prepared statement performance... - Mailing list pgsql-jdbc

Previous

Next