Thread: Re: Re: JDBC Performance

Re: Re: JDBC Performance

From
"Keith L. Musser"
Date:
I'm thinking caching byte arrays on a per-connection basis is the way to
go.

However, how much difference do you expect this to make?  How many byte
arrays to you allocate and destroy per SQL statement?  And how big are
the arrays?  How much memory will they occupy per open connection?

Will this really make a big difference?

- Keith

-----Original Message-----
From: Gunnar R|nning <gunnar@candleweb.no>
To: Keith L. Musser <kmusser@idisys.com>
Cc: Gunnar R|nning <gunnar@candleweb.no>; PGSQL-General
<pgsql-general@postgresql.org>
Date: Friday, September 29, 2000 12:39 PM
Subject: Re: [GENERAL] Re: JDBC Performance


>[feel stupid replying to myself...]
>
>Gunnar R|nning <gunnar@candleweb.no> writes:
>
>> > at org.postgresql.jdbc2.ResultSet.getObject(ResultSet.java:789)
>>
>> OK, I found the problem. I will post a fixed version later today. The
>> problem was that getObject() executed Field.getSQLType() which in
turn
>> executed Connection.ExecSQL(). I modified ExecSQL to deallocate the
cached
>> byte arrays on entry, because I believed it only were called by
>> Statement.execute() methods. I guess I should move the deallocation
into
>> the Statement classes instead, as that is were it really belongs.
>>
>> I interpret the JDBC spec. to say that only one ResultSet will be
open
>> per. Statement, but one Connection canm have several statements with
one
>> result set each.
>>
>
>This does of course imply that arrays should be cached on a
>ResultSet/Statement basis instead of on PGStream as it is done now. Do
>anybody have good suggestions on how to implement this ?
>
>Approach 1:
>The cache is now only per Connection, maybe we should a global pool of
free
>byte arrays instead ?
>Cons :
>This would probably mean that we need to add more
>synchronization to ensure safe access by concurrent threads and could
>therefore lead to poorer performance and concurrency.
>
>Pros : Possibly lower memory consumption and higher performance in some
>cases(when you find free byte arrays in the global pool). If your
>application is not pooling connections, but recreating connections it
would
>also benefit performance wise from this approach.
>
>Approach 2:
>Another solution would be have the cache be per connection but
associate a
>pool of used byte arrays to each resultset/statement and deallocate
these
>on resultset.close()/statement.close().
>
>Pros: Faster for the typical web application that uses pooled
connections,
>because this approach would require less synchronization.
>Cons: Higher memory consumption.
>
>Either of these two approaches would probably require some
reorganization
>of how the driver works.
>
>Any other suggestions or comments ?
>
>
>Regards,
>
> Gunnar
>



Re: Re: JDBC Performance

From
Gunnar R|nning
Date:
"Keith L. Musser" <kmusser@idisys.com> writes:

> I'm thinking caching byte arrays on a per-connection basis is the way to
> go.
>
> However, how much difference do you expect this to make?  How many byte
> arrays to you allocate and destroy per SQL statement?  And how big are
> the arrays?  How much memory will they occupy per open connection?
>

The current algorithm is greedy and it does not free up anything, so how
many arrays that are cached depends on the size of the resultset. A
resultset require one byte array for all values in all columns.

> Will this really make a big difference?

My web application improved it throughput/execution speed by 50%. I think
that is quite good considering that JDBC is not the only bottleneck of my
application. I also saw a complete shift in where the JDBC part of the
application spent the time. Earlier the most significant part was in the
allocation of byte arrays, in the new implementation this part is reduced
dramativally and the new bottlenecks are byte to char conversions(done when
you retrieve values from the result set) and reading data from the
database. I don't think the reading can be much faster, maybe cursored
results could help in some situations where you don't actually need the
entire result set. But cursors might also add overhead for other queries,
but I know to little about cursors in postgres yet to do any qualified
statement on that.

Regards,

    Gunnar

Re: Re: JDBC Performance

From
Peter Mount
Date:
On Fri, 29 Sep 2000, Keith L. Musser wrote:

> I'm thinking caching byte arrays on a per-connection basis is the way to
> go.
>
> However, how much difference do you expect this to make?  How many byte
> arrays to you allocate and destroy per SQL statement?  And how big are
> the arrays?  How much memory will they occupy per open connection?
>
> Will this really make a big difference?

It should. Everything that goes between JDBC and the backend is converted
into byte[] arrays, so it does occur, and occur often.

Peter

[snip]

--
Peter T Mount peter@retep.org.uk http://www.retep.org.uk
PostgreSQL JDBC Driver http://www.retep.org.uk/postgres/
Java PDF Generator http://www.retep.org.uk/pdf/