Re: new String(byte[]) performance - Mailing list pgsql-jdbc

From Teofilis Martisius
Subject Re: new String(byte[]) performance
Date
Msg-id 20021022171924.GA641@teohome.lzua.lt
Whole thread Raw
In response to Re: new String(byte[]) performance  (Barry Lind <blind@xythos.com>)
List pgsql-jdbc
On Mon, Oct 21, 2002 at 07:38:07PM -0700, Barry Lind wrote:
> Teofilis,
>
> I don't think the problem you are seeing is as a result of using java.
> It is more the result of the architecture of the jdbc driver.  I know
> other followups to this email have suggested fixes at the IO level, and
> while I think that may be interesting to look into, I think there is a
> lot that can be done to improve performance within the existing code
> that can work on all jdks (1.1, 1.2, 1.3 and 1.4).

Ok, I took a look at 1.4 java.nio but well, I also don't like tying JDBC
drivers to 1.4, because I still use 1.3 in production myself, I had
stability problems with 1.4. And I'm not sure how much would java.nio help.
Heh, I wish hava had #ifdefs. Anyway, i'm not doing 1.4 stuff until 1.4
is more widespread. Or at least 1.4 'features' should be optional. Separate
class files or something.

>
> If you look at what is happening in the driver when you do something as
> simple as 'select 1', you can see many areas of improvement.

Testing simple selects, hmm, it doesn't realy matter. I can measure how
much time is spent in JDBC driver, I don't think postgres server/query
delays realy distort the results I get. Besides, I compare with 'psql'
performance for same queries.

>
> The first thing that the driver does is allocate byte[] ...

> However the byte[] objects are only the first problem....
>
> Now using object pools can help the allocation of byte[] objects, but
> doesn't help with String objects.  However if the driver started using
> char[] objects internally instead of Strings, these could be pooled as
> well.  But this would probably mean that code like
> Integer.parseInt(String) would need to be reimplemented in the driver
> since there is no corresponding Integer.parseInt(char[]).

Hmm, I know how all this works. I read JDBC driver code. However, I did
not find much better solution. First, when transfering data from stream,
the only logical solution is to put int into byte[]. And as far as I
understand byte[] arrays are already pooled. I doubt it is
posible/better to read other things than byte[] from the stream.

About converting char[] -> everything, well, new String(char[]) is
really cheap, but it does COPY the char[] array. String is in fact just
a wrapper for char[]. It uses System.arrayCopy() AFAIK. There is a new
String(char[])  constructor that doesn't copy the array, but it is
package private for java.lang. Too bad there isn't Integer.parseInt(char[]).

There are 2 ways I think performance can be improved. One is to strongly
type the received data into field type. F.e. for integer fields receive
byte[], then convert it to java.lang.Integer at once, and store it in
memmory as java.lang.Integer. But this does remove quite a lot of
flexibility, i.e doing resultset.getString() on integer field, or even
resultset.getLong() on integer field would cause a ClassCastException.
So I don't think this is a good solution. Well, more precisely, it would
be quite hard to make this solution flexible enough. It could f.e.
return default object for getObject() and getInt(), obj.toString() for
getString, and f.e. convert object to other object via String when
getSomethingElse() is called on resultset. Hmm, and conversion should
still be done via String (are there other ways?), so temporary String
allocation would still be a problem... What do you think?

Second solution is to store received data as Strings. I don't exaclty
know how much better it would be. It would make the temporary string
allocation permanent, and one byte[] array for receiving data would be
enough, i.e. no more byte[] allocation bottlenecks. But I don't think it
would make very much difference in the end.

Hmm, so things that can be done:

1. Strong typing after receive, store data as specific objects.
2. Converting into Strings after receive, store data as strings.
3. Maybe receive all the data into single big byte[] array? or at
least entire row into a single big byte[] aray? Less trouble for
garbage collector/pool? Is it possible?
4. Examine the possibility to receive Strings of char[] arrays directly
from stream. Maybe using java.nio for that. I read the following message
from Aaron. If backed sends everything except binary cursors as string
then receiving it as string seems a logical solution :/

Ok, I could look at these things when I have time. Tell me which
solution do you prefer more or what should I work on first.

And 1 more thing- I think SQL queries are a bottleneck in many
applications, so every milisecond saved in JDBC driver counts.

Teofilis Martisius

pgsql-jdbc by date:

Previous
From: Barry Lind
Date:
Subject: Re: new String(byte[]) performance
Next
From: Aaron Mulder
Date:
Subject: Re: new String(byte[]) performance