Re: new String(byte[]) performance - Mailing list pgsql-jdbc
From | Teofilis Martisius |
---|---|
Subject | Re: new String(byte[]) performance |
Date | |
Msg-id | 20021022171924.GA641@teohome.lzua.lt Whole thread Raw |
In response to | Re: new String(byte[]) performance (Barry Lind <blind@xythos.com>) |
List | pgsql-jdbc |
On Mon, Oct 21, 2002 at 07:38:07PM -0700, Barry Lind wrote: > Teofilis, > > I don't think the problem you are seeing is as a result of using java. > It is more the result of the architecture of the jdbc driver. I know > other followups to this email have suggested fixes at the IO level, and > while I think that may be interesting to look into, I think there is a > lot that can be done to improve performance within the existing code > that can work on all jdks (1.1, 1.2, 1.3 and 1.4). Ok, I took a look at 1.4 java.nio but well, I also don't like tying JDBC drivers to 1.4, because I still use 1.3 in production myself, I had stability problems with 1.4. And I'm not sure how much would java.nio help. Heh, I wish hava had #ifdefs. Anyway, i'm not doing 1.4 stuff until 1.4 is more widespread. Or at least 1.4 'features' should be optional. Separate class files or something. > > If you look at what is happening in the driver when you do something as > simple as 'select 1', you can see many areas of improvement. Testing simple selects, hmm, it doesn't realy matter. I can measure how much time is spent in JDBC driver, I don't think postgres server/query delays realy distort the results I get. Besides, I compare with 'psql' performance for same queries. > > The first thing that the driver does is allocate byte[] ... > However the byte[] objects are only the first problem.... > > Now using object pools can help the allocation of byte[] objects, but > doesn't help with String objects. However if the driver started using > char[] objects internally instead of Strings, these could be pooled as > well. But this would probably mean that code like > Integer.parseInt(String) would need to be reimplemented in the driver > since there is no corresponding Integer.parseInt(char[]). Hmm, I know how all this works. I read JDBC driver code. However, I did not find much better solution. First, when transfering data from stream, the only logical solution is to put int into byte[]. And as far as I understand byte[] arrays are already pooled. I doubt it is posible/better to read other things than byte[] from the stream. About converting char[] -> everything, well, new String(char[]) is really cheap, but it does COPY the char[] array. String is in fact just a wrapper for char[]. It uses System.arrayCopy() AFAIK. There is a new String(char[]) constructor that doesn't copy the array, but it is package private for java.lang. Too bad there isn't Integer.parseInt(char[]). There are 2 ways I think performance can be improved. One is to strongly type the received data into field type. F.e. for integer fields receive byte[], then convert it to java.lang.Integer at once, and store it in memmory as java.lang.Integer. But this does remove quite a lot of flexibility, i.e doing resultset.getString() on integer field, or even resultset.getLong() on integer field would cause a ClassCastException. So I don't think this is a good solution. Well, more precisely, it would be quite hard to make this solution flexible enough. It could f.e. return default object for getObject() and getInt(), obj.toString() for getString, and f.e. convert object to other object via String when getSomethingElse() is called on resultset. Hmm, and conversion should still be done via String (are there other ways?), so temporary String allocation would still be a problem... What do you think? Second solution is to store received data as Strings. I don't exaclty know how much better it would be. It would make the temporary string allocation permanent, and one byte[] array for receiving data would be enough, i.e. no more byte[] allocation bottlenecks. But I don't think it would make very much difference in the end. Hmm, so things that can be done: 1. Strong typing after receive, store data as specific objects. 2. Converting into Strings after receive, store data as strings. 3. Maybe receive all the data into single big byte[] array? or at least entire row into a single big byte[] aray? Less trouble for garbage collector/pool? Is it possible? 4. Examine the possibility to receive Strings of char[] arrays directly from stream. Maybe using java.nio for that. I read the following message from Aaron. If backed sends everything except binary cursors as string then receiving it as string seems a logical solution :/ Ok, I could look at these things when I have time. Tell me which solution do you prefer more or what should I work on first. And 1 more thing- I think SQL queries are a bottleneck in many applications, so every milisecond saved in JDBC driver counts. Teofilis Martisius
pgsql-jdbc by date: