Thread: bytea, jdbc, i/o ...
I am seeing some strange behavior, and I have a guess for what is causing it. Maybe someone will know for sure. Here is the situation. Database A contains mostly data of type integer[], text, and boolean[]. Database B contains mostly data of type bytea. I wrote a program to send a bunch of select queries and time the results. This test program is in Java, and connects using JDBC. The data types in A are retrieved with the ResultSet.getString() method. That includes the arrays, which I then parse myself. The bytea fields in B are retrieved with the ResultSet.getBytes() method, which returns byte[]. I ran this test program on the same machine that has the database engine, and on a different machine on the local network. Here is the strange behavior: For database A, the difference in performance between testing locally and testing over the network is negligible; about 5% slower over the network. About what I would expect. But for database B, the difference is huge; about 35 times slower over the network. Here is my guess for what is going on: Maybe for a local connection the db engine delivers the bytea as is, whereas for a remote connection it converts the bytea to a string representation, and then Java would have to convert it back for the getBytes() method. If the queries are usually answered from the cache, then I would think converting to string and back could take 35 times longer than just pulling the data out of the cache. For the other data types, on the other hand, maybe the engine always converts them to string, whether it is local or not. Or anyway, even if it doesn't, Java would convert to string for the getString() method. Does that sound right? If not, any ideas on what is going on? Thanks. Ken
Kenneth Been <kennethb@telocity.com> writes: > Here is the strange behavior: > For database A, the difference in performance between testing > locally and testing over the network is negligible; about 5% > slower over the network. About what I would expect. But for > database B, the difference is huge; about 35 times slower over > the network. That's odd. > Here is my guess for what is going on: > Maybe for a local connection the db engine delivers the bytea as > is, whereas for a remote connection it converts the bytea to a > string representation, and then Java would have to convert it > back for the getBytes() method. I don't believe that for a millisecond; there simply is not any local/remote discrimination in the backend. I'm not sure what the problem is, but I think you need to find a different tree to bark up. Some other thoughts that come to mind involve network saturation, routing problems, DNS problems, SSL encryption overhead, multibyte encoding conversion, yadda yadda... regards, tom lane
Kenneth, The representation of the data is always converted to a string format in both the local and remote cases. But the string representation for bytea data can result in upto four times expansion (i.e. a 1K bytea value can take up to 4K in it's string form). Thus using alot of bytea data can result in significantly more network traffic for the same amount of data. I don't know if this explains your findings or not. thanks, --Barry Kenneth Been wrote: > I am seeing some strange behavior, and I have a guess for what is > causing it. Maybe someone will know for sure. > > Here is the situation. Database A contains mostly data of type > integer[], text, and boolean[]. Database B contains mostly data of > type bytea. > > I wrote a program to send a bunch of select queries and time the > results. This test program is in Java, and connects using JDBC. The > data types in A are retrieved with the ResultSet.getString() method. > That includes the arrays, which I then parse myself. The bytea fields > in B are retrieved with the ResultSet.getBytes() method, which returns > byte[]. > > I ran this test program on the same machine that has the database > engine, and on a different machine on the local network. > > Here is the strange behavior: > > For database A, the difference in performance between testing locally > and testing over the network is negligible; about 5% slower over the > network. About what I would expect. But for database B, the > difference is huge; about 35 times slower over the network. > > Here is my guess for what is going on: > > Maybe for a local connection the db engine delivers the bytea as is, > whereas for a remote connection it converts the bytea to a string > representation, and then Java would have to convert it back for the > getBytes() method. If the queries are usually answered from the > cache, then I would think converting to string and back could take 35 > times longer than just pulling the data out of the cache. > > For the other data types, on the other hand, maybe the engine always > converts them to string, whether it is local or not. Or anyway, even > if it doesn't, Java would convert to string for the getString() method. > > Does that sound right? If not, any ideas on what is going on? > > Thanks. > > Ken > > > ---------------------------(end of broadcast)--------------------------- > TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org >