Thread: bytea, jdbc, i/o ...

bytea, jdbc, i/o ...

From
Kenneth Been
Date:
I am seeing some strange behavior, and I have a guess for what is
causing it.  Maybe someone will know for sure.

Here is the situation.  Database A contains mostly data of type
integer[], text, and boolean[].  Database B contains mostly data
of type bytea.

I wrote a program to send a bunch of select queries and time the
results.  This test program is in Java, and connects using JDBC.
  The data types in A are retrieved with the
ResultSet.getString() method.  That includes the arrays, which I
then parse myself.  The bytea fields in B are retrieved with the
ResultSet.getBytes() method, which returns byte[].

I ran this test program on the same machine that has the database
engine, and on a different machine on the local network.

Here is the strange behavior:

For database A, the difference in performance between testing
locally and testing over the network is negligible; about 5%
slower over the network.  About what I would expect.  But for
database B, the difference is huge; about 35 times slower over
the network.

Here is my guess for what is going on:

Maybe for a local connection the db engine delivers the bytea as
is, whereas for a remote connection it converts the bytea to a
string representation, and then Java would have to convert it
back for the getBytes() method.  If the queries are usually
answered from the cache, then I would think converting to string
and back could take 35 times longer than just pulling the data
out of the cache.

For the other data types, on the other hand, maybe the engine
always converts them to string, whether it is local or not.  Or
anyway, even if it doesn't, Java would convert to string for the
getString() method.

Does that sound right?  If not, any ideas on what is going on?

Thanks.

Ken


Re: bytea, jdbc, i/o ...

From
Tom Lane
Date:
Kenneth Been <kennethb@telocity.com> writes:
> Here is the strange behavior:
> For database A, the difference in performance between testing
> locally and testing over the network is negligible; about 5%
> slower over the network.  About what I would expect.  But for
> database B, the difference is huge; about 35 times slower over
> the network.

That's odd.

> Here is my guess for what is going on:
> Maybe for a local connection the db engine delivers the bytea as
> is, whereas for a remote connection it converts the bytea to a
> string representation, and then Java would have to convert it
> back for the getBytes() method.

I don't believe that for a millisecond; there simply is not any
local/remote discrimination in the backend.

I'm not sure what the problem is, but I think you need to find a
different tree to bark up.

Some other thoughts that come to mind involve network saturation,
routing problems, DNS problems, SSL encryption overhead, multibyte
encoding conversion, yadda yadda...

            regards, tom lane

Re: bytea, jdbc, i/o ...

From
Barry Lind
Date:
Kenneth,

The representation of the data is always converted to a string format in
both the local and remote cases.  But the string representation for
bytea data can result in upto four times expansion (i.e. a 1K bytea
value can take up to 4K in it's string form).  Thus using alot of bytea
data can result in significantly more network traffic for the same
amount of data.  I don't know if this explains your findings or not.

thanks,
--Barry

Kenneth Been wrote:

> I am seeing some strange behavior, and I have a guess for what is
> causing it.  Maybe someone will know for sure.
>
> Here is the situation.  Database A contains mostly data of type
> integer[], text, and boolean[].  Database B contains mostly data of
> type bytea.
>
> I wrote a program to send a bunch of select queries and time the
> results.  This test program is in Java, and connects using JDBC.  The
> data types in A are retrieved with the ResultSet.getString() method.
> That includes the arrays, which I then parse myself.  The bytea fields
> in B are retrieved with the ResultSet.getBytes() method, which returns
> byte[].
>
> I ran this test program on the same machine that has the database
> engine, and on a different machine on the local network.
>
> Here is the strange behavior:
>
> For database A, the difference in performance between testing locally
> and testing over the network is negligible; about 5% slower over the
> network.  About what I would expect.  But for database B, the
> difference is huge; about 35 times slower over the network.
>
> Here is my guess for what is going on:
>
> Maybe for a local connection the db engine delivers the bytea as is,
> whereas for a remote connection it converts the bytea to a string
> representation, and then Java would have to convert it back for the
> getBytes() method.  If the queries are usually answered from the
> cache, then I would think converting to string and back could take 35
> times longer than just pulling the data out of the cache.
>
> For the other data types, on the other hand, maybe the engine always
> converts them to string, whether it is local or not.  Or anyway, even
> if it doesn't, Java would convert to string for the getString() method.
>
> Does that sound right?  If not, any ideas on what is going on?
>
> Thanks.
>
> Ken
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org
>