Thread: bytea size limit?

bytea size limit?

From

Michael Privat

Date:

11 April 2004, 11:58:07

Hi. I have Java code that stores binary into the DB. When I use MS SQL
Server 2000 (type image) it works fine. But when I use Postgres 7.4 it
fails. Is there a size limit setting somewhere I need to set up?

Thanks,
Michael

Re: bytea size limit?

From

Oliver Jowett

Date:

11 April 2004, 22:40:38

Michael Privat wrote:
> Hi. I have Java code that stores binary into the DB. When I use MS SQL
> Server 2000 (type image) it works fine. But when I use Postgres 7.4 it
> fails. Is there a size limit setting somewhere I need to set up?

Can you provide the schema and code that is failing, and details of how
it fails (with stack traces of any exceptions you see)? Otherwise it's
hard to help.

-O

Re: bytea size limit?

From

Dave Cramer

Date:

11 April 2004, 22:41:32

No, there is no size limit. However you may have better luck with
largeobjects.
Dave
On Sun, 2004-04-11 at 10:57, Michael Privat wrote:
> Hi. I have Java code that stores binary into the DB. When I use MS SQL
> Server 2000 (type image) it works fine. But when I use Postgres 7.4 it
> fails. Is there a size limit setting somewhere I need to set up?
>
> Thanks,
> Michael
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org
>
--
Dave Cramer
519 939 0336
ICQ # 14675561

Re: bytea size limit?

From

Michael Privat

Date:

11 April 2004, 22:49:03

Here's the full code sample for a minimalistic reproduction of the
error. When you run it, if you increase the size (through the command
line argument, you get an out of memory error. On my machine, after
the size gets higher than about 1400000 bytes. I can make it happen every
time.

The DB table has two fields:

id: integer
data: bytea


import java.sql.*;
import java.io.*;
import java.util.*;

public class BlobTest {
        public static void main(String[] args) {
                Connection c = null;
                try {
                        Class.forName("org.postgresql.Driver");
                        String url = "jdbc:postgresql://myserver/mydb";
                        c = DriverManager.getConnection(url, "myuser",
                        "mypass");

                        String sql = "INSERT INTO blobtest (id, data)
                        VALUES(?,?)";

                        int size = Integer.parseInt(args[0]);

                        byte[] data = new byte[size];

                        int id = Math.abs(new Random().nextInt());

                        PreparedStatement stmt = c.prepareStatement(sql);
                        stmt.setInt(1, id);
                        stmt.setBinaryStream(2, new ByteArrayInputStream(data), data.length);

                        stmt.executeUpdate();

                        stmt.close();
                }
                catch(Throwable t) {
                        t.printStackTrace();
                }
                finally {
                        try { c.close(); } catch(Exception e) {}
                }
        }
}



Sunday, April 11, 2004, 9:42:16 PM, you wrote:

DC> No, there is no size limit. However you may have better luck with
DC> largeobjects.
DC> Dave
DC> On Sun, 2004-04-11 at 10:57, Michael Privat wrote:
>> Hi. I have Java code that stores binary into the DB. When I use MS SQL
>> Server 2000 (type image) it works fine. But when I use Postgres 7.4 it
>> fails. Is there a size limit setting somewhere I need to set up?
>>
>> Thanks,
>> Michael
>>
>>
>> ---------------------------(end of
>> broadcast)---------------------------
>> TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org
>>

Re: bytea size limit?

From

Dave Cramer

Date:

11 April 2004, 22:56:30

Michael,

that doesn't surprise me, as the postgresql driver currently buffers
that internally, so you end up with two buffers of 1400000 bytes, have a
look through the archives for out of memory errors.

Dave
On Sun, 2004-04-11 at 21:48, Michael Privat wrote:
> Here's the full code sample for a minimalistic reproduction of the
> error. When you run it, if you increase the size (through the command
> line argument, you get an out of memory error. On my machine, after
> the size gets higher than about 1400000 bytes. I can make it happen every
> time.
>
> The DB table has two fields:
>
> id: integer
> data: bytea
>
>
> import java.sql.*;
> import java.io.*;
> import java.util.*;
>
> public class BlobTest {
>         public static void main(String[] args) {
>                 Connection c = null;
>                 try {
>                         Class.forName("org.postgresql.Driver");
>                         String url = "jdbc:postgresql://myserver/mydb";
>                         c = DriverManager.getConnection(url, "myuser",
>                         "mypass");
>
>                         String sql = "INSERT INTO blobtest (id, data)
>                         VALUES(?,?)";
>
>                         int size = Integer.parseInt(args[0]);
>
>                         byte[] data = new byte[size];
>
>                         int id = Math.abs(new Random().nextInt());
>
>                         PreparedStatement stmt = c.prepareStatement(sql);
>                         stmt.setInt(1, id);
>                         stmt.setBinaryStream(2, new ByteArrayInputStream(data), data.length);
>
>                         stmt.executeUpdate();
>
>                         stmt.close();
>                 }
>                 catch(Throwable t) {
>                         t.printStackTrace();
>                 }
>                 finally {
>                         try { c.close(); } catch(Exception e) {}
>                 }
>         }
> }
>
>
>
> Sunday, April 11, 2004, 9:42:16 PM, you wrote:
>
> DC> No, there is no size limit. However you may have better luck with
> DC> largeobjects.
> DC> Dave
> DC> On Sun, 2004-04-11 at 10:57, Michael Privat wrote:
> >> Hi. I have Java code that stores binary into the DB. When I use MS SQL
> >> Server 2000 (type image) it works fine. But when I use Postgres 7.4 it
> >> fails. Is there a size limit setting somewhere I need to set up?
> >>
> >> Thanks,
> >> Michael
> >>
> >>
> >> ---------------------------(end of
> >> broadcast)---------------------------
> >> TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org
> >>
>
--
Dave Cramer
519 939 0336
ICQ # 14675561

Re: bytea size limit?

From

Oliver Jowett

Date:

11 April 2004, 23:45:06

Dave Cramer wrote:
> Michael,
>
> that doesn't surprise me, as the postgresql driver currently buffers
> that internally, so you end up with two buffers of 1400000 bytes, have a
> look through the archives for out of memory errors.

It's worse that that, Jim..

 From memory, we actually end up with a char[] of size 3*array length on
average (assuming 50% of the data needs escaping to a '\\nnn' form),
i.e. we need about 6 times the array's size in temporary storage (8.4mb
in this case).

>>                        byte[] data = new byte[size];
>>
>>                        int id = Math.abs(new Random().nextInt());
>>
>>                        PreparedStatement stmt = c.prepareStatement(sql);
>>                        stmt.setInt(1, id);
>>                        stmt.setBinaryStream(2, new ByteArrayInputStream(data), data.length);

setBinaryStream ends up allocating a new byte array and reading into it,
then passing the result to setBytes. So you need a total of almost 10mb
of temporary storage to insert a 1.4mb bytearray. Yes, this sucks.

You'd be better off in this case (where the data is already in a byte
array) to call setBytes() directly. Once we have a streaming
implementation, though, calling setBinaryStream() will be better, as
setBytes() will have to take an immediate copy of the array to avoid
seeing later changes before the statement is executed and the data streamed.

If you can't throw memory at the problem, using LOs as Dave suggested
earlier is probably a better idea. The LO manager uses 4k blocks (IIRC)
when moving data to/from the backend so the memory overhead is much lower.

-O

Re: bytea size limit?

From

Michael Privat

Date:

11 April 2004, 23:54:39

Mmmh, well thanks guys. Only problem with changing to LOs is that I
already have data in production so changing the DB schema will be a
little complicated. I guess I could also contribute to the driver to
provide a streaming implementation. Do you know where that char[] is
in the code?

Sunday, April 11, 2004, 10:44:47 PM, you wrote:

OJ> Dave Cramer wrote:
>> Michael,
>>
>> that doesn't surprise me, as the postgresql driver currently buffers
>> that internally, so you end up with two buffers of 1400000 bytes, have a
>> look through the archives for out of memory errors.

OJ> It's worse that that, Jim..

OJ>  From memory, we actually end up with a char[] of size 3*array length on
OJ> average (assuming 50% of the data needs escaping to a '\\nnn' form),
OJ> i.e. we need about 6 times the array's size in temporary storage (8.4mb
OJ> in this case).

>>>                        byte[] data = new byte[size];
>>>
>>>                        int id = Math.abs(new Random().nextInt());
>>>
>>>                        PreparedStatement stmt = c.prepareStatement(sql);
>>>                        stmt.setInt(1, id);
>>>                        stmt.setBinaryStream(2, new
>>> ByteArrayInputStream(data), data.length);

OJ> setBinaryStream ends up allocating a new byte array and reading into it,
OJ> then passing the result to setBytes. So you need a total of almost 10mb
OJ> of temporary storage to insert a 1.4mb bytearray. Yes, this sucks.

OJ> You'd be better off in this case (where the data is already in a byte
OJ> array) to call setBytes() directly. Once we have a streaming
OJ> implementation, though, calling setBinaryStream() will be better, as
OJ> setBytes() will have to take an immediate copy of the array to avoid
OJ> seeing later changes before the statement is executed and the data streamed.

OJ> If you can't throw memory at the problem, using LOs as Dave suggested
OJ> earlier is probably a better idea. The LO manager uses 4k blocks (IIRC)
OJ> when moving data to/from the backend so the memory overhead is much lower.

OJ> -O

Re: bytea size limit?

From

Oliver Jowett

Date:

12 April 2004, 00:14:14

Michael Privat wrote:
> Mmmh, well thanks guys. Only problem with changing to LOs is that I
> already have data in production so changing the DB schema will be a
> little complicated. I guess I could also contribute to the driver to
> provide a streaming implementation. Do you know where that char[] is
> in the code?

The char[] arises from the fact that the driver turns each parameter
into a string representation at the time setBinaryStream etc. is called.
It's this string representation that is big. Take a look at
AbstractJdbc1Statement.setBytes() and the call to PGbytea.toPGString().

To fix this the Right Way involves:

  - changing setBytes() and setBinaryStream() to store the parameter
value for later use, not turn it into a string immediately.
  - using the V3 extended query protocol (this requires a number of
other driver changes, as at a minimum the driver will need to split up
queries that contain multiple statements) to allow use of a separate
Bind message.
  - using a binary-format parameter in the Bind message to represent the
bytea field, and streaming from the byte[]/InputStream to the socket
directly when writing the parameter.

This is not a trivial piece of work unfortunately.

There may be a way to do a temporary fix that provides streaming without
the other work. For V2, it may be possible to stream while still using a
string representation of the bytea, as V2 queries are null-terminated
and we don't need to know the length in advance. I haven't looked into
this in detail.

It may be possible to do this for V3 too, as we can in theory predict
the length of the stringized parameter (necessary as V3 messages are
preceeded by a total message length field): the bytea string format is
predictable, and the driver is using a predetermined encoding
(unicode/utf8).

It'd be ugly, though..

-O

Re: bytea size limit?

From

Tom Lane

Date:

12 April 2004, 01:23:48

Oliver Jowett <oliver@opencloud.com> writes:
> ... using the V3 extended query protocol (this requires a number of
> other driver changes, as at a minimum the driver will need to split up
> queries that contain multiple statements)

This is a bit off the thread topic, but: does the JDBC spec actually
allow/expect multiple SQL commands in a query?  Some people thought that
the V3 restriction to one command per query string was a Good Thing
because it helps guard against SQL-injection attacks when an application
is careless about inserting untrustworthy text into a command string.
If you don't have a spec requirement to cope with this, then I'd advise
against adding code to support it.

            regards, tom lane

Re: bytea size limit?

From

Oliver Jowett

Date:

12 April 2004, 02:15:49

Tom Lane wrote:
> Oliver Jowett <oliver@opencloud.com> writes:
>
>>... using the V3 extended query protocol (this requires a number of
>>other driver changes, as at a minimum the driver will need to split up
>>queries that contain multiple statements)
>
>
> This is a bit off the thread topic, but: does the JDBC spec actually
> allow/expect multiple SQL commands in a query?  Some people thought that
> the V3 restriction to one command per query string was a Good Thing
> because it helps guard against SQL-injection attacks when an application
> is careless about inserting untrustworthy text into a command string.
> If you don't have a spec requirement to cope with this, then I'd advise
> against adding code to support it.

I don't think there's such a requirement. The JDBC spec is vague, but
Sun's JDBC tutorial says:

===
This code does not end with a DBMS statement terminator, which can vary
from DBMS to DBMS. For example, Oracle uses a semicolon (;) to indicate
the end of a statement, and Sybase uses the word go . The driver you are
using will automatically supply the appropriate statement terminator,
and you will not need to include it in your JDBC code.
===

That said, executing multiple commands in a query seems quite common,
and works ok with the current driver. I'd be reluctant to break existing
apps by forbidding it.

Also, the JDBC spec recommends:

===
Drivers should provide access to every feature implemented by the
underlying data source, including features that extend the JDBC API.
When a feature is not supported, the corresponding methods throw an
SQLException. The intent is for applications using the JDBC API to have
access to the same feature set as native applications.
===

-O

Re: bytea size limit?

From

Michael Privat

Date:

12 April 2004, 19:52:06

Okay, I fixed the implementation of the BYTEA support in the driver
based on your guidelines and it is very happy now. I want to clean
up a few things before I do that, but what is the process for
submitting a patch to the dev team?

Sunday, April 11, 2004, 11:14:01 PM, you wrote:

OJ> Michael Privat wrote:
>> Mmmh, well thanks guys. Only problem with changing to LOs is that I
>> already have data in production so changing the DB schema will be a
>> little complicated. I guess I could also contribute to the driver to
>> provide a streaming implementation. Do you know where that char[] is
>> in the code?

OJ> The char[] arises from the fact that the driver turns each parameter
OJ> into a string representation at the time setBinaryStream etc. is called.
OJ> It's this string representation that is big. Take a look at
OJ> AbstractJdbc1Statement.setBytes() and the call to PGbytea.toPGString().

OJ> To fix this the Right Way involves:

OJ>   - changing setBytes() and setBinaryStream() to store the parameter
OJ> value for later use, not turn it into a string immediately.
OJ>   - using the V3 extended query protocol (this requires a number of
OJ> other driver changes, as at a minimum the driver will need to split up
OJ> queries that contain multiple statements) to allow use of a separate
OJ> Bind message.
OJ>   - using a binary-format parameter in the Bind message to represent the
OJ> bytea field, and streaming from the byte[]/InputStream to the socket
OJ> directly when writing the parameter.

OJ> This is not a trivial piece of work unfortunately.

OJ> There may be a way to do a temporary fix that provides streaming without
OJ> the other work. For V2, it may be possible to stream while still using a
OJ> string representation of the bytea, as V2 queries are null-terminated
OJ> and we don't need to know the length in advance. I haven't looked into
OJ> this in detail.

OJ> It may be possible to do this for V3 too, as we can in theory predict
OJ> the length of the stringized parameter (necessary as V3 messages are
OJ> preceeded by a total message length field): the bytea string format is
OJ> predictable, and the driver is using a predetermined encoding
OJ> (unicode/utf8).

OJ> It'd be ugly, though..

OJ> -O

Re: bytea size limit?

From

Kris Jurka

Date:

12 April 2004, 20:20:17

On Mon, 12 Apr 2004, Michael Privat wrote:

> Okay, I fixed the implementation of the BYTEA support in the driver
> based on your guidelines and it is very happy now. I want to clean
> up a few things before I do that, but what is the process for
> submitting a patch to the dev team?

Make sure you are working against the cvs version of the driver, available
at http://gborg.postgresql.org/project/pgjdbc/projdisplay.php and then
produce a context diff ( -c ) and send it to this list.

Kris Jurka

Re: bytea size limit?

From

Michael Privat

Date:

12 April 2004, 22:20:07

Here you go. Coded against the CVS HEAD. Let me know if you want me to
change anything. Obviously, because I have to send the size ahead,
there is no other way but store the entire byte[] in memory (can't
stream), but with this code, I was able to load a blob of 30Mb+ (with
a heap size of 32Mb) against 1.5Mb yesterday with the same heap. So
there is some improvement.

http://jupiter.incodea.com/trans/pgsql-jdbc.diff

Michael

Monday, April 12, 2004, 7:20:08 PM, you wrote:

KJ> On Mon, 12 Apr 2004, Michael Privat wrote:

>> Okay, I fixed the implementation of the BYTEA support in the driver
>> based on your guidelines and it is very happy now. I want to clean
>> up a few things before I do that, but what is the process for
>> submitting a patch to the dev team?

KJ> Make sure you are working against the cvs version of the driver, available
KJ> at http://gborg.postgresql.org/project/pgjdbc/projdisplay.php and then
KJ> produce a context diff ( -c ) and send it to this list.

KJ> Kris Jurka

KJ> ---------------------------(end of
KJ> broadcast)---------------------------
KJ> TIP 9: the planner will ignore your desire to choose an index scan if your
KJ>       joining column's datatypes do not match

Re: bytea size limit?

From

Oliver Jowett

Date:

12 April 2004, 23:08:54

Michael Privat wrote:
> Here you go. Coded against the CVS HEAD. Let me know if you want me to
> change anything. Obviously, because I have to send the size ahead,
> there is no other way but store the entire byte[] in memory (can't
> stream), but with this code, I was able to load a blob of 30Mb+ (with
> a heap size of 32Mb) against 1.5Mb yesterday with the same heap. So
> there is some improvement.
>
> http://jupiter.incodea.com/trans/pgsql-jdbc.diff

Comments:

Does this work for V2 connections? I didn't see a patch to the V2
execution path, but you've changed how setBytes() binds its argument.

The argument to setBytes() is not copied when bound; changes made to the
array after the setBytes() call but before query execution will be
visible. Personally I don't mind this behaviour but others disagree :)
-- see earlier discussion about mutable parameters in the pgsql-jdbc
archives.

Individually escaping and encoding each byte (in SendBytea and
GetByteaSize) seems overly expensive. We should be able to do this in
blocks at a minimum.

Why is EncodedStream necessary? Can't you just unroll that logic
directly in SendBytea()?

It's probably possible to work around needing the whole array in memory
(i.e. we could stream from an InputStream + length via setBinaryStream).
If we escape *every* byte, we need exactly 5 bytes of UTF8 output for
every input byte ('\' and '0'-'9' are all single bytes in UTF8, I
believe) so we could compute the output size directly from the input length.

====

That said, this patch is a lot simpler than I expected it to be! It
looks like a good solution until the driver knows how to speak V3
fluently. If you like, I'll see if I can rework your patch over the next
couple of days to address the above points.

-O

Re: bytea size limit?

From

Michael Privat

Date:

12 April 2004, 23:36:29

Frankly, I haven't touched the V2 and you are right, I did not apply
the patch to it. It's the first time I take a look at the driver so it
took me a little bit of adaptation, bare with me :)

With regards to individually escaping, it's just as expensive as how
it was before, except that it doesn't make several copies of the array
like it used to (hence the memory saving). I don't think there is any
performance impact at all. Basically just a memory gain.

As far as the encoding. I think in your original email you had
mentioned that the driver used UTF-8 (in which case there is an
obvious optimization that can be made), but I couldn't find it in the
driver. Everything looked like it was inheriting from the encoding
scheme set in the connection.

The stream is not necessary but I didn't like the idea of having bytea
logic inside the stream so I put it into that inner class to make it
easier to move if need be (although I couldn't find a good place for
it). Plus it makes it easier to help stream directly (once we figure this out).

Good point about over-escaping. This would probably resolve this whole
problem of pre-determining the size. I didn't think about this (but
then again, it assumes UTF-8 in all cases and I'm not familiar enough
with how the postgresql driver works to make this call).

Feel free to rework the patch as you want it. I just wanted to provide
a starting point and I'm happy to help out.

Monday, April 12, 2004, 10:08:44 PM, you wrote:

OJ> Michael Privat wrote:
>> Here you go. Coded against the CVS HEAD. Let me know if you want me to
>> change anything. Obviously, because I have to send the size ahead,
>> there is no other way but store the entire byte[] in memory (can't
>> stream), but with this code, I was able to load a blob of 30Mb+ (with
>> a heap size of 32Mb) against 1.5Mb yesterday with the same heap. So
>> there is some improvement.
>>
>> http://jupiter.incodea.com/trans/pgsql-jdbc.diff

OJ> Comments:

OJ> Does this work for V2 connections? I didn't see a patch to the V2
OJ> execution path, but you've changed how setBytes() binds its argument.

OJ> The argument to setBytes() is not copied when bound; changes made to the
OJ> array after the setBytes() call but before query execution will be
OJ> visible. Personally I don't mind this behaviour but others disagree :)
OJ> -- see earlier discussion about mutable parameters in the pgsql-jdbc
OJ> archives.

OJ> Individually escaping and encoding each byte (in SendBytea and
OJ> GetByteaSize) seems overly expensive. We should be able to do this in
OJ> blocks at a minimum.

OJ> Why is EncodedStream necessary? Can't you just unroll that logic
OJ> directly in SendBytea()?

OJ> It's probably possible to work around needing the whole array in memory
OJ> (i.e. we could stream from an InputStream + length via setBinaryStream).
OJ> If we escape *every* byte, we need exactly 5 bytes of UTF8 output for
OJ> every input byte ('\' and '0'-'9' are all single bytes in UTF8, I
OJ> believe) so we could compute the output size directly from the input length.

OJ> ====

OJ> That said, this patch is a lot simpler than I expected it to be! It
OJ> looks like a good solution until the driver knows how to speak V3
OJ> fluently. If you like, I'll see if I can rework your patch over the next
OJ> couple of days to address the above points.

OJ> -O

OJ> ---------------------------(end of
OJ> broadcast)---------------------------
OJ> TIP 6: Have you searched our list archives?

OJ>                http://archives.postgresql.org

Re: bytea size limit?

From

Kris Jurka

Date:

12 April 2004, 23:44:35

On Mon, 12 Apr 2004, Michael Privat wrote:

> As far as the encoding. I think in your original email you had
> mentioned that the driver used UTF-8 (in which case there is an
> obvious optimization that can be made), but I couldn't find it in the
> driver. Everything looked like it was inheriting from the encoding
> scheme set in the connection.
>

The driver's internal encoding (the client_encoding GUC) is unicode for
all 7.3+ databases, but not necessarily for 7.2 and before.  The 7.3
release was the first that came with multibyte support compiled by default
and before that time the charSet URL parameter was relied upon to
determine the correct database encoding.

Kris Jurka

Re: bytea size limit?

From

Oliver Jowett

Date:

12 April 2004, 23:49:24

Michael Privat wrote:
> With regards to individually escaping, it's just as expensive as how
> it was before, except that it doesn't make several copies of the array
> like it used to (hence the memory saving). I don't think there is any
> performance impact at all. Basically just a memory gain.

Doesn't it end up creating an individual String (and backing char[]) for
each source byte? While these objects are very short-lived, they are
extra garbage and will cause more frequent GCs.

> As far as the encoding. I think in your original email you had
> mentioned that the driver used UTF-8 (in which case there is an
> obvious optimization that can be made), but I couldn't find it in the
> driver. Everything looked like it was inheriting from the encoding
> scheme set in the connection.

Everything does indeed use the encoding object on the connection. The
trick is that if you look at the V3 connection setup code, the encoding
is always set to UNICODE which maps to Java's UTF8.

V2 connections can have a variety of encodings. The driver sets the
encoding to UNICODE for server versions >= 7.3, but uses the database's
encoding for earlier versions. But we don't need to know the encoded
length in advance when talking V2 anyway..

-O

Re: bytea size limit?

From

Michael Privat

Date:

20 April 2004, 12:48:08

I don't think it will create more frequent GCs (the VM GC works by
allocated size not object count). Either way, it's a mute point if we
assume UTF-8. In this case, it is possible to forecast the size of
that byte array in advance and we don't need to mess around with
encoding just for sizing. Also the sendByteA() method can be optimized
to encode A LOT faster and more efficiently, directly at the byte
level instead of using the Encoder, pushing the data directly into the
stream.

But anyway, I didn't mean to start a whole debate on this. Just
thought I'd share the code. The driver does handle a lot bigger
binary objects than before and we all agree that, with the UTF-8
assumption, there are good optimizations that can be made that had not
been noticed before.

Monday, April 12, 2004, 10:49:18 PM, you wrote:

OJ> Michael Privat wrote:
>> With regards to individually escaping, it's just as expensive as how
>> it was before, except that it doesn't make several copies of the array
>> like it used to (hence the memory saving). I don't think there is any
>> performance impact at all. Basically just a memory gain.

OJ> Doesn't it end up creating an individual String (and backing char[]) for
OJ> each source byte? While these objects are very short-lived, they are
OJ> extra garbage and will cause more frequent GCs.

>> As far as the encoding. I think in your original email you had
>> mentioned that the driver used UTF-8 (in which case there is an
>> obvious optimization that can be made), but I couldn't find it in the
>> driver. Everything looked like it was inheriting from the encoding
>> scheme set in the connection.

OJ> Everything does indeed use the encoding object on the connection. The
OJ> trick is that if you look at the V3 connection setup code, the encoding
OJ> is always set to UNICODE which maps to Java's UTF8.

OJ> V2 connections can have a variety of encodings. The driver sets the
OJ> encoding to UNICODE for server versions >= 7.3, but uses the database's
OJ> encoding for earlier versions. But we don't need to know the encoded
OJ> length in advance when talking V2 anyway..

OJ> -O

OJ> ---------------------------(end of
OJ> broadcast)---------------------------
OJ> TIP 2: you can get off all lists at once with the unregister command
OJ>     (send "unregister YourEmailAddressHere" to majordomo@postgresql.org)