Thread: Re: Strings with null characters produce exceptions when selectedor inserted. Attempts to select messages with null bytes produces "ERROR:insufficient data left in message". And inserting produces "ERROR: invalidbyte sequence for encoding \"UTF8\":

Apologies for top post using a blackberry. The binary mode should be able to cope fine as it passes the length before
sendingthe data. Should be straightforward to add strings to patch.
 

John
------Original Message------
From: Craig Ringer
Sender: pgsql-jdbc-owner@postgresql.org
To: Kris Jurka
Cc: user2037@ymail.com
Cc: pgsql-jdbc@postgresql.org
Subject: Re: [JDBC] Strings with null characters produce exceptions when selectedor inserted. Attempts to select
messageswith null bytes produces "ERROR:insufficient data left in message". And inserting produces "ERROR: invalidbyte
sequencefor encoding \"UTF8\": 0x00".  Since a null character is avalid UTF code point why is it rejected by the JDBC
driver? The attachedtest can work with Mysql and their JDBC driver.
 
Sent: 4 Jun 2009 01:34

Kris Jurka wrote:
> 
> 
> On Wed, 3 Jun 2009, user2037@ymail.com wrote:
> 
>> Strings with null characters produce exceptions when selected or
>> inserted. Attempts to select messages with null bytes produces "ERROR:
>> insufficient data left in message". And inserting produces "ERROR:
>> invalid byte sequence for encoding \"UTF8\": 0x00".
>>
>> Since a null character is a valid UTF code point why is it rejected by
>> the JDBC driver?
> 
> Because the server can't handle it.  The server is written in C and
> tracks all textual data as C strings which are null terminated.  It
> cannot handle intermediate null bytes, so the driver is just providing
> that message as early as possible to you.

Note that the `bytea' type _does_ store null bytes fine.

It's interesting that \0x00 is in fact valid utf-8, since it raises the
question of whether Pg should in fact support null bytes in `text' and
`varchar' strings.

--
Craig Ringer

-- 
Sent via pgsql-jdbc mailing list (pgsql-jdbc@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-jdbc


Sent using BlackBerry® from Orange
John Lister wrote:
> Apologies for top post using a blackberry. The binary mode should be able to cope fine as it passes the length before
sendingthe data. Should be straightforward to add strings to patch. 

It's not that we can't transfer strings with embedded zero bytes to the
server (we can), it's that the server does not handle the resulting
string. From memory it just truncates the string at that point. So we
disallow it in the driver rather than causing strange effects on the
server side.

So adding support for these strings to the binary patch would actually
be a regression.

-O

On Thu, 2009-06-04 at 11:58 +0000, John Lister wrote:
> Apologies for top post using a blackberry. The binary mode should be able to cope fine as it passes the length before
sendingthe data. Should be straightforward to add strings to patch. 

PostgreSQL itself does not consider byte sequences with embedded nulls
to be strings, as lot of its string-handling expects C strings

Use bytea type for "strings" with embedded nulls.


--
Hannu Krosing   http://www.2ndQuadrant.com
PostgreSQL Scalability and Availability
   Services, Consulting and Training