Re: A JDBC bug or problem relating to string length in Java - Mailing list pgsql-jdbc

From joe user
Subject Re: A JDBC bug or problem relating to string length in Java
Date
Msg-id 20030902013024.23669.qmail@web20421.mail.yahoo.com
Whole thread Raw
In response to Re: A JDBC bug or problem relating to string length in Java  (Kris Jurka <books@ejurka.com>)
Responses Re: A JDBC bug or problem relating to string length in Java
Re: A JDBC bug or problem relating to string length in Java
List pgsql-jdbc
--- Kris Jurka <books@ejurka.com> wrote:
> Actually varchar(N) in postgresql means the number
> of characters, not the
> number of bytes, so you should not have to worry
> about how it is actually
> encoded.

I still think there is something else going on here.
I definitely take all of my input through a truncate
method which truncates strings to 100 chars, and I
definitely get a "value too long for type character
varying(100)" error every once in a while.  The logs
show the input string to be some kind of multibyte
string, which I don't know the encoding of.  This is a
log of the "referrer" header in http requests.  There
is no specification of encoding of strings in http
headers, so these strings could be anything.  I have
tried to take those strings out of the logs and use
them to make the error happen again but I'm not
getting it to reduplicate.

Could it be possible that Java is taking this input as
binary form from the net, and constructing UTF-16
(using some incorrect default encoding guess) which is
not really valid encoding of anything, and getting the
length of that, and then PG transforms that invalid
UTF-16 into its best effort of UTF-8, which may have a
different number of chars because what Java thinks is
a single UTF-16 char is converted (incorrectly) into
two UTF-8 chars?

I think this is what is happening, and I'm not sure
how to handle this.

The fundamental problem is that the "get request
header" method of the Servlet API returns a String,
when it should be returning a byte[], because there is
NO way of knowing which encoding the client thinks it
is using.

Btw, this, and the null-byte problem, could probably
cause various low-bandwidth DoS attacks against any
site that uses PG/JDBC.  Imagine a typical JDBC use
like this:

    try {
        [ .... ]
        preparedStatement.setString(...);
        db.close();
    }
    catch(SQLException sqe) { [log it...] }

If enough of these multi-byte problems or null
problems are thrown at the app, it will throw an
exception in the try block before it can get to the
db.close() statement, quickly exhausting link
resources.  This is in fact happening on our web
application right now.  It seems that it would be
possible to bring down a service with at most a few
hundred requests like this.

Any ideas?


__________________________________
Do you Yahoo!?
Yahoo! SiteBuilder - Free, easy-to-use web site design software
http://sitebuilder.yahoo.com

pgsql-jdbc by date:

Previous
From: Fernando Nasser
Date:
Subject: Re: Caching
Next
From: Oliver Jowett
Date:
Subject: Re: A JDBC bug or problem relating to string length in Java