Re: A JDBC bug or problem relating to string length in Java - Mailing list pgsql-jdbc
From | joe user |
---|---|
Subject | Re: A JDBC bug or problem relating to string length in Java |
Date | |
Msg-id | 20030902013024.23669.qmail@web20421.mail.yahoo.com Whole thread Raw |
In response to | Re: A JDBC bug or problem relating to string length in Java (Kris Jurka <books@ejurka.com>) |
Responses |
Re: A JDBC bug or problem relating to string length in Java
(Oliver Jowett <oliver@opencloud.com>)
Re: A JDBC bug or problem relating to string length in Java (Kris Jurka <books@ejurka.com>) |
List | pgsql-jdbc |
--- Kris Jurka <books@ejurka.com> wrote: > Actually varchar(N) in postgresql means the number > of characters, not the > number of bytes, so you should not have to worry > about how it is actually > encoded. I still think there is something else going on here. I definitely take all of my input through a truncate method which truncates strings to 100 chars, and I definitely get a "value too long for type character varying(100)" error every once in a while. The logs show the input string to be some kind of multibyte string, which I don't know the encoding of. This is a log of the "referrer" header in http requests. There is no specification of encoding of strings in http headers, so these strings could be anything. I have tried to take those strings out of the logs and use them to make the error happen again but I'm not getting it to reduplicate. Could it be possible that Java is taking this input as binary form from the net, and constructing UTF-16 (using some incorrect default encoding guess) which is not really valid encoding of anything, and getting the length of that, and then PG transforms that invalid UTF-16 into its best effort of UTF-8, which may have a different number of chars because what Java thinks is a single UTF-16 char is converted (incorrectly) into two UTF-8 chars? I think this is what is happening, and I'm not sure how to handle this. The fundamental problem is that the "get request header" method of the Servlet API returns a String, when it should be returning a byte[], because there is NO way of knowing which encoding the client thinks it is using. Btw, this, and the null-byte problem, could probably cause various low-bandwidth DoS attacks against any site that uses PG/JDBC. Imagine a typical JDBC use like this: try { [ .... ] preparedStatement.setString(...); db.close(); } catch(SQLException sqe) { [log it...] } If enough of these multi-byte problems or null problems are thrown at the app, it will throw an exception in the try block before it can get to the db.close() statement, quickly exhausting link resources. This is in fact happening on our web application right now. It seems that it would be possible to bring down a service with at most a few hundred requests like this. Any ideas? __________________________________ Do you Yahoo!? Yahoo! SiteBuilder - Free, easy-to-use web site design software http://sitebuilder.yahoo.com
pgsql-jdbc by date: