Thread: JDBC driver patch for non-ASCII users

JDBC driver patch for non-ASCII users

From
sulfinu@gmail.com
Date:
Hello.

A few weeks ago I ran into a problem with the JDBC authentication, namely I
wasn't able to connect to a PostgreSQL database whose name used non-ASCII
characters. I've discussed the issue on the pgsql-hackers list and come up
with the attached solution which has been used in an ISO-8859-2 environment
since then.

The purpose of this patch is to allow user names, user passwords and database
names to contain non-ASCII characters. The only restriction upon this was to
make sure that authentications that succeeded before should also succeed
after the patch was applied.
This patch maximizes the chances for the authentication to succeed and on top
of that, it makes possible to configure the PostgreSQL database so as the
authentication could always work. Without it, the behaviour of the current
JDBC driver is UNDETERMINED in a non-ASCII environment (for disbelievers:
check the javadoc for String.getBytes() and String.getBytes(String)).

During the authentication phase, the current JDBC driver uses the poor old
ASCII encoding regardless of the environment it is used in. On the other
side, postgres performs no conversion on the byte representation of the user,
password and database strings submitted by the driver. It is only afterwards
and the server takes into account the use of a UTF-8 encoding that the client
announces.

What I proposed below is for the driver to use the environment's encoding in
an attempt to match the encoding used at server side to store the user,
password and database strings, provided that this local encoding is an ASCII
extension. If it's not, the driver uses the UTF-8 encoding, thus allowing for
a setup where things can work regardless of the environment (specifically,
the database cluster should be (re)created with the UTF-8 encoding).

I suspect that there are CJK encodings that are ASCII extensions; if so,
please add them in the Encoding.isAsciiExtension(String) method. I know next
to nothing about the Asian encodings.

Regards.

Attachment

Re: JDBC driver patch for non-ASCII users

From
Oliver Jowett
Date:
sulfinu@gmail.com wrote:

> Without it, the behaviour of the current
> JDBC driver is UNDETERMINED in a non-ASCII environment (for disbelievers:
> check the javadoc for String.getBytes() and String.getBytes(String)).

BTW, your patch does not actually change this, you are still using
String.getBytes(String) so if you pass a username / database name /
password that is not representable in the encoding you're using for
authentication, you lose.

We can't use CharsetEncoder in pre-1.4 environments, so I don't see a
way of getting strictly correct behaviour here.

In practice, the behaviour of getBytes(String) seems fairly predictable,
you just get "?" back for unmappable characters, which of course doesn't
work for authentication but at least it doesn't go and set your cat on
fire ;-)

> What I proposed below is for the driver to use the environment's encoding in
> an attempt to match the encoding used at server side to store the user,
> password and database strings, provided that this local encoding is an ASCII
> extension. If it's not, the driver uses the UTF-8 encoding, thus allowing for
> a setup where things can work regardless of the environment (specifically,
> the database cluster should be (re)created with the UTF-8 encoding).

I don't particularly like this approach because it only solves the
problem for some encodings. Search the JDBC list archives and you'll see
we have had the same problem in the past reported for GBK-encoded
usernames, and I believe that encoding is not a superset of ASCII. If
we're going to mess around with charsets during authentication I'd like
to see all encodings fixed. In that particular case a new URL parameter
was suggested but I don't think it ever got into the main code. Perhaps
you can follow up on that approach?

Also, your patch appears to have a number of unnecessary changes in it
(e.g. why did you change the encoding used for the password salt, or the
results of UnixCrypt? and there's a spurious change to build.xml too..)

I would also be a lot happier if the protocol specification docs were
updated to reflect whatever the current "approved" way of doing
non-ASCII authentication info is before the driver started making
assumptions about it. (There are in fact 3rd party apps out there that
implement the server side of the protocol..)

-O

Re: JDBC driver patch for non-ASCII users

From
Tom Lane
Date:
Oliver Jowett <oliver@opencloud.com> writes:
> I would also be a lot happier if the protocol specification docs were
> updated to reflect whatever the current "approved" way of doing
> non-ASCII authentication info is before the driver started making
> assumptions about it.

That's a bit hard, because the real problem here is that there isn't any
"approved" way of dealing with this.  The short and unpleasant answer is
that if you put any non-7-bit-ASCII characters into shared system
catalogs (not only pg_authid, but pg_database and pg_tablespace),
You Are On Your Own.  When it breaks, you get to keep both pieces.

Given the current design that allows different databases in a cluster
to (claim they) have different encodings, it's real hard to see how
to handle non-ASCII data in shared catalogs sanely.  I don't think
we'll really be able to fix this properly until that mythical day
when we have support for per-column encoding selections.  My guess
is that we'd then legislate that shared catalog columns are always
UTF8; after which we could start to think about what it would take
to do conversion of the connection startup packet's contents from
client-side encoding to UTF8.

Right now it's all pretty broken, and I really question whether it's
sane to put workarounds like this proposed patch into client-side
drivers.  If you aren't consistent about the encoding you use for
non-ASCII usernames, you're going to lose somewhere along the line
anyway.  So why not just recommend that people do that?

            regards, tom lane

Re: JDBC driver patch for non-ASCII users

From
Kris Jurka
Date:

On Fri, 7 Dec 2007, Tom Lane wrote:

> Right now it's all pretty broken, and I really question whether it's
> sane to put workarounds like this proposed patch into client-side
> drivers.  If you aren't consistent about the encoding you use for
> non-ASCII usernames, you're going to lose somewhere along the line
> anyway.  So why not just recommend that people do that?
>

For the record, I'm in favor of changing our use of initial setup encoding
from SQL-ASCII to UTF-8.  While it doesn't solve the root of the problem,
it does allow people to use non-ascii user and database names if they set
them up appropriately and doesn't seem to harm anything.  The original
patch's suggested use of the client's environment encoding seems random to
me.  If decide on UTF-8, we'll at least be able to tell people how to make
it work when they cannot control the environment their clients operate in.

Kris Jurka

Re: JDBC driver patch for non-ASCII users

From
Kris Jurka
Date:

On Fri, 7 Dec 2007, Kris Jurka wrote:

> For the record, I'm in favor of changing our use of initial setup encoding
> from SQL-ASCII to UTF-8.

Make that US-ASCII.

Kris Jurka


Re: JDBC driver patch for non-ASCII users

From
Tom Lane
Date:
Kris Jurka <books@ejurka.com> writes:
> The original patch's suggested use of the client's environment
> encoding seems random to me.

Yeah, that bothered me too --- it seemed to be introducing a whole new
variable into the behavior, without much to show for it.

            regards, tom lane

Re: JDBC driver patch for non-ASCII users

From
sulfinu@gmail.com
Date:
On Saturday 08 December 2007, Oliver Jowett wrote:
> BTW, your patch does not actually change this, you are still using
> String.getBytes(String) so if you pass a username / database name /
> password that is not representable in the encoding you're using for
> authentication, you lose.
I didn't claim to solve the undetermined behaviour, I only said I maximized
the chances for the authentication to succeed. Read again.

> If
> we're going to mess around with charsets during authentication I'd like
> to see all encodings fixed. In that particular case a new URL parameter
> was suggested but I don't think it ever got into the main code. Perhaps
> you can follow up on that approach?
Well, it CANNOT be fixed for any encoding simply because of lack of support in
postgres. I thought about introducing a connection parameter, but the
community has to accept first my/a solution.

> Also, your patch appears to have a number of unnecessary changes in it
> (e.g. why did you change the encoding used for the password salt, or the
> results of UnixCrypt? and there's a spurious change to build.xml too..)
There's nothing spurious in my patch.
I haven't touched UnixCrypt, simply because there's nothing to be done about
it (not as a simple patch).
And the change in build.xml is a correction, check the file before posting
assumptions.

> I would also be a lot happier if the protocol specification docs were
> updated to reflect whatever the current "approved" way of doing
> non-ASCII authentication info is before the driver started making
> assumptions about it.
There's no specification simply because nothing special happens and "non-ASCII
authentication" is not even considered.

Re: JDBC driver patch for non-ASCII users

From
Oliver Jowett
Date:
sulfinu@gmail.com wrote:
> On Saturday 08 December 2007, Oliver Jowett wrote:

>> Also, your patch appears to have a number of unnecessary changes in it
>> (e.g. why did you change the encoding used for the password salt, or the
>> results of UnixCrypt? and there's a spurious change to build.xml too..)
> There's nothing spurious in my patch.
> I haven't touched UnixCrypt, simply because there's nothing to be done about
> it (not as a simple patch).

Can you explain what this change does, then?

>                          String result = UnixCrypt.crypt(salt, password);
> -                        byte[] encodedResult = result.getBytes("US-ASCII");
> +                        byte[] encodedResult = result.getBytes(Encoding.AUTHENTICATION_PHASE_ENCODING);

AFAIK UnixCrypt.crypt() should only be returning ASCII anyway, right?

Same thing re. salt decoding, doesn't the server only ever send ASCII?

> And the change in build.xml is a correction, check the file before posting
> assumptions.

If you have fixes to build.xml that are not related to your encoding
changes, please post them as a separate patch with an explanation. If
you post a number of unrelated changes as a single patch, it makes it
much harder to selectively apply them. I have no comment on whether that
correction is necessary or not -- but it's unexplained and not encoding
related, so it shouldn't be there.

-O

Re: JDBC driver patch for non-ASCII users

From
Oliver Jowett
Date:
Kris Jurka wrote:

> For the record, I'm in favor of changing our use of initial setup
> encoding from US-ASCII to UTF-8.  While it doesn't solve the root of
> the problem, it does allow people to use non-ascii user and database
> names if they set them up appropriately and doesn't seem to harm
> anything.

That does seem like the simplest approach and should cover all possible
usernames (if you can't represent it in UTF-8 you can't get it into a
String, anyway), if not all possible encodings.

> The original patch's suggested use of the client's
> environment encoding seems random to me.

Agreed. I don't like the idea of "configure the server to use UTF-8 like
this.. except if you have one of these encodings as your JVM default in
which case you need to do something else like this.. and maybe how your
cluster needs to be configured will change if you change JVM versions or
environment or driver version".

-O

Re: JDBC driver patch for non-ASCII users

From
Kris Jurka
Date:

On Sat, 8 Dec 2007, sulfinu@gmail.com wrote:

> And the change in build.xml is a correction, check the file before posting
> assumptions.
>

For the record, I made the same change to build.xml in CVS a while
back:

http://cvs.pgfoundry.org/cgi-bin/cvsweb.cgi/jdbc/pgjdbc/build.xml.diff?r1=1.75&r2=1.76

Kris Jurka

Re: JDBC driver patch for non-ASCII users

From
sulfinu@gmail.com
Date:
On Saturday 08 December 2007, Oliver Jowett wrote:

> Can you explain what this change does, then?
>
> >                          String result = UnixCrypt.crypt(salt, password);
> > -                        byte[] encodedResult =
> > result.getBytes("US-ASCII"); +                        byte[]
> > encodedResult = result.getBytes(Encoding.AUTHENTICATION_PHASE_ENCODING);
>
> AFAIK UnixCrypt.crypt() should only be returning ASCII anyway, right?
Wrong, it returns a String, in other words a series of Unicode points, not
necessarily ASCII characters.
>
> Same thing re. salt decoding, doesn't the server only ever send ASCII?
The server never bothers about any encoding during the authentication phase.
The server simply treats everything as byte arrays and compares what is
received from the client to the result of its own byte-wise computation,
based on stored user name and password.

My change is an attempt to increase the chances of a successful authentication
by trying to guess the encoding used at server-side for the byte
representation of user name, password and database name.


> If you have fixes to build.xml that are not related to your encoding
> changes, please post them as a separate patch with an explanation. If
> you post a number of unrelated changes as a single patch, it makes it
> much harder to selectively apply them. I have no comment on whether that
> correction is necessary or not -- but it's unexplained and not encoding
> related, so it shouldn't be there.
You have a point here. I knew I should have added a statement about that very
small correction in build.xml, but the message was already long enough and I
feared it might not be read patiently and in full (as it happened
anyway ;) ).
The Ant file is rather complicated and I thought the correction would be
silently understood and included. Ironically, the correction was already made
in CVS by Kris Jurka.

Re: JDBC driver patch for non-ASCII users

From
Oliver Jowett
Date:
sulfinu@gmail.com wrote:
> On Saturday 08 December 2007, Oliver Jowett wrote:
>
>> Can you explain what this change does, then?
>>
>>>                          String result = UnixCrypt.crypt(salt, password);
>>> -                        byte[] encodedResult =
>>> result.getBytes("US-ASCII"); +                        byte[]
>>> encodedResult = result.getBytes(Encoding.AUTHENTICATION_PHASE_ENCODING);
>> AFAIK UnixCrypt.crypt() should only be returning ASCII anyway, right?
> Wrong, it returns a String, in other words a series of Unicode points, not
> necessarily ASCII characters.

A general String certainly might contain non-ASCII, but all Strings
returned by UnixCrypt.crypt() contain only ASCII. See the implementation
of UnixCrypt.crypt(), and in particular the UnixCrypt.cov_2char array.

So I still do not understand why the change quoted above is necessary.
Can you explain why it is necessary?

>> Same thing re. salt decoding, doesn't the server only ever send ASCII?
> The server never bothers about any encoding during the authentication phase.
> The server simply treats everything as byte arrays and compares what is
> received from the client to the result of its own byte-wise computation,
> based on stored user name and password.

The server only ever sends a subset of ASCII as crypt salt characters
(specifically, the 62 characters a-z A-Z 0-9), so US-ASCII is just fine
for decoding. See postmaster.c, RandomSalt() / RemapChar().

So I still do not understand why a change to the encoding used to decode
crypt salt characters is necessary. Can you explain why it is necessary?

-O


Re: JDBC driver patch for non-ASCII users

From
sulfinu@gmail.com
Date:
On Saturday 08 December 2007, Tom Lane wrote:
> Given the current design that allows different databases in a cluster
> to (claim they) have different encodings, it's real hard to see how
> to handle non-ASCII data in shared catalogs sanely.  I don't think
> we'll really be able to fix this properly until that mythical day
> when we have support for per-column encoding selections.  My guess
> is that we'd then legislate that shared catalog columns are always
> UTF8; after which we could start to think about what it would take
> to do conversion of the connection startup packet's contents from
> client-side encoding to UTF8.
First of all, judgeing from the code I read, you'll have to adjust the wire
protocol so that the encoding is signaled at the very beginning of a
connection! The V3 protocol seems close, but not just there. Take for example
the way that the encoding information is processed when XML reader programs.

Next, there's something I already suggested on the "-hackers" mailing list.
Until the day when PostgreSQL is rewritten in a Unicode-savvy language (where
a "char" is indeed a Unicode point), I believe you should consider enforcing
for any database cluster a single encoding chosen from the encodings that
cover the whole Unicode set, like UTF-8, UTF-16 etc. This way, lots of
problems disappear, things get cleaner and clients need not guess the
encoding used at server side for user name, password, database name, table
names and so on. Collation rules would finally depend on locale solely, just
as they should.

The only downside that I see is an (slight) increase in database size, but
that's not an issue nowadays. Perhaps you could offer administrators a choice
of encoding upon cluster creation, that would statistically minimize the
size, depending on the mostly used languages.
If you have other reasons against it, bring them to the table, but please do
not post ridiculous statements like "I'm not sure a Java char is a Unicode
point" or "I don't think that Unicode covers all languages", which I didn't
even bother to answer with the classical "RTFM!".

Support for per-column encoding selection is from my point of view a stupid
waste of developing effort and CPU time, not to mention it is a great
opportunity to introduce a myriad of bugs. You're looking at the problem from
the wrong end: it is not the encodings that must be flexibly chosen, it is
the alphabet! No user is ever going to be interested in the internal encoding
of a Postgres database file, nor should he be. But the user will always
appreciate to find again the same strings as he has put in the database,
regardless of his mother tongue and client program. The logical solution is
to support Unicode and disregard encodings altogether (actually, keep them
under the sheets, since they are a result of historical limitations).

On Saturday 08 December 2007, Kris Jurka wrote:
> For the record, I'm in favor of changing our use of initial setup encoding
> from SQL-ASCII to UTF-8.  While it doesn't solve the root of the problem,
> it does allow people to use non-ascii user and database names if they set
> them up appropriately and doesn't seem to harm anything.
Will you change ALL clients in order to do that? I only needed one client to
actually work, JDBC - very frustrating, since it was supposed to be
Unicode-proof, written in Java. Ironically, psql works because it uses the
platform encoding ;)

> The original
> patch's suggested use of the client's environment encoding seems random to
> me.
It's not random, it is a heuristical approach in guessing the right encoding,
namely the encoding used by the administrator when he created the database
and the user. Afterall, there cannot be anything random in a computer, can
it?
My solution preserves the currently working configurations - the ASCII-only
setups will continue to work after the patch is applied. Moreover, UTF-8
setups are guaranteed to always work!
In short, my patch solves today(!) with no undesired side-effects a limitation
of the PostgreSQL authentication procedure in the JDBC driver. You're free to
reject it, I published it for the general benefit (as it happens, you asked
it yourself).

Good luck.

Re: JDBC driver patch for non-ASCII users

From
sulfinu@gmail.com
Date:
On Tuesday 11 December 2007, Oliver Jowett wrote:
> A general String certainly might contain non-ASCII, but all Strings
> returned by UnixCrypt.crypt() contain only ASCII. See the implementation
> of UnixCrypt.crypt(), and in particular the UnixCrypt.cov_2char array.
>
> So I still do not understand why the change quoted above is necessary.
> Can you explain why it is necessary?
As I already stated, UnixCrypt looks like a lost cause to me. It truncates a
lot of the information in the password (try to imagine what happens to a
Unicode password at line 620), so it's result is a little more relevant than
nothing.

> The server only ever sends a subset of ASCII as crypt salt characters
> (specifically, the 62 characters a-z A-Z 0-9), so US-ASCII is just fine
> for decoding. See postmaster.c, RandomSalt() / RemapChar().
I didn't know that, I'll take your word for it. Is it officially specified
anywhere? Can it be relied upon in the future?

I used the same Encoding.AUTHENTICATION_PHASE_ENCODING in all places for
consistency and it doesn't harm in any way, since I make sure it is an ASCII
extension.

Re: JDBC driver patch for non-ASCII users

From
Oliver Jowett
Date:
sulfinu@gmail.com wrote:
> On Tuesday 11 December 2007, Oliver Jowett wrote:
>> A general String certainly might contain non-ASCII, but all Strings
>> returned by UnixCrypt.crypt() contain only ASCII. See the implementation
>> of UnixCrypt.crypt(), and in particular the UnixCrypt.cov_2char array.
>>
>> So I still do not understand why the change quoted above is necessary.
>> Can you explain why it is necessary?
> As I already stated, UnixCrypt looks like a lost cause to me. It truncates a
> lot of the information in the password (try to imagine what happens to a
> Unicode password at line 620), so it's result is a little more relevant than
> nothing.

That's not what I'm complaining about. Why do you need to change the
encoding of the RESULT of UnixCrypt?

>> The server only ever sends a subset of ASCII as crypt salt characters
>> (specifically, the 62 characters a-z A-Z 0-9), so US-ASCII is just fine
>> for decoding. See postmaster.c, RandomSalt() / RemapChar().
> I didn't know that, I'll take your word for it. Is it officially specified
> anywhere? Can it be relied upon in the future?

Yes. My crypt(3) manpage here says:

* salt is a two-character string chosen from the set [a-zA-Z0-9./]
* The returned value points to the encrypted password, a series of 13
printable ASCII characters (the first  two  characters  represent the
salt itself).
* CONFORMING TO: SVr4, 4.3BSD, POSIX.1-2001

> I used the same Encoding.AUTHENTICATION_PHASE_ENCODING in all places for
> consistency and it doesn't harm in any way, since I make sure it is an ASCII
> extension.

Ok, so this is an unnecessary change that we can discard?

I actually think this patch as a whole is a lost cause (unfortunately,
more because you seem to be unwilling to take feedback on it than any
inherent problem). Would a patch that did only UTF-8 encoding, as
suggested by Kris, handle the cases you need?

-O

Re: JDBC driver patch for non-ASCII users

From
sulfinu@gmail.com
Date:
On Wednesday 12 December 2007, Oliver Jowett wrote:
> > I used the same Encoding.AUTHENTICATION_PHASE_ENCODING in all places for
> > consistency and it doesn't harm in any way, since I make sure it is an
> > ASCII extension.
>
> Ok, so this is an unnecessary change that we can discard?
Oliver, for the third time, what happens in UnixCrypt is completely wrong in a
non-ASCII environment. The crypt() methods try to mimic an OS function that
executes on the server-side based on a password encoded in an encoding that
is HIGHLY probably different from UTF-16, which is used by Java.
The whole idea needs to be reconsidered. Make sure you read again my quote
above.
If you're so cautious with your disk space on the CVS server, then yes, by all
means, you can discard the changes made at lines 293 and 302 in
org.postgresql.core.v3.ConnectionFactoryImpl.

> I actually think this patch as a whole is a lost cause (unfortunately,
> more because you seem to be unwilling to take feedback on it than any
> inherent problem).
You might want to rephrase that, it doesn't look like English... (rhetorical)

> Would a patch that did only UTF-8 encoding, as
> suggested by Kris, handle the cases you need?
No, it wouldn't. Nor am I keen on having my patch included in the main source
tree.
On the other hand it would be very helpful if you didn't waste my time on
sterile discussions over marginal things. Please read the other posts in this
thread and discover what the truely important issues are.
Thanks.

Re: JDBC driver patch for non-ASCII users

From
Oliver Jowett
Date:
sulfinu@gmail.com wrote:

>> Would a patch that did only UTF-8 encoding, as
>> suggested by Kris, handle the cases you need?
> No, it wouldn't. Nor am I keen on having my patch included in the main source
> tree.

I'm not sure what you want here then.

> On the other hand it would be very helpful if you didn't waste my time on
> sterile discussions over marginal things. Please read the other posts in this
> thread and discover what the truely important issues are.

I'm trying to work out exactly what the changes you care about are, but
this sort of response is not helping.

I'd point out that I've already been through the encoding issues in
authentication in some detail once before while doing the v3 protocol
rewrite, so I *do* know what I'm talking about here. I am not raising
these issues to obstruct your patch; I am raising them because the whole
authentication area is relatively fragile and you have not demonstrated
a good grasp of what is going on here. So I'd like to know exactly *why*
you have made some changes before they go near the main driver code. "I
changed this because I could" is not a good reason for changing code in
this area.

If you're not prepared to discuss the reasons behind your changes, well,
I'm not quite sure what you wanted out of posting your patch to the list.

Perhaps you could resend a patch with just the changes you care about
included, and we can discuss that?

-O

JDBC causing the connection to close after executing a COPY command

From
Altaf Malik
Date:
Hi,
 JDBC is causing the connection to close on executing a simple COPY query which the driver does not support. The code shows that driver throws an exception however explicitly the connection is not closed. Consider the following case.
<code>
Class.forName("org.postgresql.Driver");
Connection con = DriverManager.getConnection("jdbc:postgresql://localhost/postgres","altaf","");

Statement stmt = con.createStatement();
stmt.execute("create table abc(a int4)");
try{
        stmt.execute("COPY abc TO STDOUT WITH DELIMITER '|'");
        }catch(Exception e) {
            e.printStackTrace();
        }
        stmt.execute("Select * from abc");

</code>

Can anybody tell what's happening?

Thanks,
Altaf Malik
EnterpriseDB
http://www.enterprisedb.com/


Looking for last minute shopping deals? Find them fast with Yahoo! Search.

Re: JDBC causing the connection to close after executing a COPY command

From
Dave Cramer
Date:
Altaf,
On 13-Dec-07, at 10:24 AM, Altaf Malik wrote:

Hi,
 JDBC is causing the connection to close on executing a simple COPY query which the driver does not support. The code shows that driver throws an exception however explicitly the connection is not closed. Consider the following case.
<code>
Class.forName("org.postgresql.Driver");
Connection con = DriverManager.getConnection("jdbc:postgresql://localhost/postgres","altaf","");

Statement stmt = con.createStatement();
stmt.execute("create table abc(a int4)");
try{
        stmt.execute("COPY abc TO STDOUT WITH DELIMITER '|'");
        }catch(Exception e) {
            e.printStackTrace();
        }
        stmt.execute("Select * from abc");

</code>

Can anybody tell what's happening?

At what point can you determine the connection is closed ? Can you do con.createStatement after the exception ?

Dave
Thanks,
Altaf Malik
EnterpriseDB
http://www.enterprisedb.com/


Looking for last minute shopping deals? Find them fast with Yahoo! Search.

Re: JDBC causing the connection to close after executing a COPY command

From
Altaf Malik
Date:
Hi,
At what point can you determine the connection is closed ? Can you do con.createStatement after the exception ?
If you see, i have placed the the
stmt.execute("COPY abc TO STDOUT WITH DELIMITER '|'");
in a try-catch block, and after the try-catch block, i have executed again
stmt.execute("Select * from abc")
which throws an IO Exception. Also if i enable logging in JDBC, and only execute the COPY statement, log shows that FrontEnd sent a Terminate message.


Thanks,
Altaf Malik
EnterpriseDB
http://www.enterprisedb.com/


Never miss a thing. Make Yahoo your homepage.

Re: JDBC causing the connection to close after executing a COPY command

From
Kris Jurka
Date:

On Thu, 13 Dec 2007, Altaf Malik wrote:

> JDBC is causing the connection to close on executing a simple COPY query
> which the driver does not support. The code shows that driver throws an
> exception however explicitly the connection is not closed. Consider the
> following case.

The JDBC driver doesn't even understand enough of the COPY protocol to
ignore it.  Perhaps we could respond to any copy requests with copyfail,
but it would make more sense to me to look at integrating the copy patch.

So, I'm not sure what your complaint is.  Do you want the driver to be
able to continue after a copy operation is issued or do you just want a
better error message?  How would client code be better able to handle a
"Connection has been closed behind the scenes because something illegal
was done" vs an IOException.

Kris Jurka

Re: JDBC causing the connection to close after executing a COPY command

From
Kris Jurka
Date:

On Thu, 13 Dec 2007, Altaf Malik wrote:

> Also if i enable logging in JDBC, and only execute the COPY statement,
> log shows that FrontEnd sent a Terminate message.
>

What's happening is that after receiving the server's message that it
wants to do a COPY, the driver doesn't understand the protocol and dies
with an "java.io.IOException: Unexpected packet type: " error.  Once this
happens the driver can't do anything but close the connection.  As part of
the connection closing process it issues a Terminate message to the
server, in case it's still listening and the protocol is in a good enough
state that the server might receive it.  The driver is not explicitly
terminating the connection upon receiving a copy request, that's just what
happens when it gets confused.

Kris Jurka

Re: JDBC causing the connection to close after executing a COPY command

From
Altaf Malik
Date:
I debugged little more and found that this is not because of incomplete copy support. When a response to any of the copy operations is received, JDBC reads next few bytes to ignore the response. But unfortunately driver reads extra four bytes which is the RFQ message so the RFQ message gets corrupted and IOException is thrown considering it as an unknown packet.

I am attaching a patch that fixes this issue and driver works smoothly even after giving the error on copy response.

Regards,
Altaf Malik
EnterpriseDB
http://www.enterprisedb.com/

Kris Jurka <books@ejurka.com> wrote:


On Thu, 13 Dec 2007, Altaf Malik wrote:

> Also if i enable logging in JDBC, and only execute the COPY statement,
> log shows that FrontEnd sent a Terminate message.
>

What's happening is that after receiving the server's message that it
wants to do a COPY, the driver doesn't understand the protocol and dies
with an "java.io.IOException: Unexpected packet type: " error. Once this
happens the driver can't do anything but close the connection. As part of
the connection closing process it issues a Terminate message to the
server, in case it's still listening and the protocol is in a good enough
state that the server might receive it. The driver is not explicitly
terminating the connection upon receiving a copy request, that's just what
happens when it gets confused.

Kris Jurka


Never miss a thing. Make Yahoo your homepage.

Re: JDBC causing the connection to close after executing a COPY command

From
Kris Jurka
Date:

On Thu, 13 Dec 2007, Altaf Malik wrote:

> I debugged little more and found that this is not because of incomplete
> copy support. When a response to any of the copy operations is received,
> JDBC reads next few bytes to ignore the response. But unfortunately
> driver reads extra four bytes which is the RFQ message so the RFQ
> message gets corrupted and IOException is thrown considering it as an
> unknown packet.

Good catch.

> I am attaching a patch that fixes this issue and driver works smoothly
> even after giving the error on copy response.
>

Sort of.  In the case of copying data from the server, if you do have a
large table, you'll generate a SQLException for every row of the copy
result which isn't great, but I guess it's better than failing completely.
Trying to copy out a single column table of 100k integers took about seven
minutes and 150 MB of memory here.

Also, it doesn't work for copying data to the server.  In that case the
server is sitting waiting for data and the driver is waiting for the
server and nothing happens.  That was the case before, so this is no
worse.

Do you have plans to work on this further or shall I apply just this minor
improvement?

Kris Jurka

Re: JDBC causing the connection to close after executing a COPY command

From
Altaf Malik
Date:
Yes i'll take care of that and send another patch.

Regards,
Altaf Malik
EnterpriseDB
http://www.enterprisedb.com/

Kris Jurka <books@ejurka.com> wrote:


On Thu, 13 Dec 2007, Altaf Malik wrote:

> I debugged little more and found that this is not because of incomplete
> copy support. When a response to any of the copy operations is received,
> JDBC reads next few bytes to ignore the response. But unfortunately
> driver reads extra four bytes which is the RFQ message so the RFQ
> message gets corrupted and IOException is thrown considering it as an
> unknown packet.

Good catch.

> I am attaching a patch that fixes this issue and driver works smoothly
> even after giving the error on copy response.
>

Sort of. In the case of copying data from the server, if you do have a
large table, you'll generate a SQLException for every row of the copy
result which isn't great, but I guess it's better than failing completely.
Trying to copy out a single column table of 100k integers took about seven
minutes and 150 MB of memory here.

Also, it doesn't work for copying data to the server. In that case the
server is sitting waiting for data and the driver is waiting for the
server and nothing happens. That was the case before, so this is no
worse.

Do you have plans to work on this further or shall I apply just this minor
improvement?

Kris Jurka


Looking for last minute shopping deals? Find them fast with Yahoo! Search.

Re: JDBC causing the connection to close after executing a COPY command

From
Altaf Malik
Date:
I am attaching the new patch.

For CopyInResponse, client will send a CopyFail message and server won't wait for the client input. For CopyOutResponse, exception is generated for only CopyOutResponse and not for every CopyData message. Also slightly changed the 'ignoring' mechanism from reading the message to ignore it to skipping the message. I hope this
 will also improve the performance.

Altaf Malik
EnterpriseDB
http://www.enterprisedb.com/

Kris Jurka <books@ejurka.com> wrote:

In the case of copying data from the server, if you do have a
large table, you'll generate a SQLException for every row of the copy
result which isn't great, but I guess it's better than failing completely.

Trying to copy out a single column table of 100k integers took about seven
minutes and 150 MB of memory here.

Also, it doesn't work for copying data to the server. In that case the
server is sitting waiting for data and the driver is waiting for the
server and nothing happens. That was the case before, so this is no
worse.

Do you have plans to work on this further or shall I apply just this minor
improvement?

Kris Jurka



Best Regards
Muhammad Altaf Malik
Sr. Software Engineer,
EnterpriseDB (SMC) (pvt) Ltd.
3rd floor, Software Technology Park,
Islamabad. Ph: +92 (051) 2873815
Cell: 0321-5177961


Never miss a thing. Make Yahoo your homepage.

Re: JDBC causing the connection to close after executing a COPY command

From
Kris Jurka
Date:

On Fri, 14 Dec 2007, Altaf Malik wrote:

> For CopyInResponse, client will send a CopyFail message and server won't
> wait for the client input. For CopyOutResponse, exception is generated
> for only CopyOutResponse and not for every CopyData message. Also
> slightly changed the 'ignoring' mechanism from reading the message to
> ignore it to skipping the message. I hope this will also improve the
> performance.
>

Applied and backpatched to 8.0.

Kris Jurka