Re: Detecting 'socket errors' - closing the Connection object - Mailing list pgsql-jdbc

From David Wall
Subject Re: Detecting 'socket errors' - closing the Connection object
Date
Msg-id 039b01c35073$4e4239d0$3201a8c0@rasta
Whole thread Raw
In response to Fix for receiving empty query errors.  (Kim Ho <kho@redhat.com>)
List pgsql-jdbc
> I still don't understand why do you believe it is Connection's job.
> Why aren't you suggesting that the Socket should reopen itself if it
> gets an error from the network?
> Or why don't you expect the java VM to restart itself in case an app
> exists on some kind of an unexpected error condition? :-)

I don't think that's a fair comparison at all.  For one thing, an I/O error
on a socket is not an unexpected condition.  The underlying socket using in
a JDBC Connection object is an internal variable.  The Connection opens it,
closes it, etc. and never gives it out.  In fact, there is probably nothing
that requires that the Connection use a socket per se, if another
communications channel would work.  It's just that TCP is the common
mechanism, but I'm sure shared memory, UNIX sockets, pipes, or a myriad of
non-TCP protocols (sna, osi, sonet, token rings whatever) could be used just
fine.  The entire purpose of object orientation is to hide such details.

Since I never closed the Connection object, and the JDBC library itself
never closed it, and since the underlying communications channel is broken
and will never recover itself, it would be fine for the Connection object to
note that it's last request was a fatal I/O error, cause the SQLException to
be thrown, reset it's file pointer to indicate no underlying socket is open,
and on the next use of the connection, it could try to re-open it.

Again, I'm okay with closing it myself if there were an obvious way to
detect that a given SQLException was in fact related to an unrecoverable I/O
error.  It's just that simple.

David


> With Oracle, you can certainly look at the exception code to find out
> the exact cause of the problem.

True, but it's Oracle specific, which is rather non-JDBC like.  Also, as you
pointed out, I'd have to have a complete understanding of all of the
possible error codes and which ones were related to unrecoverable
connections that would require a new connection to be established.  After
all, a socket error is an obvious form, but perhaps there would be other
internal JDBC errors or the like that would also mean a connection object
must be recreated since the current object is effectively dead for all
future queries.  Then we'd have to test against each of those codes, and as
new codes are added, I'd have to change my code again.  This is not the job
of the app programmer, it's the job of the JDBC programmer as the JDBC
library owns the Connection object and only it can tell the app programmer
if the Connection object is no longer functional.  (Or I have to do a "never
can fail" query and see if that fails or not, as you suggested).

> It doesn't matter. If it is an I/O exception, it could be caused by a
> bunch of reasons. The database can be down, the computer could have
> crashed, the network could be inaccessible etc, etc... In all of those
> cases it is pointless to try and restore the connection.

You are just being argumentative now.  Whether a connection can be
established at all IS up to the app code.  When I try to create a Connection
object, if it fails, then I can do whatever I think is reasonable.  For most
applications, an inability to get a connection the database would be highly
fatal as few db-related apps can do anything without the db available.  But
that's up to me.  What happens now is that there is such an error, the
socket could be re-established, but the Connection object and SQLException
give no clear, common indication that the Connection is broken and must be
re-established.

After all, nobody throws out a Connection just because an SQLException
occurred, which shows that most such exceptions are not fatal.

> And the other way around - if you get a SQL exception, it doesn't mean
> that the database is up, it could have crashed the next millisecond
> after sending a response to you. For all intents and purposes it would
> be the same situation as getting a SocketError to begin with.

So, this would be true after your "cannot fail" query.  You think you just
put a valid Connection back in your pool because it passed it, but by your
own statement here, you point out that this is not so.

> The point is that the only way to find out if your connection is good
> (meaning that the server is willing to talk to it) at this exact point
> in time is to ask the server (by sending a query).

Yes, and this is exactly what my code does.  It sends a query, gets an
SQLException, but then has no clear, common way to determine if that was
because of a fatal problem with the Connection object, or whether it was
just a typical SQL type of error.


> Besides, even if it wasn't that hard, it would still be pretty pointless
> - as I explained above - if the error you are getting is "soft" (or even
> if you are not getting any error at all), it doesn't mean that your
> connection is good be the time you are finished looking at that error.
> It only means it *was* good at the moment that response was sent... but
> what good does it do to you?

Since when has a programmer attempted to detect errors before doing an
operation?  That's just silly now.  I don't care about some future
transaction; I care about the transaction I just did, and it failed.
Without some knowledge about the type of failure, though, I will either
think I can reuse the Connection object or not.  If I reuse it and it was
typical SQLException, it will likely work just fine.  If it was an I/O
exception, it will never work, no matter what I do.  So, it's quite obvious
that I'd like to know the difference since in the latter case I'll know that
I should attempt to recreate the connection before I submit another query.

> It is not "defensive" - if you acknowledge that network/database errors
> are the reality of life, you have to keep them in mind when writing your
> code.

We do.  That's why we catch SQLException on every db call.  When I write
socket code, I catch IOException, and with the amazing future insights, I am
able to close such a socket and reopen it when the time comes.  I'm just
amazing that way; I don't just keep retrying reading a file after EOF, too.
But I suppose in your mind such indications shouldn't be given to the
programmer, and I should just keep on trying reading, and if the read fails,
I could write extra hack code just to see if that IOException was real or
not.

The point is, the Connection does report the exception just fine.  The
Connection KNOWS there was an I/O Exception.  The Connection KNOWS that the
underlying socket is useless.  It's just keeping that info to itself so my
app cannot take remedial action when such an error occurs, instead of using
your defensive programming method that assumes even a perfectly successful
query should be checked just to make sure something bad hasn't happened
since that successful query.  That's absurd for 99.99999999% of all coding.

> No, it was not a hack. Not at all.
> It is indeed has a purpose to test the connection...
> But you are not trying to say that any piece of code, that has a purpose
> is a hack, are you? ;-)

Nope, only a hack when it's a work around for a problem that we could easily
know.  If you add an extra query to the database just to see if a previous
query was successful or not, that's a hack.  The previous query's results
should be all that's necessary to know if the query was okay or not.
Additional queries only tell me if that query failed or not, and it's a hack
because you are doing an SQL query, using your computer's TCP stack and OS
code, using network resouces, using database resources, using extra memory,
using extra processing and wall clock time.  And yet if you did not get an
SQLException, the test has no purpose whatsoever, and if you did get an
SQLException, then it would be nice for that exception to have given me the
info that you are submitting the dummy query to determine.

> Because the output of the first query did not give you enough
> information. You need additional information to make a decision. What's
> wrong with it?

Because for the most part, the Connection object KNOWS that information,
it's just not letting the programmer know.

David


pgsql-jdbc by date:

Previous
From: Felipe Schnack
Date:
Subject: Re: patch: tiny patch to correct stringbuffer size estimate
Next
From: pginfo
Date:
Subject: Re: jdbc batch performance problem