Re: libpq and connection failures - Mailing list pgsql-interfaces

From jtv@xs4all.nl
Subject Re: libpq and connection failures
Date
Msg-id 24362.202.47.227.25.1120635552.squirrel@202.47.227.25
Whole thread Raw
In response to Re: libpq and connection failures  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: libpq and connection failures  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-interfaces
Tom Lane wrote:
> I think it's probably better to have the default assumption be
> "connection possibly recoverable" than have it be "summarily kill
> connection at first hint of trouble".  The latter seems less robust
> not more so.

Not after the connection failure has made its way into a PGresult, surely?Doesn't seem consistent with the design
choiceof aborting transactions
 
on error, for starters.  Are you saying that a session is still in usable
shape when you have no way of establishing whether the last command
succeeded?

If you're in an explicit transaction when this happens, it's in an unknown
state[*] so you have to abort anyway.  All the client hears is "there's
been an error of some sort, but the connection may or may not be fine,
thank you."  You don't necessarily know what level of transaction nesting
you're in though, so you may have to fire off aborts until you're pretty
sure you're out of all of them.  Frankly I'd rather call PQreset() just to
save myself the trouble.

(*) Yes, there are cases where the transaction is left in a reliable
state.  Such as on a read-only query, which the application could retry
(I'll assume it cares about the results or it wouldn't have queried) at
the cost of greater code complexity.  To me is one of the cases where
simplicity and clarity matter a damn sight more than optimizing out the
reconnect on the offchance that the application knows how to handle the
situation despite not receiving even the basic knowledge that the error
was something to do with the connection, not the query.  Tom, when I said
way back when that I wanted to do recovery and retry after a connection
was lost, weren't you the one who said "this scares the hell out of me"
because you couldn't be sure whether the last command committed?

I thought the mantra when it came to networking went "don't second-guess
the OS."  If you get negative bytes out of a socket, there are a few known
errno values that mean it's a transient thing.  Fine, if you identify more
of those then chuck them in there.  There are several cases of that in
there already.  But otherwise, why not assume that the system gave you an
error code because it decided it saw a failure?


Jeroen




pgsql-interfaces by date:

Previous
From: Tom Lane
Date:
Subject: Re: libpq and connection failures
Next
From: Robert Perry
Date:
Subject: By Passed Domain Constraints