Thread: Getting results after networking error

Getting results after networking error

From
jtv@xs4all.nl
Date:
Hi all,

Here's something I've been trying to figure out about libpq, and I hope
perhaps someone here can shed some light on it.

I sometimes issue multiple queries at once using semicolons, so one call
returns multiple results.  Now let's say my network connection to the
backend breaks and not all the results can be returned--I get a result
with some error code (wish I knew which, but that's another story)
followed by a NULL result to indicate that no more results are following,
right?

Now let's say that I issue my next query and the network connection
recovers in the process.  I happily start retrieving results again, but
are they only results for the new query, or do I go through the remaining
results of the last query first?


Jeroen




Re: Getting results after networking error

From
Tom Lane
Date:
jtv@xs4all.nl writes:
> I sometimes issue multiple queries at once using semicolons, so one call
> returns multiple results.  Now let's say my network connection to the
> backend breaks and not all the results can be returned--I get a result
> with some error code (wish I knew which, but that's another story)
> followed by a NULL result to indicate that no more results are following,
> right?

> Now let's say that I issue my next query and the network connection
> recovers in the process.  I happily start retrieving results again, but
> are they only results for the new query, or do I go through the remaining
> results of the last query first?

I don't think there is a unique answer to that, without a whole lot of
assumptions about the nature of the failure and the behavior of the
network transport layer.  Did the backend see any send() failures?
Did the transport layer permanently lose any data already given to it,
or just delay transmission?

On the whole I think the odds of re-syncing successfully are pretty bad,
and you'd be best off to pull the plug and start a new connection if you
see any networking failure.
        regards, tom lane


Re: Getting results after networking error

From
jtv@xs4all.nl
Date:
Tom Lane wrote:

>> Now let's say that I issue my next query and the network connection
>> recovers in the process.  I happily start retrieving results again, but
>> are they only results for the new query, or do I go through the
>> remaining
>> results of the last query first?
>
> I don't think there is a unique answer to that, without a whole lot of
> assumptions about the nature of the failure and the behavior of the
> network transport layer.  Did the backend see any send() failures?
> Did the transport layer permanently lose any data already given to it,
> or just delay transmission?

Well, I'm sort of assuming that if some data was actually lost, libpq
would call that a synchronization error and give up on the connection.  So
I'm talking specifically about the case where no data was lost, but an
error was returned by the network stack and the error later went away.


> On the whole I think the odds of re-syncing successfully are pretty bad,
> and you'd be best off to pull the plug and start a new connection if you
> see any networking failure.

I guess that makes sense.  But how do I know that the failure is a
networking failure and not, say, an SQL-level failure?  In my program, I
mean, without human intervention?


Jeroen




Re: Getting results after networking error

From
Tom Lane
Date:
jtv@xs4all.nl writes:
> Tom Lane wrote:
>> On the whole I think the odds of re-syncing successfully are pretty bad,
>> and you'd be best off to pull the plug and start a new connection if you
>> see any networking failure.

> I guess that makes sense.  But how do I know that the failure is a
> networking failure and not, say, an SQL-level failure?  In my program, I
> mean, without human intervention?

I think that a reasonable API for this "if PQstatus(conn) is
CONNECTION_BAD then you had a networking problem".  I am not at all sure
how well libpq honors that definition currently ... but feel free to
send patches ;-)
        regards, tom lane


Re: Getting results after networking error

From
jtv@xs4all.nl
Date:
Tom Lane wrote:

> I think that a reasonable API for this "if PQstatus(conn) is
> CONNECTION_BAD then you had a networking problem".  I am not at all sure
> how well libpq honors that definition currently ... but feel free to
> send patches ;-)

As I posted in July,

  http://archives.postgresql.org/pgsql-interfaces/2005-07/msg00003.php

the current situation is that libpq goes out of its way to maintain
CONNECTION_OK in these cases.  I'd still like to see the fix I outlined
there, which is to treat socket errors as fatal by default (apart from the
special cases for EAGAIN, EINTR and friends that are already there, of
course).

I'm attaching a patch; something similar may be called for in pqSendSome()
as well.  A next step would be to factor the code duplication out of the
function, eliminating the redundant error message in the process.


Jeroen

Attachment