Thread: Re: [GENERAL] libpq error codes

Re: [GENERAL] libpq error codes

From
Denis Perchine
Date:
> EOF, no more and no less.  It is not for the kernel to decide whether
> the connection closure represents an application-level error or not.
> Sounds like someone has managed to blow this badly in recent Linux TCP
> stacks.  Care to file a kernel bug report?

Yeps.

> In the meantime, it's probably reasonable for libpq to treat EPIPE from
> read() the same as EOF --- if I recall correctly, it already tests for
> ECONNRESET instead of EOF from kernels that have that variety of
> braindamage, so adding a defense against this variety is fair game.
> If you look in src/interfaces/libpq/fe-misc.c the places to fix should
> be obvious (but note there are two or three of them, not just one).
> Please try it out and submit a patch after you've verified it fixes
> your problem.

Now I get:

db=> select count(*) from pg_class;
 count
-------
 28531
(1 row)

db=> select count(*) from pg_class;
pqReadData() -- backend closed the channel unexpectedly.
        This probably means the backend terminated abnormally
        before or while processing the request.
The connection to the server was lost. Attempting reset: Failed.
!>

Looks much more reasonable. But I do not get messages about shutdown.
With a patch enclosed it will perform like with ECONNRESET.
Shouldn't I emulate EOF when EPIPE?

--
Sincerely Yours,
Denis Perchine

----------------------------------
E-Mail: dyp@perchine.com
HomePage: http://www.perchine.com/dyp/
FidoNet: 2:5000/120.5
----------------------------------

Attachment

Re: Re: [GENERAL] libpq error codes

From
Tom Lane
Date:
Denis Perchine <dyp@perchine.com> writes:
> db=> select count(*) from pg_class;
> pqReadData() -- backend closed the channel unexpectedly.
>         This probably means the backend terminated abnormally
>         before or while processing the request.
> The connection to the server was lost. Attempting reset: Failed.
> !>

> Looks much more reasonable. But I do not get messages about shutdown.
> With a patch enclosed it will perform like with ECONNRESET.
> Shouldn't I emulate EOF when EPIPE?

You *are* emulating EOF --- with that check in place, pqReadData
should respond to EPIPE just like it does to a normal EOF.  I don't
understand why you aren't seeing the same results I do.

How exactly are you testing this?  I'm doing it with a plain "kill"
on the connected backend process.  If you were using "kill -9"
or some such, that'd explain it --- the backend doesn't have an
opportunity to send the "I'm shutting down" message in that case.

            regards, tom lane

Re: Re: [GENERAL] libpq error codes

From
Denis Perchine
Date:
> > db=> select count(*) from pg_class;
> > pqReadData() -- backend closed the channel unexpectedly.
> >         This probably means the backend terminated abnormally
> >         before or while processing the request.
> > The connection to the server was lost. Attempting reset: Failed.
> > !>
>
> > Looks much more reasonable. But I do not get messages about shutdown.
> > With a patch enclosed it will perform like with ECONNRESET.
> > Shouldn't I emulate EOF when EPIPE?
>
> You *are* emulating EOF --- with that check in place, pqReadData
> should respond to EPIPE just like it does to a normal EOF.  I don't
> understand why you aren't seeing the same results I do.

Hmmm... Looks like I get EPIPE just after connection reset, but you are able
to read the rest of the data... Looks like Linux kernel problem again...

> How exactly are you testing this?  I'm doing it with a plain "kill"
> on the connected backend process.  If you were using "kill -9"
> or some such, that'd explain it --- the backend doesn't have an
> opportunity to send the "I'm shutting down" message in that case.

kill -TERM for sure.

--
Sincerely Yours,
Denis Perchine

----------------------------------
E-Mail: dyp@perchine.com
HomePage: http://www.perchine.com/dyp/
FidoNet: 2:5000/120.5
----------------------------------

Re: Re: [GENERAL] libpq error codes

From
Tom Lane
Date:
Denis Perchine <dyp@perchine.com> writes:
>>>> Looks much more reasonable. But I do not get messages about shutdown.
>>>> With a patch enclosed it will perform like with ECONNRESET.
>>>> Shouldn't I emulate EOF when EPIPE?
>>
>> You *are* emulating EOF --- with that check in place, pqReadData
>> should respond to EPIPE just like it does to a normal EOF.  I don't
>> understand why you aren't seeing the same results I do.

> Hmmm... Looks like I get EPIPE just after connection reset, but you are able
> to read the rest of the data... Looks like Linux kernel problem again...

Ooh, you mean it doesn't give you the rest of the data before reporting
EPIPE?  That seems so broken it's hard to believe --- a whole lot of
programs would be falling over, not just Postgres.  There's probably
something else happening here, but I'm not real sure what.  Might be
worth checking to see exactly what's in libpq's input buffer at the
time it sees the EPIPE error.

            regards, tom lane

Re: Re: [GENERAL] libpq error codes

From
Denis Perchine
Date:
> Ooh, you mean it doesn't give you the rest of the data before reporting
> EPIPE?  That seems so broken it's hard to believe --- a whole lot of
> programs would be falling over, not just Postgres.  There's probably
> something else happening here, but I'm not real sure what.  Might be
> worth checking to see exactly what's in libpq's input buffer at the
> time it sees the EPIPE error.

How it is bad to say, but you are right... I've sent a mail to linux-kernel@...
Hopefully there will be a positive reply...

But... Here is the patch which works OK.
I get:

db=> select count(*) from pg_class;
NOTICE:  AbortTransaction and not in in-progress state
FATAL 1:  The system is shutting down
pqReadData() -- backend closed the channel unexpectedly.
        This probably means the backend terminated abnormally
        before or while processing the request.
The connection to the server was lost. Attempting reset: Failed.

It's like you get.

--
Sincerely Yours,
Denis Perchine

----------------------------------
E-Mail: dyp@perchine.com
HomePage: http://www.perchine.com/dyp/
FidoNet: 2:5000/120.5
----------------------------------

Attachment