Re: Expansion of our checks for connection-loss errors - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Expansion of our checks for connection-loss errors
Date
Msg-id 2668788.1602207715@sss.pgh.pa.us
Whole thread Raw
In response to Re: Expansion of our checks for connection-loss errors  (Kyotaro Horiguchi <horikyota.ntt@gmail.com>)
Responses Re: Expansion of our checks for connection-loss errors
List pgsql-hackers
Kyotaro Horiguchi <horikyota.ntt@gmail.com> writes:
> At Thu, 08 Oct 2020 15:15:54 -0400, Tom Lane <tgl@sss.pgh.pa.us> wrote in 
>> Accordingly, I propose the attached patch (an expansion of
>> Fujii-san's) that causes us to test for all five errnos anyplace
>> we had been checking for ECONNRESET.

> +1 for the direction.

> In terms of connection errors, connect(2) and bind(2) can return
> EADDRNOTAVAIL. bind(2) and listen(2) can return EADDRINUSE. FWIW I
> recetnly saw pgbench getting EADDRNOTAVAIL. (They have mapping from
> respective WSA errors in TranslateSocketError())

I do not think we have any issues with connection-time errors;
or at least, if we do, the spots being touched here certainly
shouldn't need to worry about them.  These places are dealing
with already-established connections.

> I'd make errno_is_connection_loss use ALL_CONNECTION_LOSS_ERRNOS to
> avoid duplication definition of the errno list.

Hmm, might be worth doing, but I'm not sure.  I am worried about
whether compilers will generate equally good code that way.

> -    if (ret < 0 && WSAGetLastError() == WSAECONNRESET)
> +    if (ret < 0 && errno_is_connection_loss(WSAGetLastError()))

> Don't we need to use TranslateSocketError() before?

Oh, I missed that.  But:

> Perhaps I'm confused but SOCK_ERROR doesn't seem portable between
> Windows and Linux.

In that case, nothing would have worked on Windows for the last
ten years, so you're mistaken.  I think the actual explanation
why this works, and why that test in parallel.c probably still
works even with my mistake, is that win32_port.h makes sure that
our values of ECONNRESET etc match WSAECONNRESET etc.

IOW, we'd not actually need TranslateSocketError at all, except
that it maps some not-similarly-named error codes for conditions
that don't exist in Unix into ones that do.  We probably do want
TranslateSocketError in this parallel.c test so that anything that
it maps to one of the errno_is_connection_loss codes will be
recognized; but the basic cases would work anyway, unless I
misunderstand this stuff entirely.

            regards, tom lane



pgsql-hackers by date:

Previous
From: Kyotaro Horiguchi
Date:
Subject: Re: Expansion of our checks for connection-loss errors
Next
From: Tom Lane
Date:
Subject: Re: Assertion failure with LEFT JOINs among >500 relations