Thread: Add ETIMEDOUT to ALL_CONNECTION_FAILURE_ERRNOS

Add ETIMEDOUT to ALL_CONNECTION_FAILURE_ERRNOS

From
Jelte Fennema
Date:
Previously successfully opened TCP connections can still fail on reads
with ETIMEDOUT. This should be considered a connection failure, so that
the connection in libpq is marked as CONNECTION_BAD. The reason I got an
ETIMEDOUT was, because I had set a low tcp_user_timeout in the
connection string. However, it can probably also happen due to
keepalive limits being reached.
Attachment

Re: Add ETIMEDOUT to ALL_CONNECTION_FAILURE_ERRNOS

From
Tom Lane
Date:
Jelte Fennema <Jelte.Fennema@microsoft.com> writes:
> Previously successfully opened TCP connections can still fail on reads
> with ETIMEDOUT. This should be considered a connection failure, so that
> the connection in libpq is marked as CONNECTION_BAD. The reason I got an
> ETIMEDOUT was, because I had set a low tcp_user_timeout in the
> connection string. However, it can probably also happen due to
> keepalive limits being reached.

I'm dubious about the portability of this patch, because we don't
use ETIMEDOUT elsewhere.  strerror.c thinks it may not exist,
which is probably overly conservative because POSIX has required
it since SUSv2.  The bigger problem is that it's not accounted for in
the WSAxxx mapping done in port/win32_port.h and TranslateSocketError.
That'd have to be fixed for this to behave reasonably on Windows,
I think.

            regards, tom lane



Re: [EXTERNAL] Re: Add ETIMEDOUT to ALL_CONNECTION_FAILURE_ERRNOS

From
Jelte Fennema
Date:
Attached is a new patch that I think addresses your concerns.

From: Tom Lane <tgl@sss.pgh.pa.us>
Sent: Thursday, September 30, 2021 16:04
To: Jelte Fennema <Jelte.Fennema@microsoft.com>
Cc: pgsql-hackers@postgresql.org <pgsql-hackers@postgresql.org>
Subject: [EXTERNAL] Re: Add ETIMEDOUT to ALL_CONNECTION_FAILURE_ERRNOS
 
Jelte Fennema <Jelte.Fennema@microsoft.com> writes:
> Previously successfully opened TCP connections can still fail on reads
> with ETIMEDOUT. This should be considered a connection failure, so that
> the connection in libpq is marked as CONNECTION_BAD. The reason I got an
> ETIMEDOUT was, because I had set a low tcp_user_timeout in the
> connection string. However, it can probably also happen due to
> keepalive limits being reached.

I'm dubious about the portability of this patch, because we don't
use ETIMEDOUT elsewhere.  strerror.c thinks it may not exist,
which is probably overly conservative because POSIX has required
it since SUSv2.  The bigger problem is that it's not accounted for in
the WSAxxx mapping done in port/win32_port.h and TranslateSocketError.
That'd have to be fixed for this to behave reasonably on Windows,
I think.

                        regards, tom lane
Attachment

Re: [EXTERNAL] Re: Add ETIMEDOUT to ALL_CONNECTION_FAILURE_ERRNOS

From
Tom Lane
Date:
Jelte Fennema <Jelte.Fennema@microsoft.com> writes:
> Attached is a new patch that I think addresses your concerns.

You missed TranslateSocketError ...

Pushed to HEAD only with that fix.

            regards, tom lane



Re: [EXTERNAL] Re: Add ETIMEDOUT to ALL_CONNECTION_FAILURE_ERRNOS

From
Jelte Fennema
Date:
Oops sorry, I did make that change locally but apparently didn't update my .patch file after committing, so I uploaded
anintermediary one... 
Thanks for fixing that.

I saw you added this section to the commit message:

> Perhaps this should be back-patched, but I'm hesitant to do so given
> the lack of previous complaints, and the hazard that there's a small
> ABI break on Windows from redefining the symbol.  Even if we decide
> to do that, it'd be prudent to let this bake awhile in HEAD first.

Personally, I would love to see this backpatched. Since together with a second bug I reported[1] it's causing high
querytimeouts in Citus even if tcp_user_timeout is set to a low value. I do understand your worry though. Would a patch
likethe one I attached now be a better fit for a backport? 

Jelte

[1]:
https://www.postgresql.org/message-id/flat/AM5PR83MB017870DE81FC84D5E21E9D1EF7AA9%40AM5PR83MB0178.EURPRD83.prod.outlook.com
Attachment

Re: [EXTERNAL] Re: Add ETIMEDOUT to ALL_CONNECTION_FAILURE_ERRNOS

From
Tom Lane
Date:
Jelte Fennema <Jelte.Fennema@microsoft.com> writes:
> Personally, I would love to see this backpatched. Since together with a second bug I reported[1] it's causing high
querytimeouts in Citus even if tcp_user_timeout is set to a low value. I do understand your worry though. Would a patch
likethe one I attached now be a better fit for a backport? 

The only way that could work as-intended is if c.h includes whatever
header provides TCP_USER_TIMEOUT, which does not seem like a great idea.

            regards, tom lane



Re: [EXTERNAL] Re: Add ETIMEDOUT to ALL_CONNECTION_FAILURE_ERRNOS

From
Jelte Fennema
Date:
I would still love to get a version of this patch backported. And I just thought of an idea to do so without breaking
theWindows ABI, by slightly modifying my previous idea. See the attached patch. 
Attachment