Re: Windows: Wrong error message at connection termination - Mailing list pgsql-hackers

From Thomas Munro
Subject Re: Windows: Wrong error message at connection termination
Date
Msg-id CA+hUKG+nzfwuKYKXd43jjG2txuc-po4GOyZFA3z6s9oy3P=4kA@mail.gmail.com
Whole thread Raw
In response to Windows: Wrong error message at connection termination  (Lars Kanis <lars@greiz-reinsdorf.de>)
Responses Re: Windows: Wrong error message at connection termination
Re: Windows: Wrong error message at connection termination
List pgsql-hackers
On Thu, Nov 18, 2021 at 10:13 AM Lars Kanis <lars@greiz-reinsdorf.de> wrote:
> Unfortunately each connection is closed hard by a Windows PostgreSQL server with TCP flag RST. That in turn is
anotherWinsock API behavior, that is that every socket, that wasn't closed by the application is closed hard with the
RSTflag at process termination. I didn't find any official documentation about this behavior. 

Interesting discovery.  I think you might get the same behaviour from
a Unix system if you set SO_LINGER to 0 before you exit[1].  I suppose
if a TCP implementation is partially in user space (I have no idea if
this is true for Windows, I never use it, but I recall that Winsock
was at some point a DLL) and can't handle the existence of any socket
state after the process is gone, you might want to nuke everything and
tell the peer immediately that you're doing so on exit?

I realise now that the experiments we did a while back to try to
understand this across a few different operating systems[2] had missed
this subtlety, because that Python script had an explicit close()
call, whereas PostgreSQL exits.  It still revealed that the client
isn't allowed to read any data after its write failed, which is a
known source of error messages being eaten.  What I missed is that the
client doesn't just get an RST and enter this
no-you-can't-have-the-error-message-I-have-received state in response
to data sent by the client (the usual way you expect to get RST), like
in that test, but it also does so proactively when the server process
exits, as you've explained (in other words, it's not necessary for the
client to try to write to reach this error-eating state).

[1] https://stackoverflow.com/questions/3757289/when-is-tcp-option-so-linger-0-required
[2]
https://www.postgresql.org/message-id/flat/20190306030706.GA3967%40f01898859afd.ant.amazon.com#32f9f16f9be8da5ee5c3b405d6d1829c



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Windows: Wrong error message at connection termination
Next
From: Tomas Vondra
Date:
Subject: Re: Patch: Range Merge Join