closesocket behavior in different platforms - Mailing list pgsql-hackers

From vignesh C
Subject closesocket behavior in different platforms
Date
Msg-id CALDaNm2tEvr_Kum7SyvFn0=6H3P0P-Zkhnd=dkkX+Q=wKutZ=A@mail.gmail.com
Whole thread Raw
Responses Re: closesocket behavior in different platforms  (Amit Kapila <amit.kapila16@gmail.com>)
List pgsql-hackers
Hi,

In few scenarios the message displayed in psql console is not consistent in windows and linux. The execution results from few scenarios in windows and linux is listed below:

In CentOS
========================================================
After transaction idle timeout
postgres=# SET idle_in_transaction_session_timeout=300;
SET
postgres=# BEGIN;
BEGIN
postgres=# SELECT * FROM pg_class;
FATAL:  terminating connection due to idle-in-transaction timeout
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: Succeeded.

After pg_terminate_backend from another session:
postgres=# select * from dual;
FATAL:  terminating connection due to administrator command
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: Succeeded.

Similarly in pg_ctl kill TERM and drop database with (force).

In Windows
========================================================
After transaction idle timeout
postgres=# set idle_in_transaction_session_timeout=300;
SET
postgres=# begin;
BEGIN
postgres=# select * from dual
postgres-# ;
server closed the connection unexpectedly
        This probably means the server terminated abnormally
        before or while processing the request.
The connection to the server was lost. Attempting reset: Succeeded.

After pg_terminate_backend from another session:
postgres=# select * from dual;
server closed the connection unexpectedly
        This probably means the server terminated abnormally
        before or while processing the request.
The connection to the server was lost. Attempting reset: Succeeded.

Similarly in pg_ctl kill TERM and drop database with (force).
There may be some more scenarios which I'm missing.

It is noticed that in all the 4 cases the message "FATAL:  terminating connection due to administrator command" does not appear in windows.

However the following message is present in the server log file:
FATAL:  terminating connection due to administrator command

The reason for this looks like:
When the server closes a connection, it sends the ErrorResponse packet, and then closes the socket and terminates the backend process. If the packet is received before the server closes the connection, the error message is received in both windows and linux. If the packet is not received before the server closes the connection, the error message is not received in case of windows where as in linux it is received.

There have been a couple of discussion earlier also on this [1] & [2], but we could not find any alternate solution.

One of the options that msdn suggests in [3] is to use SO_LINGER option, we had tried this option with no luck in solving. One other thing that we had tried was to sleep for 1 second before closing the socket, this solution works if the client is active, whereas in case of inactive clients it does not solves the problem. One other thought that we had was to simultaneously check the connection from psql, when we are waiting for query input in gets_interactive function or have a separate thread to check the connection status periodically, this might work only in case of psql but will not work for application which uses libpq. Amit had also suggested one solution in [4], where he proposed 'I have also tried calling closesocket() explicitly in our function socket_close which has changed the error message to "could not receive data from server: Software caused connection abort (0x00002745/10053)".'
Or
Should we add some documentation for the above behavior.

Thoughts?

[1] https://www.postgresql.org/message-id/flat/CA%2BhUKGJowQypXSKsjws9A%2BnEQDD0-mExHZqFXtJ09N209rCO5A%40mail.gmail.com#0629f079bc59ecdaa0d6ac9f8f2c18ac
[2] https://www.postgresql.org/message-id/87k1iy44fd.fsf@news-spur.riddles.org.uk
[3] https://docs.microsoft.com/en-us/windows/win32/api/winsock/nf-winsock-closesocket
[4] https://www.postgresql.org/message-id/CAA4eK1%2BGNyjaPK77y%2Beuh5eAgM75pncG1JYZhxYZF%2BSgS6NpjA%40mail.gmail.com

Regards,
Vignesh
EnterpriseDB: http://www.enterprisedb.com

pgsql-hackers by date:

Previous
From: Amit Kapila
Date:
Subject: Re: [HACKERS] Block level parallel vacuum
Next
From: Andy Fan
Date:
Subject: what is the purpose to use 3 function to compare cost add_path/set_cheapest/get_cheapest_fractional_path