Windows: Wrong error message at connection termination - Mailing list pgsql-hackers

From Lars Kanis
Subject Windows: Wrong error message at connection termination
Date
Msg-id 90b34057-4176-7bb0-0dbb-9822a5f6425b@greiz-reinsdorf.de
Whole thread Raw
Responses Re: Windows: Wrong error message at connection termination
Re: Windows: Wrong error message at connection termination
List pgsql-hackers

Dear hackers,

I lately had a hard time to find the root cause for some wired behavior with the async API of libpq when running client and server on Windows. When the connection aborts with an error - most notably with an error at the connection setup - it sometimes fails with a wrong error message:

Instead of:

    connection to server at "::1", port 5433 failed: FATAL:  role "a" does not exist

it fails with:

    connection to server at "::1", port 5433 failed: server closed the connection unexpectedly

I found out, that the recv() function of the Winsock API has some wired behavior. If the connection receives a TCP RST flag, recv() immediately returns -1, regardless if all previous data has been retrieved. So when the connection is closed hard, the behavior is timing dependent on the client side. It may drop the last packet or it delivers it to libpq, if libpq calls recv() quick enough.

This behavior is described at closesocket() here:
https://docs.microsoft.com/en-us/windows/win32/api/winsock/nf-winsock-closesocket

This is called a hard or abortive close, because the socket's virtual circuit is reset immediately, and any unsent data is lost. On Windows, any recv call on the remote side of the circuit will fail with WSAECONNRESET.

Unfortunately each connection is closed hard by a Windows PostgreSQL server with TCP flag RST. That in turn is another Winsock API behavior, that is that every socket, that wasn't closed by the application is closed hard with the RST flag at process termination. I didn't find any official documentation about this behavior.

Explicit closing the socket before process termination leads to a graceful close even on Windows. That is done by the attached patch. I think delivering the correct error message to the user is much more important that closing the process in sync with the socket.


Some background: I'm the maintainer of ruby-pg, the PostgreSQL client library for ruby. The next version of ruby-pg will switch to the async API for connection setup. Using this API changes the timing of socket operations and therefore often leads to the above wrong message. Previous versions made use of the sync API, which usually doesn't suffer from this issue. The original issue is here: https://github.com/ged/ruby-pg/issues/404

--

Kind Regards
Lars Kanis


Attachment

pgsql-hackers by date:

Previous
From: Jeff Davis
Date:
Subject: Re: Non-superuser subscription owners
Next
From: Peter Smith
Date:
Subject: Re: CREATE PUBLICATION should "See Also" CREATE SUBSCRIPTION