Re: Windows: Wrong error message at connection termination - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Windows: Wrong error message at connection termination
Date
Msg-id 3615445.1637530289@sss.pgh.pa.us
Whole thread Raw
In response to Re: Windows: Wrong error message at connection termination  (Thomas Munro <thomas.munro@gmail.com>)
List pgsql-hackers
Thomas Munro <thomas.munro@gmail.com> writes:
> Hmm.  Well, if I understand how this works (and I'm not too familiar
> with this Windows code so I maybe I don't), the postmaster duplicates
> the socket into the child process (see
> {write,read}_inheritable_socket()) and then closes its own handle (see
> ServerLoop()'s call to StreamClose(port->sock)).  What if the
> postmaster kept the socket open, and then closed its copy after the
> child exits?

Ugh :-(.  For starters, we risk running out of FDs in the postmaster,
don't we?

I did some tracing just now and convinced myself that socket_close is
the first on_proc_exit callback registered in an ordinary backend,
and therefore the last action done by proc_exit_prepare.  The only
things that happen after that are PROFILE_PID_DIR setup (not relevant
in production builds), an elog(DEBUG) call, and any atexit callbacks
that third-party code might have registered.

If you're willing to avert your eyes from the question of what atexit
callbacks might do, then it'd be okay to do closesocket in socket_close,
reasoning that the backend has certainly disconnected itself from shmem
and so on, and thus is effectively done even if it is still a live process
so far as the kernel is concerned.  So maybe Lars' proposed patch is
acceptable after all.  It feels a bit shaky, but when we're sitting atop
a piece-of-junk TCP stack, we can't really have the guarantees we'd like.

The main way in which it's shaky is that future rearrangements of the
shutdown sequence, or additions of new on_proc_exit callbacks, could
create a situation where socket_close is no longer the last interesting
action.  We could imagine doing something to make it less likely for
that to happen accidentally, but I'm not sure it's worth the trouble.

Essentially this is reverting 268313a95 of 2003-05-29.  The commit
message for that fails to cite any mailing-list discussion, but after
some digging in the archives I think I did it in response to

https://www.postgresql.org/message-id/flat/009c01c31ce9%24eeaf00f0%24fb02a8c0%40muskrat

where the complaint was that a DB couldn't be dropped because a
just-closed connection was still live so far as the server was concerned.
We didn't do anything to make PQclose() synchronous, so the problem is
really still there; but the idea was that other client libraries could
make session-close synchronous if they wanted.  For that purpose,
being out of the ProcArray is really sufficient, and I think it's safe
to suppose that socket_close must run after that.

            regards, tom lane



pgsql-hackers by date:

Previous
From: Thomas Munro
Date:
Subject: Re: Windows: Wrong error message at connection termination
Next
From: Tom Lane
Date:
Subject: Re: Windows: Wrong error message at connection termination