Thread: pgsql: Fix waitpid() emulation on Windows.

pgsql: Fix waitpid() emulation on Windows.

From
Thomas Munro
Date:
Fix waitpid() emulation on Windows.

Our waitpid() emulation didn't prevent a PID from being recycled by the
OS before the call to waitpid().  The postmaster could finish up
tracking more than one child process with the same PID, and confuse
them.

Fix, by moving the guts of pgwin32_deadchild_callback() into waitpid(),
so that resources are released synchronously.  The process and PID
continue to exist until we close the process handle, which only happens
once we're ready to adjust our book-keeping of running children.

This seems to explain a couple of failures on CI.  It had never been
reported before, despite the code being as old as the Windows port.
Perhaps Windows started recycling PIDs more rapidly, or perhaps timing
changes due to commit 7389aad6 made it more likely to break.

Thanks to Alexander Lakhin for analysis and Andres Freund for tracking
down the root cause.

Back-patch to all supported branches.

Reported-by: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/20230208012852.bvkn2am4h4iqjogq%40awork3.anarazel.de

Branch
------
master

Details
-------
https://git.postgresql.org/pg/commitdiff/d41a178b3a7ac0c2ca16f70129899ddabc2ce468

Modified Files
--------------
src/backend/postmaster/postmaster.c | 70 +++++++++++++++++++++----------------
1 file changed, 40 insertions(+), 30 deletions(-)