Thread: pgsql: Fix waitpid() emulation on Windows.

pgsql: Fix waitpid() emulation on Windows.

From
Thomas Munro
Date:
Fix waitpid() emulation on Windows.

Our waitpid() emulation didn't prevent a PID from being recycled by the
OS before the call to waitpid().  The postmaster could finish up
tracking more than one child process with the same PID, and confuse
them.

Fix, by moving the guts of pgwin32_deadchild_callback() into waitpid(),
so that resources are released synchronously.  The process and PID
continue to exist until we close the process handle, which only happens
once we're ready to adjust our book-keeping of running children.

This seems to explain a couple of failures on CI.  It had never been
reported before, despite the code being as old as the Windows port.
Perhaps Windows started recycling PIDs more rapidly, or perhaps timing
changes due to commit 7389aad6 made it more likely to break.

Thanks to Alexander Lakhin for analysis and Andres Freund for tracking
down the root cause.

Back-patch to all supported branches.

Reported-by: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/20230208012852.bvkn2am4h4iqjogq%40awork3.anarazel.de

Branch
------
REL_13_STABLE

Details
-------
https://git.postgresql.org/pg/commitdiff/9f1c64018549e28a6f6aa5ad0d5917085520efdd

Modified Files
--------------
src/backend/postmaster/postmaster.c | 70 +++++++++++++++++++++----------------
1 file changed, 40 insertions(+), 30 deletions(-)