pgsql: Fix waitpid() emulation on Windows. - Mailing list pgsql-committers

From Thomas Munro
Subject pgsql: Fix waitpid() emulation on Windows.
Date
Msg-id E1pcFGn-003a4f-3d@gemulon.postgresql.org
Whole thread Raw
List pgsql-committers
Fix waitpid() emulation on Windows.

Our waitpid() emulation didn't prevent a PID from being recycled by the
OS before the call to waitpid().  The postmaster could finish up
tracking more than one child process with the same PID, and confuse
them.

Fix, by moving the guts of pgwin32_deadchild_callback() into waitpid(),
so that resources are released synchronously.  The process and PID
continue to exist until we close the process handle, which only happens
once we're ready to adjust our book-keeping of running children.

This seems to explain a couple of failures on CI.  It had never been
reported before, despite the code being as old as the Windows port.
Perhaps Windows started recycling PIDs more rapidly, or perhaps timing
changes due to commit 7389aad6 made it more likely to break.

Thanks to Alexander Lakhin for analysis and Andres Freund for tracking
down the root cause.

Back-patch to all supported branches.

Reported-by: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/20230208012852.bvkn2am4h4iqjogq%40awork3.anarazel.de

Branch
------
REL_14_STABLE

Details
-------
https://git.postgresql.org/pg/commitdiff/9b6e0b9c37d644bc99f7c79e01b388f6a3648387

Modified Files
--------------
src/backend/postmaster/postmaster.c | 70 +++++++++++++++++++++----------------
1 file changed, 40 insertions(+), 30 deletions(-)


pgsql-committers by date:

Previous
From: Thomas Munro
Date:
Subject: pgsql: Fix waitpid() emulation on Windows.
Next
From: Thomas Munro
Date:
Subject: pgsql: Fix waitpid() emulation on Windows.