Re: windows CI failing PMSignalState->PMChildFlags[slot] == PM_CHILD_ASSIGNED - Mailing list pgsql-hackers

From Alexander Lakhin
Subject Re: windows CI failing PMSignalState->PMChildFlags[slot] == PM_CHILD_ASSIGNED
Date
Msg-id 354d6027-e33f-ad66-6c48-27ca8d2458ca@gmail.com
Whole thread Raw
In response to Re: windows CI failing PMSignalState->PMChildFlags[slot] == PM_CHILD_ASSIGNED  (Andres Freund <andres@anarazel.de>)
Responses Re: windows CI failing PMSignalState->PMChildFlags[slot] == PM_CHILD_ASSIGNED
List pgsql-hackers
Hi,
14.03.2023 01:20, Andres Freund wrote:
>> I am yet to construct a reproduction of the case, but it seems to me that
>> the race condition is not impossible here.
> I suspect the issue could be made much more likely by adding a sleep before
> the pg_queue_signal(SIGCHLD) in pgwin32_deadchild_callback().

Thanks for the tip! With pg_usleep(50000) added there, I can reproduce the issue
reliably during a minute on average with the 099_check_pids.pl I posted before:
...
2023-03-15 07:26:14.301 GMT|[unknown]|[unknown]|3748|64117316.ea4|LOG:  
connection received: host=127.0.0.1 port=49902
2023-03-15 07:26:14.302 GMT|postgres|postgres|3748|64117316.ea4|LOG:  connection 
authorized: user=postgres database=postgres application_name=099_check-pids.pl
2023-03-15 07:26:14.304 GMT|postgres|postgres|3748|64117316.ea4|LOG:  statement: 
SELECT pg_backend_pid()
2023-03-15 07:26:14.305 GMT|postgres|postgres|3748|64117316.ea4|LOG:  
disconnection: session time: 0:00:00.005 user=postgres database=postgres 
host=127.0.0.1 port=49902
...
2023-03-15 07:26:25.592 GMT|[unknown]|[unknown]|3748|64117321.ea4|LOG:  
connection received: host=127.0.0.1 port=50407
TRAP: failed Assert("PMSignalState->PMChildFlags[slot] == PM_CHILD_ASSIGNED"), 
File: "C:\src\postgresql\src\backend\storage\ipc\pmsignal.c", Line: 329, PID: 3748
abort() has been called2023-03-15 07:26:25.608 
GMT|[unknown]|[unknown]|3524|64117321.dc4|LOG:  connection received: 
host=127.0.0.1 port=50408

The result depends on some OS conditions (it reproduced pretty well
immediately after VM reboot), but it's enough to test the patch proposed.
And I can confirm that the Assert is not observed anymore (with the sleep
added after CloseHandle(childinfo->procHandle)).

Best regards,
Alexander



pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: psql \watch 2nd argument: iteration count
Next
From: Peter Eisentraut
Date:
Subject: Re: meson: Non-feature feature options