Re: Yet another way for pg_ctl stop to fail on Windows - Mailing list pgsql-hackers
| From | Alexander Lakhin |
|---|---|
| Subject | Re: Yet another way for pg_ctl stop to fail on Windows |
| Date | |
| Msg-id | 41b8c2d3-b86c-45f5-a198-4889c22665e0@gmail.com Whole thread |
| In response to | Re: Yet another way for pg_ctl stop to fail on Windows (Noah Misch <noah@leadboat.com>) |
| List | pgsql-hackers |
Hello Noah,
08.09.2024 19:53, Noah Misch wrote:
08.09.2024 19:53, Noah Misch wrote:
I see Microsoft suggests WaitNamedPipeA() as opposed to just polling. WaitNamedPipeA() should be more responsive. Given how rare this has been, it likely doesn't matter whether we use WaitNamedPipeA() or polling. I'd lean toward whichever makes the code simpler, probably polling.So if we aim to not only fix "pg_ctl stop", but to make pgkill() robust, it's the way to go, IMHO. I'm not sure about an infinite loop they show, I'd vote for a loop with the same characteristics as in pgwin32_open_handle().I agree with bounding the total time of each kill(), like pgwin32_open_handle() does for open().
While investigating a recent buildfarm failure [1] (which also happened
before: [2]):
# # --- C:/prog/bf/root/HEAD/pgsql/src/test/regress/expected/misc_functions.out 2026-04-08 01:57:31.191354200 +0000
# # +++ C:/prog/bf/root/HEAD/pgsql.build/testrun/pg_upgrade/002_pg_upgrade/data/results/misc_functions.out 2026-04-24 02:45:42.446717500 +0000
# # @@ -316,9 +316,10 @@
# # -- permissions are set properly.
# # --
# # SELECT pg_log_backend_memory_contexts(pg_backend_pid());
# # +WARNING: could not send signal to process 3940: Invalid argument
# # pg_log_backend_memory_contexts
# # --------------------------------
# # - t
# # + f
# # (1 row)
# #
# # SELECT pg_log_backend_memory_contexts(pid) FROM pg_stat_activity
# # @@ -347,9 +348,10 @@
# #
# # SET ROLE regress_log_memory;
# # SELECT pg_log_backend_memory_contexts(pg_backend_pid());
# # +WARNING: could not send signal to process 3940: Invalid argument
# # pg_log_backend_memory_contexts
# # --------------------------------
# # - t
# # + f
# # (1 row)
# #
# # RESET ROLE;
# # 1 of 248 tests failed.
# # The differences that caused some tests to fail can be viewed in the file "C:/prog/bf/root/HEAD/pgsql.build/testrun/pg_upgrade/002_pg_upgrade/data/regression.diffs".
# # A copy of the test summary that you see above is saved in the file "C:/prog/bf/root/HEAD/pgsql.build/testrun/pg_upgrade/002_pg_upgrade/data/regression.out".
# ------------------------------------
# Looks like you failed 1 test of 20.
I've managed to reproduced it locally, by running five
pg_upgrade/002_pg_upgrade tests (with
"SELECT pg_log_backend_memory_contexts(pg_backend_pid());" x100 in
misc_functions.sql) concurrently on one-core VM:
iteration 3
1/5 postgresql:pg_upgrade_5 / pg_upgrade_5/002_pg_upgrade OK 184.42s 5 subtests passed
2/5 postgresql:pg_upgrade_1 / pg_upgrade_1/002_pg_upgrade ERROR 184.79s exit status 1
pg_upgrade_1\002_pg_upgrade\log\002_pg_upgrade_old_node.log:
2026-04-26 13:13:44.238 UTC client backend[688] pg_regress/misc_functions WARNING: could not send signal to process 688: Invalid argument
With the logging added inside pgkill():
@@ -89,6 +93,9 @@ pgkill(int pid, int sig)
errno = EPERM;
return -1;
default:
+#ifndef FRONTEND
+elog(LOG, "!!!pgkill| CallNamedPipe() returned %d", GetLastError());
+#endif
errno = EINVAL; /* unexpected */
return -1;
}
and backtrace_functions = 'pgkill', I could see
2026-04-26 13:13:44.237 UTC client backend[688] pg_regress/misc_functions LOG: !!!pgkill| CallNamedPipe() returned 231
2026-04-26 13:13:44.237 UTC client backend[688] pg_regress/misc_functions BACKTRACE:
pgkill+0x15f [0x7ff6c05279ef] [C:\src\postgresql\src\port\kill.c:103]
SendProcSignal+0xb1 [0x7ff6c03202e1] [C:\src\postgresql\src\backend\storage\ipc\procsignal.c:341]
pg_log_backend_memory_contexts+0x7b [0x7ff6c040df1b] [C:\src\postgresql\src\backend\utils\adt\mcxtfuncs.c:301]
ExecInterpExpr+0x684 [0x7ff6c00ec264] [C:\src\postgresql\src\backend\executor\execExprInterp.c:630]
...
That is, pg_log_backend_memory_contexts(() fails due to the same
ERROR_PIPE_BUSY.
Moreover, I can see other instances of this error in the log, which go
unnoticed:
2026-04-26 13:12:19.049 UTC client backend[2948] pg_regress/subselect LOG: !!!pgkill| CallNamedPipe() returned 231
2026-04-26 13:12:19.049 UTC client backend[2948] pg_regress/subselect BACKTRACE:
pgkill+0x15f [0x7ff6c05279ef] [C:\src\postgresql\src\port\kill.c:103]
SendProcSignal+0xb1 [0x7ff6c03202e1] [C:\src\postgresql\src\backend\storage\ipc\procsignal.c:341]
SICleanupQueue+0x1e1 [0x7ff6c03252b1] [C:\src\postgresql\src\backend\storage\ipc\sinvaladt.c:677]
SIInsertDataEntries+0x91 [0x7ff6c0325511] [C:\src\postgresql\src\backend\storage\ipc\sinvaladt.c:411]
AtEOXact_Inval+0x86 [0x7ff6c04b0636] [C:\src\postgresql\src\backend\utils\cache\inval.c:1225]
CommitTransaction+0x2b4 [0x7ff6bffa5604] [C:\src\postgresql\src\backend\access\transam\xact.c:2477]
CommitTransactionCommandInternal+0x8e [0x7ff6bffa588e] [C:\src\postgresql\src\backend\access\transam\xact.c:3253]
CommitTransactionCommand+0x9 [0x7ff6bffa57e9] [C:\src\postgresql\src\backend\access\transam\xact.c:3213]
RemoveTempRelationsCallback+0x6a [0x7ff6bfff51fa] [C:\src\postgresql\src\backend\catalog\namespace.c:4710]
proc_exit_prepare+0xc6 [0x7ff6c03178d6] [C:\src\postgresql\src\backend\storage\ipc\ipc.c:199]
proc_exit+0x56 [0x7ff6c03177c6] [C:\src\postgresql\src\backend\storage\ipc\ipc.c:155]
PostgresMain+0xd9a [0x7ff6c034e2ea] [C:\src\postgresql\src\backend\tcop\postgres.c:5091]
BackendMain+0xd5 [0x7ff6c034acb5] [C:\src\postgresql\src\backend\tcop\backend_startup.c:124]
...
2026-04-26 13:12:45.010 UTC client backend[8104] pg_regress/replica_identity LOG: !!!pgkill| CallNamedPipe() returned 231
2026-04-26 13:12:45.010 UTC client backend[8104] pg_regress/replica_identity BACKTRACE:
pgkill+0x15f [0x7ff6c05279ef] [C:\src\postgresql\src\port\kill.c:103]
SendProcSignal+0xb1 [0x7ff6c03202e1] [C:\src\postgresql\src\backend\storage\ipc\procsignal.c:341]
SICleanupQueue+0x1e1 [0x7ff6c03252b1] [C:\src\postgresql\src\backend\storage\ipc\sinvaladt.c:677]
SIInsertDataEntries+0x91 [0x7ff6c0325511] [C:\src\postgresql\src\backend\storage\ipc\sinvaladt.c:411]
AtInplace_Inval+0x67 [0x7ff6c04b0727] [C:\src\postgresql\src\backend\utils\cache\inval.c:1270]
heap_inplace_update_and_unlock+0x266 [0x7ff6bff2a366] [C:\src\postgresql\src\backend\access\heap\heapam.c:6591]
systable_inplace_update_finish+0x20 [0x7ff6bff4c160] [C:\src\postgresql\src\backend\access\index\genam.c:904]
index_update_stats+0x21e [0x7ff6bffefefe] [C:\src\postgresql\src\backend\catalog\index.c:2982]
index_build+0x2fc [0x7ff6bffed2fc] [C:\src\postgresql\src\backend\catalog\index.c:3180]
index_create+0xc05 [0x7ff6bffef035] [C:\src\postgresql\src\backend\catalog\index.c:1291]
DefineIndex+0x12fc [0x7ff6c006544c] [C:\src\postgresql\src\backend\commands\indexcmds.c:1265]
ProcessUtilitySlow+0x924 [0x7ff6c0358524] [C:\src\postgresql\src\backend\tcop\utility.c:1566]
standard_ProcessUtility+0x9f0 [0x7ff6c0359ec0] [C:\src\postgresql\src\backend\tcop\utility.c:1078]
ProcessUtility+0x70 [0x7ff6c0357aa0] [C:\src\postgresql\src\backend\tcop\utility.c:531]
ProcessUtilitySlow+0x3a0 [0x7ff6c0357fa0] [C:\src\postgresql\src\backend\tcop\utility.c:1262]
standard_ProcessUtility+0x9f0 [0x7ff6c0359ec0] [C:\src\postgresql\src\backend\tcop\utility.c:1078]
ProcessUtility+0x70 [0x7ff6c0357aa0] [C:\src\postgresql\src\backend\tcop\utility.c:531]
[1] https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=drongo&dt=2026-04-24%2001%3A26%3A26
[2] https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=drongo&dt=2025-09-26%2018%3A15%3A06
Best regards,
Alexander
pgsql-hackers by date: