Hi,
Right now using -w for shutting down clusters with a bit bigger shared
buffers will very frequently fail, because the shutdown checkpoint takes
much longer than 60s. Obviously that can be addressed by manually
setting PGCTLTIMEOUT to something higher, but forcing many users to do
that doesn't seem right. And while many users probably don't want to
aggressively time-out on the shutdown checkpoint, I'd assume most do
want to time out aggressively if the server doesn't actually start the
checkpoint.
I wonder if we need to split the timeout into two: One value for
postmaster to acknowledge the action, one for that action to
complete. It seems to me that that'd be useful for all of starting,
restarting and stopping.
I think we have all the necessary information in the pid file, we would
just need to check for PM_STATUS_STARTING for start, PM_STATUS_STOPPING
for restart/stop.
Comments?
Greetings,
Andres Freund