I was looking at the notify processing in async.c and I noticed that
kill is called whether or not the
process has been signalled already, and whether or not 'this' process
has signalled the process.
It seems unecessary to me - especially if we are on Win32 and the pgkill
is implemented as
a CallNamedPipe.
My understanding is that signal is normally a fairly expensive operation
at the best
of times, particularly so when its turned from a fire-and-forget to an
RPC with
scheduling.
I appreciate that signal wishes to determine whether a process is dead,
but it must
be questionable whether this is necessarily something that should be
done by peers
when the information is immediately out of date and we can definitively
determine
a crash in the master process anyway.
So:
1) why do the RPC, rather than detect death from the master process?
2) Why not use the existing compare-and-set atomic infrastructure to
maintain
a 'pending signal' flag (or flags) in struct PGPROC and elide signals
that are
flagged and not yet indicated as processed by the target process?
3) If we do both the above, would it not be cleaner to use an fd with a
local datagram socket than a signal on nearly all systems? And a semaphore
on Win32? So its all picked up in select or WaitForMultipleObjects?
I know the comment in async.c is: 'but we do still send a SIGUSR2 signal,
just in case that backend missed the earlier signal for some reason.'.
But that
seems somewhat lame - we might have multiple signals compressed but
does any system actually *lose* them?
It also occurred to me that we should not kill as we go, but accumulate a
set of pids to signal and then signal each after the iteration is
complete so
we can do as little processing with the pg_notify resources held as
possible,
and certainly no system calls if we can help it.
James