Tatsuo Ishii <t-ishii@sra.co.jp> writes:
>> Yet I don't think I've ever heard a programming recommendation to
>> save/restore errno in signal handlers...
> Agreed. I don't like this way.
Hmm, I don't like your patch and you don't like mine. Time to redesign
rather than patch ;-)
> I asked a Unix guru, and got a suggestion that we do not need to call
> wait() (and CleanupProc()) inside the signal handler. Instead we could
> have a null signal hander (it just calls pqsignal()) for SIGCHLD. If
> select() returns EINTR then we just call wait() and
> CleanupProc(). Moreover this would eliminate sigprocmask() or
> sigblock() calls currently done to avoid race conditions before going
> into the critical region. Of course we have to call wait() and
> CleanupProc() before select() to make sure that we have no waiting
> children.
This looks like it could be a really clean solution. In fact, there'd
be no need to check for EINTR from select(); we could just fall through,
knowing that the reaping will be done as soon as we loop around to the
top of the loop. The code becomes just
for (;;) { reap; select; handle any input found by select;}
Do we even need a signal handler at all for ECHILD? I suppose the
select might not get interrupted (at least on some platforms) if there
isn't one.
Actually I guess there still is a race condition: there is a window
between the last wait() of the reap loop and the select() wherein an
ECHILD won't be serviced right away, because we hit the select() before
noticing it. We could maybe use a timeout on the select to fix that.
Don't really like it though, since the timeout couldn't be very long,
but we don't want the postmaster wasting cycles when there's nothing
to do. Is there another way around this?
regards, tom lane