Tom Lane wrote:
> Zdenek Kotala <Zdenek.Kotala@Sun.COM> writes:
>> 1) Is there still some reason have negative value in postmaster.pid?
>
> Just to distinguish postmasters from standalone backends in the error
> messages. I think that's still useful.
I'm not sure what you mean. It is used only in CreatePidFile function
and I think that if directory is locked by some process, I don't see any
useful reason to know if it is postmaster or standalone backend.
(PS: Is standalone backend same as --single switch?)
>> 2) Why 100? What race condition should happen? This piece of code looks
>> like kind of magic.
>
> There are at least two race cases identified in the comments in the
> loop.
Yes there are. But it does not sense for me. If I want to open file and
another process remove it, why I want to try created it again when
another process going to do it?
There is only one reason and it is that user delete file manually from
the system, but in this case I don't believe that administrator shot
right time.
Or if it still have sense do it in this way I expect some sleep instead
of some loop which depends on CPU speed.
>> 3) Why pid checking and cleanup is in postgres? I think it is role of
>> pg_ctl or init scripts.
>
> Let's see, instead of one place in the postgres code we should do it in
> N places in different init scripts, and just trust to luck that a
> particular installation is using an init script that knows to do that?
> I don't think so. Besides, how is the init script going to remove it
> again? It won't still be running when the postmaster exits.
I'm sorry, I meant why there is a pid cleanup which stays there after
another postmaster crash. Many application only check OK there is some
pid file -> exit. And rest is on start script or some other monitoring
facility.
>> 4) The following condition is buggy, because atoi function does not have
>> defined result if parameter is not valid number.
>
>> if (other_pid <= 0)
>
> It's not actually trying to validate the syntax of the lock file, only
> to make certain it doesn't trigger any unexpected behavior in kill().
I not sure if we talk about same place. kill() is called after this if.
If I miss that atoi need not return 0 if fails, then following condition
is more accurate:
if (other_pid == 0)
> I don't think I've yet seen any reports that suggest that more syntax
> checking of the lock file would be a useful activity.
Yes, I agree.
Zdenek