Re: Questions about pid file creation code - Mailing list pgsql-hackers

From Zdenek Kotala
Subject Re: Questions about pid file creation code
Date
Msg-id 461170EF.1050401@sun.com
Whole thread Raw
In response to Re: Questions about pid file creation code  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Questions about pid file creation code
Re: Questions about pid file creation code
List pgsql-hackers
Tom Lane wrote:
> Zdenek Kotala <Zdenek.Kotala@Sun.COM> writes:
>> 1) Is there still some reason have negative value in postmaster.pid?
> 
> Just to distinguish postmasters from standalone backends in the error
> messages.  I think that's still useful.

I'm not sure what you mean. It is used only in CreatePidFile function 
and I think that if directory is locked by some process, I don't see any 
useful reason to know if it is postmaster or standalone backend.

(PS: Is standalone backend same as --single switch?)

>> 2) Why 100? What race condition should happen? This piece of code looks 
>> like kind of magic.
> 
> There are at least two race cases identified in the comments in the
> loop.

Yes there are. But it does not sense for me. If I want to open file and 
another process remove it, why I want to try created it again when 
another process going to do it?

There is only one reason and it is that user delete file manually from 
the system, but in this case I don't believe that administrator shot 
right time.

Or if it still have sense do it in this way I expect some sleep instead 
of some loop which depends on CPU speed.

>> 3) Why pid checking and cleanup is in postgres? I think it is role of 
>> pg_ctl or init scripts.
> 
> Let's see, instead of one place in the postgres code we should do it in
> N places in different init scripts, and just trust to luck that a
> particular installation is using an init script that knows to do that?
> I don't think so.  Besides, how is the init script going to remove it
> again?  It won't still be running when the postmaster exits.

I'm sorry, I meant why there is a pid cleanup which stays there after 
another postmaster crash. Many application only check OK there is some 
pid file -> exit. And rest is on start script or some other monitoring 
facility.

>> 4) The following condition is buggy, because atoi function does not have 
>> defined result if parameter is not valid number.
> 
>>   if (other_pid <= 0)
> 
> It's not actually trying to validate the syntax of the lock file, only
> to make certain it doesn't trigger any unexpected behavior in kill().

I not sure if we talk about same place. kill() is called after this if. 
If I miss that atoi need not return 0 if fails, then following condition 
is more accurate:
  if (other_pid == 0)


> I don't think I've yet seen any reports that suggest that more syntax
> checking of the lock file would be a useful activity.

Yes, I agree.
    Zdenek


pgsql-hackers by date:

Previous
From: Bruce Momjian
Date:
Subject: Re: [PATCHES] Use non-deprecated APIs for dynloader/darwin.c
Next
From: Bruce Momjian
Date:
Subject: Re: pgsql: Fix for plpython functions; return true/false for boolean,