On Thu, May 08, 2003 at 16:30:11 -0400,
mlaks <mlaks@bellatlantic.net> wrote:
> Bruno, Thanks for your help.
>
> i checked - grep in the /etc/rc.d/init.d agrees with what you said - those
> /var/lock and /var/run files are commonly placed in all of the services!
>
> Here's my problem:
>
> I had 4 out of 5 machines that got creamed this weekend, and all i needed to
> go in for was to erase that file /var/lib/pgsql/data/postmaster.pid.
> the same thing!!! (with only one machine) happened about a month ago.
>
> I notice that in his script Lamar does this
>
> pid=`pidof -s postmaster`
> if [ $pid ]
> then
> echo $"Postmaster already running."
> else
> #all systems go -- remove any stale lock files
> rm -f /tmp/.s.PGSQL.* > /dev/null
> then he starts up.
>
> What I would be doing is simply adding in
>
> rm -f /var/lib/pgsql/data/postmaster.pid line.
>
> It looks like he isnt worried about getting rid of that tmp/.s.PGSQL.* file as
> long as he ran pidof first -
> (is /tmp/.s.PGSQL. also a kind of lock file? i dont know - do you know
> what system sets it up?)
Well if there is no process with the pid in postmaster.pid then you are safe.
If there is one then you have to know it isn't a postmaster.
> Also - what do you do about those files
>
> /tmp/.s.PGSQL.* ?
These are place holders for the domain sockets used for local connections.
>
> and what do you do about the possibility of supervise starting more than one
> of the postmasters?
I do this. It is simpler to set up than making a bunch of different init
scripts. Just make sure each postmaster uses a different port and data
area.
> I like the idea of supervise starting me up again even without a reboot! and i
> just want to catch this problem and solve it.
>
> Thanks, mitchell