Re: FATAL: bogus data in lock file "postmaster.pid": "" - Mailing list pgsql-hackers

From Tom Lane
Subject Re: FATAL: bogus data in lock file "postmaster.pid": ""
Date
Msg-id 15392.1346119150@sss.pgh.pa.us
Whole thread Raw
In response to Re: FATAL: bogus data in lock file "postmaster.pid": ""  (Bruce Momjian <bruce@momjian.us>)
Responses Re: FATAL: bogus data in lock file "postmaster.pid": ""
List pgsql-hackers
Bruce Momjian <bruce@momjian.us> writes:
> On Mon, Aug 27, 2012 at 07:39:35PM -0400, Tom Lane wrote:
>> I could get behind that, but I don't think the delay should be more than
>> 100ms or so.

> I took Alvaro's approach of a sleep.  The file test was already in a
> loop that went 100 times.  Basically, if the lock file exists, this
> postmaster isn't going to succeed, so I figured there is no reason to
> rush in the testing.  I gave it 5 tries with one second between
> attempts.  Either the file is being populated, or it is stale and empty.

How did "100ms" translate to 5 seconds?

> I checked pg_ctl and that has a default wait of 60 second, so 5 seconds
> to exit out of the postmaster should be fine.

pg_ctl is not the only consideration here.  In particular, there are a
lot of initscripts out there (all of Red Hat's, for instance) that don't
use pg_ctl and expect the postmaster to come up (or not) in a couple of
seconds.

I don't see a need for more than about one retry with 100ms delay.
There is no evidence that the case we're worried about has ever occurred
in the real world anyway, so slowing down error failures to make really
really really sure there's not a competing postmaster doesn't seem like
a good tradeoff.

I'm not terribly impressed with that errhint, either.
        regards, tom lane



pgsql-hackers by date:

Previous
From: Bruce Momjian
Date:
Subject: Re: FATAL: bogus data in lock file "postmaster.pid": ""
Next
From: "Dickson S. Guedes"
Date:
Subject: Re: CREATE SCHEMA IF NOT EXISTS