Robert Haas <robertmhaas@gmail.com> writes:
> On Mon, Aug 16, 2010 at 11:18 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> We could perhaps address that risk another way: after having written
>> postmaster.pid, try to read it back to verify that it contains what we
>> wrote, and abort if not. �Then, if we can't read it during startup,
>> it's okay to assume there is no conflicting postmaster.
> What if it was readable when written but has since become unreadable?
Yup, that's the weak spot in any such assumption. One might also draw
an analogy to the case of failing to open postmaster.pid because of
permissions change, which seems at least as likely as a data change.
And we consider that as fatal, for good reason I think.
> My basic feeling on this is that manual intervention to start the
> server is really undesirable and we should try hard to avoid needing
> it. That having been said, accidentally starting two postmasters at
> the same time that are accessing the same data files would be several
> orders of magnitude worse. We can't afford to compromise on any
> interlock mechanisms that are necessary to prevent that from
> happening.
Yeah. At the same time, it's really really bad to encourage people to
remove postmaster.pid manually as the first attempt to fix anything.
That completely destroys whatever interlock you thought you had. So
it's not too hard to make a case that avoiding this scenario will really
make things safer not less so.
The bottom line here is that it's not clear to me whether changing this
would be a net reliability improvement or not. Maybe better to leave
it alone.
regards, tom lane