Tom Lane <tgl@sss.pgh.pa.us> wrote:
> The problem is that after a system crash and reboot, an old
> postmaster.pid file might be left behind. The postmaster can only
> safely remove this lock file if it is *certain* that it doesn't
> represent another live postmaster process. Otherwise it is honor-
> bound to commit hara-kiri instead of starting up. It can tell
> whether or not the PID in the file belongs to a live process and
> whether that process belongs to the postgres userid (by attempting
> kill(PID, 0) and seeing what it gets). If not, it can remove the
> file with a clear conscience.
Right -- we did run into this in spades when our backup server,
running dozens of instances of PostgreSQL in "warm standby" to confirm
the integrity of the files received, crashed hard. I wasn't sure if
this was the problem being addressed. One obvious solution, which we
now rigorously observe, is to use a different OS user for each
PostgreSQL instance. I assume that pg_ctl is safe in such an
environment?
> The long and the short of it is that it's best to not use pg_ctl.
> As mentioned, it doesn't buy much of anything for an initscript
> anyway.
It must buy something in our environment, because our attempts to use
the sample script with minimal modification were problematic.
Unfortunately I forget the details, but our problems vanished when we
switched to pg_ctl. (Well, except for that one unfortunate episode
mentioned above.)
> The whole thing is really only a problem for initscript authors (who
> all know about it by now ;-))
Well, one of them (at least) didn't quite understand the whole issue
until receiving your email. Thanks for the clear description.
-Kevin