2013/1/27 Tom Lane <tgl@sss.pgh.pa.us>:
> Craig Ringer <craig@2ndQuadrant.com> writes:
>> That's what it sounds like - confirming that PostgreSQL is really fully
>> shut down.
>
>> I'm not sure how you could do that over a protocol connection, myself.
>> I'd just read the postmaster pid from the pidfile on disk and then `kill
>> -0` it in a delay loop until the `kill` command returns failure. This
>> could be a useful convenience utility but I'm not convinced it should be
>> added to pg_isready because it requires local and possibly privileged
>> execution, unlike pg_isready's network based operation. Privileges could
>> be avoided by using an aliveness test other than `kill -0`, but you
>> absolutely have to be local to verify that the postmaster has fully
>> terminated - and it wouldn't make sense for a non-local process to care
>> about this anyway.
>
> This problem is actually quite a bit more difficult than it looks.
> In particular, the mere fact that the postmaster process is gone does
> not prove that the cluster is idle: it's possible that the postmaster
> crashed leaving orphan backends behind, and the orphans are still busily
> modifying on-disk state. A real postmaster knows how to check for that
> (by looking at the nattch count of the shmem segment cited in the old
> lockfile) but I can't see any shell script getting it right.
>
> So ATM I wouldn't trust any method short of "try to start a new
> postmaster and see if it works", which of course is not terribly helpful
> if your objective is to get to a stopped state.
>
> We could consider transposing the shmem logic into a new pg_ctl command.
> It might be better though to have a new switch in the postgres
> executable that just runs postmaster startup as far as detecting
> lockfile conflicts, and reports what it found (without ever launching
> any child processes that could confuse matters). Then "pg_ctl isdone"
> could be a frontend for that, instead of duplicating logic.
>
+1
Pavel
> regards, tom lane