Re: Reliably determining whether the server came up - Mailing list pgsql-admin

From Mischa Sandberg
Subject Re: Reliably determining whether the server came up
Date
Msg-id 1227035210.4923124a948e9@legacywebmail.telus.net
Whole thread Raw
In response to Re: Reliably determining whether the server came up  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Reliably determining whether the server came up
List pgsql-admin
Quoting Tom Lane <tgl@sss.pgh.pa.us>:

> Mischa Sandberg <mischa_sandberg@telus.net> writes:
> > Quoting Tom Lane <tgl@sss.pgh.pa.us>:
> >> I'd bet that the pg_ctl status part is failing.  I get exit status
> 1
> >> from it if there's no server running.
>
> > Yes, that was part of the problem with the original startup
> script;
> > postmaster hadn't even gotten as far as writing postmaster.pid,
> > I guess. But pg_ctl status returning 1 could also mean that that
> the
> > server had come up, hit a critical problem and exited. Hence my
> problem;
> > this has to detect server failure, reliably, as well.
>
> You could sleep for a second or so *before* you start looking for
> the
> pidfile.

The systems are under erratic load, due to concurrent
cpu and diskio spikes around start-up time.
1-2 secs is not enough to be a guarantee :-(

Probably not explaining the issues well;
caught between two constraints that aren't really pg's problem;
and wide clusters with automated admin, variable hardware
and spikes of db restarts are no doubt an oddball edge case.
There are workarounds; was hoping for something
clean and obvious (to all but me).

Switching back to tailing the log files and moving on.
Thanks everyone.
--
Engineers think that equations approximate reality.
Physicists think that reality approximates the equations.
Mathematicians never make the connection.


pgsql-admin by date:

Previous
From: Tom Lane
Date:
Subject: Re: Reliably determining whether the server came up
Next
From: Tom Lane
Date:
Subject: Re: Reliably determining whether the server came up