Tom Lane wrote:
> Magnus Hagander <magnus@hagander.net> writes:
> > On Wed, Nov 17, 2010 at 19:57, Bruce Momjian <bruce@momjian.us> wrote:
> >> Is FATAL, in general, enough to conclude the server is running?
>
> > No - specifically, we will send FATAL when "the database system is
> > starting up", which is exactly the one we want to *avoid*.
>
> > I think we should only exclude the password case. I guess we could
> > also do all fatal *except* <list>, but that seems more fragile.
>
> I believe that the above argument is exactly backwards. What we want
> here is to check the result of postmaster.c's canAcceptConnections(),
> and there are only a finite number of error codes that can result from
> rejections there. If we get past that, there are a large number of
> possible failures, but all of them indicate that the postmaster is in
> principle willing to accept connections. Checking for password errors
> only is utterly wrong: any other type of auth failure would be the same
> for this purpose, as would "no such database", "no such user", "too many
> connections", etc etc etc.
Agreed. So how do we pass that info to libpq without exceeding the
value of fixing this problem? Should we parse pg_controldata output?
pg_upgrade could use machine-readable output from that too.
> What we actually want here, and don't have, is the fabled pg_ping
> protocol...
Well, we are basically figuring how to implement that with this fix,
whether it is part of pg_ctl or a separate binary.
-- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB
http://enterprisedb.com
+ It's impossible for everything to be true. +