Thread: pg_ctl non-idempotent behavior change

pg_ctl non-idempotent behavior change

From
Jeff Janes
Date:
After 87306184580c9c49717, if the postmaster dies without cleaning up (i.e. power outage), running "pg_ctl start" just gives this message and then exits:

pg_ctl: another server might be running

Under the old behavior, it would try to start the server anyway, and succeed, then go through recovery and give you back a functional system.

From reading the archive, I can't really tell if this change in behavior was intentional.

Anyway it seems like a bad thing to me.  Now the user has a system that will not start up, and is given no clue that they need to remove "postmaster.pid" and try again.

The behavior here under the new "-I" flag seems no better in this situation.  It claims the server is running, when it only "might" be running (and in fact is not running).

Cheers,

Jeff

Re: pg_ctl non-idempotent behavior change

From
Tom Lane
Date:
Jeff Janes <jeff.janes@gmail.com> writes:
> After 87306184580c9c49717, if the postmaster dies without cleaning up (i.e.
> power outage), running "pg_ctl start" just gives this message and then
> exits:

> pg_ctl: another server might be running

> Under the old behavior, it would try to start the server anyway, and
> succeed, then go through recovery and give you back a functional system.

> From reading the archive, I can't really tell if this change in behavior
> was intentional.

Hmm.  I rather thought we had agreed not to change the default behavior,
but the commit message fairly clearly says that the default behavior is
being changed.  This case shows that that change was inadequately
thought through.

> Anyway it seems like a bad thing to me.  Now the user has a system that
> will not start up, and is given no clue that they need to remove
> "postmaster.pid" and try again.

Yeah, this is not tolerable.  We could think about improving the logic
to have a stronger check on whether the old server is really there or
not (ie it should be doing something more like pg_ping and less like
just checking if the pidfile is there).  But given how close we are to
beta, maybe the best thing is to revert that change for now and put it
back on the to-think-about-for-9.4 list.  Peter?
        regards, tom lane



Re: pg_ctl non-idempotent behavior change

From
Peter Eisentraut
Date:
On Sat, 2013-04-27 at 14:24 -0400, Tom Lane wrote:
> Yeah, this is not tolerable.  We could think about improving the logic
> to have a stronger check on whether the old server is really there or
> not (ie it should be doing something more like pg_ping and less like
> just checking if the pidfile is there).  But given how close we are to
> beta, maybe the best thing is to revert that change for now and put it
> back on the to-think-about-for-9.4 list.  Peter?

Reverted.




Re: pg_ctl non-idempotent behavior change

From
Alvaro Herrera
Date:
Tom Lane wrote:
> Jeff Janes <jeff.janes@gmail.com> writes:
> > After 87306184580c9c49717, if the postmaster dies without cleaning up (i.e.
> > power outage), running "pg_ctl start" just gives this message and then
> > exits:
> 
> > pg_ctl: another server might be running
> 
> > Under the old behavior, it would try to start the server anyway, and
> > succeed, then go through recovery and give you back a functional system.
> 
> > From reading the archive, I can't really tell if this change in behavior
> > was intentional.
> 
> Hmm.  I rather thought we had agreed not to change the default behavior,
> but the commit message fairly clearly says that the default behavior is
> being changed.  This case shows that that change was inadequately
> thought through.
> 
> > Anyway it seems like a bad thing to me.  Now the user has a system that
> > will not start up, and is given no clue that they need to remove
> > "postmaster.pid" and try again.
> 
> Yeah, this is not tolerable.  We could think about improving the logic
> to have a stronger check on whether the old server is really there or
> not (ie it should be doing something more like pg_ping and less like
> just checking if the pidfile is there).  But given how close we are to
> beta, maybe the best thing is to revert that change for now and put it
> back on the to-think-about-for-9.4 list.  Peter?

Are we going to unrevert this patch for 9.5?

-- 
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services



Re: pg_ctl non-idempotent behavior change

From
Bruce Momjian
Date:
On Mon, Aug  4, 2014 at 05:07:47PM -0400, Alvaro Herrera wrote:
> Tom Lane wrote:
> > Jeff Janes <jeff.janes@gmail.com> writes:
> > > After 87306184580c9c49717, if the postmaster dies without cleaning up (i.e.
> > > power outage), running "pg_ctl start" just gives this message and then
> > > exits:
> > 
> > > pg_ctl: another server might be running
> > 
> > > Under the old behavior, it would try to start the server anyway, and
> > > succeed, then go through recovery and give you back a functional system.
> > 
> > > From reading the archive, I can't really tell if this change in behavior
> > > was intentional.
> > 
> > Hmm.  I rather thought we had agreed not to change the default behavior,
> > but the commit message fairly clearly says that the default behavior is
> > being changed.  This case shows that that change was inadequately
> > thought through.
> > 
> > > Anyway it seems like a bad thing to me.  Now the user has a system that
> > > will not start up, and is given no clue that they need to remove
> > > "postmaster.pid" and try again.
> > 
> > Yeah, this is not tolerable.  We could think about improving the logic
> > to have a stronger check on whether the old server is really there or
> > not (ie it should be doing something more like pg_ping and less like
> > just checking if the pidfile is there).  But given how close we are to
> > beta, maybe the best thing is to revert that change for now and put it
> > back on the to-think-about-for-9.4 list.  Peter?
> 
> Are we going to unrevert this patch for 9.5?

Seems no one is thinking of restoring this patch and working on the
issue.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + Everyone has their own god. +



Re: pg_ctl non-idempotent behavior change

From
Peter Eisentraut
Date:
On 10/11/14 6:54 PM, Bruce Momjian wrote:
>> Are we going to unrevert this patch for 9.5?
> Seems no one is thinking of restoring this patch and working on the
> issue.

I had postponed work on this issue and set out to create a test
infrastructure so that all the subtle behavioral dependencies mentioned
in the thread could be expressed in code rather than prose.