Thread: recovery_target_action doesn't work for anything but shutdown
Hi, Unless I'm missing something recovery_target_action = promote/pause don't work. There's the following block of code in readRecoveryCommandFile():/* * Override any inconsistent requests. Not that this isa change * of behaviour in 9.5; prior to this we simply ignored a request * to pause if hot_standby = off, which was surprisingbehaviour. */if (recoveryTargetAction == RECOVERY_TARGET_ACTION_PAUSE && recoveryTargetActionSet && standbyState== STANDBY_DISABLED) recoveryTargetAction = RECOVERY_TARGET_ACTION_SHUTDOWN; the problem is that when the recovery command file is read standbyState will always still be STANDBY_DISABLED. Which makes sense, because we can't even know we're in recovery before readRecoveryCommandFile(). I guess what you actually intended to test was StandbyModeRequested? Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On 2015-03-12 15:52:02 +0100, Andres Freund wrote: > I guess what you actually intended to test was StandbyModeRequested? Err, EnableHotStandby. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On Thu, Mar 12, 2015 at 11:53 PM, Andres Freund <andres@2ndquadrant.com> wrote: > On 2015-03-12 15:52:02 +0100, Andres Freund wrote: >> I guess what you actually intended to test was StandbyModeRequested? > > Err, EnableHotStandby. Indeed. Without !EnableHotStandby, a node in recovery would simply be shut down even if a pause has been requested when hot_standby = on. This goes as well in the sense of the comment. -- Michael
On 12/03/15 15:53, Andres Freund wrote: > On 2015-03-12 15:52:02 +0100, Andres Freund wrote: >> I guess what you actually intended to test was StandbyModeRequested? > > Err, EnableHotStandby. > Yeah pause does not work currently. This change was made by committer and I think the intended change was to make it a little safer by silently changing to shutdown instead of silently changing to promote which is essentially what was happening before the patch. But yes the check should have been against EnableHotStandby it seems. -- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
On 13/03/15 15:06, Petr Jelinek wrote: > On 12/03/15 15:53, Andres Freund wrote: >> On 2015-03-12 15:52:02 +0100, Andres Freund wrote: >>> I guess what you actually intended to test was StandbyModeRequested? >> >> Err, EnableHotStandby. >> > > Yeah pause does not work currently. This change was made by committer > Actually the code is from me, committer just requested the change, sorry. -- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
On 2015-03-12 15:52:02 +0100, Andres Freund wrote: > /* > * Override any inconsistent requests. Not that this is a change > * of behaviour in 9.5; prior to this we simply ignored a request > * to pause if hot_standby = off, which was surprising behaviour. > */ > if (recoveryTargetAction == RECOVERY_TARGET_ACTION_PAUSE && > recoveryTargetActionSet && > standbyState == STANDBY_DISABLED) > recoveryTargetAction = RECOVERY_TARGET_ACTION_SHUTDOWN; While it's easy enough to fix I rather dislike the whole intent here though. *Silently* switching the mode of operation in a rather significant way seems like a bad idea to me. At the very least we need to emit a LOG message about this; but I think it'd be much better to error out instead. <9.5's behaviour was already quite surprising. But changing things to a different surprising behaviour seems like a bad idea. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On Sun, Mar 15, 2015 at 2:27 PM, Andres Freund <andres@2ndquadrant.com> wrote:
On 2015-03-12 15:52:02 +0100, Andres Freund wrote:
> /*
> * Override any inconsistent requests. Not that this is a change
> * of behaviour in 9.5; prior to this we simply ignored a request
> * to pause if hot_standby = off, which was surprising behaviour.
> */
> if (recoveryTargetAction == RECOVERY_TARGET_ACTION_PAUSE &&
> recoveryTargetActionSet &&
> standbyState == STANDBY_DISABLED)
> recoveryTargetAction = RECOVERY_TARGET_ACTION_SHUTDOWN;
While it's easy enough to fix I rather dislike the whole intent here
though. *Silently* switching the mode of operation in a rather
significant way seems like a bad idea to me. At the very least we need
to emit a LOG message about this; but I think it'd be much better to
error out instead.
<9.5's behaviour was already quite surprising. But changing things to a
different surprising behaviour seems like a bad idea.
+1. Especially for "sensitive" operations like this, having predictable-behavior-or-error is usually the best choice.
On 15/03/15 14:51, Magnus Hagander wrote: > On Sun, Mar 15, 2015 at 2:27 PM, Andres Freund <andres@2ndquadrant.com > <mailto:andres@2ndquadrant.com>> wrote: > > On 2015-03-12 15:52:02 +0100, Andres Freund wrote: > > /* > > * Override any inconsistent requests. Not that this is a > change > > * of behaviour in 9.5; prior to this we simply ignored a > request > > * to pause if hot_standby = off, which was surprising > behaviour. > > */ > > if (recoveryTargetAction == RECOVERY_TARGET_ACTION_PAUSE && > > recoveryTargetActionSet && > > standbyState == STANDBY_DISABLED) > > recoveryTargetAction = RECOVERY_TARGET_ACTION_SHUTDOWN; > > While it's easy enough to fix I rather dislike the whole intent here > though. *Silently* switching the mode of operation in a rather > significant way seems like a bad idea to me. At the very least we need > to emit a LOG message about this; but I think it'd be much better to > error out instead. > > <9.5's behaviour was already quite surprising. But changing things to a > different surprising behaviour seems like a bad idea. > > > +1. Especially for "sensitive" operations like this, having > predictable-behavior-or-error is usually the best choice. > Thinking about it again now, it does seem that ignoring user setting because it's in conflict with another user setting is a bad idea and I think we in general throw errors on those. So +1 from me also. -- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
On 2015-03-15 14:51:46 +0100, Magnus Hagander wrote: > On Sun, Mar 15, 2015 at 2:27 PM, Andres Freund <andres@2ndquadrant.com> > wrote: > > > On 2015-03-12 15:52:02 +0100, Andres Freund wrote: > > > /* > > > * Override any inconsistent requests. Not that this is a change > > > * of behaviour in 9.5; prior to this we simply ignored a request > > > * to pause if hot_standby = off, which was surprising behaviour. > > > */ > > > if (recoveryTargetAction == RECOVERY_TARGET_ACTION_PAUSE && > > > recoveryTargetActionSet && > > > standbyState == STANDBY_DISABLED) > > > recoveryTargetAction = RECOVERY_TARGET_ACTION_SHUTDOWN; > > > > While it's easy enough to fix I rather dislike the whole intent here > > though. *Silently* switching the mode of operation in a rather > > significant way seems like a bad idea to me. At the very least we need > > to emit a LOG message about this; but I think it'd be much better to > > error out instead. > > > > <9.5's behaviour was already quite surprising. But changing things to a > > different surprising behaviour seems like a bad idea. > > > > +1. Especially for "sensitive" operations like this, having > predictable-behavior-or-error is usually the best choice. Yea. Looking further, it's even worse right now. We'll change the target to shutdown when hot_standby = off, but iff it was set in the config file. But the default value is (and was, although configured differently) documented to be 'pause'; so if it's not configured explicitly we still will promote. At least I can't read that out of the docs. Personally I think we just should change the default to 'shutdown' for all cases. That makes documentation and behaviour less surprising. And makes experimenting less dangerous, since you can just start again. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On Sun, Mar 15, 2015 at 3:16 PM, Andres Freund <andres@2ndquadrant.com> wrote:
Yea.On 2015-03-15 14:51:46 +0100, Magnus Hagander wrote:
> On Sun, Mar 15, 2015 at 2:27 PM, Andres Freund <andres@2ndquadrant.com>
> wrote:
>
> > On 2015-03-12 15:52:02 +0100, Andres Freund wrote:
> > > /*
> > > * Override any inconsistent requests. Not that this is a change
> > > * of behaviour in 9.5; prior to this we simply ignored a request
> > > * to pause if hot_standby = off, which was surprising behaviour.
> > > */
> > > if (recoveryTargetAction == RECOVERY_TARGET_ACTION_PAUSE &&
> > > recoveryTargetActionSet &&
> > > standbyState == STANDBY_DISABLED)
> > > recoveryTargetAction = RECOVERY_TARGET_ACTION_SHUTDOWN;
> >
> > While it's easy enough to fix I rather dislike the whole intent here
> > though. *Silently* switching the mode of operation in a rather
> > significant way seems like a bad idea to me. At the very least we need
> > to emit a LOG message about this; but I think it'd be much better to
> > error out instead.
> >
> > <9.5's behaviour was already quite surprising. But changing things to a
> > different surprising behaviour seems like a bad idea.
> >
>
> +1. Especially for "sensitive" operations like this, having
> predictable-behavior-or-error is usually the best choice.
Looking further, it's even worse right now. We'll change the target to
shutdown when hot_standby = off, but iff it was set in the config
file. But the default value is (and was, although configured
differently) documented to be 'pause'; so if it's not configured
explicitly we still will promote. At least I can't read that out of the
docs.
Personally I think we just should change the default to 'shutdown' for
all cases. That makes documentation and behaviour less surprising. And
makes experimenting less dangerous, since you can just start again.
On 15 March 2015 at 14:16, Andres Freund <andres@2ndquadrant.com> wrote: > Personally I think we just should change the default to 'shutdown' for > all cases. That makes documentation and behaviour less surprising. And > makes experimenting less dangerous, since you can just start again. We need to look at the specific situation, not make a generic decision. If hot_standby = off, we are unable to unpause, once paused. Changing the default doesn't alter that problem. We have two choices: 1) override to a sensible setting, 2) throw an error. (2) sounds clean at first but we must look deeper. We know that the *only* possible other setting is 'shutdown', so it seems more user friendly to do the thing we *know* they want (1), rather than pretend that we don't. (1) is completely predictable and not at all surprising. Add a LOG message if you wish, but don't throw an error. -- Simon Riggs http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, RemoteDBA, Training & Services
On 2015-03-15 20:10:49 +0000, Simon Riggs wrote: > On 15 March 2015 at 14:16, Andres Freund <andres@2ndquadrant.com> wrote: > > > Personally I think we just should change the default to 'shutdown' for > > all cases. That makes documentation and behaviour less surprising. And > > makes experimenting less dangerous, since you can just start again. > > We need to look at the specific situation, not make a generic decision. > > If hot_standby = off, we are unable to unpause, once paused. > > Changing the default doesn't alter that problem. > > We have two choices: 1) override to a sensible setting, 2) throw an error. > > (2) sounds clean at first but we must look deeper. We know that the > *only* possible other setting is 'shutdown', so it seems more user > friendly to do the thing we *know* they want (1), rather than pretend > that we don't. > > (1) is completely predictable and not at all surprising. Add a LOG > message if you wish, but don't throw an error. Sorry, I don't buy this. If I have "recovery_target_action = 'pause'" in the config file, I want it to pause. Not do something else, just because postgres doesn't have a interface to unpause without SQL access. That makes some sense to developers, but is pretty much ununderstandable for mere mortals. Even worse, right now (after the bugfix), the behaviour is: postgresql.conf: # hot_standby = off recovery.conf: #recovery_target_action = 'pause' => promote (the downgrading is only active when explicitly configured) # hot_standby = off recovery.conf: recovery_target_action = 'pause' => shutdown (despite an explicit setting of pause) hot_standby = on recovery.conf: # recovery_target_action = 'pause' => pause hot_standby = on recovery.conf: recovery_target_action = 'pause' => pause To me that's just utterly confusing. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On Mon, Mar 16, 2015 at 7:38 AM, Andres Freund wrote: > On 2015-03-15 20:10:49 +0000, Simon Riggs wrote: >> On 15 March 2015 at 14:16, Andres Freund wrote: >> >> > Personally I think we just should change the default to 'shutdown' for >> > all cases. That makes documentation and behaviour less surprising. And >> > makes experimenting less dangerous, since you can just start again. >> >> We need to look at the specific situation, not make a generic decision. >> >> If hot_standby = off, we are unable to unpause, once paused. >> >> Changing the default doesn't alter that problem. >> >> We have two choices: 1) override to a sensible setting, 2) throw an error. >> >> (2) sounds clean at first but we must look deeper. We know that the >> *only* possible other setting is 'shutdown', so it seems more user >> friendly to do the thing we *know* they want (1), rather than pretend >> that we don't. >> >> (1) is completely predictable and not at all surprising. Add a LOG >> message if you wish, but don't throw an error. > > Sorry, I don't buy this. If I have "recovery_target_action = 'pause'" in > the config file, I want it to pause. Not do something else, just because > postgres doesn't have a interface to unpause without SQL access. That > makes some sense to developers, but is pretty much ununderstandable for > mere mortals. +1 for removing this code block, and +1 for having a LOG message, with an additional paragraph in the docs mentioning that when using pause as recovery_target_action and hot_standby = off there is no direct interface to unpause the node. -- Michael
On 15 March 2015 at 22:38, Andres Freund <andres@2ndquadrant.com> wrote: > Sorry, I don't buy this. If I have "recovery_target_action = 'pause'" in > the config file, I want it to pause. You want it to enter a state where you cannot perform any action other than shutdown? Why would anyone want that? -- Simon Riggs http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, RemoteDBA, Training & Services
On 2015-03-16 07:52:20 +0000, Simon Riggs wrote: > On 15 March 2015 at 22:38, Andres Freund <andres@2ndquadrant.com> wrote: > > > Sorry, I don't buy this. If I have "recovery_target_action = 'pause'" in > > the config file, I want it to pause. > > You want it to enter a state where you cannot perform any action other > than shutdown? > > Why would anyone want that? You actually still could promote. But I'd be perfectly happy if postgres said ERROR: recovery_target_action = 'pause' in "%s" cannot be used without hot_standby DETAIL: Recovery pauses cannot be resumed without SQL level access. HINT: Configure hot_standby and try again. or something roughly like that. If postgres tells me what to change to achieve what I want, I have a much higher chance of getting there. Especially if it just does something else otherwise. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On Mon, Mar 16, 2015 at 11:39:26AM +0100, Andres Freund wrote: > On 2015-03-16 07:52:20 +0000, Simon Riggs wrote: > > On 15 March 2015 at 22:38, Andres Freund <andres@2ndquadrant.com> wrote: > > > > > Sorry, I don't buy this. If I have "recovery_target_action = 'pause'" in > > > the config file, I want it to pause. > > > > You want it to enter a state where you cannot perform any action other > > than shutdown? > > > > Why would anyone want that? > > You actually still could promote. But I'd be perfectly happy if postgres > said > ERROR: recovery_target_action = 'pause' in "%s" cannot be used without hot_standby > DETAIL: Recovery pauses cannot be resumed without SQL level access. > HINT: Configure hot_standby and try again. > > or something roughly like that. If postgres tells me what to change to > achieve what I want, I have a much higher chance of getting > there. Especially if it just does something else otherwise. Where are we on this? -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. +
On Thu, Mar 12, 2015 at 11:53 PM, Andres Freund <andres@2ndquadrant.com> wrote: > On 2015-03-12 15:52:02 +0100, Andres Freund wrote: >> I guess what you actually intended to test was StandbyModeRequested? > > Err, EnableHotStandby. Pushed the fix. Regards, -- Fujii Masao
On Mon, Mar 16, 2015 at 7:39 PM, Andres Freund <andres@2ndquadrant.com> wrote: > On 2015-03-16 07:52:20 +0000, Simon Riggs wrote: >> On 15 March 2015 at 22:38, Andres Freund <andres@2ndquadrant.com> wrote: >> >> > Sorry, I don't buy this. If I have "recovery_target_action = 'pause'" in >> > the config file, I want it to pause. >> >> You want it to enter a state where you cannot perform any action other >> than shutdown? >> >> Why would anyone want that? > > You actually still could promote. But I'd be perfectly happy if postgres > said > ERROR: recovery_target_action = 'pause' in "%s" cannot be used without hot_standby > DETAIL: Recovery pauses cannot be resumed without SQL level access. > HINT: Configure hot_standby and try again. This works for me. Regards, -- Fujii Masao