Thread: Add shutdown_at_recovery_target option to recovery.conf
Hi, I recently wanted several times to have slave server prepared at certain point in time to reduce the time it takes for it to replay remaining WALs (say I have pg_basebackup -x on busy db for example). Currently the way to do it is to have pause_at_recovery_target true (default) and wait until pg accepts connection and the shut it down. The issue is that this is ugly, and also there is a chance that somebody else connects and does bad things (tm) before my process does. So I wrote simple patch that adds option to shut down the cluster once recovery_target is reached. The server will still be able to continue WAL replay if needed later or can be configured to start standalone. -- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Attachment
On Wed, Sep 10, 2014 at 1:45 AM, Petr Jelinek <petr@2ndquadrant.com> wrote: > Hi, > > I recently wanted several times to have slave server prepared at certain > point in time to reduce the time it takes for it to replay remaining WALs > (say I have pg_basebackup -x on busy db for example). In your example, you're thinking to perform the recovery after taking the backup and stop it at the consistent point (i.e., end of backup) by using the proposed feature? Then you're expecting that the future recovery will start from that consistent point and which will reduce the recovery time? This is true if checkpoint is executed at the end of backup. But there might be no occurrence of checkpoint during backup. In this case, even future recovery would need to start from very start of backup. That is, we cannot reduce the recovery time. So, for your purpose, for example, you might also need to add new option to pg_basebackup so that checkpoint is executed at the end of backup if the option is set. > Currently the way to do it is to have pause_at_recovery_target true > (default) and wait until pg accepts connection and the shut it down. The > issue is that this is ugly, and also there is a chance that somebody else > connects and does bad things (tm) before my process does. > > So I wrote simple patch that adds option to shut down the cluster once > recovery_target is reached. The server will still be able to continue WAL > replay if needed later or can be configured to start standalone. What about adding something like action_at_recovery_target=pause|shutdown instead of increasing the number of parameters? Regards, -- Fujii Masao
On 10/09/14 13:13, Fujii Masao wrote: > On Wed, Sep 10, 2014 at 1:45 AM, Petr Jelinek <petr@2ndquadrant.com> wrote: >> Hi, >> >> I recently wanted several times to have slave server prepared at certain >> point in time to reduce the time it takes for it to replay remaining WALs >> (say I have pg_basebackup -x on busy db for example). > > In your example, you're thinking to perform the recovery after taking > the backup and stop it at the consistent point (i.e., end of backup) by > using the proposed feature? Then you're expecting that the future recovery > will start from that consistent point and which will reduce the recovery time? > > This is true if checkpoint is executed at the end of backup. But there might > be no occurrence of checkpoint during backup. In this case, even future > recovery would need to start from very start of backup. That is, we cannot > reduce the recovery time. So, for your purpose, for example, you might also > need to add new option to pg_basebackup so that checkpoint is executed > at the end of backup if the option is set. > For my use-case it does not matter much as I am talking here of huge volumes where it would normally take hours to replay so being behind one checkpoint is not too bad, but obviously I am not sure that it's good enough for project in general. Adding checkpoint for pg_basebackup might be useful addition, yes. Also I forgot to write another use-case which making sure that I actually do have all the WAL present to get to certain point in time (this one could be done via patch to pg_receivexlog I guess, but I see advantage in having the changes already applied compared to just having the wal files). >> So I wrote simple patch that adds option to shut down the cluster once >> recovery_target is reached. The server will still be able to continue WAL >> replay if needed later or can be configured to start standalone. > > What about adding something like action_at_recovery_target=pause|shutdown > instead of increasing the number of parameters? > That will also increase number of parameters as we can't remove the current pause one if we want to be backwards compatible. Also there would have to be something like action_at_recovery_target=none or off or something since the default is that pause is on and we need to be able to turn off pause without having to have shutdown on. What more, I am not sure I see any other actions that could be added in the future as promote action already works and listen (for RO queries) also already works independently of this. -- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
On 11 September 2014 16:02, Petr Jelinek <petr@2ndquadrant.com> wrote: >> What about adding something like action_at_recovery_target=pause|shutdown >> instead of increasing the number of parameters? >> > > That will also increase number of parameters as we can't remove the current > pause one if we want to be backwards compatible. Also there would have to be > something like action_at_recovery_target=none or off or something since the > default is that pause is on and we need to be able to turn off pause without > having to have shutdown on. What more, I am not sure I see any other actions > that could be added in the future as promote action already works and listen > (for RO queries) also already works independently of this. I accept your argument, though I have other thoughts. If someone specifies shutdown_at_recovery_target = true pause_at_recovery_target = true it gets a little hard to work out what to do; we shouldn't allow such lack of clarity. In recovery its easy to do this if (recoveryShutdownAtTarget) recoveryPauseAtTarget = false; but it won't be when these become GUCs, so I think Fuji's suggestion is a good one. No other comments on patch, other than good idea. -- Simon Riggs http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Hi Petr,
I have spent sometime to review the patch, overall patch looks good, it applies fine and make check run without issue. If recovery target is specified and shutdown_at_recovery_target is set to true, it shutdown the server at specified recovery point. I do have few points to share i.e.
1. It seems that following log message need to be more descriptive about reason for shutdown i.e.
2. As Simon suggesting following recovery settings are not clear i.e.
It is going to make true both but patch describe as following i.e.
3. As it don’t rename reconvery.conf, subsequent attempt (without any changes in reconvery.conf) to start of server keep showing the following i.e.
Is that right ?. Thanks.
Regards,
Muhammad Asif Naeem
I have spent sometime to review the patch, overall patch looks good, it applies fine and make check run without issue. If recovery target is specified and shutdown_at_recovery_target is set to true, it shutdown the server at specified recovery point. I do have few points to share i.e.
1. It seems that following log message need to be more descriptive about reason for shutdown i.e.
+ if (recoveryShutdownAtTarget && reachedStopPoint)
+ {
+ ereport(LOG, (errmsg("shutting down")));
2. As Simon suggesting following recovery settings are not clear i.e.
shutdown_at_recovery_target = true
pause_at_recovery_target = true
It is going to make true both but patch describe as following i.e.
+ Setting this to true will set <link linkend="pause-at-recovery-target">
+ <varname>pause_at_recovery_target</></link> to false.
...
LOG: redo starts at 0/1803620
DEBUG: checkpointer updated shared memory configuration values
LOG: consistent recovery state reached at 0/1803658
LOG: restored log file "000000010000000000000002" from archive
DEBUG: got WAL segment from archive
LOG: restored log file "000000010000000000000003" from archive
DEBUG: got WAL segment from archive
LOG: restored log file "000000010000000000000004" from archive
DEBUG: got WAL segment from archive
LOG: restored log file "000000010000000000000005" from archive
DEBUG: got WAL segment from archive
LOG: restored log file "000000010000000000000006" from archive
DEBUG: got WAL segment from archive
…
Regards,
Muhammad Asif Naeem
On Thu, Oct 16, 2014 at 7:24 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
On 11 September 2014 16:02, Petr Jelinek <petr@2ndquadrant.com> wrote:
>> What about adding something like action_at_recovery_target=pause|shutdown
>> instead of increasing the number of parameters?
>>
>
> That will also increase number of parameters as we can't remove the current
> pause one if we want to be backwards compatible. Also there would have to be
> something like action_at_recovery_target=none or off or something since the
> default is that pause is on and we need to be able to turn off pause without
> having to have shutdown on. What more, I am not sure I see any other actions
> that could be added in the future as promote action already works and listen
> (for RO queries) also already works independently of this.
I accept your argument, though I have other thoughts.
If someone specifies
shutdown_at_recovery_target = true
pause_at_recovery_target = true
it gets a little hard to work out what to do; we shouldn't allow such
lack of clarity.
In recovery its easy to do this
if (recoveryShutdownAtTarget)
recoveryPauseAtTarget = false;
but it won't be when these become GUCs, so I think Fuji's suggestion
is a good one.
No other comments on patch, other than good idea.
--
Simon Riggs http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 29/10/14 20:27, Asif Naeem wrote: > I have spent sometime to review the patch, overall patch looks good, it > applies fine and make check run without issue. If recovery target is > specified and shutdown_at_recovery_target is set to true, it shutdown > the server at specified recovery point. I do have few points to share i.e. > Thanks! > 1. It seems that following log message need to be more descriptive about > reason for shutdown i.e. > > + if (recoveryShutdownAtTarget && reachedStopPoint) > + { > + ereport(LOG, (errmsg("shutting down"))); > Agreed, I just wasn't sure on what exactly to writes, I originally had there "shutting down by user request" or some such but that's misleading. > 2. As Simon suggesting following recovery settings are not clear i.e. > > shutdown_at_recovery_target = true > pause_at_recovery_target = true Hmm I completely missed Simon's email, strange. Well other option would be to throw error if both are set to true - error will have to happen anyway if action_at_recovery_target is set to shutdown while pause_at_recovery_target is true (I think we need to keep pause_at_recovery_target for compatibility). But considering all of you think something like action_at_recovery_target is better solution, I will do it that way then. > > 3. As it don’t rename reconvery.conf, subsequent attempt (without any > changes in reconvery.conf) to start of server keep showing the following > i.e. > > ... > LOG: redo starts at 0/1803620 > DEBUG: checkpointer updated shared memory configuration values > LOG: consistent recovery state reached at 0/1803658 > LOG: restored log file "000000010000000000000002" from archive > DEBUG: got WAL segment from archive > LOG: restored log file "000000010000000000000003" from archive > DEBUG: got WAL segment from archive > LOG: restored log file "000000010000000000000004" from archive > DEBUG: got WAL segment from archive > LOG: restored log file "000000010000000000000005" from archive > DEBUG: got WAL segment from archive > LOG: restored log file "000000010000000000000006" from archive > DEBUG: got WAL segment from archive > … > Yes, it will still replay everything since last checkpoint, that's by design since otherwise we would have to write checkpoint on shutdown and that would mean the instance can't be used as hot standby anymore and I consider that an important feature to have. -- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Hi, Attached is the v2 of the patch with the review comments addressed (see below). On 29/10/14 21:08, Petr Jelinek wrote: > On 29/10/14 20:27, Asif Naeem wrote: >> 1. It seems that following log message need to be more descriptive about >> reason for shutdown i.e. >> >> + if (recoveryShutdownAtTarget && reachedStopPoint) >> + { >> + ereport(LOG, (errmsg("shutting >> down"))); >> > > Agreed, I just wasn't sure on what exactly to writes, I originally had > there "shutting down by user request" or some such but that's misleading. > I changed it to "shutting down at recovery target" hope that's better. >> 2. As Simon suggesting following recovery settings are not clear i.e. >> >> shutdown_at_recovery_target = true >> pause_at_recovery_target = true > > Hmm I completely missed Simon's email, strange. Well other option would > be to throw error if both are set to true - error will have to happen > anyway if action_at_recovery_target is set to shutdown while > pause_at_recovery_target is true (I think we need to keep > pause_at_recovery_target for compatibility). > > But considering all of you think something like > action_at_recovery_target is better solution, I will do it that way then. Done, there is now action_at_recovery_target which can be set to either pause, continue or shutdown, defaulting to pause (which is same as old behavior of pause_at_recovery_target defaulting to true). I also added check that prohibits using both pause_at_recovery_target and action_at_recovery_target in the same config, mainly to avoid users shooting themselves in the foot. > >> >> 3. As it don’t rename reconvery.conf, subsequent attempt (without any >> changes in reconvery.conf) to start of server keep showing the following >> i.e. >> >> ... >> LOG: redo starts at 0/1803620 >> DEBUG: checkpointer updated shared memory configuration values >> LOG: consistent recovery state reached at 0/1803658 >> LOG: restored log file "000000010000000000000002" from archive >> DEBUG: got WAL segment from archive >> LOG: restored log file "000000010000000000000003" from archive >> DEBUG: got WAL segment from archive >> LOG: restored log file "000000010000000000000004" from archive >> DEBUG: got WAL segment from archive >> LOG: restored log file "000000010000000000000005" from archive >> DEBUG: got WAL segment from archive >> LOG: restored log file "000000010000000000000006" from archive >> DEBUG: got WAL segment from archive >> … >> > > Yes, it will still replay everything since last checkpoint, that's by > design since otherwise we would have to write checkpoint on shutdown and > that would mean the instance can't be used as hot standby anymore and I > consider that an important feature to have. > I added note to the documentation that says this will happen. -- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Attachment
On 31 October 2014 15:18, Petr Jelinek <petr@2ndquadrant.com> wrote: > Attached is the v2 of the patch with the review comments addressed (see > below). ... > Done, there is now action_at_recovery_target which can be set to either > pause, continue or shutdown, defaulting to pause (which is same as old > behavior of pause_at_recovery_target defaulting to true). One comment only: I think the actions should be called: pause, promote and shutdown, since "continue" leads immediately to promotion of the server. I'm good with this patch otherwise. Barring objections I will commit tomorrow. -- Simon Riggs http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On 18/11/14 12:57, Simon Riggs wrote: > On 31 October 2014 15:18, Petr Jelinek <petr@2ndquadrant.com> wrote: > >> Attached is the v2 of the patch with the review comments addressed (see >> below). > ... >> Done, there is now action_at_recovery_target which can be set to either >> pause, continue or shutdown, defaulting to pause (which is same as old >> behavior of pause_at_recovery_target defaulting to true). > > One comment only: I think the actions should be called: pause, promote > and shutdown, since "continue" leads immediately to promotion of the > server. > > I'm good with this patch otherwise. Barring objections I will commit tomorrow. > OK, promote works for me as well, I attached patch that changes continue to promote so you don't have to find and replace everything yourself. The changed doc wording probably needs to be checked. -- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Attachment
On 18 November 2014 22:05, Petr Jelinek <petr@2ndquadrant.com> wrote: > OK, promote works for me as well, I attached patch that changes continue to > promote so you don't have to find and replace everything yourself. The > changed doc wording probably needs to be checked. I've reworded docs a little. Which led me to think about shutdown more. If we ask for PAUSE and we're not in HotStandby it just ignores it, which means it changes into PROMOTE. My feeling is that we should change that into a SHUTDOWN, not a PROMOTE. I can raise that separately if anyone objects. Also, for the Shutdown itself, why are we not using kill(PostmasterPid, SIGINT)? That gives a clean, fast shutdown rather than what looks like a crash. -- Simon Riggs http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On Wed, Nov 19, 2014 at 8:13 AM, Simon Riggs <simon@2ndquadrant.com> wrote: > If we ask for PAUSE and we're not in HotStandby it just ignores it, > which means it changes into PROMOTE. My feeling is that we should > change that into a SHUTDOWN, not a PROMOTE. To me, that seems like a definite improvement. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On 19/11/14 14:13, Simon Riggs wrote: > On 18 November 2014 22:05, Petr Jelinek <petr@2ndquadrant.com> wrote: > >> OK, promote works for me as well, I attached patch that changes continue to >> promote so you don't have to find and replace everything yourself. The >> changed doc wording probably needs to be checked. > > I've reworded docs a little. > > Which led me to think about shutdown more. > > If we ask for PAUSE and we're not in HotStandby it just ignores it, > which means it changes into PROMOTE. My feeling is that we should > change that into a SHUTDOWN, not a PROMOTE. I can raise that > separately if anyone objects. Ok that seems reasonable, I can write updated patch which does that. > Also, for the Shutdown itself, why are we not using > kill(PostmasterPid, SIGINT)? > > That gives a clean, fast shutdown rather than what looks like a crash. > My first (unsubmitted) version did that, there was some issue with latches when doing that, but I think that's no longer problem as the point at which the shutdown happens was moved away from the problematic part of code. Other than that, it's just child killing postmaster feels bit weird, but I don't have strong objection to it. -- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
On 19 November 2014 13:13, Simon Riggs <simon@2ndquadrant.com> wrote: > I've reworded docs a little. Done > If we ask for PAUSE and we're not in HotStandby it just ignores it, > which means it changes into PROMOTE. My feeling is that we should > change that into a SHUTDOWN, not a PROMOTE. Done > > Also, for the Shutdown itself, why are we not using > kill(PostmasterPid, SIGINT)? Done Other plan is to throw a FATAL message. > That gives a clean, fast shutdown rather than what looks like a crash. I've also changed the location of where we do RECOVERY_TARGET_ACTION_SHUTDOWN, so its in the same place as where we pause. I've also moved the check to see if we should throw FATAL because aren't yet consistent to *before* we do any actionOnRecoveryTarget stuff. It seems essential that we know that earlier rather than later. Thoughts? -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Attachment
On 2014-11-19 15:47:05 +0000, Simon Riggs wrote: > > Also, for the Shutdown itself, why are we not using > > kill(PostmasterPid, SIGINT)? > > Done I don't think that's ok. The postmaster is the one that should be in control, not some subprocess. I fail to see the win in simplicity over using exit (like we already do for the normal end of recovery!) is. The issue with the log line seems perfectly easily to avoid by just checking the exit code in postmaster.c. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On 19 November 2014 15:57, Andres Freund <andres@2ndquadrant.com> wrote: > On 2014-11-19 15:47:05 +0000, Simon Riggs wrote: >> > Also, for the Shutdown itself, why are we not using >> > kill(PostmasterPid, SIGINT)? >> >> Done > > I don't think that's ok. The postmaster is the one that should be in > control, not some subprocess. > > I fail to see the win in simplicity over using exit (like we already do > for the normal end of recovery!) is. The issue with the log line seems > perfectly easily to avoid by just checking the exit code in > postmaster.c. We need to be able to tell the difference between a crashed Startup process and this usage. As long as we can tell, I don't mind how we do it. Suggestions please. -- Simon Riggs http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On 19/11/14 17:04, Simon Riggs wrote: > On 19 November 2014 15:57, Andres Freund <andres@2ndquadrant.com> wrote: >> On 2014-11-19 15:47:05 +0000, Simon Riggs wrote: >>>> Also, for the Shutdown itself, why are we not using >>>> kill(PostmasterPid, SIGINT)? >>> >>> Done >> >> I don't think that's ok. The postmaster is the one that should be in >> control, not some subprocess. >> >> I fail to see the win in simplicity over using exit (like we already do >> for the normal end of recovery!) is. The issue with the log line seems >> perfectly easily to avoid by just checking the exit code in >> postmaster.c. > > We need to be able to tell the difference between a crashed Startup > process and this usage. > > As long as we can tell, I don't mind how we do it. > > Suggestions please. > Different exit code? -- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
On 2014-11-19 16:04:49 +0000, Simon Riggs wrote: > On 19 November 2014 15:57, Andres Freund <andres@2ndquadrant.com> wrote: > > On 2014-11-19 15:47:05 +0000, Simon Riggs wrote: > >> > Also, for the Shutdown itself, why are we not using > >> > kill(PostmasterPid, SIGINT)? > >> > >> Done > > > > I don't think that's ok. The postmaster is the one that should be in > > control, not some subprocess. Just as an example why I think this is wrong: Some user could just trigger replication to resume and we'd be in some awkward state. > > I fail to see the win in simplicity over using exit (like we already do > > for the normal end of recovery!) is. The issue with the log line seems > > perfectly easily to avoid by just checking the exit code in > > postmaster.c. > > We need to be able to tell the difference between a crashed Startup > process and this usage. Exit code, as suggested above. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On 19/11/14 16:47, Simon Riggs wrote: > On 19 November 2014 13:13, Simon Riggs <simon@2ndquadrant.com> wrote: >> >> Also, for the Shutdown itself, why are we not using >> kill(PostmasterPid, SIGINT)? > > Done > > Other plan is to throw a FATAL message. > >> That gives a clean, fast shutdown rather than what looks like a crash. > > I've also changed the location of where we do > RECOVERY_TARGET_ACTION_SHUTDOWN, so its in the same place as where we > pause. > Another problem with how you did these two changes is that if you just pause when you send kill then during clean shutdown the recovery will be resumed and will finish renaming the recovery.conf to recovery.done, bumping timeline etc, and we don't want that since that prevents resuming recovery in the future. -- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
On Wed, Nov 19, 2014 at 10:49 PM, Robert Haas <robertmhaas@gmail.com> wrote: > On Wed, Nov 19, 2014 at 8:13 AM, Simon Riggs <simon@2ndquadrant.com> wrote: >> If we ask for PAUSE and we're not in HotStandby it just ignores it, >> which means it changes into PROMOTE. My feeling is that we should >> change that into a SHUTDOWN, not a PROMOTE. > > To me, that seems like a definite improvement. But changing the default will force us to set action_at_recovery_target to 'promote' when we want to just recover the database to the specified point. This extra step is not necessary so far, but required in 9.5, which would surprise the users and may cause some troubles like "Oh, in 9.5, PITR failed and the server shut down unexpectedly even though I just ran the PITR procedures that I used to use so far....". Of course probably I can live with the change of the default if it's really worthwhile and we warn the users about that, though. Regards, -- Fujii Masao
On 19 November 2014 16:11, Petr Jelinek <petr@2ndquadrant.com> wrote: >> We need to be able to tell the difference between a crashed Startup >> process and this usage. >> >> As long as we can tell, I don't mind how we do it. >> >> Suggestions please. >> > > Different exit code? Try this one. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Attachment
On 19 November 2014 16:41, Fujii Masao <masao.fujii@gmail.com> wrote: > On Wed, Nov 19, 2014 at 10:49 PM, Robert Haas <robertmhaas@gmail.com> wrote: >> On Wed, Nov 19, 2014 at 8:13 AM, Simon Riggs <simon@2ndquadrant.com> wrote: >>> If we ask for PAUSE and we're not in HotStandby it just ignores it, >>> which means it changes into PROMOTE. My feeling is that we should >>> change that into a SHUTDOWN, not a PROMOTE. >> >> To me, that seems like a definite improvement. > > But changing the default will force us to set > action_at_recovery_target to 'promote' > when we want to just recover the database to the specified point. If you explicitly request pause and then can't pause, ISTM the action we actually perform shouldn't be the exact opposite of what was requested. So if the user explicitly requests pause and we aren't in HS then they should get Shutdown, which is closest action. If the user doesn't request anything at all then we can default to Promote, just like we do now. -- Simon Riggs http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On 19/11/14 19:51, Simon Riggs wrote: > On 19 November 2014 16:11, Petr Jelinek <petr@2ndquadrant.com> wrote: > >>> We need to be able to tell the difference between a crashed Startup >>> process and this usage. >>> >>> As long as we can tell, I don't mind how we do it. >>> >>> Suggestions please. >>> >> >> Different exit code? > > Try this one. > Ok this seems ok, I did couple of fixes - used exit code 3 as 2 is used in some places - given the "if (pid == StartupPID)" it would probably never conflict in practice, but better be safe than sorry in this case IMHO. And you forgot to actually set the postmaster into one of the Shutdown states so I added that. -- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Attachment
On 19 November 2014 22:47, Petr Jelinek <petr@2ndquadrant.com> wrote: > On 19/11/14 19:51, Simon Riggs wrote: >> >> On 19 November 2014 16:11, Petr Jelinek <petr@2ndquadrant.com> wrote: >> >>>> We need to be able to tell the difference between a crashed Startup >>>> process and this usage. >>>> >>>> As long as we can tell, I don't mind how we do it. ... > Ok this seems ok, I did couple of fixes - used exit code 3 as 2 is used in > some places - given the "if (pid == StartupPID)" it would probably never > conflict in practice, but better be safe than sorry in this case IMHO. > And you forgot to actually set the postmaster into one of the Shutdown > states so I added that. Like it. Patch looks good now. Will commit shortly. -- Simon Riggs http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On Thu, Nov 20, 2014 at 5:35 AM, Simon Riggs <simon@2ndquadrant.com> wrote: > On 19 November 2014 16:41, Fujii Masao <masao.fujii@gmail.com> wrote: >> On Wed, Nov 19, 2014 at 10:49 PM, Robert Haas <robertmhaas@gmail.com> wrote: >>> On Wed, Nov 19, 2014 at 8:13 AM, Simon Riggs <simon@2ndquadrant.com> wrote: >>>> If we ask for PAUSE and we're not in HotStandby it just ignores it, >>>> which means it changes into PROMOTE. My feeling is that we should >>>> change that into a SHUTDOWN, not a PROMOTE. >>> >>> To me, that seems like a definite improvement. >> >> But changing the default will force us to set >> action_at_recovery_target to 'promote' >> when we want to just recover the database to the specified point. > > If you explicitly request pause and then can't pause, ISTM the action > we actually perform shouldn't be the exact opposite of what was > requested. > > So if the user explicitly requests pause and we aren't in HS then they > should get Shutdown, which is closest action. > > If the user doesn't request anything at all then we can default to > Promote, just like we do now. Yes, this is what I was trying to suggest. +1 to do that. Regards, -- Fujii Masao
On 25 November 2014 at 14:06, Fujii Masao <masao.fujii@gmail.com> wrote: > On Thu, Nov 20, 2014 at 5:35 AM, Simon Riggs <simon@2ndquadrant.com> wrote: >> On 19 November 2014 16:41, Fujii Masao <masao.fujii@gmail.com> wrote: >>> On Wed, Nov 19, 2014 at 10:49 PM, Robert Haas <robertmhaas@gmail.com> wrote: >>>> On Wed, Nov 19, 2014 at 8:13 AM, Simon Riggs <simon@2ndquadrant.com> wrote: >>>>> If we ask for PAUSE and we're not in HotStandby it just ignores it, >>>>> which means it changes into PROMOTE. My feeling is that we should >>>>> change that into a SHUTDOWN, not a PROMOTE. >>>> >>>> To me, that seems like a definite improvement. >>> >>> But changing the default will force us to set >>> action_at_recovery_target to 'promote' >>> when we want to just recover the database to the specified point. >> >> If you explicitly request pause and then can't pause, ISTM the action >> we actually perform shouldn't be the exact opposite of what was >> requested. >> >> So if the user explicitly requests pause and we aren't in HS then they >> should get Shutdown, which is closest action. >> >> If the user doesn't request anything at all then we can default to >> Promote, just like we do now. > > Yes, this is what I was trying to suggest. +1 to do that. Implemented. Patch committed. -- Simon Riggs http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On 25/11/14 21:15, Simon Riggs wrote: > On 25 November 2014 at 14:06, Fujii Masao <masao.fujii@gmail.com> wrote: >> On Thu, Nov 20, 2014 at 5:35 AM, Simon Riggs <simon@2ndquadrant.com> wrote: >>> On 19 November 2014 16:41, Fujii Masao <masao.fujii@gmail.com> wrote: >>>> On Wed, Nov 19, 2014 at 10:49 PM, Robert Haas <robertmhaas@gmail.com> wrote: >>>>> On Wed, Nov 19, 2014 at 8:13 AM, Simon Riggs <simon@2ndquadrant.com> wrote: >>>>>> If we ask for PAUSE and we're not in HotStandby it just ignores it, >>>>>> which means it changes into PROMOTE. My feeling is that we should >>>>>> change that into a SHUTDOWN, not a PROMOTE. >>>>> >>>>> To me, that seems like a definite improvement. >>>> >>>> But changing the default will force us to set >>>> action_at_recovery_target to 'promote' >>>> when we want to just recover the database to the specified point. >>> >>> If you explicitly request pause and then can't pause, ISTM the action >>> we actually perform shouldn't be the exact opposite of what was >>> requested. >>> >>> So if the user explicitly requests pause and we aren't in HS then they >>> should get Shutdown, which is closest action. >>> >>> If the user doesn't request anything at all then we can default to >>> Promote, just like we do now. >> >> Yes, this is what I was trying to suggest. +1 to do that. > > Implemented. > > Patch committed. > Thanks! -- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Re: Petr Jelinek 2014-11-25 <5474EFEA.2040000@2ndquadrant.com> > >Patch committed. > > Thanks! I'm a bit late to the party, but wouldn't recovery_target_action = ... have been a better name for this? It'd be in line with the other recovery_target_* parameters, and also a bit shorter than the imho somewhat ugly "action_at_recovery_target". Christoph -- cb@df7cb.de | http://www.df7cb.de/
On Wed, Nov 26, 2014 at 7:10 PM, Christoph Berg <cb@df7cb.de> wrote: > Re: Petr Jelinek 2014-11-25 <5474EFEA.2040000@2ndquadrant.com> >> >Patch committed. >> >> Thanks! > > I'm a bit late to the party, but wouldn't > > recovery_target_action = ... > > have been a better name for this? It'd be in line with the other > recovery_target_* parameters, and also a bit shorter than the imho > somewhat ugly "action_at_recovery_target". Looks like a good idea to me. Why not as well mark pause_at_recovery_target as deprecated in the docs and remove it in, let's say, 2 releases? -- Michael
On Wed, Nov 26, 2014 at 5:15 AM, Simon Riggs <simon@2ndquadrant.com> wrote: > On 25 November 2014 at 14:06, Fujii Masao <masao.fujii@gmail.com> wrote: >> On Thu, Nov 20, 2014 at 5:35 AM, Simon Riggs <simon@2ndquadrant.com> wrote: >>> On 19 November 2014 16:41, Fujii Masao <masao.fujii@gmail.com> wrote: >>>> On Wed, Nov 19, 2014 at 10:49 PM, Robert Haas <robertmhaas@gmail.com> wrote: >>>>> On Wed, Nov 19, 2014 at 8:13 AM, Simon Riggs <simon@2ndquadrant.com> wrote: >>>>>> If we ask for PAUSE and we're not in HotStandby it just ignores it, >>>>>> which means it changes into PROMOTE. My feeling is that we should >>>>>> change that into a SHUTDOWN, not a PROMOTE. >>>>> >>>>> To me, that seems like a definite improvement. >>>> >>>> But changing the default will force us to set >>>> action_at_recovery_target to 'promote' >>>> when we want to just recover the database to the specified point. >>> >>> If you explicitly request pause and then can't pause, ISTM the action >>> we actually perform shouldn't be the exact opposite of what was >>> requested. >>> >>> So if the user explicitly requests pause and we aren't in HS then they >>> should get Shutdown, which is closest action. >>> >>> If the user doesn't request anything at all then we can default to >>> Promote, just like we do now. >> >> Yes, this is what I was trying to suggest. +1 to do that. > > Implemented. > > Patch committed. Isn't it better to add the sample setting of this parameter into recovery.conf.sample? Regards, -- Fujii Masao
Christoph Berg <cb@df7cb.de> writes: > Re: Petr Jelinek 2014-11-25 <5474EFEA.2040000@2ndquadrant.com> >> >Patch committed. Before I go and rebase that recovery.conf -> GUC patch on top of this... is it final? >> >> Thanks! > > I'm a bit late to the party, but wouldn't > > recovery_target_action = ... > > have been a better name for this? It'd be in line with the other > recovery_target_* parameters, and also a bit shorter than the imho > somewhat ugly "action_at_recovery_target". FWIW, I too think that "recovery_target_action" is a better name. -- Alex
On 28/11/14 17:46, Alex Shulgin wrote: > > Christoph Berg <cb@df7cb.de> writes: > >> Re: Petr Jelinek 2014-11-25 <5474EFEA.2040000@2ndquadrant.com> >>>> Patch committed. > > Before I go and rebase that recovery.conf -> GUC patch on top of > this... is it final? > I think so, perhaps sans the name mentioned below. >>> >>> Thanks! >> >> I'm a bit late to the party, but wouldn't >> >> recovery_target_action = ... >> >> have been a better name for this? It'd be in line with the other >> recovery_target_* parameters, and also a bit shorter than the imho >> somewhat ugly "action_at_recovery_target". > > FWIW, I too think that "recovery_target_action" is a better name. > I agree. -- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
On Fri, Nov 28, 2014 at 11:59 AM, Petr Jelinek <petr@2ndquadrant.com> wrote: >>> I'm a bit late to the party, but wouldn't >>> >>> recovery_target_action = ... >>> >>> have been a better name for this? It'd be in line with the other >>> recovery_target_* parameters, and also a bit shorter than the imho >>> somewhat ugly "action_at_recovery_target". >> >> FWIW, I too think that "recovery_target_action" is a better name. > > I agree. +1. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On 02/12/14 18:59, Robert Haas wrote: > On Fri, Nov 28, 2014 at 11:59 AM, Petr Jelinek <petr@2ndquadrant.com> wrote: >>>> I'm a bit late to the party, but wouldn't >>>> >>>> recovery_target_action = ... >>>> >>>> have been a better name for this? It'd be in line with the other >>>> recovery_target_* parameters, and also a bit shorter than the imho >>>> somewhat ugly "action_at_recovery_target". >>> >>> FWIW, I too think that "recovery_target_action" is a better name. >> >> I agree. > > +1. > Here is patch which renames action_at_recovery_target to recovery_target_action everywhere. -- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Attachment
On 4 December 2014 at 22:13, Petr Jelinek <petr@2ndquadrant.com> wrote: > On 02/12/14 18:59, Robert Haas wrote: >> >> On Fri, Nov 28, 2014 at 11:59 AM, Petr Jelinek <petr@2ndquadrant.com> >> wrote: >>>>> >>>>> I'm a bit late to the party, but wouldn't >>>>> >>>>> recovery_target_action = ... >>>>> >>>>> have been a better name for this? It'd be in line with the other >>>>> recovery_target_* parameters, and also a bit shorter than the imho >>>>> somewhat ugly "action_at_recovery_target". >>>> >>>> FWIW, I too think that "recovery_target_action" is a better name. >>> >>> I agree. >> >> >> +1. >> > > Here is patch which renames action_at_recovery_target to > recovery_target_action everywhere. Thanks; I'll fix it up on Monday. -- Simon Riggs http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
<div dir="ltr"><br /><br />On Thu, Dec 4, 2014 at 10:13 PM, Petr Jelinek <<a href="mailto:petr@2ndquadrant.com">petr@2ndquadrant.com</a>>wrote:<br />> On 02/12/14 18:59, Robert Haas wrote:<br/>>><br />>> On Fri, Nov 28, 2014 at 11:59 AM, Petr Jelinek <<a href="mailto:petr@2ndquadrant.com">petr@2ndquadrant.com</a>><br/>>> wrote:<br />>>>>><br />>>>>>I'm a bit late to the party, but wouldn't<br />>>>>><br />>>>>> recovery_target_action= ...<br />>>>>><br />>>>>> have been a better name for this? It'd bein line with the other<br />>>>>> recovery_target_* parameters, and also a bit shorter than the imho<br/>>>>>> somewhat ugly "action_at_recovery_target".<br />>>>><br />>>>><br />>>>>FWIW, I too think that "recovery_target_action" is a better name.<br />>>><br />>>><br/>>>> I agree.<br />>><br />>><br />>> +1.<br />>><br />><br />>Here is patch which renames action_at_recovery_target to<br />> recovery_target_action everywhere.<br />Thanks,Looks good to me.<br /><br />A couple of things that would be good to document as well in recovery-config.sgml:<br/>- the fact that pause_at_recovery_target is deprecated.<br />- the fact that both parameters cannotbe used at the same time.<br />Let's not surprise the users with behaviors they may expect or guess and document thisstuff precisely..<br /><br />This would make docs consistent with this block of code in xlog.c:<br /> if (recoveryPauseAtTargetSet&& actionAtRecoveryTargetSet)<br /> ereport(ERROR,<br /> (errcode(ERRCODE_INVALID_PARAMETER_VALUE),<br /> errmsg("cannot set both \"%s\"and \"%s\" recovery parameters",<br /> "pause_at_recovery_target",<br/> "action_at_recovery_target"),<br /> errhint("The \"pause_at_recovery_target\" is deprecated.")));<br /><br />Regards,<br />-- <br/>Michael</div>
<div dir="ltr"><br /><div class="gmail_extra"><br /><div class="gmail_quote">On Thu, Dec 4, 2014 at 10:45 PM, Michael Paquier<span dir="ltr"><<a href="mailto:michael.paquier@gmail.com" target="_blank">michael.paquier@gmail.com</a>></span>wrote:<br /><blockquote class="gmail_quote" style="margin:0px 0px0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><span class=""><br /><br />On Thu, Dec4, 2014 at 10:13 PM, Petr Jelinek <<a href="mailto:petr@2ndquadrant.com" target="_blank">petr@2ndquadrant.com</a>>wrote:<br />> On 02/12/14 18:59, Robert Haas wrote:<br />>><br />>>On Fri, Nov 28, 2014 at 11:59 AM, Petr Jelinek <<a href="mailto:petr@2ndquadrant.com" target="_blank">petr@2ndquadrant.com</a>><br/>>> wrote:<br />>>>>><br />>>>>> I'ma bit late to the party, but wouldn't<br />>>>>><br />>>>>> recovery_target_action = ...<br/>>>>>><br />>>>>> have been a better name for this? It'd be in line with the other<br/>>>>>> recovery_target_* parameters, and also a bit shorter than the imho<br />>>>>>somewhat ugly "action_at_recovery_target".<br />>>>><br />>>>><br />>>>>FWIW, I too think that "recovery_target_action" is a better name.<br />>>><br />>>><br/>>>> I agree.<br />>><br />>><br />>> +1.<br />>><br />><br />>Here is patch which renames action_at_recovery_target to<br />> recovery_target_action everywhere.<br /></span>Thanks,Looks good to me.<br /><br />A couple of things that would be good to document as well in recovery-config.sgml:<br/>- the fact that pause_at_recovery_target is deprecated.<br />- the fact that both parameters cannotbe used at the same time.<br />Let's not surprise the users with behaviors they may expect or guess and document thisstuff precisely..<br /><span class=""></span></div></blockquote></div>Btw, you are missing as well the addition of thisparameter in recovery.conf.sample (mentioned by Fujii-san upthread). It would be nice to have a description paragraphas well similarly to what is written for pause_at_recovery_target.<br />-- <br /><div class="gmail_signature">Michael<br/></div></div></div>
On Thu, Dec 4, 2014 at 8:45 AM, Michael Paquier <michael.paquier@gmail.com> wrote: >> Here is patch which renames action_at_recovery_target to >> recovery_target_action everywhere. > Thanks, Looks good to me. > > A couple of things that would be good to document as well in > recovery-config.sgml: > - the fact that pause_at_recovery_target is deprecated. Why not just remove it altogether? I don't think the backward-compatibility break is enough to get excited about, or to justify the annoyance of carrying an extra parameter that does (part of) the same thing. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
<div dir="ltr"><br /><div class="gmail_extra"><br /><div class="gmail_quote">On Sat, Dec 6, 2014 at 12:49 AM, Robert Haas<span dir="ltr"><<a href="mailto:robertmhaas@gmail.com" target="_blank">robertmhaas@gmail.com</a>></span> wrote:<br/><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><span class="">OnThu, Dec 4, 2014 at 8:45 AM, Michael Paquier<br /> <<a href="mailto:michael.paquier@gmail.com">michael.paquier@gmail.com</a>>wrote:<br /> >> Here is patch which renamesaction_at_recovery_target to<br /> >> recovery_target_action everywhere.<br /> > Thanks, Looks good to me.<br/> ><br /> > A couple of things that would be good to document as well in<br /> > recovery-config.sgml:<br/> > - the fact that pause_at_recovery_target is deprecated.<br /><br /></span>Why not just removeit altogether? I don't think the<br /> backward-compatibility break is enough to get excited about, or to<br /> justifythe annoyance of carrying an extra parameter that does (part<br /> of) the same thing.<br /></blockquote></div>Atleast we won't forget removing in the future something that has been marked as deprecated for years.So +1 here for a simple removal, and a mention in the future release notes.<br />-- <br /><div class="gmail_signature">Michael<br/></div></div></div>
On 05/12/14 16:49, Robert Haas wrote: > On Thu, Dec 4, 2014 at 8:45 AM, Michael Paquier > <michael.paquier@gmail.com> wrote: >>> Here is patch which renames action_at_recovery_target to >>> recovery_target_action everywhere. >> Thanks, Looks good to me. >> >> A couple of things that would be good to document as well in >> recovery-config.sgml: >> - the fact that pause_at_recovery_target is deprecated. > > Why not just remove it altogether? I don't think the > backward-compatibility break is enough to get excited about, or to > justify the annoyance of carrying an extra parameter that does (part > of) the same thing. > Ok this patch does that, along with the rename to recovery_target_action and addition to the recovery.conf.sample. -- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Attachment
On Mon, Dec 8, 2014 at 9:59 AM, Petr Jelinek <petr@2ndquadrant.com> wrote: > Ok this patch does that, along with the rename to recovery_target_action and > addition to the recovery.conf.sample. This needs a rebase as at least da71632 and b8e33a8 are conflicting. -- Michael
On 08/12/14 02:03, Michael Paquier wrote: > On Mon, Dec 8, 2014 at 9:59 AM, Petr Jelinek <petr@2ndquadrant.com> wrote: >> Ok this patch does that, along with the rename to recovery_target_action and >> addition to the recovery.conf.sample. > This needs a rebase as at least da71632 and b8e33a8 are conflicting. > Simon actually already committed something similar, so no need. -- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
On 08/12/14 02:06, Petr Jelinek wrote: > On 08/12/14 02:03, Michael Paquier wrote: >> On Mon, Dec 8, 2014 at 9:59 AM, Petr Jelinek <petr@2ndquadrant.com> >> wrote: >>> Ok this patch does that, along with the rename to >>> recovery_target_action and >>> addition to the recovery.conf.sample. >> This needs a rebase as at least da71632 and b8e33a8 are conflicting. >> > > Simon actually already committed something similar, so no need. > ...except for the removal of pause_at_recovery_target it seems, so I attached just that -- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Attachment
On Mon, Dec 8, 2014 at 10:18 AM, Petr Jelinek <petr@2ndquadrant.com> wrote: > On 08/12/14 02:06, Petr Jelinek wrote: >> Simon actually already committed something similar, so no need. >> > > ...except for the removal of pause_at_recovery_target it seems, so I > attached just that Thanks! Removal of this parameter is getting at least two votes, and nobody expressed against it, so IMO we should remove it now instead or later, or else I'm sure we would simply forget it. My2c. -- Michael
On 2014-12-08 02:18:11 +0100, Petr Jelinek wrote: > On 08/12/14 02:06, Petr Jelinek wrote: > >On 08/12/14 02:03, Michael Paquier wrote: > >>On Mon, Dec 8, 2014 at 9:59 AM, Petr Jelinek <petr@2ndquadrant.com> > >>wrote: > >>>Ok this patch does that, along with the rename to > >>>recovery_target_action and > >>>addition to the recovery.conf.sample. > >>This needs a rebase as at least da71632 and b8e33a8 are conflicting. > >> > > > >Simon actually already committed something similar, so no need. > > > > ...except for the removal of pause_at_recovery_target it seems, so I > attached just that I intend to push this one unless somebody protests soon. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Hi, On 2014-12-08 02:18:11 +0100, Petr Jelinek wrote: > ...except for the removal of pause_at_recovery_target it seems, so I > attached just that Pushed. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services