Thread: Fast promotion, loose ends
We never reached a consensus on the user interface of the new 'fast promotion'. We should settle that before beta. The thread died here: http://www.postgresql.org/message-id/CA+U5nMKmDD7hGCYzOo=iFM=eK5OPDXCEzmq79fgLWr0TJk=sXw@mail.gmail.com Simon said in that email that he's waiting for further comments, so let me reiterate my view on this: The current situation is that to do 'fast promotion', you have to use "pg_ctl promote -m fast". 'slow' promotion is the default. I didn't like that, because: 1. The slow method has no advantage over the fast method. Therefore I think the fast method should be the default, when promoting a standby. To be conservative, just in case there are bugs in the fast method, we'd still use the slow method for crash recovery and end of point-in-time recovery. That provides an escape hatch, so that if the fast method fails because of a bug, you can still start up the database. 2. There is no way to perform 'fast promotion' using the trigger file. That feature is only available using "pg_ctl promote". When "pg_ctl promote" was introduced, it was not meant to replace the trigger file method, as that is also very useful in many situations. (as explained here: http://www.postgresql.org/message-id/5112A54B.8090500@vmware.com) Putting points 1 and 2 together, I think the best solution is to always perform 'fast' promotion when in standby mode, and remove pg_ctl -m fast/smart option altogether. There is no need to expose that to users. - Heikki
On 22 April 2013 08:13, Heikki Linnakangas <hlinnakangas@vmware.com> wrote: > We never reached a consensus on the user interface of the new 'fast > promotion'. We should settle that before beta. The thread died here: > > http://www.postgresql.org/message-id/CA+U5nMKmDD7hGCYzOo=iFM=eK5OPDXCEzmq79fgLWr0TJk=sXw@mail.gmail.com > > Simon said in that email that he's waiting for further comments, so let me > reiterate my view on this: I'm very happy to discuss this some more, thanks for bringing it up. > The current situation is that to do 'fast promotion', you have to use > "pg_ctl promote -m fast". 'slow' promotion is the default. > > I didn't like that, because: > > 1. The slow method has no advantage over the fast method. Therefore I think > the fast method should be the default, when promoting a standby. To be > conservative, just in case there are bugs in the fast method, we'd still use > the slow method for crash recovery and end of point-in-time recovery. That > provides an escape hatch, so that if the fast method fails because of a bug, > you can still start up the database. In general, I agree. My only "advantage" for slow is that we know it works. > 2. There is no way to perform 'fast promotion' using the trigger file. That > feature is only available using "pg_ctl promote". When "pg_ctl promote" was > introduced, it was not meant to replace the trigger file method, as that is > also very useful in many situations. (as explained here: > http://www.postgresql.org/message-id/5112A54B.8090500@vmware.com) I realise that it *is* possible to trigger fast promotion using a file, since that is exactly the method we use to implement the feature by pg_ctl. In fact, pg_ctl promote is just a nice paint job over the trigger file concept. So, to initiate promotion, you can create a file called $DATADIR/fast_promote or $DATADIR/promote Does that solve the issue? > Putting points 1 and 2 together, I think the best solution is to always > perform 'fast' promotion when in standby mode, and remove pg_ctl -m > fast/smart option altogether. There is no need to expose that to users. We can make pg_ctl promote write the fast_promote file by default and remove the command line option altogether. I would like to retain the option to promote with a full checkpoint, for safety reasons, so continue to use both fast_promote and promote files under the covers. --Simon Riggs http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On 22.04.2013 10:58, Simon Riggs wrote: > On 22 April 2013 08:13, Heikki Linnakangas<hlinnakangas@vmware.com> wrote: >> 2. There is no way to perform 'fast promotion' using the trigger file. That >> feature is only available using "pg_ctl promote". When "pg_ctl promote" was >> introduced, it was not meant to replace the trigger file method, as that is >> also very useful in many situations. (as explained here: >> http://www.postgresql.org/message-id/5112A54B.8090500@vmware.com) > > I realise that it *is* possible to trigger fast promotion using a > file, since that is exactly the method we use to implement the feature > by pg_ctl. > > In fact, pg_ctl promote is just a nice paint job over the trigger file concept. > > So, to initiate promotion, you can create a file called > $DATADIR/fast_promote or $DATADIR/promote > > Does that solve the issue? Hmm. That requires write access to $DATADIR, so that's not quite the same thing as the trigger_file recovery.conf option. But if you really want to keep the 'slow' option, we could add an option to recovery.conf for it. I'm thinking something like this: trigger_file = '/mnt/foo/trigger' # file to watch for trigger_file_method = 'slow' # slow / fast, 'fast' is default. You would have to decide which method to use when you create recovery.conf file, rather than when you promote, but I think that would OK. >> Putting points 1 and 2 together, I think the best solution is to always >> perform 'fast' promotion when in standby mode, and remove pg_ctl -m >> fast/smart option altogether. There is no need to expose that to users. > > We can make pg_ctl promote write the fast_promote file by default and > remove the command line option altogether. Sounds good to me. > I would like to retain the option to promote with a full checkpoint, > for safety reasons, so continue to use both fast_promote and promote > files under the covers. Ok. As long as 'fast' is the default and you get the same functionality with the trigger file method and pg_ctl promote, that works for me. Thanks! - Heikki
On 04/22/2013 02:58 AM, Simon Riggs wrote: > So, to initiate promotion, you can create a file called > $DATADIR/fast_promote or $DATADIR/promote Pardon my naiveté, but could it also be an option to read the method from the promotion file? echo "slow" > /my/promotion/path That would work without any default naming scheme, and only incurs a read on the file-handle. -- Shaun Thomas OptionsHouse | 141 W. Jackson Blvd. | Suite 500 | Chicago IL, 60604 312-676-8870 sthomas@optionshouse.com ______________________________________________ See http://www.peak6.com/email_disclaimer/ for terms and conditions related to this email
On 22 April 2013 09:29, Heikki Linnakangas <hlinnakangas@vmware.com> wrote: > Hmm. That requires write access to $DATADIR, so that's not quite the same > thing as the trigger_file recovery.conf option. Well, you also (elsewhere) requested that we must keep recovery.conf in $DATADIR, so it needs to be writable. --Simon Riggs http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On 22 April 2013 16:09, Shaun Thomas <sthomas@optionshouse.com> wrote: > On 04/22/2013 02:58 AM, Simon Riggs wrote: > >> So, to initiate promotion, you can create a file called >> $DATADIR/fast_promote or $DATADIR/promote > > > Pardon my naiveté, but could it also be an option to read the method from > the promotion file? > > echo "slow" > /my/promotion/path > > That would work without any default naming scheme, and only incurs a read on > the file-handle. We could do that and a similar mechanism existed in pg_standby, but was removed in 9.0. Given where we are and the shape of the code I'd rather not do that. --Simon Riggs http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On 22.04.2013 18:45, Simon Riggs wrote: > On 22 April 2013 16:09, Shaun Thomas<sthomas@optionshouse.com> wrote: >> On 04/22/2013 02:58 AM, Simon Riggs wrote: >> >>> So, to initiate promotion, you can create a file called >>> $DATADIR/fast_promote or $DATADIR/promote >> >> >> Pardon my naiveté, but could it also be an option to read the method from >> the promotion file? >> >> echo "slow"> /my/promotion/path >> >> That would work without any default naming scheme, and only incurs a read on >> the file-handle. Yeah, that would be one way to do it. There's a little race condition if you create the file like above; postgres might read it just when it's created, but before the "slow" word has been written to it. But that would probably be acceptable, and an application could always do create and write the file and then rename it into place, if that's a problem. > We could do that and a similar mechanism existed in pg_standby, but > was removed in 9.0. That code is still in pg_standby. Maybe you were thinking of some other feature? - Heikki
On 22 April 2013 19:04, Heikki Linnakangas <hlinnakangas@vmware.com> wrote: >> We could do that and a similar mechanism existed in pg_standby, but >> was removed in 9.0. > > > That code is still in pg_standby. Maybe you were thinking of some other > feature? It wasn't removed from pg_standby. But since all the features of pg_standby except that have been moved into the main server and/or new extensions, I described that situation as being a feature removal. Just a passing comment, no huge loss. --Simon Riggs http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On 22.04.2013 18:44, Simon Riggs wrote: > On 22 April 2013 09:29, Heikki Linnakangas<hlinnakangas@vmware.com> wrote: > >> Hmm. That requires write access to $DATADIR, so that's not quite the same >> thing as the trigger_file recovery.conf option. > > Well, you also (elsewhere) requested that we must keep recovery.conf > in $DATADIR, so it needs to be writable. That's a slightly different requirement. $DATADIR must be writable by the process that restores the backup or puts the server into standby mode, while trigger_file needs to be writable by the process that triggers failover. Those are not necessarily the same thing. I'm thinking of a heartbeat process that triggers failover by creating a file on an NFS server or similar. Possibly the same location where the WAL archive is located. $DATADIR would be stored elsewhere. - Heikki
On 24 April 2013 08:23, Heikki Linnakangas <hlinnakangas@vmware.com> wrote: > On 22.04.2013 18:44, Simon Riggs wrote: >> >> On 22 April 2013 09:29, Heikki Linnakangas<hlinnakangas@vmware.com> >> wrote: >> >>> Hmm. That requires write access to $DATADIR, so that's not quite the same >>> thing as the trigger_file recovery.conf option. >> >> >> Well, you also (elsewhere) requested that we must keep recovery.conf >> in $DATADIR, so it needs to be writable. > > > That's a slightly different requirement. $DATADIR must be writable by the > process that restores the backup or puts the server into standby mode, while > trigger_file needs to be writable by the process that triggers failover. > Those are not necessarily the same thing. I'm thinking of a heartbeat > process that triggers failover by creating a file on an NFS server or > similar. Possibly the same location where the WAL archive is located. > $DATADIR would be stored elsewhere. The default you've requested is fast promotion and I've agreed to that. The ability to write a file called "promote" to $DATADIR is there as a protection in case we need it in the field, its not going to be the primary mechanism any more. So if you're not intending to use it ever, it doesn't seem worth discussing the fact you don't like its location. But if you do want to discuss it, I think it's unreasonable of you to demand recovery.conf cannot be outside $DATADIR and then also demand related files be relocatable outside $DATADIR as well, the exact opposite. We had the chance to avoid making $DATADIR writable externally and that's gone now, at least for now. Here's the patch I was intending to apply. Please let me know if you have comments. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Attachment
On 24.04.2013 10:57, Simon Riggs wrote: > On 24 April 2013 08:23, Heikki Linnakangas<hlinnakangas@vmware.com> wrote: >> On 22.04.2013 18:44, Simon Riggs wrote: >>> >>> On 22 April 2013 09:29, Heikki Linnakangas<hlinnakangas@vmware.com> >>> wrote: >>> >>>> Hmm. That requires write access to $DATADIR, so that's not quite the same >>>> thing as the trigger_file recovery.conf option. >>> >>> Well, you also (elsewhere) requested that we must keep recovery.conf >>> in $DATADIR, so it needs to be writable. >> >> That's a slightly different requirement. $DATADIR must be writable by the >> process that restores the backup or puts the server into standby mode, while >> trigger_file needs to be writable by the process that triggers failover. >> Those are not necessarily the same thing. I'm thinking of a heartbeat >> process that triggers failover by creating a file on an NFS server or >> similar. Possibly the same location where the WAL archive is located. >> $DATADIR would be stored elsewhere. > > The default you've requested is fast promotion and I've agreed to that. > > The ability to write a file called "promote" to $DATADIR is there as a > protection in case we need it in the field, its not going to be the > primary mechanism any more. So if you're not intending to use it ever, > it doesn't seem worth discussing the fact you don't like its location. Ok, works for me. > But if you do want to discuss it, I think it's unreasonable of you to > demand recovery.conf cannot be outside $DATADIR and then also demand > related files be relocatable outside $DATADIR as well, the exact > opposite. We had the chance to avoid making $DATADIR writable > externally and that's gone now, at least for now. As I said above, it's a different situation. recovery.conf has always been in $DATADIR, and it's always been possible to point trigger_file elsewhere, and you've always gotten full functionality using the trigger_file. I just want to maintain that status quo. Which your patch achieves, so I'm happy that that. > Here's the patch I was intending to apply. Please let me know if you > have comments. Regarding the change in pg_ctl: > /* > - * Use two different kinds of promotion file so we can understand > - * the difference between smart and fast promotion. > + * For 9.3 onwards, use fast promotion as the default option. > + * Promotion with a full checkpoint is still possible by writing > + * a file called "promote", e.g. > + * snprintf(promote_file, MAXPGPATH, "%s/promote", pg_data); > */ > - if (shutdown_mode >= FAST_MODE) > - snprintf(promote_file, MAXPGPATH, "%s/fast_promote", pg_data); > - else > - snprintf(promote_file, MAXPGPATH, "%s/promote", pg_data); > + snprintf(promote_file, MAXPGPATH, "%s/fast_promote", pg_data); Should there be a version check there? I guess we've never guaranteed a newer pg_ctl to work with an older server version, but it seems likely that someone would try to do that, especially with "pg_ctl promote". With the above change, creating $DATADIR/fast_promote in a 9.2 server's data dir will do nothing. I'd suggest that we keep the filename unchanged, "promote", and only change the behavior in the server side, so that it performs fast promotion. If you want to have a "slow" promote file, we can call that "slow_promote" or "checkpoint_then_promote" or something. Aside from that, I assume you'll clean up the now-dead -m fast option etc. - Heikki
On 24 April 2013 09:10, Heikki Linnakangas <hlinnakangas@vmware.com> wrote: > Regarding the change in pg_ctl: > >> /* >> - * Use two different kinds of promotion file so we can understand >> - * the difference between smart and fast promotion. >> + * For 9.3 onwards, use fast promotion as the default option. >> + * Promotion with a full checkpoint is still possible by writing >> + * a file called "promote", e.g. >> + * snprintf(promote_file, MAXPGPATH, "%s/promote", pg_data); >> */ >> - if (shutdown_mode >= FAST_MODE) >> - snprintf(promote_file, MAXPGPATH, "%s/fast_promote", >> pg_data); >> - else >> - snprintf(promote_file, MAXPGPATH, "%s/promote", pg_data); >> + snprintf(promote_file, MAXPGPATH, "%s/fast_promote", pg_data); > > > Should there be a version check there? I guess we've never guaranteed a > newer pg_ctl to work with an older server version, but it seems likely that > someone would try to do that, especially with "pg_ctl promote". With the > above change, creating $DATADIR/fast_promote in a 9.2 server's data dir will > do nothing. I'd suggest that we keep the filename unchanged, "promote", and > only change the behavior in the server side, so that it performs fast > promotion. If you want to have a "slow" promote file, we can call that > "slow_promote" or "checkpoint_then_promote" or something. pg_ctl already checks versions, so I don't see the point. --Simon Riggs http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On 24.04.2013 11:23, Simon Riggs wrote: > On 24 April 2013 09:10, Heikki Linnakangas<hlinnakangas@vmware.com> wrote: >> Regarding the change in pg_ctl: >> >>> /* >>> - * Use two different kinds of promotion file so we can understand >>> - * the difference between smart and fast promotion. >>> + * For 9.3 onwards, use fast promotion as the default option. >>> + * Promotion with a full checkpoint is still possible by writing >>> + * a file called "promote", e.g. >>> + * snprintf(promote_file, MAXPGPATH, "%s/promote", pg_data); >>> */ >>> - if (shutdown_mode>= FAST_MODE) >>> - snprintf(promote_file, MAXPGPATH, "%s/fast_promote", >>> pg_data); >>> - else >>> - snprintf(promote_file, MAXPGPATH, "%s/promote", pg_data); >>> + snprintf(promote_file, MAXPGPATH, "%s/fast_promote", pg_data); >> >> >> Should there be a version check there? I guess we've never guaranteed a >> newer pg_ctl to work with an older server version, but it seems likely that >> someone would try to do that, especially with "pg_ctl promote". With the >> above change, creating $DATADIR/fast_promote in a 9.2 server's data dir will >> do nothing. I'd suggest that we keep the filename unchanged, "promote", and >> only change the behavior in the server side, so that it performs fast >> promotion. If you want to have a "slow" promote file, we can call that >> "slow_promote" or "checkpoint_then_promote" or something. > > pg_ctl already checks versions, so I don't see the point. The point is, if you do "pgsql93/bin/pg_ctl -D $92DATADIR promote", it will create fast_promote file and return success. But it won't actually promote the server. I think that's bad. If pg_ctl already has a check against that, fine, but I don't think it does. Please make sure you test that before applying. - Heikki
On 24 April 2013 09:32, Heikki Linnakangas <hlinnakangas@vmware.com> wrote: >> pg_ctl already checks versions, so I don't see the point. > > > The point is, if you do "pgsql93/bin/pg_ctl -D $92DATADIR promote", it will > create fast_promote file and return success. But it won't actually promote > the server. I think that's bad. > > If pg_ctl already has a check against that, fine, but I don't think it does. > Please make sure you test that before applying. If it doesn't check, that is not the only thing that would be broken. The original commit of pg_ctl promote would also be broken. --Simon Riggs http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On 24.04.2013 11:46, Simon Riggs wrote: > On 24 April 2013 09:32, Heikki Linnakangas<hlinnakangas@vmware.com> wrote: > >>> pg_ctl already checks versions, so I don't see the point. >> >> >> The point is, if you do "pgsql93/bin/pg_ctl -D $92DATADIR promote", it will >> create fast_promote file and return success. But it won't actually promote >> the server. I think that's bad. >> >> If pg_ctl already has a check against that, fine, but I don't think it does. >> Please make sure you test that before applying. > > If it doesn't check, that is not the only thing that would be broken. > The original commit of pg_ctl promote would also be broken. Yeah, it would've been good if the "pg_ctl promote" patch would've added a version check. Nevertheless, don't you think it would be good to avoid changing the filename of the "promote" file, so that we don't have any more such breakage? I don't see any advantage in changing it. - Heikki
On 24 April 2013 09:53, Heikki Linnakangas <hlinnakangas@vmware.com> wrote: > On 24.04.2013 11:46, Simon Riggs wrote: >> >> On 24 April 2013 09:32, Heikki Linnakangas<hlinnakangas@vmware.com> >> wrote: >> >>>> pg_ctl already checks versions, so I don't see the point. >>> >>> >>> >>> The point is, if you do "pgsql93/bin/pg_ctl -D $92DATADIR promote", it >>> will >>> create fast_promote file and return success. But it won't actually >>> promote >>> the server. I think that's bad. >>> >>> If pg_ctl already has a check against that, fine, but I don't think it >>> does. >>> Please make sure you test that before applying. >> >> >> If it doesn't check, that is not the only thing that would be broken. >> The original commit of pg_ctl promote would also be broken. > > > Yeah, it would've been good if the "pg_ctl promote" patch would've added a > version check. Nevertheless, don't you think it would be good to avoid > changing the filename of the "promote" file, so that we don't have any more > such breakage? I don't see any advantage in changing it. Apart from all the reasons I already gave, no. The filename isn't changing. We are adding a new capability and changing the default. --Simon Riggs http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services