Thread: Enable WAL archiving even in standby
Hi, I'd propose the attached WIP patch which allows us to enable WAL archiving even in standby. The patch adds "always" as the valid value of archive_mode. If it's set to "always", the archiver is started when the server is in standby mode and all the WAL files that walreceiver wrote to the disk are archived by using archive_command. Then, even after the server is promoted to master, the archiver keeps archiving WAL files. The patch doesn't change the meanings of the setting values "on" and "off" of archive_mode. I think that this feature is useful for the case, e.g., where large database needs to be replicated between remote servers. Imagine the situation where the replicated database gets corrupted completely in the remote standby. How should we address this problematic situation and restart the standby? One approach is to take a fresh backup from the master and restore it onto the standby. But since the database is large and there is long distance between two servers, this approach might take a surprisingly long time. Another approach is to restore the backup which was taken from the standby before. But most of many WAL files which the backup needs might exist only in the master (because WAL archiving cannot be enabled in the standby) and they need to be transfered from the master to the standby via long-distance network. So I think that this approach also would take a fairly long time. To shorten that time, you may think that archive_command in the master can be set so that it transfers WAL files from the master to the standby's archival storage. I agree that this setting can accelerate the database restore process. But this causes every WAL files to be transfered between remote servers twice (one is by streaming replication, another is by archive_command), and which is a waste of network bandwidth. Enabling WAL archiving in standby is one of solutions for this situation. We can expect that most of WAL files that are required for the backup taken from the standby would exist in the standby's archival storage. Back to the patch. If archive_mode is set to "always", archive_command is always used to archive WAL files even during recovery. Do we need to separate the command into two for master and standby, respectively? We can add something like standby_archive_command parameter which is used to archive only WAL files walreceiver writes. The other WAL files are archived by archive_command. I'm not sure if it's really worth separating the command that way. Is there any use case? The patch doesn't allow us to enable WAL archiving *only* during recovery. Should we support yet another archive_mode like "standby" which allows the archiver to be running only during recovery, but makes it end just after the server is promoted to master? I'm not sure if there is really use case for that. I've not included the update of document in the patch yet. If we agree to support this feature, I will do the remaining work. Regards, -- Fujii Masao
Attachment
On Wed, Aug 13, 2014 at 6:42 AM, Fujii Masao <masao.fujii@gmail.com> wrote: > I'd propose the attached WIP patch which allows us to enable WAL archiving > even in standby. The patch adds "always" as the valid value of archive_mode. > If it's set to "always", the archiver is started when the server is in standby > mode and all the WAL files that walreceiver wrote to the disk are archived by > using archive_command. Then, even after the server is promoted to master, > the archiver keeps archiving WAL files. The patch doesn't change the meanings > of the setting values "on" and "off" of archive_mode. I like the feature, but I don't much like this as a control mechanism. Having archive_command and standby_archive_command, as you propose further down, seems saner. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Fri, Aug 15, 2014 at 4:30 AM, Robert Haas <robertmhaas@gmail.com> wrote: > On Wed, Aug 13, 2014 at 6:42 AM, Fujii Masao <masao.fujii@gmail.com> wrote: >> I'd propose the attached WIP patch which allows us to enable WAL archiving >> even in standby. The patch adds "always" as the valid value of archive_mode. >> If it's set to "always", the archiver is started when the server is in standby >> mode and all the WAL files that walreceiver wrote to the disk are archived by >> using archive_command. Then, even after the server is promoted to master, >> the archiver keeps archiving WAL files. The patch doesn't change the meanings >> of the setting values "on" and "off" of archive_mode. > > I like the feature, but I don't much like this as a control mechanism. > Having archive_command and standby_archive_command, as you propose > further down, seems saner. Okay, that's fine. One question is; Which WAL files should be archived by standby_archive_command? There are following kinds of WAL files. (1) WAL files which were fully written and closed by walreceiver Curently they are not archived at all. (2) WAL file which is being written by walreceiver This file will be closed before it's fully written because of, forexample, standby promotion. Currently this is archived by archive_command. (3) WAL file with new timeline, which is copied from (2) At the end of recovery, after new timeline is assigned, this latest WAL file with new timeline is created by being copied from (2) (i.e., latest WAL file with old timeline).WAL data of end-of-recovery checkpoint is written to this latest WAL file. Currently this is archived byarchive_command. (4) Timeline history files When standby is promoted to the master, the imeline is incremented and the timeline historyfile is created. Currently the timeline history files are archived by archive_command. (5) WAL files generated in normal processing mode Currently they are archived by archive_command. I'm thinking to use standby_archive_command only for (1) because the others are currently archived by archive_command. That means that even if there are type (1) WAL files which have not been archived yet after the standby promotion (i.e., the situation where WAL archiving was delayed for some reasons in the standby), they are archived by standby_archive_command. IOW, the archiver uses both archive commands as the situation demands. OTOH, maybe there are people who want to use standby_archive_command for all the WAL files with old timeline, i.e., (1) and (2). Thought? Regards, -- Fujii Masao
From: "Fujii Masao" <masao.fujii@gmail.com> > I'd propose the attached WIP patch which allows us to enable WAL archiving > even in standby. The patch adds "always" as the valid value of > archive_mode. > If it's set to "always", the archiver is started when the server is in > standby > mode and all the WAL files that walreceiver wrote to the disk are archived > by > using archive_command. Then, even after the server is promoted to master, > the archiver keeps archiving WAL files. The patch doesn't change the > meanings > of the setting values "on" and "off" of archive_mode. > > I think that this feature is useful for the case, e.g., where large > database > needs to be replicated between remote servers. Imagine the situation where > the replicated database gets corrupted completely in the remote standby. > How should we address this problematic situation and restart the standby? > > One approach is to take a fresh backup from the master and restore it onto > the standby. But since the database is large and there is long distance > between two servers, this approach might take a surprisingly long time. > > Another approach is to restore the backup which was taken from the standby > before. But most of many WAL files which the backup needs might exist only > in the master (because WAL archiving cannot be enabled in the standby) and > they need to be transfered from the master to the standby via > long-distance > network. So I think that this approach also would take a fairly long time. > To shorten that time, you may think that archive_command in the master can > be set so that it transfers WAL files from the master to the standby's > archival storage. I agree that this setting can accelerate the database > restore > process. But this causes every WAL files to be transfered between remote > servers twice (one is by streaming replication, another is by > archive_command), > and which is a waste of network bandwidth. Great. This is exactly what I hoped for disaster recovery, although I haven't looked at the patch yet. > Back to the patch. If archive_mode is set to "always", archive_command is > always used to archive WAL files even during recovery. Do we need to > separate > the command into two for master and standby, respectively? We can add > something like standby_archive_command parameter which is used to archive > only WAL files walreceiver writes. The other WAL files are archived by > archive_command. I'm not sure if it's really worth separating the command > that way. Is there any use case? I don't see any reason to separate parameters. I want the spec simple. > I've not included the update of document in the patch yet. If we agree to > support this feature, I will do the remaining work. Could you consider adding a new section for disaster recovery that describes concrete parameter settings (e.g. how do we discard old archive WAL files after taking a base backup from standby, because backup label file is not created?). Good luck! Regards MauMau
On Tue, Aug 19, 2014 at 7:33 AM, Fujii Masao <masao.fujii@gmail.com> wrote: > On Fri, Aug 15, 2014 at 4:30 AM, Robert Haas <robertmhaas@gmail.com> wrote: >> On Wed, Aug 13, 2014 at 6:42 AM, Fujii Masao <masao.fujii@gmail.com> wrote: >>> I'd propose the attached WIP patch which allows us to enable WAL archiving >>> even in standby. The patch adds "always" as the valid value of archive_mode. >>> If it's set to "always", the archiver is started when the server is in standby >>> mode and all the WAL files that walreceiver wrote to the disk are archived by >>> using archive_command. Then, even after the server is promoted to master, >>> the archiver keeps archiving WAL files. The patch doesn't change the meanings >>> of the setting values "on" and "off" of archive_mode. >> >> I like the feature, but I don't much like this as a control mechanism. >> Having archive_command and standby_archive_command, as you propose >> further down, seems saner. > > Okay, that's fine. One question is; Which WAL files should be archived by > standby_archive_command? There are following kinds of WAL files. > > (1) WAL files which were fully written and closed by walreceiver > Curently they are not archived at all. > > (2) WAL file which is being written by walreceiver > This file will be closed before it's fully written because of, > for example, standby promotion. > Currently this is archived by archive_command. > > (3) WAL file with new timeline, which is copied from (2) > At the end of recovery, after new timeline is assigned, > this latest WAL file with new timeline is created by being copied > from (2) (i.e., latest WAL file with old timeline). WAL data of > end-of-recovery checkpoint is written to this latest WAL file. > Currently this is archived by archive_command. > > (4) Timeline history files > When standby is promoted to the master, the imeline is incremented > and the timeline history file is created. > Currently the timeline history files are archived by archive_command. > > (5) WAL files generated in normal processing mode > Currently they are archived by archive_command. > > I'm thinking to use standby_archive_command only for (1) because > the others are currently archived by archive_command. That means > that even if there are type (1) WAL files which have not been archived > yet after the standby promotion (i.e., the situation where WAL archiving > was delayed for some reasons in the standby), they are archived by > standby_archive_command. IOW, the archiver uses both archive commands > as the situation demands. > > OTOH, maybe there are people who want to use standby_archive_command > for all the WAL files with old timeline, i.e., (1) and (2). Thought? Boy, that's quite confusing. I didn't think we ever ran archive_command on the standby right now, so then it would make sense to have a way to do that. And it makes sense for it to be separate from the mode used on the master to avoid breaking existing configurations, so that a user assuming that a certain setting will only take effect after promotion will not be disappointed. However, if what you're saying is that we do archiving on the standby sometimes but not others, I'm not quite sure what the best way forward is. It seems rather inconsistent to do it for some types of WAL files but not others. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Wed, Aug 13, 2014 at 12:42 PM, Fujii Masao <masao.fujii@gmail.com> wrote: > Hi, > > I'd propose the attached WIP patch which allows us to enable WAL archiving > even in standby. ... > I think that this feature is useful for the case, e.g., where large database > needs to be replicated between remote servers. Imagine the situation where > the replicated database gets corrupted completely in the remote standby. > How should we address this problematic situation and restart the standby? > > One approach is to take a fresh backup from the master and restore it onto > the standby. But since the database is large and there is long distance > between two servers, this approach might take a surprisingly long time. > > Another approach is to restore the backup which was taken from the standby > before. But most of many WAL files which the backup needs might exist only > in the master (because WAL archiving cannot be enabled in the standby) and > they need to be transfered from the master to the standby via long-distance > network. So I think that this approach also would take a fairly long time. > To shorten that time, you may think that archive_command in the master can > be set so that it transfers WAL files from the master to the standby's > archival storage. I agree that this setting can accelerate the database restore > process. But this causes every WAL files to be transfered between remote > servers twice (one is by streaming replication, another is by archive_command), > and which is a waste of network bandwidth. Well, in theory one can also use pg_receivexlog to get the WAL files from master, and then run them through the PITR on the slave without the archive_cleanup command. I'm not sure you can do the same if the source of the WAL files is a cascading slave, but I see no reasons why not to. However, I find the patch useful, since it allows accomplishing things in a much more straightforward way. > > Back to the patch. If archive_mode is set to "always", archive_command is > always used to archive WAL files even during recovery. Do we need to separate > the command into two for master and standby, respectively? We can add > something like standby_archive_command parameter which is used to archive > only WAL files walreceiver writes. The other WAL files are archived by > archive_command. I'm not sure if it's really worth separating the command > that way. Is there any use case? I don't see a good use case for doing things only on standby, but I can imagine that some different archiving methods might be used depending on the role of the archiver: on master, one may do, for instance, additional copy to the NFS partition. Does it make sense to expose the server role ('is_master') via an additional variable available to the recovery_command (i.e. %m)? > > The patch doesn't allow us to enable WAL archiving *only* during recovery. > Should we support yet another archive_mode like "standby" which allows > the archiver to be running only during recovery, but makes it end just after > the server is promoted to master? I'm not sure if there is really use case for > that. I do not see much use for this as well. > > I've not included the update of document in the patch yet. If we agree to > support this feature, I will do the remaining work. I think it is useful, and I gave this patch a spin by, essentially, creating a cascaded archive-only slave. I made a base backup from master, then ran the standby from this base backup with archive_mode = 'always' and archive_command copying files to the archive_location, then created another base backup out of it (without including WAL files into the backup) and pointed the recovery command of the final slave into the archive created by the intermediate one. Recovery worked, as well as the promotion of the intermediate slave to the master, the final slave just caught up with the timeline changes (with recovery_timeline set to 'latest') and went on with the recovery. One thing I've noticed is that pg_basebackup copies the postgresql.conf from the standby verbatim, including the archive_mode, which means that if one runs the cascaded replica without changing the archive_mode, that replica will try to archive the WAL, and if both the source and the replica are running on the same machine (or attached to NFS using the same mount points) even the destination for archiving will be the same. Should not be a big problem if one follows the recommendation of not overwriting the files that already exist at the destination, but it would be nice to reset the archive_mode flag to off. I do not know much about the WAL-related code, but one thing that I found strange in the patch is a separate file xlogarchive.h instead of putting stuff into xlog.h? Does not make much sense for a single enum, are you planning to put more things there? There was a single hunk when applying this against the latest master: >Hunk #4 succeeded at 4789 (offset -1 lines). -- Regards, Alexey Klyukin
On 13 August 2014 11:42, Fujii Masao <masao.fujii@gmail.com> wrote: > I'd propose the attached WIP patch which allows us to enable WAL archiving > even in standby. The patch adds "always" as the valid value of archive_mode. > If it's set to "always", the archiver is started when the server is in standby > mode and all the WAL files that walreceiver wrote to the disk are archived by > using archive_command. Then, even after the server is promoted to master, > the archiver keeps archiving WAL files. The patch doesn't change the meanings > of the setting values "on" and "off" of archive_mode. Seems OK idea. Perhaps better to have a new parameter called archive_role = Origin | Replica | Always should be used so we match up with how triggers/rules work. rather than abuse archive_mode We can pass a attribute called %m (mode?) which tells the archive_command whether we are Master or Standby when called Remember to bump the number of background procs by 1 since we previously assumed archiver would never run in recovery -- Simon Riggs http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services