Thread: Streaming Replication and archiving
I've been having a look at this, one master + one replica and also one master + 2 replicas. I gotta say this is a nice piece of functionality (particularly the multiple replicas). I've been using the wiki page (http://wiki.postgresql.org/wiki/Streaming_Replication) as a guide, and I notice that it recommends the master (and replicas) have a non-trivial archive_command even after the backup step is completed. ISTM that after the backup the master's archive_command can be set to '' or '/bin/true' as the walsender does not make any use of the WAL archive (AFAICS anyway). Clearly it might be desirable to have the archived segments around for other reasons - but equally it might be desirable *not* to have to have to (e.g disk space), or am I overlooking something? Cheers Mark
Mark Kirkwood <mark.kirkwood@catalyst.net.nz> writes: > I've been using the wiki page > (http://wiki.postgresql.org/wiki/Streaming_Replication) as a guide, and I > notice that it recommends the master (and replicas) have a non-trivial > archive_command even after the backup step is completed. ISTM that after the > backup the master's archive_command can be set to '' or '/bin/true' as the > walsender does not make any use of the WAL archive (AFAICS anyway). Clearly > it might be desirable to have the archived segments around for other reasons > - but equally it might be desirable *not* to have to have to (e.g disk > space), or am I overlooking something? I think it's still necessary in the case the connection between a slave and the master breaks. If the transient error is long enough for the slave requesting WALs that the master no longer has, restore_command will get used. IIUC from the mails here, the restore_command is run directly by the slave itself, so it needs to have access to master archives embedded in the restore_command. Take all this with a huge grain of salt, that's my understanding without having had the time to read the patch or play with it yet. Regards, -- dim PS: maybe the states and transitions and what data flows from where to where in each of those would be good at least in the documentation, if not explicit in the code?
Dimitri Fontaine wrote: > Mark Kirkwood <mark.kirkwood@catalyst.net.nz> writes: > >> I've been using the wiki page >> (http://wiki.postgresql.org/wiki/Streaming_Replication) as a guide, and I >> notice that it recommends the master (and replicas) have a non-trivial >> archive_command even after the backup step is completed. ISTM that after the >> backup the master's archive_command can be set to '' or '/bin/true' as the >> walsender does not make any use of the WAL archive (AFAICS anyway). Clearly >> it might be desirable to have the archived segments around for other reasons >> - but equally it might be desirable *not* to have to have to (e.g disk >> space), or am I overlooking something? >> > > I think it's still necessary in the case the connection between a slave > and the master breaks. If the transient error is long enough for the > slave requesting WALs that the master no longer has, restore_command > will get used. > > IIUC from the mails here, the restore_command is run directly by the > slave itself, so it needs to have access to master archives embedded in > the restore_command. > > Take all this with a huge grain of salt, that's my understanding without > having had the time to read the patch or play with it yet. > > Thanks Dimitri, I'd missed that thread. Ok, slave will need a suitable restore_comand in addition to primary_conninfo in recovery.conf, and then extended communication failures (or shutting down the slave for a while!) will not break the streaming setup (FWIW I tried this just now). regards Mark
> Thanks Dimitri, I'd missed that thread. Ok, slave will need a suitable > restore_comand in addition to primary_conninfo in recovery.conf, and > then extended communication failures (or shutting down the slave for a > while!) will not break the streaming setup (FWIW I tried this just now). Sure, but if the archived WAL segments are NOT needed, how are they supposed to get deleted? It doesn't take long to run out of disk space if they're not being rotated. Masao-san, can you comment on this? --Josh
Josh Berkus wrote: >> Thanks Dimitri, I'd missed that thread. Ok, slave will need a suitable >> restore_comand in addition to primary_conninfo in recovery.conf, and >> then extended communication failures (or shutting down the slave for a >> while!) will not break the streaming setup (FWIW I tried this just now). >> > > Sure, but if the archived WAL segments are NOT needed, how are they > supposed to get deleted? It doesn't take long to run out of disk space > if they're not being rotated. > > +1 From what I am seeing at the moment (8.5 devel from 2 days ago), the archived segments are not deleted at all (I have several hundred now after a number of pgbench runs over the last day or so). regards Mark
Mark Kirkwood <mark.kirkwood@catalyst.net.nz> writes: > Josh Berkus wrote: >> Sure, but if the archived WAL segments are NOT needed, how are they >> supposed to get deleted? It doesn't take long to run out of disk space >> if they're not being rotated. > From what I am seeing at the moment (8.5 devel from 2 days ago), the > archived segments are not deleted at all (I have several hundred now > after a number of pgbench runs over the last day or so). Huh? *Archived* segments aren't supposed to get deleted, at least not by any automatic Postgres action. It would be up to the DBA how long he wants to keep them around. regards, tom lane
Tom Lane wrote: > Mark Kirkwood <mark.kirkwood@catalyst.net.nz> writes: > >> Josh Berkus wrote: >> >>> Sure, but if the archived WAL segments are NOT needed, how are they >>> supposed to get deleted? It doesn't take long to run out of disk space >>> if they're not being rotated. >>> > > >> From what I am seeing at the moment (8.5 devel from 2 days ago), the >> archived segments are not deleted at all (I have several hundred now >> after a number of pgbench runs over the last day or so). >> > > Huh? *Archived* segments aren't supposed to get deleted, at least not > by any automatic Postgres action. It would be up to the DBA how long > he wants to keep them around. > > > Exactly - there was a comment in the 'retry from archive' thread that suggested otherwise. The likely typical use case for streaming replication makes a good case and automated safe way of pruning these guys - I've seen a few cases where overly aggressive cleanup has broken log shipping setups (usually 8.2, before the restart option was available). regards Mark
Mark Kirkwood wrote: > The likely typical use case for streaming replication makes a good > case and automated safe way of pruning these guys Sorry, stupid typo: should read '...makes a good case for an automated safe way of pruning these....'
> Huh? *Archived* segments aren't supposed to get deleted, at least not > by any automatic Postgres action. It would be up to the DBA how long > he wants to keep them around. OK. The docs indicated that the segments needed to be kept around in case the slave fell behind. If that's not the case (as it appears not to be) then they can just be deleted by cron job, or the archive_command on the master can be changed. Presumably, however, if the slave falls sufficiently behind and there are no archive logs, then the slave would not be able to resynch with the master, no? --Josh Berkus
On Thu, Jan 21, 2010 at 10:48 AM, Josh Berkus <josh@agliodbs.com> wrote: > Presumably, however, if the slave falls sufficiently behind and there > are no archive logs, then the slave would not be able to resynch with > the master, no? In that case, we would need to make a fresh backup of the primary, and start the standby from that data. Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center
On Thu, Jan 21, 2010 at 1:48 AM, Josh Berkus <josh@agliodbs.com> wrote: >> Huh? *Archived* segments aren't supposed to get deleted, at least not >> by any automatic Postgres action. It would be up to the DBA how long >> he wants to keep them around. > > OK. The docs indicated that the segments needed to be kept around in > case the slave fell behind. If that's not the case (as it appears not > to be) then they can just be deleted by cron job, or the archive_command > on the master can be changed. It's definitely the case. Generally speaking each base backup image has an oldest archived log which is needed to make it useful. And each standby database -- which is just a recovered base backup which has been rolled forward some distance already -- also has one. What would be useful is a tool which given a list of standby databases and list of base backup images can apply a set of policy rules to determine which base backups and archived logs to delete. The policy might look something like "keep one base backup per week going back a month and one per day going back seven days and keep archived logs going back far enough for any of these base backups or any of these live replicas." Bonus points if you can say "also keep one base backup per month going back three years with just enough archived logs to recover the base backup to a consistent state". -- greg
Greg Stark <gsstark@mit.edu> wrote: > What would be useful is a tool which given a list of standby > databases and list of base backup images can apply a set of policy > rules to determine which base backups and archived logs to delete. > > The policy might look something like "keep one base backup per > week going back a month and one per day going back seven days and > keep archived logs going back far enough for any of these base > backups or any of these live replicas." > > Bonus points if you can say "also keep one base backup per month > going back three years with just enough archived logs to recover > the base backup to a consistent state". Hmmm... Our policy is "Keep the most recent base backup and all WAL files from the point needed to use it to current on the backup server local to the source database, keep the most recent two weekly backups and all the WAL files from the point needed to start the earlier to current on the central backup server (along with a warm standby instance running for each source to confirm that the backup and WAL files are usable), and keep the first weekly backup of each month and just enough WAL files for a consistent start of each on a mirrored SAN archive for one year." Just in case you're looking for real-life policies currently in use. By the way, the .backup files and the information from pg_controldata and pg_ctl status currently provide just enough information for this all to be run from bash scripts without much human attention, but the effort required to get there is far from trivial. I'm sure that something which made such a policy easy to implement would be useful to some shops. Hopefully that can be done without breaking current scripts. -Kevin