Re: standby with a fixed lag behind the master - Mailing list pgsql-admin
From | Jerry Sievers |
---|---|
Subject | Re: standby with a fixed lag behind the master |
Date | |
Msg-id | 87k3xqacql.fsf@comcast.net Whole thread Raw |
In response to | standby with a fixed lag behind the master (Alexey Klyukin <alexk@commandprompt.com>) |
List | pgsql-admin |
Alexey Klyukin <alexk@commandprompt.com> writes: > Hello, > > I've recently come across the task of setting up a PostgreSQL 9.1 > standby server that is N hours behind the master, i.e. only > transactions that finished N hours in the past or older should be > replayed on a standby. The goal is to have a known good state server > to perform backups from and possibly revert to in case of unwanted > changes on primary. It seems that there is no mechanism in > PostgreSQL to just ask the standby to keep a fixed distance (in > terms of either WAL segments or time) between the primary, so these > are possible solutions: A not unreasonable problem to want options for solving. > 1. Use restore command on standby to fetch the current WAL segment > only if it has been created not less than N hours in the past > (according to ctime). This is the approach I implemented a few years ago by writing a custom version of pg_standby that implements all the standard logic plus checking for a configurable age of WAL segment before applying same. I chose to do the program in Python rather than trying to hack the feature into the C language native version but of course this was due to personal preference. I have seen this feature requested a few times before but not sure ever added to the community release. > 2. Pause the restore process on standby if the lag * is less than N > hours (with pg_xlog_replay_pause()) and resume if it is more than > that. > 3. Set recovery_target_time to current - 6 hours and > pause_at_recovery_target to true, periodically check whether the > recovery is paused, reset the recovery target time to a new value > (and restart the standby) if it is. > > * - the lag would be calculated as now() - pg_last_xact_replay_timestamp() on standby. > > Both 2 and 3 requires external cron job to pause/resume the > recovery, and 1, while being the easiest of all, doesn't work with > SR (unless it's combined with WAL shipping). I wonder if there are > other well established approaches at solving this problem and if > there is an interest for adding such feature to the -core? Right and I"m not sure such a strategy would play nice with hot-standby either. One strategy if you have enough storage, is to have multiple standby instances perhaps all running on the same standby host. One of them is a streamer and the other a lagger. You would go live on whichever instance was necessary to recover from whatever disaster you had. In practice, we've never had to use any of our lagged standbys for the intended purpose, always unlagging them and bringing online at latest transaction. But... If you are ever faced with a case of like junior DBA drops important table in production, deployment code mangles DB or whatever, you may be glad you had a lagged standby to migrate to. In the case of guarding against risk of disaster from a deployment gone bad, and given just one or more regular standbys in the typical near real-time mode, it's much easier to just halt one of them after apps are down and immediatly prior to the deployment. If things go badly, online that standby and start running on it. HTH > Thank you, > -- > Alexey Klyukin http://www.commandprompt.com > The PostgreSQL Company Command Prompt, Inc. > > > > > > -- > Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-admin > -- Jerry Sievers Postgres DBA/Development Consulting e: postgres.consulting@comcast.net p: 732.216.7255
pgsql-admin by date: