Re: standby with a fixed lag behind the master - Mailing list pgsql-admin

From Jerry Sievers
Subject Re: standby with a fixed lag behind the master
Date
Msg-id 87k3xqacql.fsf@comcast.net
Whole thread Raw
In response to standby with a fixed lag behind the master  (Alexey Klyukin <alexk@commandprompt.com>)
List pgsql-admin
Alexey Klyukin <alexk@commandprompt.com> writes:

> Hello,
>
> I've recently come across the task of setting up a PostgreSQL 9.1
> standby server that is N hours behind the master, i.e. only
> transactions that finished N hours in the past or older should be
> replayed on a standby. The goal is to have a known good state server
> to perform backups from and possibly revert to in case of unwanted
> changes on primary. It seems that there is no mechanism in
> PostgreSQL to just ask the standby to keep a fixed distance (in
> terms of either WAL segments or time) between the primary, so these
> are possible solutions:

A not unreasonable problem to want options for solving.

> 1. Use restore command on standby to fetch the current WAL segment
> only if it has been created not less than N hours in the past
> (according to ctime).

This is the approach I implemented a few years ago by writing a custom
version of pg_standby that implements all the standard logic plus
checking for a configurable age of WAL segment before applying same.
I chose to do the program in Python rather than trying to hack the
feature into the C language native version but of course this was due
to personal preference.

I have seen this feature requested a few times before but not sure
ever added to the community release.

> 2. Pause the restore process on standby if the lag * is less than N
> hours (with pg_xlog_replay_pause()) and resume if it is more than
> that.

> 3. Set recovery_target_time to current - 6 hours and
> pause_at_recovery_target to true, periodically check whether the
> recovery is paused, reset the recovery target time to a new value
> (and restart the standby) if it is.
>
> * - the lag would be calculated as now() - pg_last_xact_replay_timestamp() on standby.
>
> Both 2 and 3 requires external cron job to pause/resume the
> recovery, and 1, while being the easiest of all, doesn't work with
> SR (unless it's combined with WAL shipping). I wonder if there are
> other well established approaches at solving this problem and if
> there is an interest for adding such feature to the -core?

Right and I"m not sure such a strategy would play nice with
hot-standby either.

One strategy if you have enough storage, is to have multiple standby
instances perhaps all running on the same standby host.  One of them
is a streamer and the other a lagger.  You would go live on whichever
instance was necessary to recover from whatever disaster you had.

In practice, we've never had to use any of our lagged standbys for the
intended purpose, always unlagging them and bringing online  at latest
transaction.

But...  If you are ever faced with a case of like junior DBA drops
important table in production, deployment code mangles DB  or
whatever, you may be glad you had a lagged standby to migrate to.

In the case of guarding against risk of disaster from  a deployment
gone bad, and given just one or more regular standbys in the typical
near real-time mode, it's much easier to just halt one of them after
apps are down and immediatly prior to the deployment.  If things go
badly, online that standby and start running on it.

HTH

> Thank you,
> --
> Alexey Klyukin        http://www.commandprompt.com
> The PostgreSQL Company – Command Prompt, Inc.
>
>
>
>
>
> --
> Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-admin
>

--
Jerry Sievers
Postgres DBA/Development Consulting
e: postgres.consulting@comcast.net
p: 732.216.7255

pgsql-admin by date:

Previous
From: "ktm@rice.edu"
Date:
Subject: Re: "Data import from Mysql to MS Sql Server"
Next
From: Tom Lane
Date:
Subject: Re: problems with access into system catalogs