Thread: what to do after a failover
I run a master and standby setup with Postgresql 11. The systems are identical from a hardware and software setup. If the master goes down I can do a pg_ctl promote on the standby and point my applications to use the standby (new master).
--
Once the original master is online, when is an appropriate time to fail back over? And are there any other things besides promote after the failover is done?
--
--- Get your facts first, then you can distort them as you please.--
On Wed, Jan 08, 2020 at 11:06:28PM -0500, Rita wrote: > I run a master and standby setup with Postgresql 11. The systems are > identical from a hardware and software setup. If the master goes down I > can do a pg_ctl promote on the standby and point my applications to use the > standby (new master). > > Once the original master is online, when is an appropriate time to fail > back over? And are there any other things besides promote after the > failover is done? Make sure that you still have an HA configuration able to handle multiple degrees of failures with always standbys available after a promotion. The options available to rebuild your HA configuration after a failover depend on the version of PostgreSQL you are using. After a failover the most simple solution would be to always recreate a new standby from a base backup taken from the freshly-promoted primary, though it can be costly depending on your instance. You could also use pg_rewind (available in core since 9.5) to recycle the previous primary and reuse it as a standby of the new promoted custer. Note that there are community-based solutions for such things, like pg_auto_failover or pacemaker-based stuff just to name two. These rely on more complex architectures, where a third node is present to monitor the others (any sane HA infra ought to do at least that to be honest). -- Michael
Attachment
Thanks for the response.
I am using Postgresql 11.
I want something simple and I have a strong preference toward using stock tools. After the promotion and the original master comes online, I was thinking of doing a pg_basebackup to sync. Any thoughts about that? I had a very hard time with pg_rewind and I didn't like its complexity.
On Wed, Jan 8, 2020 at 11:31 PM Michael Paquier <michael@paquier.xyz> wrote:
On Wed, Jan 08, 2020 at 11:06:28PM -0500, Rita wrote:
> I run a master and standby setup with Postgresql 11. The systems are
> identical from a hardware and software setup. If the master goes down I
> can do a pg_ctl promote on the standby and point my applications to use the
> standby (new master).
>
> Once the original master is online, when is an appropriate time to fail
> back over? And are there any other things besides promote after the
> failover is done?
Make sure that you still have an HA configuration able to handle
multiple degrees of failures with always standbys available after a
promotion.
The options available to rebuild your HA configuration after a
failover depend on the version of PostgreSQL you are using. After a
failover the most simple solution would be to always recreate a new
standby from a base backup taken from the freshly-promoted primary,
though it can be costly depending on your instance. You could also
use pg_rewind (available in core since 9.5) to recycle the previous
primary and reuse it as a standby of the new promoted custer. Note
that there are community-based solutions for such things, like
pg_auto_failover or pacemaker-based stuff just to name two. These
rely on more complex architectures, where a third node is present to
monitor the others (any sane HA infra ought to do at least that to be
honest).
--
Michael
--- Get your facts first, then you can distort them as you please.--
On Thu, 9 Jan 2020 06:55:18 -0500 Rita <rmorgan466@gmail.com> wrote: > Thanks for the response. > I am using Postgresql 11. > I want something simple and I have a strong preference toward using stock > tools. After the promotion and the original master comes online, I was > thinking of doing a pg_basebackup to sync. Any thoughts about that? If you can afford that, this is the cleanest and easiest procedure you could find. Note that pg_basebackup need an empty PGDATA, so it will have to transfert the whole instance from new promoted primary to the original one. Regards,
On Thu, Jan 09, 2020 at 03:14:59PM +0100, Jehan-Guillaume de Rorthais wrote: > If you can afford that, this is the cleanest and easiest procedure you could > find. > > Note that pg_basebackup need an empty PGDATA, so it will have to transfert the > whole instance from new promoted primary to the original one. Simple is easier to understand. Now the larger your instance, the longer it takes to copy a base backup and the longer your reduce the availability of your cluster. So be careful with what you choose. -- Michael