Thread: what to do after a failover

what to do after a failover

From

Rita

Date:

09 January 2020, 04:06:28

I run a master and standby setup with Postgresql 11. The systems are identical from a hardware and software setup. If the master goes down I can do a pg_ctl promote on the standby and point my applications to use the standby (new master).

Once the original master is online, when is an appropriate time to fail back over? And are there any other things besides promote after the failover is done?

--- Get your facts first, then you can distort them as you please.--

Re: what to do after a failover

From

Michael Paquier

Date:

09 January 2020, 04:31:39

On Wed, Jan 08, 2020 at 11:06:28PM -0500, Rita wrote:
> I run a master and standby setup with Postgresql 11. The systems are
> identical from a hardware and software setup.  If the master goes down I
> can do a pg_ctl promote on the standby and point my applications to use the
> standby (new master).
>
> Once the original master is online, when is an appropriate time to fail
> back over? And are there any other things besides promote after the
> failover is done?

Make sure that you still have an HA configuration able to handle
multiple degrees of failures with always standbys available after a
promotion.

The options available to rebuild your HA configuration after a
failover depend on the version of PostgreSQL you are using.  After a
failover the most simple solution would be to always recreate a new
standby from a base backup taken from the freshly-promoted primary,
though it can be costly depending on your instance.  You could also
use pg_rewind (available in core since 9.5) to recycle the previous
primary and reuse it as a standby of the new promoted custer.  Note
that there are community-based solutions for such things, like
pg_auto_failover or pacemaker-based stuff just to name two.  These
rely on more complex architectures, where a third node is present to
monitor the others (any sane HA infra ought to do at least that to be
honest).
--
Michael

Attachment

signature.asc

Re: what to do after a failover

From

Rita

Date:

09 January 2020, 11:55:18

Thanks for the response.

I am using Postgresql 11.

I want something simple and I have a strong preference toward using stock tools. After the promotion and the original master comes online, I was thinking of doing a pg_basebackup to sync. Any thoughts about that? I had a very hard time with pg_rewind and I didn't like its complexity.

On Wed, Jan 8, 2020 at 11:31 PM Michael Paquier <michael@paquier.xyz> wrote:

On Wed, Jan 08, 2020 at 11:06:28PM -0500, Rita wrote:
> I run a master and standby setup with Postgresql 11. The systems are
> identical from a hardware and software setup. If the master goes down I
> can do a pg_ctl promote on the standby and point my applications to use the
> standby (new master).
>
> Once the original master is online, when is an appropriate time to fail
> back over? And are there any other things besides promote after the
> failover is done?

Make sure that you still have an HA configuration able to handle
multiple degrees of failures with always standbys available after a
promotion.

The options available to rebuild your HA configuration after a
failover depend on the version of PostgreSQL you are using. After a
failover the most simple solution would be to always recreate a new
standby from a base backup taken from the freshly-promoted primary,
though it can be costly depending on your instance. You could also
use pg_rewind (available in core since 9.5) to recycle the previous
primary and reuse it as a standby of the new promoted custer. Note
that there are community-based solutions for such things, like
pg_auto_failover or pacemaker-based stuff just to name two. These
rely on more complex architectures, where a third node is present to
monitor the others (any sane HA infra ought to do at least that to be
honest).
--
Michael

--- Get your facts first, then you can distort them as you please.--

Re: what to do after a failover

From

Jehan-Guillaume de Rorthais

Date:

09 January 2020, 14:14:59

On Thu, 9 Jan 2020 06:55:18 -0500
Rita <rmorgan466@gmail.com> wrote:

> Thanks for the response.
> I am using Postgresql 11.
> I want something simple and I have a strong preference toward using stock
> tools. After the promotion and the original master comes online, I was
> thinking of doing a pg_basebackup to sync. Any thoughts about that? 

If you can afford that, this is the cleanest and easiest procedure you could
find.

Note that pg_basebackup need an empty PGDATA, so it will have to transfert the
whole instance from new promoted primary to the original one.

Regards,

Re: what to do after a failover

From

Michael Paquier

Date:

10 January 2020, 00:11:24

On Thu, Jan 09, 2020 at 03:14:59PM +0100, Jehan-Guillaume de Rorthais wrote:
> If you can afford that, this is the cleanest and easiest procedure you could
> find.
>
> Note that pg_basebackup need an empty PGDATA, so it will have to transfert the
> whole instance from new promoted primary to the original one.

Simple is easier to understand.  Now the larger your instance, the
longer it takes to copy a base backup and the longer your reduce the
availability of your cluster.  So be careful with what you choose.
--
Michael

Attachment

signature.asc