Re: Auotmated postgres failover - Mailing list pgsql-general

From Jehan-Guillaume de Rorthais
Subject Re: Auotmated postgres failover
Date
Msg-id 20160122101801.38da3ec8@firost
Whole thread Raw
In response to Re: Auotmated postgres failover  (John R Pierce <pierce@hogranch.com>)
List pgsql-general
Le Thu, 21 Jan 2016 11:34:18 -0800,
John R Pierce <pierce@hogranch.com> a écrit :

> On 1/21/2016 11:07 AM, jwiencek3@comcast.net wrote:
> >
> >
> > I'm looking for a tool to automate PostgreSQL cluster management
> > failover in the event the master database were to become unavailable.
> >   Currently are manually issuing a "pg_ctl promote"  once we become
> > aware that the master database has crashed.
> >
> >
> > Is repmgr a via solution?  Please pass along experiences with repmgr.
> >
> > Are there any other  tools available to automatically issue the
> > "promote" in the event of a master database crash?

Yes, 3 different Pacemaker resource agents exist for PostgreSQL:

 * official one, in the package "resource-agents" on most linux distribs.
   This one is pretty complex and support multistate and stateless setup.
 * a simple, stupid, easy and stateless, agent:
   https://github.com/dalibo/pgsql-resource-agent/tree/master/stateless
   This one is fine for a 2 node cluster
 * a multistate-aware agent:
   https://github.com/dalibo/pgsql-resource-agent/tree/master/multistate
   This one is nice for multi-node cluster, searching for the best known slave
   to elect after a master lost.

Some important docs are available in the pgsql-resource-agent (PRA) repo:
 * https://github.com/dalibo/pgsql-resource-agent/blob/master/FENCING.md
 * the stateless:
   https://github.com/dalibo/pgsql-resource-agent/blob/master/stateless/README.md
 * the multistate:
   https://github.com/dalibo/pgsql-resource-agent/blob/master/multistate/README.md
   https://github.com/dalibo/pgsql-resource-agent/blob/master/multistate/INSTALL.md
   https://github.com/dalibo/pgsql-resource-agent/blob/master/multistate/docs/Quick_Start.md

> repmgr is a tool you could use in conjunction with a generic cluster
> management system like linuxha/heartbeat, vcs, etc.
>
> the most difficult part is reliably determining that A) the master has
> crashed, and B) fencing the failed old master so it doesn't wake up and
> think its still in charge.

+1
--
Jehan-Guillaume de Rorthais
Dalibo


pgsql-general by date:

Previous
From: Craig Ringer
Date:
Subject: Re: adding a bdr node using bcv backup
Next
From: Geoff Winkless
Date:
Subject: Re: Let's Do the CoC Right