Re: High Availability with Postgres - Mailing list pgsql-general

From Greg Smith
Subject Re: High Availability with Postgres
Date
Msg-id 4C202930.9020807@2ndquadrant.com
Whole thread Raw
In response to Re: High Availability with Postgres  (John R Pierce <pierce@hogranch.com>)
Responses Re: High Availability with Postgres
Re: High Availability with Postgres
List pgsql-general
John R Pierce wrote:
> the commercial cluster software vendors insist on using dedicated
> connections for the heartbeat messages between the cluster members and
> insist on having fencing capabilities (for instance, disabling the
> fiber switch port of the formerly active server and enabling the port
> for the to-be-activated server).  with linux-ha and heartbeat, you're
> on your own.

This is worth highlighting.  As John points out, it's straighforward to
build a shared storage implementation using PostgreSQL and either one of
the commercial clustering systems or using Linux-HA.  And until
PostgreSQL gets fully synchronous replication, it's a viable alternate
solution for "must not lose a transaction" deployments when the storage
used is much more reliable than the nodes.

The hard part of shared storage failover is always solving the "shoot
the other node in the head problem", to keep a down node from coming
back once it's no longer the active one.  In order to do that well, you
really need to lock the now unavailable node from accessing the storage
at the hardware level--"fencing"--with disabling its storage port being
one way to handle that.  Figure out how you're going to do that reliably
in a way that's integrated into a proper cluster manager, and there's no
reason you can't do this with PostgreSQL.

There's a description of the fencing options for Linux-HA at
http://www.clusterlabs.org/doc/crm_fencing.html ; the cheap way to solve
this problem is to have a UPS that disables the power going to the shot
node.  Once that's done, you can then safely failover the shared storage
to another system.  At that point, you can probably even turn back on
the power, presuming that the now rebooted system will be able to regain
access to the storage during a fresh system start.

--
Greg Smith  2ndQuadrant US  Baltimore, MD
PostgreSQL Training, Services and Support
greg@2ndQuadrant.com   www.2ndQuadrant.us


pgsql-general by date:

Previous
From: "George Weaver"
Date:
Subject: Problem Using RowType Declaration with Table Domains
Next
From: "Carlo Stonebanks"
Date:
Subject: No PL/PHP ? Any reason?