Re: [GENERAL] Postgres HA - Mailing list pgsql-general

From
Subject Re: [GENERAL] Postgres HA
Date
Msg-id 20170224100336.6E8CFD62@m0087795.ppops.net
Whole thread Raw
In response to [GENERAL] Postgres HA  (Dylan Luong <Dylan.Luong@unisa.edu.au>)
List pgsql-general
Julyanto Sutandang <julyanto@equnix.id> wrote:
> Talking about High Availability, we should understand the basic concept of HA, it is avoiding SPOF (Single Point of
Failure).When we use a Loadbalancer (LTM) and that load balancer is single, then you may get HA only for the PostgreSQL
butthere are another single point of failure, it is the LTM it self. In overall that  topology is not HA.  
>
> The best configuration for HA i know is using Linux-HA to watch between 2 servers and doing failover VIP (Virtual IP)
whenMaster is down or out of service. The best configuration for HA, Servers should be on the same site and uses direct
cableconnection to ensure dedicated private bandwidth and there are no Single Point of Failure.  
> LinuxHA or pacemaker or corosync will do the Virtual IP swing over from master host to slave host and promote the
replicadatabase in slave host become master.  
>
> There is no single point of failure.


I'll agree with most of this, especially that avoiding SPOF is the goal.

However, I will point out that if you put both servers in the same site that you've created a SPOF. If you lose power
atthat site, or construction workers takes out the network cable to the building, or any other thing that can happen to
takethe whole site down, any of that will cause total failure. 

If you really want to avoid that, then the 2 servers need to sit multiple miles/kilometers apart and be served by a
dedicatedconnection or else 2 different network providers (I've heard differing opinions on the best way) that is 100Mb
orhigher to adequately deal with the various replication issues that must be addressed. If the load is high enough, you
mayneed multiple lines to be bonded or else even Gb. 

Of course, the OP may not need that level of HA, but it is something that should be asked and answered by him and his
organization.

It's something we deal with for our larger customers. We currently solve the replication with DRBD (for the DB) and
csync2(for the application code and logs). We are slowly considering the question if we'd be better off abandoning DRBD
andusing Pg's builtin replication, but the jury is still out on that as we haven't had enough time to fully figure it
out.We'd considered ZFS replication for a while, but found ZFS to be too slow for the DB despite it's other useful
featureslike replication (at least for the hardware that we had). 

Personally, I find HA easy to understand but hard to implement well -- especially without a large budget. :)

HTH,
Kevin


pgsql-general by date:

Previous
From: Moreno Andreo
Date:
Subject: [GENERAL] echo
Next
From: Justin Pryzby
Date:
Subject: [GENERAL] SELECT x'00000000F'::int leading zeros causes "integer out of range"