Home > mailing lists

Re: [GENERAL] Postgres HA - Mailing list pgsql-general

From	dinesh kumar
Subject	Re: [GENERAL] Postgres HA
Date	February 23, 2017 14:36:20
Msg-id	CALnrH7rx0Hriz98UyaZTeyGh64jPbEYK20Mxm2txnhCX+rHCUw@mail.gmail.com Whole thread Raw
In response to	[GENERAL] Postgres HA (Dylan Luong <Dylan.Luong@unisa.edu.au>)
List	pgsql-general

Tree view

Hi Dylan,

On Thu, Feb 23, 2017 at 4:28 AM, Dylan Luong <Dylan.Luong@unisa.edu.au> wrote:

Hi

I am a DBA at the University of South Australia. For PostgreSQL High Availability, we currently have setup a Master/Slave across two datacenters using PostgreSQL (WAL) streaming replication. We use an LTM (load balancer) server that sits between the application servers and the PostgreSQL server that directs connections to the Master (and the Slave if failover occurs). We also have watchdog processes on the PostgreSQL servers that polls the LTM to determine who is Master and perform automatic failover if required. I am looking at options to improve our high availability.
I would like to know how other organizations in different industries (other than education) setup High Availability on their PostgreSQL environments.

Below is the approach we have followed, to achieve the maximum HA with async streaming replication.

1. Create an instance "I" with "N" number of nodes.

2. Set up the replication among the "N" nodes as "N-1" nodes points to 1 master.

3. Configured 2 physical replication slots to each "N-1" nodes.

4. One replication slot is to receive the archives, and one is for doing the replication. (Suspected recovery process is slower than the receiver process)

5. Configured a parallel WAL uploader (Customized program) on master to wal-backup server. (We needed this for the PITR)

6. Implemented a quorum on "N-1" slaves as one should become as master in worst cases. (Guaranteed data availability as per RTO settings)

7. Watchdog process which update the pgbouncer configuration from master to the latest master.

8. Used consul service discovery for identifying the master, slave heart beats.

9. Once Failover is completed, "N-1"(including old master) follows the new master by doing a fresh refresh. (Planning to use pg_rewind)

10. Covered the split brain problems by removing the service discovery keys from consul. (It's delivering the promising results, but need to spend more time on this).

The above mentioned approach what we have done is similar to your's, but we needed to handle with multiple slaves rather single one. In case, if you are looking for any open source tools to implement in your production servers, then prefer to add repmgr, pgHA, PAF tools into your list. These open source tools are great and deliver the results as demonstrated.

What tools do you use. Are they commercial licensed products? How is the architecture setup and how do you do recovery of new slave.
Your information is greatly appreciated.

Regards
Dylan

Dylan Luong
Information Strategy & Technology Services
University of South Australia
A Building, Room E2-07, Mawson Lakes Campus
MAWSON LAKES
South Australia 5095

Email:      dylan.luong@unisa.edu.au
Phone:    +61 8 83023629
Fax:         +61 8 83023577
WWW:    http://www.unisa.edu.au

Regards,

Dinesh

manojadinesh.blogspot.com

pgsql-general by date:

From: Jehan-Guillaume de Rorthais
Date: 23 February 2017, 14:28:29
Subject: Re: [GENERAL] Postgres HA

From: Francisco Olarte
Date: 23 February 2017, 15:33:27
Subject: Re: [GENERAL] Move rows from one database to other

Re: [GENERAL] Postgres HA - Mailing list pgsql-general

Previous

Next