Thread: Postgresql HA cluster

Postgresql HA cluster

From
Jason Grammenos
Date:

Hello,

 

I am new to PostgreSQL and having to migrate a Cassandra cluster to PostgreSQL.

Have a reasonable amount of experience with MySQL and use Master -> Master MySQL replication quite successfully.

 

I am trying to understand what configuration/option to pick for PostgreSQL that will provide the same or similar level of operational easy as the Cassandra cluster. What I mean by that is : Backups, Patching (rebooting nodes to patch)

With Cassandra any single node can be patched and rebooting without incurring downtime. This is also true with MySQL master->master replication, which we have fronted by a HAproxy reverse proxy (we can fail over between the two MySQL node, draining the connections as we do so and then patching and reboot).

 

I have found the following documentation:

https://www.postgresql.org/docs/current/different-replication-solutions.html

 

and have heard of “Traktor”, “pgEdge” and some other third-party tools.

My difficulty is that with very little knowledge of PostgreSQL I am having a hard time sorting through the various options and determining which are potential candidates for the kind of setup I am looking for (something approximating MySQL master->master replication).

 

Any advice or resources would be much appreciated.

Regards

Jason Grammenos

 

Jason Grammenos | Operations & Infrastructure Analyst  
Pronouns: he/him
P: 613.232.7797 x1131
Toll-free: 866.545.3745 x1131
jason.grammenos@agilitypr.com
agilitypr.com

Learn new PR tips from our free resources.

 

Attachment

RE: Postgresql HA cluster

From
Jason Grammenos
Date:

Thank you for the feedback,

 

I have used pacemaker for other purposes previously so am a little familiar with it. It appears that in this scenario pacemaker is being used to manage a floating ip as well as deal with split brain scenarios. What isn’t clear is how effective master-> master replication is being accomplished. Postgresql streaming replication to the best of my limited knowledge only replicates in one direction, from the active to the standby servers. The issue this presents to me is that once you failover from the active to the standby (or one of the standby’s depending on how many you have) none of the data written on the standby is replicated back to the formerly active server.

 

Let us say that I have only 2 postgresql servers (absolute minimum number) and I want to patch server A. Ideally, I would use a load balancer (or other failover mechanism like pacemaker) and repoint the floating ip to server B. Now traffic would “drain” down off server A, and slowly (or rapidly) move to B. During the move some clients would still be writing to A and some clients would be writing to B. Once they have all moved to B, server A would then be patched. Then the load balancer would be used to repoint the floating ip again back to A, and the process would repeat, with traffic moving back to A. Just like in the first half of the failover some traffic would exist on both hosts as the failover progresses. Once completed all the traffic would be back on A.

 

In the above scenario, I do not understand how streaming replication would handle the part of the scenario when there are clients writing to A and B. It would seem that something like `pgpool-ii` or `pgEdge` would be required, but with my limited knowledge it is unclear if or which would be appropriate.

 

Regards

Jason

 

 

Jason Grammenos | Operations & Infrastructure Analyst  
Pronouns: he/him
P: 613.232.7797 x1131
Toll-free: 866.545.3745 x1131
jason.grammenos@agilitypr.com
agilitypr.com

Learn new PR tips from our free resources.

 

From: Олег Самойлов <splarv@ya.ru>
Sent: Thursday, October 12, 2023 11:08 AM
To: Jason Grammenos <jason.grammenos@agilitypr.com>; pgsql-general@lists.postgresql.org
Subject: Re: Postgresql HA cluster

 

For Postgresql HA cluster the most popular solution is streaming replication. There is an option how implement this. Web programmer approach is using haproxy+consul+patrony. The "old schoool" is using Pacemaker, all in one bottle and well tested. If you interesting how implement in Pacemaker, you may look at my project of testbed that continuously test different HA clusters by random failures. I don't see such for  haproxy+consul+patrony.
https://github.com/domclick/tuchanka

 

 

11.10.2023, 21:02, "Jason Grammenos" <jason.grammenos@agilitypr.com>:

Hello,

 

I am new to PostgreSQL and having to migrate a Cassandra cluster to PostgreSQL.

Have a reasonable amount of experience with MySQL and use Master -> Master MySQL replication quite successfully.

 

I am trying to understand what configuration/option to pick for PostgreSQL that will provide the same or similar level of operational easy as the Cassandra cluster. What I mean by that is : Backups, Patching (rebooting nodes to patch)

With Cassandra any single node can be patched and rebooting without incurring downtime. This is also true with MySQL master->master replication, which we have fronted by a HAproxy reverse proxy (we can fail over between the two MySQL node, draining the connections as we do so and then patching and reboot).

 

I have found the following documentation:

https://www.postgresql.org/docs/current/different-replication-solutions.html

 

and have heard of “Traktor”, “pgEdge” and some other third-party tools.

My difficulty is that with very little knowledge of PostgreSQL I am having a hard time sorting through the various options and determining which are potential candidates for the kind of setup I am looking for (something approximating MySQL master->master replication).

 

Any advice or resources would be much appreciated.

Regards

Jason Grammenos

 

Jason Grammenos | Operations & Infrastructure Analyst  
Pronouns: he/him
P: 613.232.7797 x1131
Toll-free: 866.545.3745 x1131
jason.grammenos@agilitypr.com
agilitypr.com

Learn new PR tips from our free resources.

 

Attachment

Re: Postgresql HA cluster

From
Jehan-Guillaume de Rorthais
Date:
On Fri, 13 Oct 2023 12:02:53 +0000
Jason Grammenos <jason.grammenos@agilitypr.com> wrote:

> Thank you for the feedback,
>
> I have used pacemaker for other purposes previously so am a little familiar
> with it.

So you might be familiar with shared-storage cluster, that are the simpler one
you could deploy (baring you have a reliable HA storage available). But it's
not a multi-primary cluster.

> It appears that in this scenario pacemaker is being used to manage a
> floating ip as well as deal with split brain scenarios.

There's also two different resource agents dealing with PostgreSQL itself:
pgsql and PAF. Both handle multi-status differently from the administration
point of view.

> What isn’t clear is how effective master-> master replication is being accomplished.

There's no master-master in PostgreSQL core. There's few external solutions out
there though, but double check you real needs, the real primary-standby
capacity to answer you needs, and the various constraints M-M imply before
rushing there.

> Postgresql streaming replication to the best of my limited knowledge only
> replicates in one direction, from the active to the standby servers. The
> issue this presents to me is that once you failover from the active to the
> standby (or one of the standby’s depending on how many you have) none of the
> data written on the standby is replicated back to the formerly active server.

It depend if this is a "controlled failover" (aka. "switchover") or a real
failover triggered by some failure. If this is a controlled failover, you can
hook back your old primary as a standby with no trouble. PAF even handle this
for you.

Moreover, even with a failure scenario, there's some solutions around to quickly
fix your old primary data and get it back in production quickly as a standby
(pg_rewind, PITR/pgbackrest, etc).

You just have to plan for failure and write you procedures accordingly to get
the cluster back on feet quickly after a failover.

> Let us say that I have only 2 postgresql servers (absolute minimum number)
> and I want to patch server A. Ideally, I would use a load balancer (or other
> failover mechanism like pacemaker) and repoint the floating ip to server B.
> Now traffic would “drain” down off server A, and slowly (or rapidly) move to
> B. During the move some clients would still be writing to A and some clients
> would be writing to B.

This doesn't exist as PostgreSQL has no multi-primary solution in core. You can
do rolling upgrade, but you'll have to pause the production during the
switchover between the primary and the standby.

> In the above scenario, I do not understand how streaming replication would
> handle the part of the scenario when there are clients writing to A and B.

It will not.

> It would seem that something like `pgpool-ii` or `pgEdge` would be required, but
> with my limited knowledge it is unclear if or which would be appropriate.

External multi-primary solution exists, pgpool-II, Bucardo, BDR, etc. But
you'll have to ask and evaluate these thrid party solutions yourself.

But really, double check first why a simple primary-standby architecture doesn't
meet your needs. The simpler the architecture is, the better. Even from the
application point of view.

Regards,



Re: Postgresql HA cluster

From
Laura Smith
Date:
------- Original Message -------
On Friday, October 13th, 2023 at 14:10, Jehan-Guillaume de Rorthais <jgdr@dalibo.com> wrote:

> But really, double check first why a simple primary-standby architecture doesn't
> meet your needs. The simpler the architecture is, the better. Even from the
> application point of view.
>


From my perspective I do wonder why Postgres doesn't have an equivalent to MySQL Group Replication.

Although you can run MySQL GR as multi-primary, most people run it as primary-standby.

However the difference with Postgres is that MySQL Group does leader election. Whilst Postgres failover/failback is a
highlymanual affair. 



Re: Postgresql HA cluster

From
Jehan-Guillaume de Rorthais
Date:
On Fri, 13 Oct 2023 19:21:46 +0000
Laura Smith <n5d9xq3ti233xiyif2vp@protonmail.ch> wrote:

> ------- Original Message -------
> On Friday, October 13th, 2023 at 14:10, Jehan-Guillaume de Rorthais
> <jgdr@dalibo.com> wrote:
> 
> > But really, double check first why a simple primary-standby architecture
> > doesn't meet your needs. The simpler the architecture is, the better. Even
> > from the application point of view.
> >   
> 
> 
> From my perspective I do wonder why Postgres doesn't have an equivalent to
> MySQL Group Replication.
> 
> Although you can run MySQL GR as multi-primary, most people run it as
> primary-standby.
> 
> However the difference with Postgres is that MySQL Group does leader
> election. Whilst Postgres failover/failback is a highly manual affair.

PostgreSQL core only cares about primary-standby replication.

Auto-failover must involved various components way outside of the scope of
PostgreSQL itself: the system, the network, sometime the storage, a quorum
mechanism, sometime some fencing, etc.

There's various auto-failover, non manual, solutions in PostgreSQL ecosystems,
they just all live outside of the core.

Regards,



RE: Postgresql HA cluster

From
Jason Grammenos
Date:
Hello Jehan-Guillaume de Rorthais

Perhaps you could help me understand a few things better about how primary-standby works in postgresql with streaming
replication.

If you have 2 PostgreSQL nodes hooked up to a Load balancer (haproxy), and you move take node1 out of load balancing,
younow have connections on node1 and connections on node2, as the Load balancer drains the connections off node1 and
overto node2. How does PostgreSQL handle this scenario when there are writes happening on both nodes?
 

If instead you have 2 PostgreSQL nodes behind pacemaker (controlling a floating ip), what happens when you initiate a
failoverand move the floating ip? You want the connections to drain off node1 and move to node2. Again in this
scenario,both nodes would be sustaining writes at some point in time. How does the write only replica get changed out
ofwrite mode during this failover? How does the primary node get switched to read only after the connection drain is
complete?

Overall, I am trying to understand what it looks like operationally to run a 2 node postgresql "cluster" and how
patchingboth nodes would work (and mentally comparing it to what I currently do with mysql).
 

You recommended that primary-standby could be sufficient and is much simpler, so I am simply trying to wrap my head
aroundwhat exactly running it would look like. If primary standby is simple enough to failover, patch , reboot,
maintain,etc. Then you could be correct that master->master may not be needed.
 

Thank You
Jason

Jason Grammenos | Operations & Infrastructure Analyst  
Pronouns: he/him
P: 613.232.7797 x1131
Toll-free: 866.545.3745 x1131
jason.grammenos@agilitypr.com
agilitypr.com
Learn new PR tips from our free resources.

-----Original Message-----
From: Jehan-Guillaume de Rorthais <jgdr@dalibo.com> 
Sent: Friday, October 13, 2023 9:10 AM
To: Jason Grammenos <jason.grammenos@agilitypr.com>
Cc: pgsql-general@lists.postgresql.org
Subject: Re: Postgresql HA cluster

On Fri, 13 Oct 2023 12:02:53 +0000
Jason Grammenos <jason.grammenos@agilitypr.com> wrote:

> Thank you for the feedback,
> 
> I have used pacemaker for other purposes previously so am a little 
> familiar with it.

So you might be familiar with shared-storage cluster, that are the simpler one you could deploy (baring you have a
reliableHA storage available). But it's not a multi-primary cluster.
 

> It appears that in this scenario pacemaker is being used to manage a 
> floating ip as well as deal with split brain scenarios.

There's also two different resource agents dealing with PostgreSQL itself:
pgsql and PAF. Both handle multi-status differently from the administration point of view.

> What isn’t clear is how effective master-> master replication is being accomplished.

There's no master-master in PostgreSQL core. There's few external solutions out there though, but double check you real
needs,the real primary-standby capacity to answer you needs, and the various constraints M-M imply before rushing
there.

> Postgresql streaming replication to the best of my limited knowledge 
> only replicates in one direction, from the active to the standby 
> servers. The issue this presents to me is that once you failover from 
> the active to the standby (or one of the standby’s depending on how 
> many you have) none of the data written on the standby is replicated back to the formerly active server.

It depend if this is a "controlled failover" (aka. "switchover") or a real failover triggered by some failure. If this
isa controlled failover, you can hook back your old primary as a standby with no trouble. PAF even handle this for
you.

Moreover, even with a failure scenario, there's some solutions around to quickly fix your old primary data and get it
backin production quickly as a standby (pg_rewind, PITR/pgbackrest, etc).
 

You just have to plan for failure and write you procedures accordingly to get the cluster back on feet quickly after a
failover.

> Let us say that I have only 2 postgresql servers (absolute minimum 
> number) and I want to patch server A. Ideally, I would use a load 
> balancer (or other failover mechanism like pacemaker) and repoint the floating ip to server B.
> Now traffic would “drain” down off server A, and slowly (or rapidly) 
> move to B. During the move some clients would still be writing to A 
> and some clients would be writing to B.

This doesn't exist as PostgreSQL has no multi-primary solution in core. You can do rolling upgrade, but you'll have to
pausethe production during the switchover between the primary and the standby.
 

> In the above scenario, I do not understand how streaming replication 
> would handle the part of the scenario when there are clients writing to A and B.

It will not.

> It would seem that something like `pgpool-ii` or `pgEdge` would be 
> required, but with my limited knowledge it is unclear if or which would be appropriate.

External multi-primary solution exists, pgpool-II, Bucardo, BDR, etc. But you'll have to ask and evaluate these thrid
partysolutions yourself. 
 

But really, double check first why a simple primary-standby architecture doesn't meet your needs. The simpler the
architectureis, the better. Even from the application point of view.
 

Regards,

Re: Postgresql HA cluster

From
Jehan-Guillaume de Rorthais
Date:
Hi Jason,

On Tue, 17 Oct 2023 19:59:00 +0000
Jason Grammenos <jason.grammenos@agilitypr.com> wrote:

[...]
> If you have 2 PostgreSQL nodes hooked up to a Load balancer (haproxy), and
> you move take node1 out of load balancing, you now have connections on node1
> and connections on node2, as the Load balancer drains the connections off
> node1 and over to node2. How does PostgreSQL handle this scenario when there
> are writes happening on both nodes?

As you write about "writes happening on both nodes", you describe a
Multi-primary architecture that PostgreSQL does not support.

A standby is in "recovery mode", receiving and applying WALs directly from
the primary (or archives).

A standby will only accept read-only (RO) queries (providing you set
"hot_standby=on") as it can not produce WAL itself. Any write attempt raises
an ERROR to the client. This is a really strict state, not an advisory one. You
just can't write on a standby, even as a superuser.

So, in the situation you describe, as long as the standby is not promoted to
primary, any write (RW) query on it will just fails hard.

Obviously, you should definitely NOT promote a standby while another primary is
still alive in the cluster. You would end up with a bad split brain scenario.

About haproxy, you should use tcp-check (eg. on a Systemd socket) or http-check
(eg. on Patroni API) to decide on which server the primary is hosted. It
avoids this transition period where both nodes have RW connections.

> If instead you have 2 PostgreSQL nodes behind pacemaker (controlling a
> floating ip), what happens when you initiate a failover and move the floating
> ip? You want the connections to drain off node1 and move to node2. Again in
> this scenario, both nodes would be sustaining writes at some point in time.

No, both nodes would not be sustaining writes in the same time. Two primaries
can not leave in the same cluster in the same time, this would be a split brain
scenario. If this happen, you have to pick one and rewind or restore it from
PITR, losing the transactions it wrote in the meantime. Of course, you can also
compare both nodes and tries to recover missing transaction before restoring
one of them, but it takes a lot of time...That's why clusters (not just for
PostgreSQL) tries hard to avoid split brain scenario using fencing, quorum,
watchdog, etc.

Before moving the IP address, you have to demote your primary as standby.
Unfortunately, this can not be achieved without restarting the primary
PostgreSQL as a standby one on its node. But because demote == restart as
standby, you effectively end all clients sessions. At this point, all your
nodes will be in standby mode, without primary to feed them.

So, during a switchover Pacemaker will:

* demote your current primary (ending all sessions)
* promote the other standby as primary
* move the IP address to the new primary (colocation rule in Pacemaker between
  the IP and the promoted role).

As soon as a primary appears in the cluster and the IP address (or haproxy)
points on it, all other standbies nodes will start replicating from it.

> How does the write only replica get changed out of write mode during this
> failover?

"write only"? Do you speak about the primary in RW mode or about a standby in
RO mode? 

For a primary: a PostgreSQL restart.
For a standby: promotion happen live, all existing sessions are kept and can
then issue write queries.

> How does the primary node get switched to read only after the
> connection drain is complete?

A PostgreSQL restart.

> Overall, I am trying to understand what it looks like operationally to run a
> 2 node postgresql "cluster" and how patching both nodes would work (and
> mentally comparing it to what I currently do with mysql).
> 
> You recommended that primary-standby could be sufficient and is much simpler,
> so I am simply trying to wrap my head around what exactly running it would
> look like. If primary standby is simple enough to failover, patch , reboot,
> maintain, etc. Then you could be correct that master->master may not be
> needed.

And if you have a strong, safe and high available storage, a shared storage
cluster for the primary is even simpler IMHO (because you move all the
complexity off your hands to the storage guys ones).

For details, I recommend reading this chapter in documentation:
https://www.postgresql.org/docs/current/high-availability.html

Regards,