RE: Postgresql HA cluster - Mailing list pgsql-general

From Jason Grammenos
Subject RE: Postgresql HA cluster
Date
Msg-id YQBPR0101MB6282FD930D4A3964C65754ECE4D6A@YQBPR0101MB6282.CANPRD01.PROD.OUTLOOK.COM
Whole thread Raw
In response to Re: Postgresql HA cluster  (Jehan-Guillaume de Rorthais <jgdr@dalibo.com>)
Responses Re: Postgresql HA cluster
List pgsql-general
Hello Jehan-Guillaume de Rorthais

Perhaps you could help me understand a few things better about how primary-standby works in postgresql with streaming
replication.

If you have 2 PostgreSQL nodes hooked up to a Load balancer (haproxy), and you move take node1 out of load balancing,
younow have connections on node1 and connections on node2, as the Load balancer drains the connections off node1 and
overto node2. How does PostgreSQL handle this scenario when there are writes happening on both nodes?
 

If instead you have 2 PostgreSQL nodes behind pacemaker (controlling a floating ip), what happens when you initiate a
failoverand move the floating ip? You want the connections to drain off node1 and move to node2. Again in this
scenario,both nodes would be sustaining writes at some point in time. How does the write only replica get changed out
ofwrite mode during this failover? How does the primary node get switched to read only after the connection drain is
complete?

Overall, I am trying to understand what it looks like operationally to run a 2 node postgresql "cluster" and how
patchingboth nodes would work (and mentally comparing it to what I currently do with mysql).
 

You recommended that primary-standby could be sufficient and is much simpler, so I am simply trying to wrap my head
aroundwhat exactly running it would look like. If primary standby is simple enough to failover, patch , reboot,
maintain,etc. Then you could be correct that master->master may not be needed.
 

Thank You
Jason

Jason Grammenos | Operations & Infrastructure Analyst  
Pronouns: he/him
P: 613.232.7797 x1131
Toll-free: 866.545.3745 x1131
jason.grammenos@agilitypr.com
agilitypr.com
Learn new PR tips from our free resources.

-----Original Message-----
From: Jehan-Guillaume de Rorthais <jgdr@dalibo.com> 
Sent: Friday, October 13, 2023 9:10 AM
To: Jason Grammenos <jason.grammenos@agilitypr.com>
Cc: pgsql-general@lists.postgresql.org
Subject: Re: Postgresql HA cluster

On Fri, 13 Oct 2023 12:02:53 +0000
Jason Grammenos <jason.grammenos@agilitypr.com> wrote:

> Thank you for the feedback,
> 
> I have used pacemaker for other purposes previously so am a little 
> familiar with it.

So you might be familiar with shared-storage cluster, that are the simpler one you could deploy (baring you have a
reliableHA storage available). But it's not a multi-primary cluster.
 

> It appears that in this scenario pacemaker is being used to manage a 
> floating ip as well as deal with split brain scenarios.

There's also two different resource agents dealing with PostgreSQL itself:
pgsql and PAF. Both handle multi-status differently from the administration point of view.

> What isn’t clear is how effective master-> master replication is being accomplished.

There's no master-master in PostgreSQL core. There's few external solutions out there though, but double check you real
needs,the real primary-standby capacity to answer you needs, and the various constraints M-M imply before rushing
there.

> Postgresql streaming replication to the best of my limited knowledge 
> only replicates in one direction, from the active to the standby 
> servers. The issue this presents to me is that once you failover from 
> the active to the standby (or one of the standby’s depending on how 
> many you have) none of the data written on the standby is replicated back to the formerly active server.

It depend if this is a "controlled failover" (aka. "switchover") or a real failover triggered by some failure. If this
isa controlled failover, you can hook back your old primary as a standby with no trouble. PAF even handle this for
you.

Moreover, even with a failure scenario, there's some solutions around to quickly fix your old primary data and get it
backin production quickly as a standby (pg_rewind, PITR/pgbackrest, etc).
 

You just have to plan for failure and write you procedures accordingly to get the cluster back on feet quickly after a
failover.

> Let us say that I have only 2 postgresql servers (absolute minimum 
> number) and I want to patch server A. Ideally, I would use a load 
> balancer (or other failover mechanism like pacemaker) and repoint the floating ip to server B.
> Now traffic would “drain” down off server A, and slowly (or rapidly) 
> move to B. During the move some clients would still be writing to A 
> and some clients would be writing to B.

This doesn't exist as PostgreSQL has no multi-primary solution in core. You can do rolling upgrade, but you'll have to
pausethe production during the switchover between the primary and the standby.
 

> In the above scenario, I do not understand how streaming replication 
> would handle the part of the scenario when there are clients writing to A and B.

It will not.

> It would seem that something like `pgpool-ii` or `pgEdge` would be 
> required, but with my limited knowledge it is unclear if or which would be appropriate.

External multi-primary solution exists, pgpool-II, Bucardo, BDR, etc. But you'll have to ask and evaluate these thrid
partysolutions yourself. 
 

But really, double check first why a simple primary-standby architecture doesn't meet your needs. The simpler the
architectureis, the better. Even from the application point of view.
 

Regards,

pgsql-general by date:

Previous
From: Hafeez Rahim
Date:
Subject: Index based search have issues with 11.20
Next
From: Tom Lane
Date:
Subject: Re: Index based search have issues with 11.20