Thread: Patroni vs pgpool II
P {margin-top:0;margin-bottom:0;} Hi Guys,Hope you are doing well.Can someone please suggest what is one (Patroni vs PGPool II) is best for achieving HA/Auto failover, Load balancing for DB servers. Along with this, can you please share the company/client names using these tools for large PG databases?
We're satisfied with PgPool for HA. Can't give the same, and it's only a few hundred GB, though.
Born in Arizona, moved to Babylonia.
On Mon, 3 Apr 2023 06:33:46 +0000 Inzamam Shafiq <inzamam.shafiq@hotmail.com> wrote: [...] > Can someone please suggest what is one (Patroni vs PGPool II) is best for > achieving HA/Auto failover, Load balancing for DB servers. Along with this, > can you please share the company/client names using these tools for large PG > databases? Load balancing is best achieved from the application side. The most popular auto failover solution is Patroni. Other solutions are involving Pacemaker to either: * build a shared storage cluster with a standalone instance moving from node to node (but this can include standbys) * build a cluster with a promotable resource using eg. the PAF resource agent, that will decide where to start the standbys and which one to promote. No matter the solution you pick, be prepared to learn and train. A lot.
Can someone please suggest what is one (Patroni vs PGPool II) is best for achieving HA/Auto failover, Load balancing for DB servers. Along with this, can you please share the company/client names using these tools for large PG databases?
Having used pgpool in multiple production deployments I swore to never use it again, ever.
The first reason is that you need a doctorate degree to try to understand how it actually works, what the pcp commands do in each scenario and how to correctly write the failover scripts.
It is basically a daemon glued together with scripts for which you are entirely responsible for. Any small mistake in failover scripts and cluster enters a broken state.
Even once you have it set up as it should, yes, it will fail over correctly but it won't autoheal without manual intervention.
You also often end up in weird situation when backends are up, pgpool reports down and similar scenarios and then you need to run the precise sequence of pcp commands to recover
or destroy your whole cluster in the process if you mistype.
I haven't used patroni yet but it surely can't be worse.
Best regards, cen
Hi, > Hi Guys, > > Hope you are doing well. > > Can someone please suggest what is one (Patroni vs PGPool II) is best for achieving HA/Auto failover, Load balancing forDB servers. I am not sure if Patroni provides load balancing feature. > Along with this, can you please share the company/client names using these tools for large PG databases? I can't give you names but we (SRA OSS) have many customers using PostgreSQL and some of them are using Pgpool-II. Best reagards, -- Tatsuo Ishii SRA OSS LLC English: http://www.sraoss.co.jp/index_en/ Japanese:http://www.sraoss.co.jp
Hi,
I am not sure if Patroni provides load balancing feature.
> BUT, even if there is a solution that parses queries to make a decision it > I would not recommend anyone to use it unless all consequences are > understood. > Specifically, not every read-only query could be salefy sent to a replica, > because they could be lagging behind the primary. > Only application (developers) could decide whether for a specific query > they could afford slightly outdated results. Most of the popular > application frameworks support configuring two connection strings for this > purpose. I think Pgpool-II users well understand the effect of replication lagging because I've never heard complains like "hey, why my query result is sometimes outdated?" Moreover Pgpool-II provides many load balancing features depending on user's needs. For example users can: - just turn off load balancing - turn off load balancing only for specific application name - turn off load balancing only for specific database - turn off load balancing if current transaction includes write query Best reagards, -- Tatsuo Ishii SRA OSS LLC English: http://www.sraoss.co.jp/index_en/ Japanese:http://www.sraoss.co.jp
Sent: Wednesday, April 5, 2023 12:38 PM
To: cyberdemn@gmail.com <cyberdemn@gmail.com>
Cc: inzamam.shafiq@hotmail.com <inzamam.shafiq@hotmail.com>; pgsql-general@lists.postgresql.org <pgsql-general@lists.postgresql.org>
Subject: Re: Patroni vs pgpool II
> I would not recommend anyone to use it unless all consequences are
> understood.
> Specifically, not every read-only query could be salefy sent to a replica,
> because they could be lagging behind the primary.
> Only application (developers) could decide whether for a specific query
> they could afford slightly outdated results. Most of the popular
> application frameworks support configuring two connection strings for this
> purpose.
I think Pgpool-II users well understand the effect of replication
lagging because I've never heard complains like "hey, why my query
result is sometimes outdated?"
Moreover Pgpool-II provides many load balancing features depending on
user's needs. For example users can:
- just turn off load balancing
- turn off load balancing only for specific application name
- turn off load balancing only for specific database
- turn off load balancing if current transaction includes write query
Best reagards,
--
Tatsuo Ishii
SRA OSS LLC
English: http://www.sraoss.co.jp/index_en/
Japanese:http://www.sraoss.co.jp
> But, I heard PgPool is still affected by Split brain syndrome. Can you elaborate more? If more than 3 pgpool watchdog nodes (the number of nodes must be odd) are configured, a split brain can be avoided. Best reagards, -- Tatsuo Ishii SRA OSS LLC English: http://www.sraoss.co.jp/index_en/ Japanese:http://www.sraoss.co.jp > Regards, > > Inzamam Shafiq > Sr. DBA > ________________________________ > From: Tatsuo Ishii <ishii@sraoss.co.jp> > Sent: Wednesday, April 5, 2023 12:38 PM > To: cyberdemn@gmail.com <cyberdemn@gmail.com> > Cc: inzamam.shafiq@hotmail.com <inzamam.shafiq@hotmail.com>; pgsql-general@lists.postgresql.org <pgsql-general@lists.postgresql.org> > Subject: Re: Patroni vs pgpool II > >> BUT, even if there is a solution that parses queries to make a decision it >> I would not recommend anyone to use it unless all consequences are >> understood. >> Specifically, not every read-only query could be salefy sent to a replica, >> because they could be lagging behind the primary. >> Only application (developers) could decide whether for a specific query >> they could afford slightly outdated results. Most of the popular >> application frameworks support configuring two connection strings for this >> purpose. > > I think Pgpool-II users well understand the effect of replication > lagging because I've never heard complains like "hey, why my query > result is sometimes outdated?" > > Moreover Pgpool-II provides many load balancing features depending on > user's needs. For example users can: > > - just turn off load balancing > - turn off load balancing only for specific application name > - turn off load balancing only for specific database > - turn off load balancing if current transaction includes write query > > Best reagards, > -- > Tatsuo Ishii > SRA OSS LLC > English: http://www.sraoss.co.jp/index_en/ > Japanese:http://www.sraoss.co.jp
On Wed, 05 Apr 2023 16:50:15 +0900 (JST) Tatsuo Ishii <ishii@sraoss.co.jp> wrote: > > But, I heard PgPool is still affected by Split brain syndrome. > > Can you elaborate more? If more than 3 pgpool watchdog nodes (the > number of nodes must be odd) are configured, a split brain can be > avoided. Split brain is a hard situation to avoid. I suppose OP is talking about PostgreSQL split brain situation. I'm not sure how PgPool's watchdog would avoid that. To avoid split brain, you need to implement a combinaison of quorum and (self-)fencing. Patroni quorum is in the DCS's hands. Patroni's self-fencing can be achieved with the (hardware) watchdog. You can also implement node fencing through the "pre_promote" script to fence the old primary node before promoting the new one. If you need HA with a high level of anti-split-brain security, you'll not be able to avoid some sort of fencing, no matter what. Good luck.
>> > But, I heard PgPool is still affected by Split brain syndrome. >> >> Can you elaborate more? If more than 3 pgpool watchdog nodes (the >> number of nodes must be odd) are configured, a split brain can be >> avoided. > > Split brain is a hard situation to avoid. I suppose OP is talking about > PostgreSQL split brain situation. I'm not sure how PgPool's watchdog would > avoid that. Ok, "split brain" means here that there are two or more PostgreSQL primary serves exist. Pgpool-II's watchdog has a feature called "quorum failover" to avoid the situation. To make this work, you need to configure 3 or more Pgpool-II nodes. Suppose they are w0, w1 and w2. Also suppose there are two PostgreSQL servers pg0 (primary) and pg1 (standby). The goal is to avoid that both pg0 and pg1 become primary servers. Pgpool-II periodically monitors PostgreSQL healthiness by checking whether it can reach to the PostgreSQL servers. Suppose w0 and w1 detect that pg0 is healthy but pg1 is not, while w2 thinks oppositely, i.e. pg0 is unhealthy but pg1 is healthy (this could happen if w0, w1, pg0 are in a network A, but w2 and pg1 in different network B. A and B cannot reach each other). In this situation if w2 promotes pg1 because w0 seems to be down, then the system ends up with two primary servers: split brain. With quorum failover is enabled, w0, w1, and w2 communicate each other to vote who is correct (if it cannot communicate, it regards other watchdog is down). In the case above w0 and w1 are majority and will win. Thus w0 and w1 just detach pg1 and keep on using pg0 as the primary. On the other hand, since wg2 looses, and it gives up promoting pg1, thus the split brain is avoided. Note that in the configuration above, clients access the cluster via VIP. VIP is always controlled by majority watchdog, clients will not access pg1 because it is set to down status by w0 and w1. > To avoid split brain, you need to implement a combinaison of quorum and > (self-)fencing. > > Patroni quorum is in the DCS's hands. Patroni's self-fencing can be achieved > with the (hardware) watchdog. You can also implement node fencing through the > "pre_promote" script to fence the old primary node before promoting the new one. > > If you need HA with a high level of anti-split-brain security, you'll not be > able to avoid some sort of fencing, no matter what. > > Good luck. Well, if you define fencing as STONITH (Shoot The Other Node in the Head), Pgpool-II does not have the feature. However I am not sure STONITH is always mandatory. I think that depends what you want to avoid using fencing. If the purpose is to avoid having two primary servers at the same time, Pgpool-II achieve that as described above. Best reagards, -- Tatsuo Ishii SRA OSS LLC English: http://www.sraoss.co.jp/index_en/ Japanese:http://www.sraoss.co.jp
On 4/6/23 23:16, Tatsuo Ishii wrote: >>>> But, I heard PgPool is still affected by Split brain syndrome. >>> Can you elaborate more? If more than 3 pgpool watchdog nodes (the >>> number of nodes must be odd) are configured, a split brain can be >>> avoided. >> Split brain is a hard situation to avoid. I suppose OP is talking about >> PostgreSQL split brain situation. I'm not sure how PgPool's watchdog would >> avoid that. > Ok, "split brain" means here that there are two or more PostgreSQL > primary serves exist. > > Pgpool-II's watchdog has a feature called "quorum failover" to avoid > the situation. To make this work, you need to configure 3 or more > Pgpool-II nodes. Suppose they are w0, w1 and w2. Also suppose there > are two PostgreSQL servers pg0 (primary) and pg1 (standby). The goal > is to avoid that both pg0 and pg1 become primary servers. > > Pgpool-II periodically monitors PostgreSQL healthiness by checking > whether it can reach to the PostgreSQL servers. Suppose w0 and w1 > detect that pg0 is healthy but pg1 is not, while w2 thinks oppositely, > i.e. pg0 is unhealthy but pg1 is healthy (this could happen if w0, w1, > pg0 are in a network A, but w2 and pg1 in different network B. A and B > cannot reach each other). > > In this situation if w2 promotes pg1 because w0 seems to be down, then > the system ends up with two primary servers: split brain. > > With quorum failover is enabled, w0, w1, and w2 communicate each other > to vote who is correct (if it cannot communicate, it regards other > watchdog is down). In the case above w0 and w1 are majority and will > win. Thus w0 and w1 just detach pg1 and keep on using pg0 as the > primary. On the other hand, since wg2 looses, and it gives up > promoting pg1, thus the split brain is avoided. > > Note that in the configuration above, clients access the cluster via > VIP. VIP is always controlled by majority watchdog, clients will not > access pg1 because it is set to down status by w0 and w1. And this concept is quite old. (It's also what Windows clustering uses.) -- Born in Arizona, moved to Babylonia.
On Thu, Apr 6, 2023 at 9:17 PM Tatsuo Ishii <ishii@sraoss.co.jp> wrote: > With quorum failover is enabled, w0, w1, and w2 communicate each other > to vote who is correct (if it cannot communicate, it regards other > watchdog is down). In the case above w0 and w1 are majority and will > win. Communication takes time – network latencies. What if during this communication, the situation becomes different? What if some of them cannot communicate with each other due to network issues? What if pg1 is currently primary, pg0 is standby, both are healthy, but due not network issues, both pg1 and w2 are not reachable to other nodes? Will pg1 remain primary, and w0 and w1 decide to promote pg0?
> Communication takes time – network latencies. What if during this > communication, the situation becomes different? We have to accept it (and do the best to mitigate any consequence of the problem). I think there's no such a system which presuppose 0 communication latency. > What if some of them cannot communicate with each other due to network issues? Can you elaborate more? There are many scenarios for communication break down. I hesitate to discuss all of them on this forum since this is for discussions on PostgreSQL, not Pgpool-II. I am welcome you to join and continue the discussion on pgpool mailing list. > What if pg1 is currently primary, pg0 is standby, both are healthy, but > due not network issues, both pg1 and w2 are not reachable to other > nodes? Will pg1 remain primary, and w0 and w1 decide to promote pg0? pg1 will remain primary but it is set to "quarantine" state from pgpool's point of view, which means clients cannot access pg1 via pgpool. w0 and w1 will decide to promote pg0. Best reagards, -- Tatsuo Ishii SRA OSS LLC English: http://www.sraoss.co.jp/index_en/ Japanese:http://www.sraoss.co.jp
On Thu, Apr 6, 2023 at 11:13 PM Tatsuo Ishii <ishii@sraoss.co.jp> wrote: > I am welcome you to > join and continue the discussion on pgpool mailing list. I truly believe that this problem – HA – is PostgreSQL's, not 3rd party's. And it's a shame that Postgres itself doesn't solve this. So we're discussing it here. > > What if pg1 is currently primary, pg0 is standby, both are healthy, but > > due not network issues, both pg1 and w2 are not reachable to other > > nodes? Will pg1 remain primary, and w0 and w1 decide to promote pg0? > > pg1 will remain primary but it is set to "quarantine" state from > pgpool's point of view, which means clients cannot access pg1 via > pgpool. So we have a split brain here – two primaries. Especially if some clients communicate with PG directly. And even if there are no such clients, archive_command is going to work on both nodes, monitoring will show two primaries confusing humans (e.g, SREs) and various systems, if we have many standby nodes, some of them might continue replicating from the old primary if they happen to be in the same network partition, and so on. I don't see how all these things can be solved with this approach.
> I truly believe that this problem – HA – is PostgreSQL's, not 3rd > party's. And it's a shame that Postgres itself doesn't solve this. So > we're discussing it here. Let's see what other subscribers on this forum say. >> > What if pg1 is currently primary, pg0 is standby, both are healthy, but >> > due not network issues, both pg1 and w2 are not reachable to other >> > nodes? Will pg1 remain primary, and w0 and w1 decide to promote pg0? >> >> pg1 will remain primary but it is set to "quarantine" state from >> pgpool's point of view, which means clients cannot access pg1 via >> pgpool. > > So we have a split brain here – two primaries. Especially if some > clients communicate with PG directly. Clients are not allowed to communicate with PostgreSQL directory. That's the prerequisite of using Pgpool-II. > And even if there are no such > clients, archive_command is going to > work on both nodes, What's the problem with this? Moreover you can write a logic to disable this in the failover command. > monitoring will show two primaries confusing > humans (e.g, SREs) and various systems, That's why pgpool provides its own monitoring tools. Clustering system is different from standalone PostgreSQL. Existing PostgreSQL tools usually only take account of stand alone PostgreSQL. Users have to realize the difference. > if we have many standby nodes, > some of them might continue replicating from the old primary if they > happen to be in the same network partition, and so on. As of pg0 and existing standby in the same network as pg0, you can either manually or automatically make them to follow pg0. Best reagards, -- Tatsuo Ishii SRA OSS LLC English: http://www.sraoss.co.jp/index_en/ Japanese:http://www.sraoss.co.jp
On Fri, 07 Apr 2023 13:16:59 +0900 (JST) Tatsuo Ishii <ishii@sraoss.co.jp> wrote: > >> > But, I heard PgPool is still affected by Split brain syndrome. > >> > >> Can you elaborate more? If more than 3 pgpool watchdog nodes (the > >> number of nodes must be odd) are configured, a split brain can be > >> avoided. > > > > Split brain is a hard situation to avoid. I suppose OP is talking about > > PostgreSQL split brain situation. I'm not sure how PgPool's watchdog would > > avoid that. > > Ok, "split brain" means here that there are two or more PostgreSQL > primary serves exist. > > Pgpool-II's watchdog has a feature called "quorum failover" to avoid > the situation. To make this work, you need to configure 3 or more > Pgpool-II nodes. Suppose they are w0, w1 and w2. Also suppose there > are two PostgreSQL servers pg0 (primary) and pg1 (standby). The goal > is to avoid that both pg0 and pg1 become primary servers. > > Pgpool-II periodically monitors PostgreSQL healthiness by checking > whether it can reach to the PostgreSQL servers. Suppose w0 and w1 > detect that pg0 is healthy but pg1 is not, while w2 thinks oppositely, > i.e. pg0 is unhealthy but pg1 is healthy (this could happen if w0, w1, > pg0 are in a network A, but w2 and pg1 in different network B. A and B > cannot reach each other). > > In this situation if w2 promotes pg1 because w0 seems to be down, then > the system ends up with two primary servers: split brain. > > With quorum failover is enabled, w0, w1, and w2 communicate each other > to vote who is correct (if it cannot communicate, it regards other > watchdog is down). In the case above w0 and w1 are majority and will > win. Thus w0 and w1 just detach pg1 and keep on using pg0 as the > primary. On the other hand, since wg2 looses, and it gives up > promoting pg1, thus the split brain is avoided. > > Note that in the configuration above, clients access the cluster via > VIP. VIP is always controlled by majority watchdog, clients will not > access pg1 because it is set to down status by w0 and w1. > > > To avoid split brain, you need to implement a combinaison of quorum and > > (self-)fencing. > > > > Patroni quorum is in the DCS's hands. Patroni's self-fencing can be achieved > > with the (hardware) watchdog. You can also implement node fencing through > > the "pre_promote" script to fence the old primary node before promoting the > > new one. > > > > If you need HA with a high level of anti-split-brain security, you'll not be > > able to avoid some sort of fencing, no matter what. > > > > Good luck. > > Well, if you define fencing as STONITH (Shoot The Other Node in the > Head), Pgpool-II does not have the feature. And I believe that's part of what Cen was complaining about: « It is basically a daemon glued together with scripts for which you are entirely responsible for. Any small mistake in failover scripts and cluster enters a broken state. » If you want to build something clean, including fencing, you'll have to handle/dev it by yourself in scripts > However I am not sure STONITH is always mandatory. Sure, it really depend on how much risky you can go and how much complexity you can afford. Some cluster can leave with a 10 minute split brain where some other can not survive a 5s split brain. > I think that depends what you want to avoid using fencing. If the purpose is > to avoid having two primary servers at the same time, Pgpool-II achieve that > as described above. How could you be so sure? See https://www.alteeve.com/w/The_2-Node_Myth « * Quorum is a tool for when things are working predictably * Fencing is a tool for when things go wrong » Regards,
> And I believe that's part of what Cen was complaining about: > > « > It is basically a daemon glued together with scripts for which you are > entirely responsible for. Any small mistake in failover scripts and > cluster enters a broken state. > » > > If you want to build something clean, including fencing, you'll have to > handle/dev it by yourself in scripts That's a design decision. This gives maximum flexibility to users. Please note that we provide step-by-step installation/configuration documents which has been used by production systems. https://www.pgpool.net/docs/44/en/html/example-cluster.html >> However I am not sure STONITH is always mandatory. > > Sure, it really depend on how much risky you can go and how much complexity you > can afford. Some cluster can leave with a 10 minute split brain where some other > can not survive a 5s split brain. > >> I think that depends what you want to avoid using fencing. If the purpose is >> to avoid having two primary servers at the same time, Pgpool-II achieve that >> as described above. > > How could you be so sure? > > See https://www.alteeve.com/w/The_2-Node_Myth > > « > * Quorum is a tool for when things are working predictably > * Fencing is a tool for when things go wrong I think the article does not apply to Pgpool-II. ------------------------------------------------------------------- 3-Node When node 1 stops responding, node 2 declares it lost, reforms a cluster with the quorum node, node 3, and is quorate. It begins recovery by mounting the filesystem under NFS, which replays journals and cleans up, then starts NFS and takes the virtual IP address. Later, node 1 recovers from its hang. At the moment of recovery, it has no concept that time has passed and so has no reason to check to see if it is still quorate or whether its locks are still valid. It just finished doing whatever it was doing at the moment it hung. In the best case scenario, you now have two machines claiming the same IP address. At worse, you have uncoordinated writes to storage and you corrupt your data. ------------------------------------------------------------------- > Later, node 1 recovers from its hang. Pgpool-II does not allow an automatic recover. If node 1 hangs and once it is recognized as "down" by other nodes, it will not be used without manual intervention. Thus the disaster described above will not happen in pgpool. Best reagards, -- Tatsuo Ishii SRA OSS LLC English: http://www.sraoss.co.jp/index_en/ Japanese:http://www.sraoss.co.jp
On Fri, 07 Apr 2023 18:04:05 +0900 (JST) Tatsuo Ishii <ishii@sraoss.co.jp> wrote: > > And I believe that's part of what Cen was complaining about: > > > > « > > It is basically a daemon glued together with scripts for which you are > > entirely responsible for. Any small mistake in failover scripts and > > cluster enters a broken state. > > » > > > > If you want to build something clean, including fencing, you'll have to > > handle/dev it by yourself in scripts > > That's a design decision. This gives maximum flexibility to users. Sure, no problem with that. But people has to realize that the downside is that it left the whole complexity and reliability of the cluster in the hands of the administrator. And these are much more complicated and racy than a simple promote node. Even dealing with a simple vIP can become a nightmare if not done correctly. > Please note that we provide step-by-step installation/configuration > documents which has been used by production systems. > > https://www.pgpool.net/docs/44/en/html/example-cluster.html These scripts rely on SSH, which is really bad. What if you have a SSH failure in the mix? Moreover, even if SSH wouldn't be a weakness by itself, the script it doesn't even try to shutdown the old node or stop the old primary. You can add to the mix that both Pgpool and SSH rely on TCP for availability checks and actions. You better have very low TCP timeout/retry... When a service lose quorum on a resource, it is supposed to shutdown as fast as possible... Or even self-fence itself using a watchdog device if the shutdown action doesn't return fast enough. > >> However I am not sure STONITH is always mandatory. > > > > Sure, it really depend on how much risky you can go and how much complexity > > you can afford. Some cluster can leave with a 10 minute split brain where > > some other can not survive a 5s split brain. > > > >> I think that depends what you want to avoid using fencing. If the purpose > >> is to avoid having two primary servers at the same time, Pgpool-II achieve > >> that as described above. > > > > How could you be so sure? > > > > See https://www.alteeve.com/w/The_2-Node_Myth > > > > « > > * Quorum is a tool for when things are working predictably > > * Fencing is a tool for when things go wrong > > I think the article does not apply to Pgpool-II. It is a simple example using NFS. The point here is that when things are getting unpredictable, Quorum is just not enough. So yes, it does apply to Pgpool. Quorum is nice when nodes can communicate with each others, when they have enough time and/or minimal load to complete actions correctly. My point is that a proper cluster with a anti-split-brain solution required need both quorum and fencing. > [...] > > Later, node 1 recovers from its hang. > > Pgpool-II does not allow an automatic recover. This example neither. There's no automatic recover. It just state that node 1 was unable to answer in a timely fashion, just enough for a new quorum to be formed and elect a new primary. But node 1 was not dead, and when node 1 is able to answer, boom. Service being muted for some period of time is really common. There's various articles/confs feedback about cluster failing-over wrongly because of eg. a high load on the primary... Last one was during the fosdem iirc. > If node 1 hangs and once it is recognized as "down" by other nodes, it will > not be used without manual intervention. Thus the disaster described above > will not happen in pgpool. Ok, so I suppose **all** connections, scripts, softwares, backups, maintenances and admins must go through Pgpool to be sure to hit the correct primary. This might be acceptable in some situation, but I wouldn't call that an anti-split-brain solution. It's some kind of «software hiding the rogue node behind a curtain and pretend it doesn't exist anymore» Regards,
On Fri, 07 Apr 2023 18:04:05 +0900 (JST) Tatsuo Ishii <ishii@sraoss.co.jp> wrote:And I believe that's part of what Cen was complaining about: « It is basically a daemon glued together with scripts for which you are entirely responsible for. Any small mistake in failover scripts and cluster enters a broken state. » If you want to build something clean, including fencing, you'll have to handle/dev it by yourself in scriptsThat's a design decision. This gives maximum flexibility to users.Sure, no problem with that. But people has to realize that the downside is that it left the whole complexity and reliability of the cluster in the hands of the administrator. And these are much more complicated and racy than a simple promote node. Even dealing with a simple vIP can become a nightmare if not done correctly.Please note that we provide step-by-step installation/configuration documents which has been used by production systems. https://www.pgpool.net/docs/44/en/html/example-cluster.htmlThese scripts rely on SSH, which is really bad. What if you have a SSH failure in the mix? Moreover, even if SSH wouldn't be a weakness by itself, the script it doesn't even try to shutdown the old node or stop the old primary.
That does not matter, when only PgPool does the writing to the database.
You can add to the mix that both Pgpool and SSH rely on TCP for availability checks and actions. You better have very low TCP timeout/retry... When a service lose quorum on a resource, it is supposed to shutdown as fast as possible... Or even self-fence itself using a watchdog device if the shutdown action doesn't return fast enough.
Scenario:
S0 - Running Postgresql as primary, and also PgPool.
S1 - Running Postgresql as secondary, and also PgPool.
S2 - Running only PgPool. Has the VIP.
There's no need for Postgresql or PgPool on server 0 to shut down if it loses contact with S1 and S2, since they'll also notice that that S1 has disappeared. In that case, they'll vote S1 into degraded state, and promote S1 to be the Postgresql primary.
A good question is what happens when S0 and S1 lose connection to S2 (meaning that S2 loses connection to them, too). S0 and S1 then "should" vote that S0 take over the VIP. But, if S2 is still up and can connect to "the world", does it voluntarily decide to give up the VIP since it's all alone?
Born in Arizona, moved to Babylonia.
> Scenario: > S0 - Running Postgresql as primary, and also PgPool. > S1 - Running Postgresql as secondary, and also PgPool. > S2 - Running only PgPool. Has the VIP. > > There's no /need/ for Postgresql or PgPool on server 0 to shut down if > it loses contact with S1 and S2, since they'll also notice that that > S1 has disappeared. In that case, they'll vote S1 into degraded > state, and promote S1 to be the Postgresql primary. > > A good question is what happens when S0 and S1 lose connection to S2 > (meaning that S2 loses connection to them, too). S0 and S1 then > "should" vote that S0 take over the VIP. But, if S2 is still up and > can connect to "the world", does it voluntarily decide to give up the > VIP since it's all alone? Yes, because S2 pgpool is not the leader anymore. In this case S2 voluntarily gives up VIP. Best reagards, -- Tatsuo Ishii SRA OSS LLC English: http://www.sraoss.co.jp/index_en/ Japanese:http://www.sraoss.co.jp
>> If node 1 hangs and once it is recognized as "down" by other nodes, it will >> not be used without manual intervention. Thus the disaster described above >> will not happen in pgpool. > > Ok, so I suppose **all** connections, scripts, softwares, backups, maintenances > and admins must go through Pgpool to be sure to hit the correct primary. > > This might be acceptable in some situation, but I wouldn't call that an > anti-split-brain solution. It's some kind of «software hiding the rogue node > behind a curtain and pretend it doesn't exist anymore» You can call Pgpool-II whatever you like. Important thing for me (and probably for users) is, if it can solve user's problem or not. Best reagards, -- Tatsuo Ishii SRA OSS LLC English: http://www.sraoss.co.jp/index_en/ Japanese:http://www.sraoss.co.jp
On Fri, 07 Apr 2023 21:16:04 +0900 (JST) Tatsuo Ishii <ishii@sraoss.co.jp> wrote: > >> If node 1 hangs and once it is recognized as "down" by other nodes, it will > >> not be used without manual intervention. Thus the disaster described above > >> will not happen in pgpool. > > > > Ok, so I suppose **all** connections, scripts, softwares, backups, > > maintenances and admins must go through Pgpool to be sure to hit the > > correct primary. > > > > This might be acceptable in some situation, but I wouldn't call that an > > anti-split-brain solution. It's some kind of «software hiding the rogue node > > behind a curtain and pretend it doesn't exist anymore» > > You can call Pgpool-II whatever you like. I didn't mean to be rude here. Please, accept my apologies if my words offended you. I consider "proxy-based" fencing architecture fragile because you just don't know what is happening on your rogue node as long as a meatware is coming along to deal with it. Moreover, you must trust your scripts, configurations, procedures, admins, applications, users, replication, network, Pgpool, etc to not fail on you in the meantime... In the Pacemaker world, where everything MUST be **predictable**, the only way to predict the state of a rogue node is to fence it from the cluster. Either cut it from the network, shut it down or set up the watchdog so it reset itself if needed. At the end, you know your old primary is off or idle or screaming in the void with no one to hear it. It can't harm your other nodes, data or apps anymore, no matter what. > Important thing for me (and probably for users) is, if it can solve user's > problem or not. In my humble (and biased) opinion, Patroni, PAF or shared storage cluster are solving user's problem in regard with HA. All with PROs and CONs. All rely on strong, safe, well known and well developed clustering concepts. Some consider they are complex pieces of software to deploy and maintain, but this is because HA is complex. No miracle here. Solutions like Pgpool or Repmgr are trying hard to re-implement HA concepts but left most of this complexity and safety to the user discretion. Unfortunately, this is not the role of the user to deal with such things. This kind of architecture probably answer a need, a gray zone, where it is good enough. I've seen similar approach in the past with pgbouncer + bash scripting calling themselves "fencing" solution [1]. I'm fine with it as far as people are clear about the limitations. Kind regards, [1] eg. https://www.postgresql.eu/events/pgconfeu2016/sessions/session/1348-ha-with-repmgr-barman-and-pgbouncer/
Hi Guys,Hope you are doing well.Can someone please suggest what is one (Patroni vs PGPool II) is best for achieving HA/Auto failover, Load balancing for DB servers. Along with this, can you please share the company/client names using these tools for large PG databases?Thanks.Regards,Inzamam Shafiq
Allan Nielsen