Re: Built-in Raft replication - Mailing list pgsql-hackers
From | Ashutosh Bapat |
---|---|
Subject | Re: Built-in Raft replication |
Date | |
Msg-id | CAExHW5tq8ShEfboFa52wDMWmyaW+6kX_6x8MZ-0TP65tOhJ1wQ@mail.gmail.com Whole thread Raw |
In response to | Re: Built-in Raft replication (Andrey Borodin <x4mmm@yandex-team.ru>) |
Responses |
Re: Built-in Raft replication
|
List | pgsql-hackers |
On Wed, Apr 16, 2025 at 10:29 AM Andrey Borodin <x4mmm@yandex-team.ru> wrote: > > > We may build an extension which > > has a similar role in PostgreSQL world as zookeeper in Hadoop. > > Patroni, pg_consul and others already use zookeeper, etcd and similar systems for consensus. > Is it any better as extension than as etcd? I feel so. An extension runs from within a postgresql process, uses the same protocol as PostgreSQL whereas etcd is another process and another protocol. > > > It can > > be then used for other distributed systems as well - like shared > > nothing clusters based on FDW. > > I didn’t get FDW analogy. Why other distributed systems should choose Postgres extension over Zookeeper? By other distributed systems I mean PostgreSQL distributed systems - FDW based native sharding or native replication or a system which uses both. > > > There's already a proposal to bring > > CREATE SERVER to the world of logical replication - so I see these two > > worlds uniting in future. > > Again, I’m lost here. Which two worlds? Logical replication and FDW based native sharding. > > > The > > distributed system based on logical replication or FDW or both will > > use this ensemble to manage its shared state. The same ensemble can be > > shared across multiple distributed clusters if it has scaling > > capabilities. > > Yes, shared DCS are common these days. AFAIK, we use one Zookeeper instance per hundred Postgres clusters to coordinatepg_consuls. > > Actually, scalability is opposite to topic of this thread. Let me explain. > Currently, Postgres automatic failover tools rely on databases with built-in automatic failover. Konstantin is proposingto shorten this loop and make Postgres use its build-in automatic failover. > > So, existing tooling allows you to have 3 hosts for DCS, with majority of 2 hosts able to elect new leader in case of failover. > And you can have only 2 hosts for Postgres - Primary and Standby. You can have 2 big Postgres machines with 64 CPUs. And3 one-CPU hosts for Zookeper\etcd. > > If you use build-in failover you have to resort to 3 big Postgres machines because you need 2/3 majority. Of course, youcan install MySQL-stype arbiter - host that had no real PGDATA, only participates in voting. But this is a solution toproblem induced by built-in autofailover. Users find it a waste of resources to deploy 3 big PostgreSQL instances just for HA where 2 suffice even if they deploy 3 lightweight DCS instances. Having only some of the nodes act as DCS and others purely PostgreSQL nodes will reduce waste of resources. -- Best Wishes, Ashutosh Bapat
pgsql-hackers by date: