Thread: Cluster for an appliance-type deployment

Cluster for an appliance-type deployment

From
Matthias Leisi
Date:
Dear all,

I’m reasonably experienced with Postgres with simple (single, „rebuild and restore“) requirements, but would need some hints on what to look for in a more complex situation - deploying Postgres as the backend for a (virtual) appliance. 

This appliance can scale horizontally from a single to dozens of VMs (theoretically more, but most installations are small-ish). It is feasible to configure VMs for particular purposes (eg „you are [also] a DB node“), but basically all instances will/should be able to perform their main tasks besides (also) being a DB node. As the VMs may be installed in very different environments, network-based solutions are less feasible and we would prefer a DB-level solution. We assume that for most cases, primary/stand-by configurations would be sufficient in terms of availability / latency / throughput. 

We must also assume that there is no person who would be able to touch things if an error occurs. Data consistency and (as much as possible) automated recovery from error situations („VM down“, „network lost“, …) are therefor more important than „n nines". We can assume that the VMs can talk to each other over TCP (eg using SSH tunnels, direct Postgres connection, or some other suitable protocol). Scripting „around“ the database is available to initialize instances and for similar tasks.

Would Postgres’ own log-shipping (file-based + streaming replication, possibly with remote_write) be sufficient for such a set of requirements?

What aspects would you consider important for such a scenario? 

PS: In case it matters, OpenBSD is the base platform of the VMs.

Thanks for your inputs!

— Matthias




Re: Cluster for an appliance-type deployment

From
Christian Ramseyer
Date:

On 06.11.23 20:26, Matthias Leisi wrote:
> Dear all,
> 
> I’m reasonably experienced with Postgres with simple (single, „rebuild 
> and restore“) requirements, but would need some hints on what to look 
> for in a more complex situation - deploying Postgres as the backend for 
> a (virtual) appliance.
> 
> This appliance can scale horizontally from a single to dozens of VMs 
> (theoretically more, but most installations are small-ish). It is 
> feasible to configure VMs for particular purposes (eg „you are [also] a 
> DB node“), but basically all instances will/should be able to perform 
> their main tasks besides (also) being a DB node. As the VMs may be 
> installed in very different environments, network-based solutions are 
> less feasible and we would prefer a DB-level solution. We assume that 
> for most cases, primary/stand-by configurations would be sufficient in 
> terms of availability / latency / throughput.
> 
> We must also assume that there is no person who would be able to touch 
> things if an error occurs. Data consistency and (as much as possible) 
> automated recovery from error situations („VM down“, „network lost“, …) 
> are therefor more important than „n nines". We can assume that the VMs 
> can talk to each other over TCP (eg using SSH tunnels, direct Postgres 
> connection, or some other suitable protocol). Scripting „around“ the 
> database is available to initialize instances and for similar tasks.
> 
> Would Postgres’ own log-shipping (file-based + streaming replication, 
> possibly with remote_write) be sufficient for such a set of requirements?
> 
> What aspects would you consider important for such a scenario?


The replication that ships with Postgres gives you one writeable primary 
server and a number of standbys, but it has no tools to automatically 
discover or recover from failure. From 
https://www.postgresql.org/docs/current/warm-standby-failover.html:

---

PostgreSQL does not provide the system software required to identify a 
failure on the primary and notify the standby database server. Many such 
tools exist and are well integrated with the operating system facilities 
required for successful failover, such as IP address migration.

Once failover to the standby occurs, there is only a single server in 
operation. This is known as a degenerate state. The former standby is 
now the primary, but the former primary is down and might stay down. To 
return to normal operation, a standby server must be recreated, either 
on the former primary system when it comes up, or on a third, possibly 
new, system. The pg_rewind utility can be used to speed up this process 
on large clusters. Once complete, the primary and standby can be 
considered to have switched roles. Some people choose to use a third 
server to provide backup for the new primary until the new standby 
server is recreated, though clearly this complicates the system 
configuration and operational processes.

---

So you need to add additional parts from the ecosystem to detect 
failure, handle failover, potentially move an IP address with the 
Primary etc. Popular tools are repmgr, Patroni, CloudNativePG, BDR, 
pacemaker+corosync and endless others. They will address many of your 
requirements, but still some work and understanding is required to make 
them fully unattended, as well as have them expand dynamically with new 
replicas if an appliance is added.

Postgres is an amazing product and I like to use it for almost 
everything, but in this scenario we are of course making our live hard 
with ACID compliance. In NoSQL/"eventual consistency" land, there are 
products that are a lot friendlier to a setup like this - stuff like 
Cassandra, etcd, CouchDb comes to mind. I'd compare the pros and cons of 
such alternatives, the big con of course being a lot less consistency 
and durability guarantees - the question is whether the application 
needs it.

But enough about non-Postgres topics on this list :) To go with 
Postgres, on a hunch I'd try Patroni first: it does a lot of the 
advanced failover stuff, has a great track record, and supposedly runs 
on BSD: https://openports.pl/path/databases/patroni


Cheers
Christian


-- 
Christian Ramseyer, netnea ag
Network Management. Security. OpenSource.
https://www.netnea.com
Phone: +41 79 644 77 64