Thread: Cluster for an appliance-type deployment
Dear all,
This appliance can scale horizontally from a single to dozens of VMs (theoretically more, but most installations are small-ish). It is feasible to configure VMs for particular purposes (eg „you are [also] a DB node“), but basically all instances will/should be able to perform their main tasks besides (also) being a DB node. As the VMs may be installed in very different environments, network-based solutions are less feasible and we would prefer a DB-level solution. We assume that for most cases, primary/stand-by configurations would be sufficient in terms of availability / latency / throughput.
We must also assume that there is no person who would be able to touch things if an error occurs. Data consistency and (as much as possible) automated recovery from error situations („VM down“, „network lost“, …) are therefor more important than „n nines". We can assume that the VMs can talk to each other over TCP (eg using SSH tunnels, direct Postgres connection, or some other suitable protocol). Scripting „around“ the database is available to initialize instances and for similar tasks.
Would Postgres’ own log-shipping (file-based + streaming replication, possibly with remote_write) be sufficient for such a set of requirements?
What aspects would you consider important for such a scenario?
PS: In case it matters, OpenBSD is the base platform of the VMs.
Thanks for your inputs!
— Matthias
On 06.11.23 20:26, Matthias Leisi wrote: > Dear all, > > I’m reasonably experienced with Postgres with simple (single, „rebuild > and restore“) requirements, but would need some hints on what to look > for in a more complex situation - deploying Postgres as the backend for > a (virtual) appliance. > > This appliance can scale horizontally from a single to dozens of VMs > (theoretically more, but most installations are small-ish). It is > feasible to configure VMs for particular purposes (eg „you are [also] a > DB node“), but basically all instances will/should be able to perform > their main tasks besides (also) being a DB node. As the VMs may be > installed in very different environments, network-based solutions are > less feasible and we would prefer a DB-level solution. We assume that > for most cases, primary/stand-by configurations would be sufficient in > terms of availability / latency / throughput. > > We must also assume that there is no person who would be able to touch > things if an error occurs. Data consistency and (as much as possible) > automated recovery from error situations („VM down“, „network lost“, …) > are therefor more important than „n nines". We can assume that the VMs > can talk to each other over TCP (eg using SSH tunnels, direct Postgres > connection, or some other suitable protocol). Scripting „around“ the > database is available to initialize instances and for similar tasks. > > Would Postgres’ own log-shipping (file-based + streaming replication, > possibly with remote_write) be sufficient for such a set of requirements? > > What aspects would you consider important for such a scenario? The replication that ships with Postgres gives you one writeable primary server and a number of standbys, but it has no tools to automatically discover or recover from failure. From https://www.postgresql.org/docs/current/warm-standby-failover.html: --- PostgreSQL does not provide the system software required to identify a failure on the primary and notify the standby database server. Many such tools exist and are well integrated with the operating system facilities required for successful failover, such as IP address migration. Once failover to the standby occurs, there is only a single server in operation. This is known as a degenerate state. The former standby is now the primary, but the former primary is down and might stay down. To return to normal operation, a standby server must be recreated, either on the former primary system when it comes up, or on a third, possibly new, system. The pg_rewind utility can be used to speed up this process on large clusters. Once complete, the primary and standby can be considered to have switched roles. Some people choose to use a third server to provide backup for the new primary until the new standby server is recreated, though clearly this complicates the system configuration and operational processes. --- So you need to add additional parts from the ecosystem to detect failure, handle failover, potentially move an IP address with the Primary etc. Popular tools are repmgr, Patroni, CloudNativePG, BDR, pacemaker+corosync and endless others. They will address many of your requirements, but still some work and understanding is required to make them fully unattended, as well as have them expand dynamically with new replicas if an appliance is added. Postgres is an amazing product and I like to use it for almost everything, but in this scenario we are of course making our live hard with ACID compliance. In NoSQL/"eventual consistency" land, there are products that are a lot friendlier to a setup like this - stuff like Cassandra, etcd, CouchDb comes to mind. I'd compare the pros and cons of such alternatives, the big con of course being a lot less consistency and durability guarantees - the question is whether the application needs it. But enough about non-Postgres topics on this list :) To go with Postgres, on a hunch I'd try Patroni first: it does a lot of the advanced failover stuff, has a great track record, and supposedly runs on BSD: https://openports.pl/path/databases/patroni Cheers Christian -- Christian Ramseyer, netnea ag Network Management. Security. OpenSource. https://www.netnea.com Phone: +41 79 644 77 64