Thread: Replication
Looking for replication solutions, I find: Slony-I Seems good, single master only, master is a single point of failure, no good failover system for electing a new master or having a failed master rejoin the cluster. Slave databases are mostly for safety or for parallelizing queries for performance. Suffers from O(N^2) communications (N = cluster size). Slony-II Seems brilliant, a solid theoretical foundation, at the forefront of computer science. But can't find project status -- when will it be available? Is it a pipe dream, or a nearly-ready reality? PGReplication Appears to be a page that someone forgot to erase from the old GBorg site. PGCluster Seems pretty good, but web site is not current, there are releases in use that are not on the web site, and also seems to always be a couple steps behind the current release of Postgres. Two single-points failure spots, load balancer and the data replicator. Is this a good summary of the status of replication? Have I missed any important solutions or mischaracterized anything? Thanks! Craig
Craig James wrote: > Looking for replication solutions, I find: > > Slony-I > Seems good, single master only, master is a single point of failure, > no good failover system for electing a new master or having a failed > master rejoin the cluster. Slave databases are mostly for safety or > for parallelizing queries for performance. Suffers from O(N^2) > communications (N = cluster size). Yep > > Slony-II > Seems brilliant, a solid theoretical foundation, at the forefront of > computer science. But can't find project status -- when will it be > available? Is it a pipe dream, or a nearly-ready reality? > Dead > PGReplication > Appears to be a page that someone forgot to erase from the old GBorg site. > Dead > PGCluster > Seems pretty good, but web site is not current, there are releases in use > that are not on the web site, and also seems to always be a couple steps > behind the current release of Postgres. Two single-points failure spots, > load balancer and the data replicator. > Slow as all get out for writes but cool idea > Is this a good summary of the status of replication? Have I missed any > important solutions or mischaracterized anything? > log shipping, closed source solutions > Thanks! > Craig > > > ---------------------------(end of broadcast)--------------------------- > TIP 2: Don't 'kill -9' the postmaster >
Which replication problem are you trying to solve? On Thu, 14 Jun 2007, Craig James wrote: > Looking for replication solutions, I find: > > Slony-I > Seems good, single master only, master is a single point of failure, > no good failover system for electing a new master or having a failed > master rejoin the cluster. Slave databases are mostly for safety or > for parallelizing queries for performance. Suffers from O(N^2) > communications (N = cluster size). > > Slony-II > Seems brilliant, a solid theoretical foundation, at the forefront of > computer science. But can't find project status -- when will it be > available? Is it a pipe dream, or a nearly-ready reality? > > PGReplication > Appears to be a page that someone forgot to erase from the old GBorg site. > > PGCluster > Seems pretty good, but web site is not current, there are releases in use > that are not on the web site, and also seems to always be a couple steps > behind the current release of Postgres. Two single-points failure spots, > load balancer and the data replicator. > > Is this a good summary of the status of replication? Have I missed any > important solutions or mischaracterized anything? > > Thanks! > Craig > > > ---------------------------(end of broadcast)--------------------------- > TIP 2: Don't 'kill -9' the postmaster >
On 6/15/07, Craig James <craig_james@emolecules.com> wrote: [snip] > Is this a good summary of the status of replication? Have I missed any important solutions or mischaracterized anything? * Mammoth Replicator, commercial. * Continuent uni/cluster, commercial (http://www.continuent.com/index.php?option=com_content&task=view&id=212&Itemid=169). * pgpool-II. Supports load-balancing and replication by implementing a proxy that duplicates all updates to all slaves. It can partition data by doing this, and it can semi-intelligently route queries to the appropriate servers. * Cybertec. This is a commercial packaging of PGCluster-II from an Austrian company. * Greenplum Database (formerly Bizgres MPP), commercial. Not so much a replication solution as a way to parallelize queries, and targeted at the data warehousing crowd. Similar to ExtenDB, but tightly integrated with PostgreSQL. * DRDB (http://www.drbd.org/), a device driver that replicates disk blocks to other nodes. This works for failover only, not for scaling reads. Easy migration of devices if combined with an NFS export. * Skytools (https://developer.skype.com/SkypeGarage/DbProjects/SkyTools), a collection of replication tools from the Skype people. Purports to be simpler to use than Slony. Lastly, and perhaps most promisingly, there's the Google Summer of Code effort by Florian Pflug (http://code.google.com/soc/postgres/appinfo.html?csaid=6545828A8197EBC6) to implement true log-based replication, where PostgreSQL's transaction logs are used to keep live slave servers up to date with a master. In theory, such a system would be extremely simple to set up and use, especially since it should, as far as I can see, also transparently replicate the schema for you. Alexander.
>>> On Thu, Jun 14, 2007 at 6:14 PM, in message <4671CBBA.6010104@emolecules.com>, Craig James <craig_james@emolecules.com> wrote: > Looking for replication solutions, I find: > > Slony-I > Slony-II > PGReplication > PGCluster You wouldn't guess it from the name, but pgpool actually supports replication: http://pgpool.projects.postgresql.org/
Thanks to all who replied and filled in the blanks. The problem with the web is you never know if you've missed something. Joshua D. Drake wrote: >> Looking for replication solutions, I find... >> Slony-II > Dead Wow, I'm surprised. Is it dead for lack of need, lack of resources, too complex, or all of the above? It sounded like sucha promising theoretical foundation. Ben wrote: > Which replication problem are you trying to solve? Most of our data is replicated offline using custom tools tailored to our loading pattern, but we have a small amount of"global" information, such as user signups, system configuration, advertisements, and such, that go into a single small(~5-10 MB) "global database" used by all servers. We need "nearly-real-time replication," and instant failover. That is, it's far more important for the system to keep workingthan it is to lose a little data. Transactional integrity is not important. Actual hardware failures are rare, andif a user just happens to sign up, or do "save preferences", at the instant the global-database server goes down, it'snot a tragedy. But it's not OK for the entire web site to go down when the one global-database server fails. Slony-I can keep several slave databases up to date, which is nice. And I think I can combine it with a PGPool instanceon each server, with the master as primary and few Slony-copies as secondary. That way, if the master goes down,the PGPool servers all switch to their secondary Slony slaves, and read-only access can continue. If the master crashes,users will be able to do most activities, but new users can't sign up, and existing users can't change their preferences,until either the master server comes back, or one of the slaves is promoted to master. The problem is, there don't seem to be any "vote a new master" type of tools for Slony-I, and also, if the original mastercomes back online, it has no way to know that a new master has been elected. So I'd have to write a bunch of SOAPservices or something to do all of this. I would consider PGCluster, but it seems to be a patch to Postgres itself. I'm reluctant to introduce such a major pieceof technology into our entire system, when only one tiny part of it needs the replication service. Thanks, Craig
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 > Most of our data is replicated offline using custom tools tailored to > our loading pattern, but we have a small amount of "global" information, > such as user signups, system configuration, advertisements, and such, > that go into a single small (~5-10 MB) "global database" used by all > servers. Slony provides near instantaneous failovers (in the single digit seconds range). You can script an automatic failover if the master server becomes unreachable. That leaves you the problem of restarting your app (or making it reconnect) to the new master. 5-10MB data implies such a fast initial replication, that making the server rejoin the cluster by setting it up from scratch is not an issue. > The problem is, there don't seem to be any "vote a new master" type of > tools for Slony-I, and also, if the original master comes back online, > it has no way to know that a new master has been elected. So I'd have > to write a bunch of SOAP services or something to do all of this. You don't need SOAP services, and you do not need to elect a new master. if dbX goes down, dbY takes over, you should be able to decide on a static takeover pattern easily enough. The point here is, that the servers need to react to a problem, but you probably want to get the admin on duty to look at the situation as quickly as possible anyway. With 5-10MB of data in the database, a complete rejoin from scratch to the cluster is measured in minutes. Furthermore, you need to checkout pgpool, I seem to remember that it has some bad habits in routing queries. (E.g. it wants to apply write queries to all nodes, but slony makes the other nodes readonly. Furthermore, anything inside a BEGIN is sent to the master node, which is bad with some ORMs, that by default wrap any access into a transaction) Andreas -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.2 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFGceUXHJdudm4KnO0RAgh/AJ4kXFpzoQAEnn1B7K6pzoCxk0wFxQCggGF1 mA1KWvcKtfJ6ZcPiajJK1i4= =eoNN -----END PGP SIGNATURE-----
Andreas Kostyrka wrote: > Slony provides near instantaneous failovers (in the single digit seconds > range). You can script an automatic failover if the master server > becomes unreachable. But Slony slaves are read-only, correct? So the system isn't fully functional once the master goes down. > That leaves you the problem of restarting your app > (or making it reconnect) to the new master. Don't you have to run a Slony app to convert one of the slaves into the master? > 5-10MB data implies such a fast initial replication, that making the > server rejoin the cluster by setting it up from scratch is not an issue. The problem is to PREVENT it from rejoining the cluster. If you have some semi-automatic process that detects the dead serverand converts a slave to the master, and in the mean time the dead server manages to reboot itself (or its network getsfixed, or whatever the problem was), then you have two masters sending out updates, and you're screwed. >> The problem is, there don't seem to be any "vote a new master" type of >> tools for Slony-I, and also, if the original master comes back online, >> it has no way to know that a new master has been elected. So I'd have >> to write a bunch of SOAP services or something to do all of this. > > You don't need SOAP services, and you do not need to elect a new master. > if dbX goes down, dbY takes over, you should be able to decide on a > static takeover pattern easily enough. I can't see how that is true. Any self-healing distributed system needs something like the following: - A distributed system of nodes that check each other's health - A way to detect that a node is down and to transmit that information across the nodes - An election mechanism that nominates a new master if the master fails - A way for a node coming online to determine if it is a master or a slave Any solution less than this can cause corruption because you can have two nodes that both think they're master, or end upwith no master and no process for electing a master. As far as I can tell, Slony doesn't do any of this. Is there a simplersolution? I've never heard of one. > The point here is, that the servers need to react to a problem, but you > probably want to get the admin on duty to look at the situation as > quickly as possible anyway. No, our requirement is no administrator interaction. We need instant, automatic recovery from failure so that the systemstays online. > Furthermore, you need to checkout pgpool, I seem to remember that it has > some bad habits in routing queries. (E.g. it wants to apply write > queries to all nodes, but slony makes the other nodes readonly. > Furthermore, anything inside a BEGIN is sent to the master node, which > is bad with some ORMs, that by default wrap any access into a transaction) I should have been more clear about this. I was planning to use PGPool in the PGPool-1 mode (not the new PGPool-2 featuresthat allow replication). So it would only be acting as a failover mechanism. Slony would be used as the replicationmechanism. I don't think I can use PGPool as the replicator, because then it becomes a new single point of failure that could bringthe whole system down. If you're using it for INSERT/UPDATE, then there can only be one PGPool server. I was thinking I'd put a PGPool server on every machine in failover mode only. It would have the Slony master as the primaryconnection, and a Slony slave as the failover connection. The applications would route all INSERT/UPDATE statementsdirectly to the Slony master, and all SELECT statements to the PGPool on localhost. When the master failed, allof the PGPool servers would automatically switch to one of the Slony slaves. This way, the system would keep running on the Slony slaves (so it would be read-only), until a sysadmin could get the masterSlony back online. And when the master came online, the PGPool servers would automatically reconnect and write-accesswould be restored. Does this make sense? Craig
Craig James wrote: > Andreas Kostyrka wrote: >> Slony provides near instantaneous failovers (in the single digit seconds >> range). You can script an automatic failover if the master server >> becomes unreachable. > > But Slony slaves are read-only, correct? So the system isn't fully > functional once the master goes down. That is what promotion is for. Joshua D. Drake > ---------------------------(end of broadcast)--------------------------- > TIP 5: don't forget to increase your free space map settings > -- === The PostgreSQL Company: Command Prompt, Inc. === Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240 Providing the most comprehensive PostgreSQL solutions since 1997 http://www.commandprompt.com/ Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate PostgreSQL Replication: http://www.commandprompt.com/products/
On 6/15/07, Craig James <craig_james@emolecules.com> wrote: > I don't think I can use PGPool as the replicator, because then it becomes a new single point of failure that could bringthe whole system down. If you're using it for INSERT/UPDATE, then there can only be one PGPool server. Are you sure? I have been considering this possibility, too, but I didn't find anything in the documentation. The main mechanism of the proxy is taking received updates and playing them one multiple servers with 2PC, and the proxies should not need to keep any state about this, so why couldn't you install multiple proxies? Alexander.
Hello, On Thu, 2007-06-14 at 16:14 -0700, Craig James wrote: > Cluster > Seems pretty good, but web site is not current, http://www.pgcluster.org is a bit up2date, also http://pgfoundry.org/projects/pgcluster is up2date (at least downloads page :) ) Regards, -- Devrim GÜNDÜZ PostgreSQL Replication, Consulting, Custom Development, 24x7 support Managed Services, Shared and Dedicated Hosting Co-Authors: plPHP, ODBCng - http://www.commandprompt.com/
Attachment
On Thu, 14 Jun 2007 17:38:01 -0700 Craig James <craig_james@emolecules.com> wrote: > I would consider PGCluster, but it seems to be a patch to Postgres > itself. I'm reluctant to introduce such a major piece of technology Yes it is. For most of the time it is not very much behind actual versions of postgresql. The project's biggest drawbacks, as I see: - horrible documentation - changing configuration without any warning/help to the "user" (as far as there are only "rc"-s, I can't really blame the developers for that... :) ) - there are only "rc" -s, no "stable" version available for current postgresql releases. I think this project needs someone speaking english very well, and having the time and will to coordinate and document all the code that is written. Otherwise the idea and the solution seems to be very good. If someone - with big luck and lot of try-fail efforts - sets up a working system, then it will be stable and working for long time. > into our entire system, when only one tiny part of it needs the > replication service. > > Thanks, > Craig Rgds, Akos -- Üdvözlettel, Gábriel Ákos -=E-Mail :akos.gabriel@i-logic.hu|Web: http://www.i-logic.hu =- -=Tel/fax:+3612367353 |Mobil:+36209278894 =-
Hi, Joshua D. Drake wrote: >> Slony-II >> Seems brilliant, a solid theoretical foundation, at the forefront of >> computer science. But can't find project status -- when will it be >> available? Is it a pipe dream, or a nearly-ready reality? >> > > Dead Not quite... there's still Postgres-R, see www.postgres-r.org And I'm continuously working on it, despite not having updated the website for almost a year now... I planned on releasing the next development snapshot together with 8.3, as that seems to be delayed, that seems realistic ;-) Regards Markus
Markus Schiltknecht wrote: > Not quite... there's still Postgres-R, see www.postgres-r.org And I'm > continuously working on it, despite not having updated the website for > almost a year now... > > I planned on releasing the next development snapshot together with 8.3, > as that seems to be delayed, that seems realistic ;-) Is Postgres-R the same thing as Slony-II? There's a lot of info and news around about Slony-II, but your web page doesn'tseem to mention it. While researching replication solutions, I had a heck of a time sorting out the dead or outdated web pages (like the stuffon gborg) from the active projects. Either way, it's great to know you're working on it. Craig
Hi, Craig James wrote: > Is Postgres-R the same thing as Slony-II? There's a lot of info and > news around about Slony-II, but your web page doesn't seem to mention it. Hm... true. Good point. Maybe I should add a FAQ: Postgres-R has been the name of the research project by Bettina Kemme et al. Slony-II was the name Neil and Gavin gave their attempt to continue that project. I've based my work on the old (6.4.2) Postgres-R source code - and I'm still calling it Postgres-R, probably Postgres-R (8) to distinguish it from the original one. But I'm thinking about changing the name completely... however, I'm a developer, not a marketing guru. > While researching replication solutions, I had a heck of a time sorting > out the dead or outdated web pages (like the stuff on gborg) from the > active projects. Yeah, that's one of the main problems with replication for PostgreSQL. I hope Postgres-R (or whatever name I'll come up with in the future) can change that. > Either way, it's great to know you're working on it. Maybe you want to join its mailing list [1]? I'll try to get some discussion going there in the near future. Regards Markus [1]: Postgres-R on gborg: http://pgfoundry.org/projects/postgres-r/
On Thu, 2007-06-14 at 16:14 -0700, Craig James wrote: > Looking for replication solutions, I find: > > Slony-I > Seems good, single master only, master is a single point of failure, > no good failover system for electing a new master or having a failed > master rejoin the cluster. Slave databases are mostly for safety or > for parallelizing queries for performance. Suffers from O(N^2) > communications (N = cluster size). > There's MOVE SET which transfers the origin (master) from one node to another without losing any committed transactions. There's also FAILOVER, which can set a new origin even if the old origin is completely gone, however you will lose the transactions that haven't been replicated yet. To have a new node join the cluster, you SUBSCRIBE SET, and you can MOVE SET to it later if you want that to be the master. Regards, Jeff Davis
On Mon, Jun 18, 2007 at 08:54:46PM +0200, Markus Schiltknecht wrote: > Postgres-R has been the name of the research project by Bettina Kemme et > al. Slony-II was the name Neil and Gavin gave their attempt to continue > that project. This isn't quite true. Slony-II was originally conceived by Jan as an attempt to implement some of the Postgres-R ideas. For our uses, however, Postgres-R had built into it a rather knotty design problem: under high-contention workloads, it will automatically increase the number of ROLLBACKs users experience. Jan had some ideas on how to solve this by moving around the GC events and doing slightly different things with them. To that end, Afilias sponsored a small workshop in Toronto during one of the coldest weeks the city has ever seen. This should have been a clue, perhaps. ;-) Anyway, the upshot of this was that two or three different approaches were attempted in prototypes. AFAIK, Neil and Gavin got the farthest, but just about everyone who was involved in the original workshop all independently concluded that the approach we were attempting to get to work was doomed -- it might go, but the overhead was great enough that it wouldn't be any benefit. Part of the problem, as near as I could tell, was that we had no group communication protocol that would really work. Spread needed a _lot_ of work (where "lot of work" may mean "rewrite"), and I just didn't have the humans to put on that problem. Another part of the problem was that, for high-contention workloads like the ones we happened to be working on, an optimistic approach like Postgres-R is probably always going to be a loser. A -- Andrew Sullivan | ajs@crankycanuck.ca In the future this spectacle of the middle classes shocking the avant- garde will probably become the textbook definition of Postmodernism. --Brad Holland
Hi, Andrew Sullivan wrote: > This isn't quite true. Slony-II was originally conceived by Jan as > an attempt to implement some of the Postgres-R ideas. Oh, right, thanks for that correction. > Part of the problem, as near as I could tell, was that we had no > group communication protocol that would really work. Spread needed a > _lot_ of work (where "lot of work" may mean "rewrite"), and I just > didn't have the humans to put on that problem. Another part of the > problem was that, for high-contention workloads like the ones we > happened to be working on, an optimistic approach like Postgres-R is > probably always going to be a loser. Hm.. for high-contention on single rows, sure, yes - you would mostly get rollbacks for conflicting transactions. But the optimism there is justified, as I think most real world transactions don't conflict (or else you can work around such high single row contention). You are right in that the serialization of the GCS can be bottleneck. However, there's lots of research going on in that area and I'm convinced that Postgres-R has it's value. Regards Markus