Thread: High-availability
Hi all, After realizing that 'clustering' in the PgSQL docs means multiple DBs behind one server, and NOT multple machines, I am back at square one, feeling somewhat the fool. :P Can anyone point me to docs/websites that discuss options on replicating in (as close as possible to) realtime? Ideally with load balancing while both/all servers are up, and failover/resyncing when a member fails and is restored. Is this even possible on PostgreSQL? Being a quite small company, proprietary hardware and fancy software licenses are not possible (ie: 'use oracle' won't help). I've looked at slony, but it looks more like a way to push occasional copies to slaves, and isn't meant to be real time. Am I wrong by chance? Thanks for any help/tips/pointers! Madi
On 6/1/07, Madison Kelly <linux@alteeve.com> wrote: > After realizing that 'clustering' in the PgSQL docs means multiple > DBs behind one server, and NOT multple machines, I am back at square > one, feeling somewhat the fool. :P I remember being similarly disappointed in this rampant co-opting of the word "cluster" back in 7.4 or so. :) A gaggle of geese, a murder of crows, a cluster of databases, I guess. > Can anyone point me to docs/websites that discuss options on > replicating in (as close as possible to) realtime? Ideally with load > balancing while both/all servers are up, and failover/resyncing when a > member fails and is restored. The PostgreSQL documentation gives a pretty good overview of the options: http://www.postgresql.org/docs/8.2/interactive/high-availability.html That said, there is to my knowledge no single, integrated product that will do all you ask. None are capable of anything near real-time, automatic failover tends to be left as an exercise for the reader, and there is a lot of work to get it up and running, and requires particular care in maintenance and monitoring once it's up. There are several commercial (Mammoth Replicator comes to mind) and several open-source projects. Among the open-source ones (Slony-I, pgpool, PGCluster), I believe Slony-I is the most mature. There are a few in-progress attempts (pgpool-II, PGCluster 2, PostgreSQL-R) that are not ready for prime time yet; of these, I believe pgpool-II is the most promising. As mentioned in a different thread today, work is being done to implement WAL-based master-slave replication, which I think should prove more scalable and more transparent than the current third-party products: http://archives.postgresql.org/pgsql-hackers/2007-03/msg00050.php > I've looked at slony, but it looks more like a way to push occasional > copies to slaves, and isn't meant to be real time. Am I wrong by chance? Slony is indeed intended for near-real-time replication; it's asynchronous, so slaves always lag behind the master. The amount of discrepancy depends on a bunch of factors -- individual node performance, network performance, and system load. Alexander.
On 6/3/07, Madison Kelly <linux@alteeve.com> wrote: > > Slony is indeed intended for near-real-time replication; it's > > asynchronous, so slaves always lag behind the master. The amount of > > discrepancy depends on a bunch of factors -- individual node > > performance, network performance, and system load. > > That was *exactly* the kind of link I was trying to find. You're welcome. As a side-note, I sat up pgpool-II today, and was pleasantly surprised about how easy it all was; within two minutes I had two databases in perfect sync on my laptop. It has limitations (such as in its handling of sequences), but compared to Slony it's like a breath of fresh mountain air. Pgpool-II also supports table partitioning, where you define each database to have a subset of the data. Pgpool-II then intercepts every SQL statement and routes it to the correct server. It doesn't work with referential integrity, I think, which is a major limitation, but it's the nature of the beast. Alexander.
On 6/3/07, Alexander Staubo <alex@purefiction.net> wrote: > As a side-note, I sat up pgpool-II today, and was pleasantly surprised > about how easy it all was; within two minutes I had two databases in > perfect sync on my laptop. It has limitations (such as in its handling > of sequences), but compared to Slony it's like a breath of fresh > mountain air. Err, the setup is, I mean. Once you have Slony up and running, it's pretty smooth. Alexander.
Alexander Staubo wrote: > On 6/1/07, Madison Kelly <linux@alteeve.com> wrote: >> After realizing that 'clustering' in the PgSQL docs means multiple >> DBs behind one server, and NOT multple machines, I am back at square >> one, feeling somewhat the fool. :P > > I remember being similarly disappointed in this rampant co-opting of > the word "cluster" back in 7.4 or so. :) A gaggle of geese, a murder > of crows, a cluster of databases, I guess. > >> Can anyone point me to docs/websites that discuss options on >> replicating in (as close as possible to) realtime? Ideally with load >> balancing while both/all servers are up, and failover/resyncing when a >> member fails and is restored. > > The PostgreSQL documentation gives a pretty good overview of the options: > > http://www.postgresql.org/docs/8.2/interactive/high-availability.html > > That said, there is to my knowledge no single, integrated product that > will do all you ask. None are capable of anything near real-time, > automatic failover tends to be left as an exercise for the reader, and > there is a lot of work to get it up and running, and requires > particular care in maintenance and monitoring once it's up. > > There are several commercial (Mammoth Replicator comes to mind) and > several open-source projects. Among the open-source ones (Slony-I, > pgpool, PGCluster), I believe Slony-I is the most mature. There are a > few in-progress attempts (pgpool-II, PGCluster 2, PostgreSQL-R) that > are not ready for prime time yet; of these, I believe pgpool-II is the > most promising. > > As mentioned in a different thread today, work is being done to > implement WAL-based master-slave replication, which I think should > prove more scalable and more transparent than the current third-party > products: > > http://archives.postgresql.org/pgsql-hackers/2007-03/msg00050.php > >> I've looked at slony, but it looks more like a way to push occasional >> copies to slaves, and isn't meant to be real time. Am I wrong by chance? > > Slony is indeed intended for near-real-time replication; it's > asynchronous, so slaves always lag behind the master. The amount of > discrepancy depends on a bunch of factors -- individual node > performance, network performance, and system load. > > Alexander. That was *exactly* the kind of link I was trying to find. Thank you! Madi
Alexander Staubo wrote: >> As a side-note, I sat up pgpool-II today, and was pleasantly surprised >> about how easy it all was; within two minutes I had two databases in >> perfect sync on my laptop. It has limitations (such as in its handling >> of sequences), but compared to Slony it's like a breath of fresh >> mountain air. > > Err, the setup is, I mean. Once you have Slony up and running, it's > pretty smooth. I wonder what the OP means by "real-time". The standard definition is "within a deterministic time bound". Replication implies latency. Ignoring latency or wishing it away will not help. It is possible to manage latency. One strategy is to minimize it. There are others. Also remember the ancient proverb, applicable when two or more nodes are trying to agree on what time it is: "Man with two watches never knows correct time." I think of this category of issue as the Special Relativity of information. -- Lew
Madison Kelly wrote: > Being a quite small company, proprietary hardware and fancy software > licenses are not possible (ie: 'use oracle' won't help). How much data do you put in the DB? Oracle has a free version, but it has size limits. (Ducking the slings and arrows of outraged PG fans: I prefer Postgre, I really do.) -- Lew
Lew wrote: > Madison Kelly wrote: >> Being a quite small company, proprietary hardware and fancy software >> licenses are not possible (ie: 'use oracle' won't help). > > How much data do you put in the DB? Oracle has a free version, but it > has size limits. > > (Ducking the slings and arrows of outraged PG fans: I prefer Postgre, I > really do.) > Hrm, it's hard to say as we're (hoping!) to grow. At the moment, a few hundred megs. If the company gets off the ground, possibly much more. also, we've got a few (dozen or so) side projects that each have their own DBs. I think the risk of running into a barrier like a size limit would be too much. Even if we get off the ground, the storage needs of the DB will outgrow our revenue. I'd hate to be in a position where I am dependent on a (potentially) very expensive invoice while we are still running on a shoe-string. Thanks for the suggestion though! I will poke at the free/trial version and, if I am unable to load-balance pgSQL and we run into performance problems, I will have a better idea of what options I have (ie: bigger iron vs. an oracle license). Thanks! Madi
On Sun, Jun 03, 2007 at 01:35:49PM -0400, Lew wrote: > How much data do you put in the DB? Oracle has a free version, but it has > size limits. AFAIK, Oracle's free version doesn't include RAC, which is what would be needed to satisfy the request anyway. A -- Andrew Sullivan | ajs@crankycanuck.ca Users never remark, "Wow, this software may be buggy and hard to use, but at least there is a lot of code underneath." --Damien Katz
Madison Kelly wrote:
I think you'll typically find that you can get one or the other - synchronous replication, or load balancing...but not both. On the other hand, if you were really serious about having close to both, you could have a three node setup - two (a provider and subscriber) that run using Slony-I (and async replication) and one that runs using one of the aforementioned methods (i.e., DRBD and warm-standby synchronous replication). In such cases a "failover" would mean switching to the synchronous replication system. You should even be able to get SLONY to continuing to avail you with load balancing in such a case, without having to re-sync - though I haven't tried this myself... You'd still have a potential query that got stale data (when it went to a Slony-I subscriber), but you would never lose a committed transaction. You'd have the added benefit of a "shared nothing" environment as well...
As a side plug, we discuss and implement a few of these options in our PostgreSQL performance tuning course.. http://www.otg-nc.com/training-courses/coursedetail.php?courseid=47&cat_id=8
Hi all,If you're interested in the "less than ideal" case (no load balancing, but synchronous replication in a "warm standby" type mode), there are several options, such as shared disk (two systems sharing a SAN or NAS with heartbeat-style fail over - shared disk scenario), or DRBD (where block level changes to one device are mirrored in real-time over to another, with heartbeat style fail over - this is a "shared nothing" type scenario). It's not too hard to put together a "warm standby" synchronous replication mechanism with overhead that isn't too much more than what you incur by enabling PITR... Such systems can also have very fast failover on failure detection (via heartbeat2), and be synchronous.
After realizing that 'clustering' in the PgSQL docs means multiple DBs behind one server, and NOT multple machines, I am back at square one, feeling somewhat the fool. :P
Can anyone point me to docs/websites that discuss options on replicating in (as close as possible to) realtime? Ideally with load balancing while both/all servers are up, and failover/resyncing when a member fails and is restored.
I think you'll typically find that you can get one or the other - synchronous replication, or load balancing...but not both. On the other hand, if you were really serious about having close to both, you could have a three node setup - two (a provider and subscriber) that run using Slony-I (and async replication) and one that runs using one of the aforementioned methods (i.e., DRBD and warm-standby synchronous replication). In such cases a "failover" would mean switching to the synchronous replication system. You should even be able to get SLONY to continuing to avail you with load balancing in such a case, without having to re-sync - though I haven't tried this myself... You'd still have a potential query that got stale data (when it went to a Slony-I subscriber), but you would never lose a committed transaction. You'd have the added benefit of a "shared nothing" environment as well...
As a side plug, we discuss and implement a few of these options in our PostgreSQL performance tuning course.. http://www.otg-nc.com/training-courses/coursedetail.php?courseid=47&cat_id=8
Is this even possible on PostgreSQL?
Being a quite small company, proprietary hardware and fancy software licenses are not possible (ie: 'use oracle' won't help).
I've looked at slony, but it looks more like a way to push occasional copies to slaves, and isn't meant to be real time. Am I wrong by chance?
Thanks for any help/tips/pointers!
Chander Ganesan Open Technology Group, Inc. One Copley Parkway, Suite 210 Morrisville, NC 27560 Phone: 877-258-8987/919-463-0999 http://www.otg-nc.com Expert PostgreSQL Training - On-Site and Public Enrollment
Madi
---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings
Chander Ganesan wrote: > Madison Kelly wrote: >> Hi all, >> >> After realizing that 'clustering' in the PgSQL docs means multiple >> DBs behind one server, and NOT multple machines, I am back at square >> one, feeling somewhat the fool. :P >> >> Can anyone point me to docs/websites that discuss options on >> replicating in (as close as possible to) realtime? Ideally with load >> balancing while both/all servers are up, and failover/resyncing when a >> member fails and is restored. > If you're interested in the "less than ideal" case (no load balancing, > but synchronous replication in a "warm standby" type mode), there are > several options, such as shared disk (two systems sharing a SAN or NAS > with heartbeat-style fail over - shared disk scenario), or DRBD (where > block level changes to one device are mirrored in real-time over to > another, with heartbeat style fail over - this is a "shared nothing" > type scenario). It's not too hard to put together a "warm standby" > synchronous replication mechanism with overhead that isn't too much more > than what you incur by enabling PITR... Such systems can also have very > fast failover on failure detection (via heartbeat2), and be synchronous. > > I think you'll typically find that you can get one or the other - > synchronous replication, or load balancing...but not both. On the other > hand, if you were really serious about having close to both, you could > have a three node setup - two (a provider and subscriber) that run using > Slony-I (and async replication) and one that runs using one of the > aforementioned methods (i.e., DRBD and warm-standby synchronous > replication). In such cases a "failover" would mean switching to the > synchronous replication system. You should even be able to get SLONY to > continuing to avail you with load balancing in such a case, without > having to re-sync - though I haven't tried this myself... You'd still > have a potential query that got stale data (when it went to a Slony-I > subscriber), but you would never lose a committed transaction. You'd > have the added benefit of a "shared nothing" environment as well... > > As a side plug, we discuss and implement a few of these options in our > PostgreSQL performance tuning course.. > http://www.otg-nc.com/training-courses/coursedetail.php?courseid=47&cat_id=8 > >> >> Is this even possible on PostgreSQL? >> >> Being a quite small company, proprietary hardware and fancy software >> licenses are not possible (ie: 'use oracle' won't help). >> >> I've looked at slony, but it looks more like a way to push >> occasional copies to slaves, and isn't meant to be real time. Am I >> wrong by chance? >> >> Thanks for any help/tips/pointers! >> > Chander Ganesan > Open Technology Group, Inc. > One Copley Parkway, Suite 210 > Morrisville, NC 27560 > Phone: 877-258-8987/919-463-0999 > http://www.otg-nc.com > *Expert PostgreSQL Training - On-Site and Public Enrollment* > >> Madi >> >> ---------------------------(end of broadcast)--------------------------- >> TIP 5: don't forget to increase your free space map settings > Thank you for your reply! The more I learn, the more I am leaning towards the DRBD/shared-nothing setup. Our loads are not terribly heavy at this point. I hate the idea of having a nice server sitting there doing nothing 99% of the time, but it looks like the most viable way of setting up HA at this point. Given that I am learning as I go, I think the three-way setup you describe would be a bit too ambitious for me just now. That said, I do have a spare third server that I could use for just such a setup, should I feel comfortable enough down the road. Madi
On Mon, Jun 04, 2007 at 04:21:32PM +0200, Chander Ganesan wrote: > I think you'll typically find that you can get one or the other - > synchronous replication, or load balancing...but not both. On the other Hi, I am in very similar position, but I am more failover oriented. I am considering using pgcluster, which shall resolve both at the cost of slight transaction overhead. Does anyone have any experience with this? What problems may I expect in this setup? Kind regards, Bohdan
On Mon, 2007-06-04 at 10:21 -0400, Chander Ganesan wrote: > It's not too hard to put together a "warm standby" synchronous > replication mechanism with overhead that isn't too much more than what > you incur by enabling PITR... Such systems can also have very fast > failover on failure detection (via heartbeat2), and be synchronous. Do you have any performance measurements of either the replication overhead or the failover time? I'm interested in how well we cope with high transaction rates. Thanks. -- Simon Riggs EnterpriseDB http://www.enterprisedb.com
Simon Riggs wrote:
If I have time I'll see if we can run a more meaningful metric (need to generate a smaller database for that) the next time we have a performance tuning class (in August).
The failover time is tunable to some extent...via heartbeat2 (incurs < 1% performance penalty, but with sub-second failover this can go up a bit), and can be pretty quick (I usually set it up with around a 3 second failover time on node failure, then factor that in with the amount of time required for WAL auto-recovery)...it really depends a lot on what your metric is for "failure" (since node failover is probably the "worst worst case").
Aside from a bunch of customized pgbench benchmarks (on the 9.6 GB sample database we use), which are "better than nothing, but far from the best", not really. In my experience, the larger the database; slower the commit rate; and less frequently the checkpoints - the better the performance of synchronous warm-replication. In our tests, higher commit rates and more frequent checkpoints incur a higher penalty. Basically, the more WAL activity the higher the cost.On Mon, 2007-06-04 at 10:21 -0400, Chander Ganesan wrote:It's not too hard to put together a "warm standby" synchronous replication mechanism with overhead that isn't too much more than what you incur by enabling PITR... Such systems can also have very fast failover on failure detection (via heartbeat2), and be synchronous.Do you have any performance measurements of either the replication overhead or the failover time? I'm interested in how well we cope with high transaction rates. Thanks.
If I have time I'll see if we can run a more meaningful metric (need to generate a smaller database for that) the next time we have a performance tuning class (in August).
The failover time is tunable to some extent...via heartbeat2 (incurs < 1% performance penalty, but with sub-second failover this can go up a bit), and can be pretty quick (I usually set it up with around a 3 second failover time on node failure, then factor that in with the amount of time required for WAL auto-recovery)...it really depends a lot on what your metric is for "failure" (since node failover is probably the "worst worst case").
-- Chander Ganesan The Open Technology Group One Copley Parkway, Suite 210 Morrisville, NC 27560 Phone: 877-258-8987/919-463-0999 http://www.otg-nc.com
Although I rarely see it mentioned, Skype has some replication tools that they opensourced. https://developer.skype.com/SkypeGarage/DbProjects/SkyTools