Thread: Replication
Hi, I just see that Mysql will propose at the end of the month a full synchronous replication system with auto-recovery. http://www.mysql.com/products/cluster/ We need to see when stable version would be released..... I use PostgreSQL and I would appreciate to have the same features in PostgreSQL. Any comments ? (no flame, please) Cordialement, Jean-Gérard Pailloncy
On Tue, Apr 20, 2004 at 11:26:24AM +0200, Pailloncy Jean-G?rard wrote: > Hi, > > I just see that Mysql will propose at the end of the month a full > synchronous replication system with auto-recovery. Well, sort of. It seems to be yet another 80/20 Solution From MySQL (tm). It looks like it's based on a new table type. It stores everything in memory, and then writes out asynchronously. This strikes me as pretty dangerous from the point of view of reliability: what if the box dies before the write is complete? (And don't tell me about super-redundant high-availability hardware. I _have_ all that. All hardware sucks; HA stuff just sucks less often at a higher price.) Also, it doesn't support the other table types. I don't want to contemplate the horrible mess you'd have to clean up if you had a transaction crossing three table types and get a hardware failure. I'm afraid I agree with the recently-posted Oracle Veep interview: this does not represent any serious challenge to the core ORAC market. > I use PostgreSQL and I would appreciate to have the same features in > PostgreSQL. Sure, so would I. Talk to Jan Wieck about what he plans to do about it, and maybe consider supporting that development work too ;-) A -- Andrew Sullivan | ajs@crankycanuck.ca
Andrew Sullivan wrote: > On Tue, Apr 20, 2004 at 11:26:24AM +0200, Pailloncy Jean-G?rard wrote: >> Hi, >> >> I just see that Mysql will propose at the end of the month a full >> synchronous replication system with auto-recovery. > > Well, sort of. It seems to be yet another 80/20 Solution From MySQL > (tm). > > It looks like it's based on a new table type. It stores everything > in memory, and then writes out asynchronously. This strikes me as > pretty dangerous from the point of view of reliability: what if the > box dies before the write is complete? (And don't tell me about > super-redundant high-availability hardware. I _have_ all that. All > hardware sucks; HA stuff just sucks less often at a higher price.) > Also, it doesn't support the other table types. I don't want to > contemplate the horrible mess you'd have to clean up if you had a > transaction crossing three table types and get a hardware failure. > > I'm afraid I agree with the recently-posted Oracle Veep interview: > this does not represent any serious challenge to the core ORAC > market. Quoting from the MySQL(tm) FAQ about MySQL(tm) Cluster(tm) avaliable at http://www.mysql.com/products/cluster/faq.html <quote> Q: Does MySQL Cluster work with MyISAM and InnoDB? A: MySQL Cluster can include the MyISAM and InnoDB storage engines. Of these, the high-availability data must reside in the MySQL Cluster storage engine. The MySQL Cluster DB node stores MySQL Cluster data, the MySQL Server parses SQL and sends requests to the DB node. The MySQL Server does not store any data belonging to the MySQL Cluster storage engine. InnoDB/MyISAM data is still stored in the MySQL server and can be used in the standard way, but that data is not replicated, so that data is not visible from any other MySQL server that is connected to the MySQL Cluster. </quote> It is just another table handler made available for the SQL query engine. Touting loudly and on all available channels that "MySQL Cluster combines the world's most popular open source database with a parallel-server" naturally leads to the misinterpretation that all the wonderfull new features like foreign keys, MVCC and rollback will now horizontally scale over multiple, high available nodes. This is not true. The NDB table type does not have support for foreign keys, constraints, triggers. It does support transactions, but these transactions are not the same transactions as the ones of the InnoDB table handler, so a COMMIT is not atomic across different table types. MySQL likes to point out that the largest systems like SAP R/3 do not use referential integrity on the database level. That is true so far, but having worked for many years as an SAP base consultant I can tell you that the reason for that is NOT performance. SAP spends that effort multiple times by implementing their own, custom integrity control and data domain system in the DB abstraction layer, to gain DB vendor independence. That abstraction layer is larger than PHP and Apache together, so this example is IMHO totally irrelevant for the typical MySQL user. Also, the NDB table type is based on an in-memory, partitioned storage engine (that's where the speed comes from) and to get high availablility one needs at least two times the full database size in RAM (plus some for the OS and other overhead), and a higher factor to really achieve the 99.999%. So to serve let's say a 100 GB database, we're talking about 220-240 GB of RAM. Now that's 8 boxes with 32GB each? And according to a MySQL consultant I spoke with, the real bottleneck is the network, so these boxes like to have "better than Gigabit Ethernet" as a backbone. That are some decent hardware requirements, make sure you have a forklift on your next shopping list. So what one gets with NDB on the bottom line is another table type that is usefull for some special cases. I can imagine for example systems that read sensor data, which cannot be interrupted. Sensors usually don't care much about referential integrity, so for the logging system this is in fact irrelevant, the data has to be stored now and corrected later. I think it is indeed a big plus for a system, to make that logging data available inside the same SQL query engine where the more complicated bits and pieces of the application are implemented in. But that is all, and that can pretty easy be achieved by doing bulk-loads of the log data into regular database tables. Unless one really needs the ability to query and analyse up to the last second of logdata, running some multiple 100 kilodollar hardware and network equipment just for the fun of a memory cluster solution is a bit overkill. As the Oracle VP of product strategy, Ken Jacobs, pointed out: "MySQL is trying to address certain product shortcomings by acquiring a third-party technology. This does not mean they now have a product that is competitive with Oracle—or even other—database products, whether clustered or not.". Absolutely right Mr. Jacobs, they have done that before by adding InnoDB, now they added some limited multimaster replication capabilities. But instead of developing an integrated solution that includes the InnoDB table handler, where this functionality would be usefull, they just added a fifth wheel to the cart. > >> I use PostgreSQL and I would appreciate to have the same features in >> PostgreSQL. > > Sure, so would I. Talk to Jan Wieck about what he plans to do > about it, and maybe consider supporting that development work too ;-) Ken Jacobs further said "No one has anything at all like Oracle's Real Application Clusters". And that is right too. However good PostgreSQL by now compares on SQL features and standalone DB performance. On replication we are 2 years or more behind. Right now we need to get the Slony-I project out the door and let that settle a bit and maybe get enhanced over one more release. With that as the base, we will start designing a synchronous multimaster system that can be jump-started from a running, asynchronous replication setup. All this "high-availability" babble is IMHO totally pointless as long as there is no way of (re)creataing a (failed) node from scratch without taking an outage. And that functionality is listed on the MySQL roadmap for 5.1 ... so somewhere in 2008? Slony does that for async master-slave right today. Jan -- #======================================================================# # It's easier to get forgiveness for being wrong than for being right. # # Let's break this rule - forgive me. # #================================================== JanWieck@Yahoo.com #
On Wed, Apr 21, 2004 at 11:23:51AM -0400, Jan Wieck wrote: > for that is NOT performance. SAP spends that effort multiple times by > implementing their own, custom integrity control and data domain system > in the DB abstraction layer, to gain DB vendor independence. That > abstraction layer is larger than PHP and Apache together, so this > example is IMHO totally irrelevant for the typical MySQL user. Actually, I think it _is_ relevant. It's proof, IMNSHO, that the strategy of "doing it in the client" is completely bankrupt. It's one thing to do it this way if you have software which is a category-killer the way SAP is, because you can afford the overhead of all those developers doing all that extra work, and you can make your customers buy trillion-dollar hardware to run your bloated masterpiece. The Rest Of Us, however, need to do things efficiently, and that means doing the work in the place where it is least likely to need to be checked again. For most database applications, that's inside the database. (I'll not now start my rant on the mess caused by developers who are careless with this principle.) A -- Andrew Sullivan | ajs@crankycanuck.ca The plural of anecdote is not data. --Roger Brinner
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Wednesday 21 April 2004 10:28 am, Andrew Sullivan wrote: > On Wed, Apr 21, 2004 at 11:23:51AM -0400, Jan Wieck wrote: > > for that is NOT performance. SAP spends that effort multiple times by > > implementing their own, custom integrity control and data domain system > > in the DB abstraction layer, to gain DB vendor independence. That > > abstraction layer is larger than PHP and Apache together, so this > > example is IMHO totally irrelevant for the typical MySQL user. > > Actually, I think it _is_ relevant. It's proof, IMNSHO, that the > strategy of "doing it in the client" is completely bankrupt. It's > one thing to do it this way if you have software which is a > category-killer the way SAP is, because you can afford the overhead > of all those developers doing all that extra work, and you can make > your customers buy trillion-dollar hardware to run your bloated > masterpiece. The Rest Of Us, however, need to do things efficiently, > and that means doing the work in the place where it is least likely > to need to be checked again. For most database applications, that's > inside the database. (I'll not now start my rant on the mess caused > by developers who are careless with this principle.) I concur. However the problem SAP had some 18years ago when they invented their system were massive differences between databases. The scope they had in mind didn't allow for whole database layers to be redundant just for the sake of being able to talk to several database engines - ergo they wrote one layer and omitted using vendor dependant database features. Nowadays most relevant databases are pretty compatible when it comes to constraints, so if you stick to the basics you should be fine now. UC - -- Open Source Solutions 4U, LLC 2570 Fleetwood Drive Phone: +1 650 872 2425 San Bruno, CA 94066 Cell: +1 650 302 2405 United States Fax: +1 650 872 2417 -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.3 (GNU/Linux) iD8DBQFAhvgKjqGXBvRToM4RAmy1AJ9q2n44+9KFAp+o2u3NPqR6DISyGACePO6V a6L/yfArAk3m0N6lSQVDx0k= =dRNi -----END PGP SIGNATURE-----
Martha Stewart called it a Good Thing when ajs@crankycanuck.ca (Andrew Sullivan) wrote: > On Wed, Apr 21, 2004 at 11:23:51AM -0400, Jan Wieck wrote: >> for that is NOT performance. SAP spends that effort multiple times >> by implementing their own, custom integrity control and data domain >> system in the DB abstraction layer, to gain DB vendor >> independence. That abstraction layer is larger than PHP and Apache >> together, so this example is IMHO totally irrelevant for the >> typical MySQL user. > > Actually, I think it _is_ relevant. It's proof, IMNSHO, that the > strategy of "doing it in the client" is completely bankrupt. It's > one thing to do it this way if you have software which is a > category-killer the way SAP is, because you can afford the overhead > of all those developers doing all that extra work, and you can make > your customers buy trillion-dollar hardware to run your bloated > masterpiece. The Rest Of Us, however, need to do things > efficiently, and that means doing the work in the place where it is > least likely to need to be checked again. For most database > applications, that's inside the database. (I'll not now start my > rant on the mess caused by developers who are careless with this > principle.) There's a further issue, namely that these "bankrupt" ways were adopted literally decades ago, and involve the "trillion-dollar" investments. Way back when, R/2 was implemented on MVS using IMS, and SAP implemented their own toolset to manage that, including their own more-or-less-implicit transaction manager. Thirty years later, they have ported it to run on systems that didn't exist back then, but it's still, in essence, a "mainframe" thing, even if you're running it atop Windows NT and Microsoft's port of Sybase. It persists because there's 30 years worth of ABAP/4 code customized for a zillion purposes that SAP can keep on selling. The problem is not one of carelessness; it is that R/2 was written to run on databases like IMS, a pre-relational hierarchical system, and Adabas, which couldn't support having more than 256 tables, once upon a time... They had to create an ad-hoc variation on CICS, and starting from scratch would cost trillions. Even they couldn't afford that. Microsoft has been trying to do some "reinvention" of Windows in the new "Longhorn" thing, and despite having $Billion$ in the bank, it looks like those ambitions are dying the death of a thousand paper cuts. Jan's right, in that the typical MySQL user that's building an unambitious little PHP application doesn't care about the extra layers. They wouldn't have bought CICS, whether from IBM or from BEA, or something more modern, like Tuxedo; if they're using MySQL as a step up from MS Access, "doing things right" wasn't a notion that they had in their heads to even contemplate. It's like deciding to prefer Windows because if you visit Office Depot, Staples, CompUSA, and Circuit City, that's the only system you see boxed software for; it's not a measure of goodness, but merely the fact that it's visible, and can be made serviceable enough. Sleepycat DB can accurately claim to have hundreds of millions of deployments simply out of the fact that practically every Linux system links to it. (I see 67 programs in my /usr/bin that link to libdb3.so.3...) They're obviously far and away the "most popular open source database." Entertainingly enough, they have a replication system, too, and even an XA interface to support 2PC :-). No SQL, though... -- "cbbrowne","@","acm.org" http://www.ntlug.org/~cbbrowne/linux.html Why do we drive on parkways and park on driveways?
Quoth uwe@oss4u.com ("Uwe C. Schroeder"): > I concur. However the problem SAP had some 18years ago when they > invented their system were massive differences between > databases. The scope they had in mind didn't allow for whole > database layers to be redundant just for the sake of being able to > talk to several database engines - ergo they wrote one layer and > omitted using vendor dependant database features. Nowadays most > relevant databases are pretty compatible when it comes to > constraints, so if you stick to the basics you should be fine now. One of the issues was always that of locking. Different systems still have different semantics. -- output = reverse("gro.gultn" "@" "enworbbc") http://www.ntlug.org/~cbbrowne/nonrdbms.html I've implemented a parser combinator library in Generic C#, and indeed what is pretty clear in a functional language looks extremely scientific in an object-oriented one. -- Peter Sestoft
On Wed, Apr 21, 2004 at 10:08:07PM -0400, Christopher Browne wrote: > or something more modern, like Tuxedo; if they're using MySQL as a > step up from MS Access, ^^ That's spelled "down". Access is almost 100% SQL-92 compliant, allows subselects, and does pretty good query optimization. MySQL has nothing on it. And, no, I'm not some Microsofty. Michael -- Michael Darrin Chaney mdchaney@michaelchaney.com http://www.michaelchaney.com/
On Wed, Apr 21, 2004 at 10:08:07PM -0400, Christopher Browne wrote: > The problem is not one of carelessness; No, I agree that in SAP's case it wasn't carelessness. As you say, that was a long time ago. What I am arguing is that it is careless to design things using that approach today. And people do. A -- Andrew Sullivan | ajs@crankycanuck.ca
> > On Tue, Apr 20, 2004 at 11:26:24AM +0200, Pailloncy Jean-G?rard wrote: > > Hi, > > > > I just see that Mysql will propose at the end of the month a full > > synchronous replication system with auto-recovery. > > Well, sort of. It seems to be yet another 80/20 Solution From MySQL > (tm). > > It looks like it's based on a new table type. It stores everything > in memory, and then writes out asynchronously. This strikes me as > pretty dangerous from the point of view of reliability: what if the > box dies before the write is complete? (And don't tell me about > super-redundant high-availability hardware. I _have_ all that. All > hardware sucks; HA stuff just sucks less often at a higher price.) > Also, it doesn't support the other table types. I don't want to > contemplate the horrible mess you'd have to clean up if you had a > transaction crossing three table types and get a hardware failure. > > I'm afraid I agree with the recently-posted Oracle Veep interview: > this does not represent any serious challenge to the core ORAC > market. What is Oracle selling as their replication solution these days? When I still had a MetaLink userid they had posted a "Product Obsolescence Desupport Notice" for "Oracle Replication Services" The dates where something like: Desupport End Dates Error Correction Support: 01-SEP-2002 Extended Assistance Support: 01-SEP-2005 Oracle Recommended customers upgrade/migrate to the following... which was no migration path exits, as no new versions will be release and no replacement product is available Their ORAC if I understand it correctly is a "cluster" solution and no a "replication" solution. Guess I should visit their web site and see what they are pedaling for replication over WAN links these days. > > > I use PostgreSQL and I would appreciate to have the same > features in > > PostgreSQL. > > Sure, so would I. Talk to Jan Wieck about what he plans to do > about it, and maybe consider supporting that development work too ;-) > > A > > -- > Andrew Sullivan | ajs@crankycanuck.ca >
On Thu, Apr 22, 2004 at 09:42:12AM -0400, Eric Comeau wrote: > > What is Oracle selling as their replication solution these days? [. . .] > Their ORAC if I understand it correctly is a "cluster" solution and > no a "replication" solution. This is an example of why I think most of the discussion about "replication" is so confusing. ORAC is certainly a kind of replication: it provides always-on, hot redundancy in a cluster of machines. It's multi-master, and something very close to asynchronous. It's a _very_ clever system, but it'll do you not one whit of good if your primary site fails. Also, it's not suitable for use on unreliable hardware: every cluster member failure causes a "remastering" event which causes everything to stop while remastering happens. Finally, it requires some nifty but expensive storage -- storage which itself could be a single point of failure, if it failed in the right ways. To solve all of that, Oracle also offers Data Guard. This is basically a standard log-shipping technique. The off-site "standby" databases can't be used while in standby mode. This has all the standard caveats of asynchronous WAN replication, not least of which is that if you processed a $100 million transaction right before your master failed, and then you recovered onto a slave which didn't have that last moment of data, you might find yourself making a $100 million mistake. So, Oracle Corp offers two different ways to keeo you up nights. :) I'm sure they're both wonderful products. But they certainly don't have a one-size-fits-all approach. A -- Andrew Sullivan | ajs@crankycanuck.ca