Thread: Proposal for a cascaded master-slave replication system
Dear community, for some reason the post I sent yesterday night still did not show up on the mailing lists. I have set up some links on the developers side under http://developer.postgresql.org/~wieck/slony1.html The concept will be the base for some of my work as a Software Engineer here at Afilias USA INC. in the near future. Afilias is like many of you in need of reliable and performant replication solutions for backup and failover purposes. We started this work a couple of weeks ago by defining the goals and required features for our usage of PostgreSQL. Slony-I will be the first of 2 distinct replication systems designed with the 24/7 datacenter in mind. We want to build this system as a community project. The plan was from the beginning to release the product under the BSD license. And we think it is best to start it as such and to ask for suggestions during the design phase already. I would like to start developing the replication engine itself as soon as possible. And as a PostgreSQL CORE developer I will sure put some of my spare time into this as well. On the other hand there is absolutely no design other than "they mostly call some stored procedures" done for the frontend tools yet, and I think that we need some real good admin tools in the end. I look forward to your comments. Jan -- #======================================================================# # It's easier to get forgiveness for being wrong than for being right. # # Let's break this rule - forgive me. # #================================================== JanWieck@Yahoo.com #
Jan Wieck wrote: > http://developer.postgresql.org/~wieck/slony1.html Very interesting read. Nice work! > We want to build this system as a community project. The plan was from > the beginning to release the product under the BSD license. And we think > it is best to start it as such and to ask for suggestions during the > design phase already. I couldn't quite tell from the design doc -- do you intend to support conditional replication at a row level? I'm also curious, with cascaded replication, how do you handle the case where a second level slave has a transaction failure for some reason, i.e.: M / \ / \ Sa Sb / \ / \ Sc Sd Se Sf What happens if data is successfully replicated to Sa, Sb, Sc, and Sd, and then an exception/rollback occurs on Se? Joe
On Nov 11, 2003, at 12:11 PM, Joe Conway wrote: > Jan Wieck wrote: >> http://developer.postgresql.org/~wieck/slony1.html > > Very interesting read. Nice work! Ditto. I'll read it a bit closer later, but after a quick read it seems quite complete and well thought out. I especially like that sequences are being dealt with. Thanks for putting the effort in, and making it a community project. > >> We want to build this system as a community project. The plan was from >> the beginning to release the product under the BSD license. And we >> think >> it is best to start it as such and to ask for suggestions during the >> design phase already. > > I couldn't quite tell from the design doc -- do you intend to support > conditional replication at a row level? > > I'm also curious, with cascaded replication, how do you handle the > case where a second level slave has a transaction failure for some > reason, i.e.: > > M > / \ > / \ > Sa Sb > / \ / \ > Sc Sd Se Sf > > What happens if data is successfully replicated to Sa, Sb, Sc, and Sd, > and then an exception/rollback occurs on Se? > > Joe > > > ---------------------------(end of > broadcast)--------------------------- > TIP 2: you can get off all lists at once with the unregister command > (send "unregister YourEmailAddressHere" to majordomo@postgresql.org) > -------------------- Andrew Rawnsley President The Ravensfield Digital Resource Group, Ltd. (740) 587-0114 www.ravensfield.com
Joe Conway wrote: > Jan Wieck wrote: >> http://developer.postgresql.org/~wieck/slony1.html > > Very interesting read. Nice work! > >> We want to build this system as a community project. The plan was from >> the beginning to release the product under the BSD license. And we think >> it is best to start it as such and to ask for suggestions during the >> design phase already. > > I couldn't quite tell from the design doc -- do you intend to support > conditional replication at a row level? If you mean to configure the system to replicate rows to different destinations (slaves) based on arbitrary qualifications, no. I had thought about it, but it does not really fit into the "datacenter and failover" picture, so it is not required to meet the goals and adds unnecessary complexity. This sort of feature is much more important for a replication system designed for hundreds or thousands of sporadic, asynchronous multi-master systems, the typical "salesman on the street" kind of replication. > > I'm also curious, with cascaded replication, how do you handle the case > where a second level slave has a transaction failure for some reason, i.e.: > > M > / \ > / \ > Sa Sb > / \ / \ > Sc Sd Se Sf > > What happens if data is successfully replicated to Sa, Sb, Sc, and Sd, > and then an exception/rollback occurs on Se? First, it does not replicate single transactions. It replicates batches of them together. Since the transactions are already committed (and possibly some other depending on them too), there is no way - you loose Se. If this is only a temporary failure, like a power fail and the database recovers on restart fine including the last confirmed SYNC event (they get confirmed after they commit locally, but that's before the next checkpoint so there is actually a gap where the slave could loose a committed transaction and then it's lost for sure) ... so if it comes back up without loosing the last confirmed SYNC, it will catch up. Jan -- #======================================================================# # It's easier to get forgiveness for being wrong than for being right. # # Let's break this rule - forgive me. # #================================================== JanWieck@Yahoo.com #
Jan Wieck wrote: > If you mean to configure the system to replicate rows to different > destinations (slaves) based on arbitrary qualifications, no. I had > thought about it, but it does not really fit into the "datacenter and > failover" picture, so it is not required to meet the goals and adds > unnecessary complexity. > > This sort of feature is much more important for a replication system > designed for hundreds or thousands of sporadic, asynchronous > multi-master systems, the typical "salesman on the street" kind of > replication. OK, thanks. This actually fits any kind of distributed application. We have one that lives in our datacenters, but needs to replicate across both fast LAN/MAN and slow WAN. It is multimaster in the sense that individual data rows can be originated anywhere, but they are read-only in nodes other than where they were originated. Anyway, I'm using a hacked copy of dbmirror at the moment. > First, it does not replicate single transactions. It replicates batches > of them together. Since the transactions are already committed (and > possibly some other depending on them too), there is no way - you loose Se. OK, got it. Thanks. Joe
Hans-Jürgen Schönig wrote: > Jan, > > First of all we really appreciate that this is going to be an Open > Source project. > There is something I wanted to add from a marketing point of view: I > have done many public talks in the 2 years or so. There is one question > people keep asking me: "How about the pgreplication project?". In every > training course, at any conference people keep asking for synchronous > replication. We have offered this people some async solutions which are > already out there but nobody seems to be interested in having it (my > person impression). People keep asking for a sync approach via email but > nobody seems to care about an async approach. This does not mean that > async is bad but we can see a strong demand for synchronous replication. > > Meanwhile we seem to be in a situation where PostgreSQL is rather > competing against Oracle than against MySQL. In our case there are more > people asking for Oracle -> Pg migration than for MySQL -> Pg. MySQL > does not seem to be the great enemy because most people know that it is > an inferior product anyway. What I want to point out is that some people > want an alternative Oracle's Real Application Cluster. They want load > balancing and hot failover. Even data centers asking for replication did > not want to have an async approach in the past. Hans-Jürgen, we are well aware of the high demand for multi-master replication addressing load balancing and clustering. We have that need ourself as well and I plan to work on a follow-up project as soon as Slony-I is released. But as of now, we see a higher priority for a reliable master slave system that includes the cascading and backup features described in my concept. There are a couple of different similar product out there, I know. But show me one of them where you can failover without becoming the single point of failure? We've just recently seen ... or better "where not able to see anything any more" how failures tend to ripple through systems - half of the US East Coast was dark. So where is the replication system where a slave becomes the "master", and not a standalone server. Show me one that has a clear concept of failback, one that has hot-join as a primary design goal. These are the features that I expect if something is labeled "Enterprise Level". As far as my ideas for multi-master go, it will be a synchronous solution using group communication. My idea is "group commit" instead of 2-Phase ... and an early stage test hack has replicated some update 3 weeks ago. The big challange will be to integrate the two systems so that a node can start as an asynchronous Slony-I slave, catch up ... and switch over to synchronous multimaster without stopping the cluster. I have no clue yet how to do that, but I refuse to think smaller. Jan -- #======================================================================# # It's easier to get forgiveness for being wrong than for being right. # # Let's break this rule - forgive me. # #================================================== JanWieck@Yahoo.com #
Jan, I am wondering if you are familar with the work covered in 'Recovery in Parallel Database Systems' by Svein-Olaf Hvasshovd (Vieweg) ? The book is an excellent detailed description covering high availablility DB implementations. I think your right on by not thinking smaller!! Jordan Henderson On Wednesday 12 November 2003 10:45, Jan Wieck wrote: > Hans-Jürgen Schönig wrote: > > Jan, > > > > First of all we really appreciate that this is going to be an Open > > Source project. > > There is something I wanted to add from a marketing point of view: I > > have done many public talks in the 2 years or so. There is one question > > people keep asking me: "How about the pgreplication project?". In every > > training course, at any conference people keep asking for synchronous > > replication. We have offered this people some async solutions which are > > already out there but nobody seems to be interested in having it (my > > person impression). People keep asking for a sync approach via email but > > nobody seems to care about an async approach. This does not mean that > > async is bad but we can see a strong demand for synchronous replication. > > > > Meanwhile we seem to be in a situation where PostgreSQL is rather > > competing against Oracle than against MySQL. In our case there are more > > people asking for Oracle -> Pg migration than for MySQL -> Pg. MySQL > > does not seem to be the great enemy because most people know that it is > > an inferior product anyway. What I want to point out is that some people > > want an alternative Oracle's Real Application Cluster. They want load > > balancing and hot failover. Even data centers asking for replication did > > not want to have an async approach in the past. > > Hans-Jürgen, > > we are well aware of the high demand for multi-master replication > addressing load balancing and clustering. We have that need ourself as > well and I plan to work on a follow-up project as soon as Slony-I is > released. But as of now, we see a higher priority for a reliable master > slave system that includes the cascading and backup features described > in my concept. There are a couple of different similar product out > there, I know. But show me one of them where you can failover without > becoming the single point of failure? We've just recently seen ... or > better "where not able to see anything any more" how failures tend to > ripple through systems - half of the US East Coast was dark. So where is > the replication system where a slave becomes the "master", and not a > standalone server. Show me one that has a clear concept of failback, one > that has hot-join as a primary design goal. These are the features that > I expect if something is labeled "Enterprise Level". > > As far as my ideas for multi-master go, it will be a synchronous > solution using group communication. My idea is "group commit" instead of > 2-Phase ... and an early stage test hack has replicated some update 3 > weeks ago. The big challange will be to integrate the two systems so > that a node can start as an asynchronous Slony-I slave, catch up ... and > switch over to synchronous multimaster without stopping the cluster. I > have no clue yet how to do that, but I refuse to think smaller. > > > Jan
Re: [HACKERS] Proposal for a cascaded master-slave replication system
From
Hans-Jürgen Schönig
Date:
Jan, This is EXACTLY what we have been waiting for (years) :) :) :). If you need somebody for testing or documentation just drop me a line. Cheers, Hans Jan Wieck wrote: > Hans-Jürgen Schönig wrote: > >> Jan, >> >> First of all we really appreciate that this is going to be an Open >> Source project. >> There is something I wanted to add from a marketing point of view: I >> have done many public talks in the 2 years or so. There is one >> question people keep asking me: "How about the pgreplication >> project?". In every training course, at any conference people keep >> asking for synchronous replication. We have offered this people some >> async solutions which are already out there but nobody seems to be >> interested in having it (my person impression). People keep asking for >> a sync approach via email but nobody seems to care about an async >> approach. This does not mean that async is bad but we can see a strong >> demand for synchronous replication. >> >> Meanwhile we seem to be in a situation where PostgreSQL is rather >> competing against Oracle than against MySQL. In our case there are >> more people asking for Oracle -> Pg migration than for MySQL -> Pg. >> MySQL does not seem to be the great enemy because most people know >> that it is an inferior product anyway. What I want to point out is >> that some people want an alternative Oracle's Real Application >> Cluster. They want load balancing and hot failover. Even data centers >> asking for replication did not want to have an async approach in the >> past. > > > Hans-Jürgen, > > we are well aware of the high demand for multi-master replication > addressing load balancing and clustering. We have that need ourself as > well and I plan to work on a follow-up project as soon as Slony-I is > released. But as of now, we see a higher priority for a reliable master > slave system that includes the cascading and backup features described > in my concept. There are a couple of different similar product out > there, I know. But show me one of them where you can failover without > becoming the single point of failure? We've just recently seen ... or > better "where not able to see anything any more" how failures tend to > ripple through systems - half of the US East Coast was dark. So where is > the replication system where a slave becomes the "master", and not a > standalone server. Show me one that has a clear concept of failback, one > that has hot-join as a primary design goal. These are the features that > I expect if something is labeled "Enterprise Level". > > As far as my ideas for multi-master go, it will be a synchronous > solution using group communication. My idea is "group commit" instead of > 2-Phase ... and an early stage test hack has replicated some update 3 > weeks ago. The big challange will be to integrate the two systems so > that a node can start as an asynchronous Slony-I slave, catch up ... and > switch over to synchronous multimaster without stopping the cluster. I > have no clue yet how to do that, but I refuse to think smaller. > > > Jan > -- Cybertec Geschwinde u Schoenig Ludo-Hartmannplatz 1/14, A-1160 Vienna, Austria Tel: +43/2952/30706 or +43/660/816 40 77 www.cybertec.at, www.postgresql.at, kernel.cybertec.at
In the last exciting episode, JanWieck@Yahoo.com (Jan Wieck) wrote: > I look forward to your comments. It is not evident from the paper what approach is taken to dealing with the duplicate key conflicts. The example: UPDATE table SET col1 = 'temp' where col = 'A'; UPDATE table SET col1 = 'A' where col = 'B'; UPDATE table SET col1 = 'B' where col = 'temp'; I can think of several approaches to this: 1. The present eRserv code reads what is in the table at the time of the 'snapshot', and so tries to pass on: update table set col1 = 'B' where otherkey = 123; update table set col1 = 'A' where otherkey = 456; which breaks because at some point, col1 is not unique, irrespective of what order we apply the changes in. 2. If the contents as at the time of the COMMIT are stored in the log table, then we would do all three updates in the destination DB, in order, as shown above. Either we have to: a) Store the updated fields in the replication tables somewhere, or b) Make the third UPDATE wait for the updates to be stored in a file somewhere. 3. The replication code requires that any given key only be updated once in a 'snapshot', so that the updates may be unambiguously partitioned: UPDATE table SET col1 = 'temp' where col = 'A' ; -- and otherkey = 123 UPDATE table SET col1 = 'A' where col = 'B'; -- and otherkey = 456 -- Must partition here before hitting #123 again -- UPDATE table SET col1 = 'B' where col = 'temp'; -- and otherkey = 123 The third UPDATE may have to be held up until the "partition" is set up, right? 4. I seem to recall a recent discussion about the possibility of deferring the UNIQUE constraint 'til the END of a commit, with the result that we could simplify to update table set col1 = 'B' where otherkey = 123; update table set col1 = 'A' where otherkey = 456; and discover that the UNIQUE constraint was relaxed just long enough for us to make the TWO changes that in the end combined to being unique. None of these look like they turn out totally happily, or am I missing an approach? -- wm(X,Y):-write(X),write('@'),write(Y). wm('cbbrowne','ntlug.org'). http://www.ntlug.org/~cbbrowne/languages.html "Java and C++ make you think that the new ideas are like the old ones. Java is the most distressing thing to hit computing since MS-DOS." -- Alan Kay
Re: [HACKERS] Proposal for a cascaded master-slave replication system
From
Hans-Jürgen Schönig
Date:
Jan Wieck wrote: > Dear community, > > for some reason the post I sent yesterday night still did not show up on > the mailing lists. I have set up some links on the developers side under > http://developer.postgresql.org/~wieck/slony1.html > > The concept will be the base for some of my work as a Software Engineer > here at Afilias USA INC. in the near future. Afilias is like many of you > in need of reliable and performant replication solutions for backup and > failover purposes. We started this work a couple of weeks ago by > defining the goals and required features for our usage of PostgreSQL. > > Slony-I will be the first of 2 distinct replication systems designed > with the 24/7 datacenter in mind. > > We want to build this system as a community project. The plan was from > the beginning to release the product under the BSD license. And we think > it is best to start it as such and to ask for suggestions during the > design phase already. > > I would like to start developing the replication engine itself as soon > as possible. And as a PostgreSQL CORE developer I will sure put some of > my spare time into this as well. On the other hand there is absolutely > no design other than "they mostly call some stored procedures" done for > the frontend tools yet, and I think that we need some real good admin > tools in the end. > > I look forward to your comments. > > > Jan > Jan, First of all we really appreciate that this is going to be an Open Source project. There is something I wanted to add from a marketing point of view: I have done many public talks in the 2 years or so. There is one question people keep asking me: "How about the pgreplication project?". In every training course, at any conference people keep asking for synchronous replication. We have offered this people some async solutions which are already out there but nobody seems to be interested in having it (my person impression). People keep asking for a sync approach via email but nobody seems to care about an async approach. This does not mean that async is bad but we can see a strong demand for synchronous replication. Meanwhile we seem to be in a situation where PostgreSQL is rather competing against Oracle than against MySQL. In our case there are more people asking for Oracle -> Pg migration than for MySQL -> Pg. MySQL does not seem to be the great enemy because most people know that it is an inferior product anyway. What I want to point out is that some people want an alternative Oracle's Real Application Cluster. They want load balancing and hot failover. Even data centers asking for replication did not want to have an async approach in the past. I just wanted to mention that because personally I don't have the impression that an additional async project is worth the effort. Note: This does not mean that it is bad to have one more product ;). Cheers, Hans -- Cybertec Geschwinde u Schoenig Ludo-Hartmannplatz 1/14, A-1160 Vienna, Austria Tel: +43/2952/30706 or +43/660/816 40 77 www.cybertec.at, www.postgresql.at, kernel.cybertec.at
Jordan Henderson wrote: > Jan, > > I am wondering if you are familar with the work covered in 'Recovery in > Parallel Database Systems' by Svein-Olaf Hvasshovd (Vieweg) ? The book is an > excellent detailed description covering high availablility DB > implementations. No, but it sounds like something I allways wanted to have. > > I think your right on by not thinking smaller!! Thanks Jan > > Jordan Henderson > On Wednesday 12 November 2003 10:45, Jan Wieck wrote: >> Hans-Jürgen Schönig wrote: >> > Jan, >> > >> > First of all we really appreciate that this is going to be an Open >> > Source project. >> > There is something I wanted to add from a marketing point of view: I >> > have done many public talks in the 2 years or so. There is one question >> > people keep asking me: "How about the pgreplication project?". In every >> > training course, at any conference people keep asking for synchronous >> > replication. We have offered this people some async solutions which are >> > already out there but nobody seems to be interested in having it (my >> > person impression). People keep asking for a sync approach via email but >> > nobody seems to care about an async approach. This does not mean that >> > async is bad but we can see a strong demand for synchronous replication. >> > >> > Meanwhile we seem to be in a situation where PostgreSQL is rather >> > competing against Oracle than against MySQL. In our case there are more >> > people asking for Oracle -> Pg migration than for MySQL -> Pg. MySQL >> > does not seem to be the great enemy because most people know that it is >> > an inferior product anyway. What I want to point out is that some people >> > want an alternative Oracle's Real Application Cluster. They want load >> > balancing and hot failover. Even data centers asking for replication did >> > not want to have an async approach in the past. >> >> Hans-Jürgen, >> >> we are well aware of the high demand for multi-master replication >> addressing load balancing and clustering. We have that need ourself as >> well and I plan to work on a follow-up project as soon as Slony-I is >> released. But as of now, we see a higher priority for a reliable master >> slave system that includes the cascading and backup features described >> in my concept. There are a couple of different similar product out >> there, I know. But show me one of them where you can failover without >> becoming the single point of failure? We've just recently seen ... or >> better "where not able to see anything any more" how failures tend to >> ripple through systems - half of the US East Coast was dark. So where is >> the replication system where a slave becomes the "master", and not a >> standalone server. Show me one that has a clear concept of failback, one >> that has hot-join as a primary design goal. These are the features that >> I expect if something is labeled "Enterprise Level". >> >> As far as my ideas for multi-master go, it will be a synchronous >> solution using group communication. My idea is "group commit" instead of >> 2-Phase ... and an early stage test hack has replicated some update 3 >> weeks ago. The big challange will be to integrate the two systems so >> that a node can start as an asynchronous Slony-I slave, catch up ... and >> switch over to synchronous multimaster without stopping the cluster. I >> have no clue yet how to do that, but I refuse to think smaller. >> >> >> Jan > > > ---------------------------(end of broadcast)--------------------------- > TIP 7: don't forget to increase your free space map settings -- #======================================================================# # It's easier to get forgiveness for being wrong than for being right. # # Let's break this rule - forgive me. # #================================================== JanWieck@Yahoo.com #
Christopher Browne wrote: > In the last exciting episode, JanWieck@Yahoo.com (Jan Wieck) wrote: >> I look forward to your comments. > > It is not evident from the paper what approach is taken to dealing > with the duplicate key conflicts. > > The example: > > UPDATE table SET col1 = 'temp' where col = 'A'; > UPDATE table SET col1 = 'A' where col = 'B'; > UPDATE table SET col1 = 'B' where col = 'temp'; > > I can think of several approaches to this: One fundamental flaw in eRServer is that it tries to "combine" multiple updates into one update at snapshot-time in the first place. The application can do these three steps in one single transaction, how do you split that? You can develop an automatic recovery for that. At the time you got a dupkey error, you rollback but remember the _rserv_ts and table_id that caused the dupkey. In the next sync attempt, you fetch the row with that _rserv_ts and delete all rows from the slave table with that primary key plus fake INSERT log rows on the master for the same. Then you prepare and apply and cross fingers that nobody touched the same row again already between your last attempt and now ... which was how many hours ago? And since you can only find one dupkey per round, you might do this a few times with larger and larger lists of _rserv_ts,table_id. The idea of not accumulating log forever, but just holding this status table (the name log is misleading in eRServer, it holds flags telling "the row with _rserv_ts=nnnn got INS|UPD|DEL'd") has one big advantage. However long your slave does not sync, your master will not run out of space. But I don't think that there is value in the attempt to let a slave catch up the last 4 days at once anyway. Drop it and use COPY. When your slave does not come up before you have modified half your database, it will be faster this way anyway. Jan > > 1. The present eRserv code reads what is in the table at the time of > the 'snapshot', and so tries to pass on: > > update table set col1 = 'B' where otherkey = 123; > update table set col1 = 'A' where otherkey = 456; > > which breaks because at some point, col1 is not unique, irrespective > of what order we apply the changes in. > > 2. If the contents as at the time of the COMMIT are stored in the log > table, then we would do all three updates in the destination DB, in > order, as shown above. > > Either we have to: > a) Store the updated fields in the replication tables somewhere, or > b) Make the third UPDATE wait for the updates to be stored in a > file somewhere. > > 3. The replication code requires that any given key only be updated > once in a 'snapshot', so that the updates may be unambiguously > partitioned: > > UPDATE table SET col1 = 'temp' where col = 'A' ; -- and otherkey = 123 > UPDATE table SET col1 = 'A' where col = 'B'; -- and otherkey = 456 > -- Must partition here before hitting #123 again -- > UPDATE table SET col1 = 'B' where col = 'temp'; -- and otherkey = 123 > > The third UPDATE may have to be held up until the "partition" is set > up, right? > > 4. I seem to recall a recent discussion about the possibility of > deferring the UNIQUE constraint 'til the END of a commit, with the > result that we could simplify to > > update table set col1 = 'B' where otherkey = 123; > update table set col1 = 'A' where otherkey = 456; > > and discover that the UNIQUE constraint was relaxed just long enough > for us to make the TWO changes that in the end combined to being > unique. > > None of these look like they turn out totally happily, or am I missing > an approach? -- #======================================================================# # It's easier to get forgiveness for being wrong than for being right. # # Let's break this rule - forgive me. # #================================================== JanWieck@Yahoo.com #
On Wed, Nov 12, 2003 at 02:08:23PM +0100, Hans-J?rgen Sch?nig wrote: > an inferior product anyway. What I want to point out is that some people > want an alternative Oracle's Real Application Cluster. They want load > balancing and hot failover. Even data centers asking for replication did > not want to have an async approach in the past. I think Jan has already outlined his more-distant-future idea, but I'd also like to know whether the people who are asking for a replacement for RAC are willing to invest in it? You could buy some _awfully_ good development time for even a year's worth of licensing for RAC. I get the impression from the Postgres-R list that their biggest obstacle is development resources. <rant> People often like to say they need hot-fail-capable, five nines, 24/7/365 systems. For most applications, I just do not believe that, and the truth is that the cost of getting from three nines to four (never mind five) is so great that people cheat: one paragraph has the "five nines" clause, and the next paragraph talks about scheduled downtime. In a real "five nines" system (the phone company, say, or the air traffic control system), the time for scheduled downtime is just the cumulative possible outage at any node when it is being switched with its replacement. Five minutes a year is a pretty high bar to jump, and most people long ago concluded that you don't actually need it for most applications. </rant> A -- ---- Andrew Sullivan 204-4141 Yonge Street Afilias Canada Toronto, Ontario Canada <andrew@libertyrms.info> M2P 2A8 +1 416 646 3304 x110
Hans-J�rgen Sch�nig wrote: > Meanwhile we seem to be in a situation where PostgreSQL is rather > competing against Oracle than against MySQL. In our case there are more > people asking for Oracle -> Pg migration than for MySQL -> Pg. MySQL > does not seem to be the great enemy because most people know that it is > an inferior product anyway. I can confirm Hans' impressions --- I get very few questions about MySQL vs. PostgreSQL, at least in the past few years. People still using MySQL at this point know they are using something inferior to PostgreSQL, and if they didn't, the new MySQL licensing has made it abundantly clear. MySQL just isn't in the same league, and probably will never be. What people want is Informix/Oracle/MS-SQL => PostgreSQL. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania 19073
On Wed, Nov 12, 2003 at 07:46:11PM -0500, James Robinson wrote: > Speaking from a non-profit whose enterprise data sits inside postgres, > we would be willing to invest a few thousand dollars into the pot of > synchronous multi-master replication. Postgres-r sounded absolutely > marvelous to us back in the day that it was rumored to be one of the > possible deliverables of 7.4. As far as I know, Postgres-R hackers are eager for funding. There is a foundation which was established to fund such activities, in an effort to pool the resources that people had to donate. You could ask in the Postgres-R mailing list about it. A -- ---- Andrew Sullivan 204-4141 Yonge Street Afilias Canada Toronto, Ontario Canada <andrew@libertyrms.info> M2P 2A8 +1 416 646 3304 x110
On Wed, Nov 12, 2003 at 04:43:03PM -0500, Andrew Sullivan wrote: > <rant> People often like to say they need hot-fail-capable, five BTW, this was not a rant at the person posting -- he was just reporting what he has heard. I've heard it plenty, too, and the people whence I've heard it are the rant targets. Since hot-failover replication really is indistinguishable from magic in the eyes of the correctly-shaped-hair crowd, they ask for it all over the place, figuring it'll be free. A -- ---- Andrew Sullivan 204-4141 Yonge Street Afilias Canada Toronto, Ontario Canada <andrew@libertyrms.info> M2P 2A8 +1 416 646 3304 x110
On Tue, Nov 11, 2003 at 03:38:53PM -0500, Christopher Browne wrote: > In the last exciting episode, JanWieck@Yahoo.com (Jan Wieck) wrote: > > I look forward to your comments. > > It is not evident from the paper what approach is taken to dealing > with the duplicate key conflicts. > > The example: > > UPDATE table SET col1 = 'temp' where col = 'A'; > UPDATE table SET col1 = 'A' where col = 'B'; > UPDATE table SET col1 = 'B' where col = 'temp'; It's not a problem, because as the proposal states, the actual SQL is to be sent in order to the slave. That is, only consistent sets are sent: you can't have a condition on the slave that never could have obtained on the master. This means greater overhead for cases where the same row is altered repeatedly, but it's safe. A -- ---- Andrew Sullivan 204-4141 Yonge Street Afilias Canada Toronto, Ontario Canada <andrew@libertyrms.info> M2P 2A8 +1 416 646 3304 x110
Bruce Momjian wrote: > Hans-J?rgen Sch?nig wrote: >> Meanwhile we seem to be in a situation where PostgreSQL is rather >> competing against Oracle than against MySQL. In our case there are more >> people asking for Oracle -> Pg migration than for MySQL -> Pg. MySQL >> does not seem to be the great enemy because most people know that it is >> an inferior product anyway. > > I can confirm Hans' impressions --- I get very few questions about MySQL > vs. PostgreSQL, at least in the past few years. People still using > MySQL at this point know they are using something inferior to > PostgreSQL, and if they didn't, the new MySQL licensing has made it > abundantly clear. MySQL just isn't in the same league, and probably > will never be. What people want is Informix/Oracle/MS-SQL => PostgreSQL. > I would like to add that there is a good reason why they aren't in the same league. As a rule of thumb one can say that the smaller a software company, the faster some development must turn into revenue. That is why Oracle and Microsoft have the "time" to do things right. They can throw 20 manyears at a project and if it turns out that wasn't enough, double down on that. I include MS on purpose here, because they gain that time from some products, and then use it on others like SQL server. MySQL on the other hand didn't have that "time" in the past, and look what they do as soon as they have 19.5 million seconds more "time" ... the only thing that is right, replace the whole architecture, or what is that MaxSQL move? I hope 19.5 million seconds are enough, honestly. Because nobody will double down in their case. PostgreSQL does not have that problem because the base project itself does not depend on any companies success. Time is relative. Our time is very patient compared to their time. PostgreSQL gets the time it needs for free. Jan -- #======================================================================# # It's easier to get forgiveness for being wrong than for being right. # # Let's break this rule - forgive me. # #================================================== JanWieck@Yahoo.com #
Speaking from a non-profit whose enterprise data sits inside postgres, we would be willing to invest a few thousand dollars into the pot of synchronous multi-master replication. Postgres-r sounded absolutely marvelous to us back in the day that it was rumored to be one of the possible deliverables of 7.4. Not so much for nine-nines of uptime, but for the case of being able to take a full hit on a DB box in production yet still remain running w/o any data loss. Our application servers are JBoss and will be high-available clustered / fully-mirrored, but even with RAID on the DB box one bad thing could take it down, and the data between the hourly backup would go down with it. We have experimented in-house with C-JDBC [ being 'lucky' enough to have all DB writes to go through JDBC ], but would feel more confident w/o involving another service in-between the application and the DB layers, especially since it is not yet fully high-available -- currently shifts the single-point of failure from the DB layer to the CJDBC controller single point. It is reported to have HA via group communication 'soon', but, you never can tell. Read up on it at http://c-jdbc.objectweb.org/ , but the end feel I got from it was not nearly so warm and cozy with the problem being solved at the right place -- the postgres-r way felt much more robust / speedy. We won't ever have parallel oracle dollars, but we would have dollars to bring higher-availability to postgres. 'Cause its our butt on the line hosting our client's data. ---- James Robinson Socialserve.com
Re: [HACKERS] Proposal for a cascaded master-slave replication system
From
dalgoda@ix.netcom.com (Mike Castle)
Date:
In article <200311131624.hADGO7824904@candle.pha.pa.us>, Bruce Momjian <pgman@candle.pha.pa.us> wrote: >Yes, I noticed that we have a much longer view of our software lifecycle >than most other open source projects. I think the only other things comparable are the OSes themselves. The Linux kernel and the releases of the various *BSDs seem to be on similar scales. mrc -- Mike Castle dalgoda@ix.netcom.com www.netcom.com/~dalgoda/ We are all of us living in the shadow of Manhattan. -- Watchmen fatal ("You are in a maze of twisty compiler features, all different"); -- gcc