Thread: Replication
Hi, I currently have a postgresql 8 system which I want to replicate (ideally in realtime) with a spare host in order to introduce some redundancy - eg. if the master server dies then I've got a ready-to-go backup. Switchover does not have to be automated. I've looked into commandprompt.com's mammoth system, but it only supports up to 1000 tables (the documentation doesn't mention this!) - the database in question has more than 1000 tables, and adds new tables regularly. Slony-I and pgpool apparently don't support dynamic schemas, which I'd obviously need, so they're not quite up to the job either. I'm currently looking at some sort of hack-job with the WAL archives (see http://www.issociate.de/board/index.php?t=msg&goto=443099), but this looks like a slightly flaky approach - have I missed the obvious solution? Is there any stable software available which can replicate a large and dynamic number of tables? Cheers, Barry
Am Montag, 12. September 2005 13:52 schrieb barry@e-rm.co.uk: > I currently have a postgresql 8 system which I want to replicate Look at DRBD. -- Peter Eisentraut http://developer.postgresql.org/~petere/
barry@e-rm.co.uk wrote: >I've looked into commandprompt.com's mammoth system, but it only >supports up to 1000 tables (the documentation doesn't mention this!) - >the database in question has more than 1000 tables, and adds new tables >regularly. Slony-I and pgpool apparently don't support dynamic >schemas, which I'd obviously need, so they're not quite up to the job >either. > Actually the theorectical limit for the Mammoth system is 10,000 tables, we have only tested up to 1000. This is metioned quite clearly on the website. It should be noted that if Slony-I won't do what you need then Mammoth probably would not either. Although Mammoth can add new tables on the fly as long as those tables are empty. If they are not empty they would cause a full sync to occur. Sincerely, Joshua D. Drake > >I'm currently looking at some sort of hack-job with the WAL archives >(see http://www.issociate.de/board/index.php?t=msg&goto=443099), but >this looks like a slightly flaky approach - have I missed the obvious >solution? Is there any stable software available which can replicate a >large and dynamic number of tables? > >Cheers, > >Barry > > >---------------------------(end of broadcast)--------------------------- >TIP 4: Have you searched our list archives? > > http://archives.postgresql.org > -- Your PostgreSQL solutions company - Command Prompt, Inc. 1.800.492.2240 PostgreSQL Replication, Consulting, Custom Programming, 24x7 support Managed Services, Shared and Dedicated Hosting Co-Authors: plPHP, plPerlNG - http://www.commandprompt.com/
Barry, You can use PITR to archive transaction logs to a second server that is kept in standby mode. This will cope with any number of tables and cope with dynamic changes to tables. This is fairly straightforward and very low overhead. Set archive_command to a program that transfers xlog files to second server. Then set restore_command on the second server to a program that loops until the next file is available. Switchover time is low. Best Regards, Simon Riggs > -----Original Message----- > From: pgsql-general-owner@postgresql.org > [mailto:pgsql-general-owner@postgresql.org]On Behalf Of > barry@e-rm.co.uk > Sent: 12 September 2005 04:52 > To: pgsql-general@postgresql.org > Subject: [GENERAL] Replication > > > Hi, > > I currently have a postgresql 8 system which I want to replicate > (ideally in realtime) with a spare host in order to introduce some > redundancy - eg. if the master server dies then I've got a ready-to-go > backup. Switchover does not have to be automated. > > I've looked into commandprompt.com's mammoth system, but it only > supports up to 1000 tables (the documentation doesn't mention this!) - > the database in question has more than 1000 tables, and adds > new tables > regularly. Slony-I and pgpool apparently don't support dynamic > schemas, which I'd obviously need, so they're not quite up to the job > either. > > I'm currently looking at some sort of hack-job with the WAL archives > (see http://www.issociate.de/board/index.php?t=msg&goto=443099), but > this looks like a slightly flaky approach - have I missed the obvious > solution? Is there any stable software available which can > replicate a > large and dynamic number of tables? > > Cheers, > > Barry > > > ---------------------------(end of > broadcast)--------------------------- > TIP 4: Have you searched our list archives? > > http://archives.postgresql.org
> I currently have a postgresql 8 system which I want to replicate > (ideally in realtime) with a spare host in order to introduce some > redundancy - eg. if the master server dies then I've got a ready-to-go > backup. Switchover does not have to be automated. > > I've looked into commandprompt.com's mammoth system, but it only > supports up to 1000 tables (the documentation doesn't mention this!) - > the database in question has more than 1000 tables, and adds new tables > regularly. Slony-I and pgpool apparently don't support dynamic > schemas, which I'd obviously need, so they're not quite up to the job > either. pgpool(without Slony-I) replicates schema changes. And PGCluter too. -- SRA OSS, Inc. Japan Tatsuo Ishii
> You can use PITR to archive transaction logs to a second server that is > kept in standby mode. > > This will cope with any number of tables and cope with dynamic changes > to tables. > > This is fairly straightforward and very low overhead. > Set archive_command to a program that transfers xlog files to second > server. > Then set restore_command on the second server to a program that loops > until the next file is available. > Switchover time is low. I thought there are some issues of log based replication (I don't remeber details though). Have they been resolved recently? -- SRA OSS, Inc. Japan Tatsuo Ishii
> From: Tatsuo Ishii [mailto:t-ishii@sra.co.jp] > > You can use PITR to archive transaction logs to a second > server that is > > kept in standby mode. > > > > This will cope with any number of tables and cope with > dynamic changes > > to tables. > > > > This is fairly straightforward and very low overhead. > > Set archive_command to a program that transfers xlog files to second > > server. > > Then set restore_command on the second server to a program > that loops > > until the next file is available. > > Switchover time is low. > > I thought there are some issues of log based replication (I don't > remeber details though). Have they been resolved recently? Yes, currently log-file based replication is not recommended when the transaction rate is either low or extremely variable. I was regrettably unable to get a patch for that into 8.1 Best Regards, Simon Riggs
"Simon Riggs" <simon@2ndquadrant.com> writes: >> From: Tatsuo Ishii [mailto:t-ishii@sra.co.jp] >> I thought there are some issues of log based replication (I don't >> remeber details though). Have they been resolved recently? > Yes, currently log-file based replication is not recommended when the > transaction rate is either low or extremely variable. The point being not that it doesn't work, but that it can take a long time (or an unpredictable amount of time) for transactions to propagate to the slave machine. You can kluge up a solution for this, ie, write a cron job to copy the latest WAL segment file across to the slave once a minute (or whatever your requirement for propagation delay is). But we haven't gotten a standard solution into the code itself yet. regards, tom lane
Simon Riggs wrote: > Barry, > > You can use PITR to archive transaction logs to a second server that is > kept in standby mode. > > This will cope with any number of tables and cope with dynamic changes > to tables. > > This is fairly straightforward and very low overhead. > Set archive_command to a program that transfers xlog files to second > server. > Then set restore_command on the second server to a program that loops > until the next file is available. > Switchover time is low. > Apologies for going slighly off topic, but isn't this basically how MySQL does replication? I ask because one of the arguments against moving to PostgreSQL in my organisation is 'no replication out of the box'. But if the above is true it seems that all that is required are a couple of scripts to handle log transfer and you have a form of replication out of the box right there. Or am I missing something? -- Russ
Well, AFAICT this kind of replication in postgres is not be named "out of the box". Setting up the whole thing to work, and test it to really work reliably is not exactly trivial, and you could have a number of surprises (like when there's little activity, the last WAL stays at the server and so the stand-by can be off by a considerable amount of time, even if in terms of data quantity the difference is not that big). You must master well scripting to write reliable scripts to handle the process, though I think there are some examples around, but you still need to customize them for sure. What is missing is an easy way to set up the hot stand-by, without requiring the user to do too much side-work. I wonder if there is a way to set up a direct link between the standby and the server and instead of copying files just send directly the WAL data over by some background process ? The overhead of sending the data directly should be similar with copying the files. Then the easiest way to set up a standby would be to start up the stand-by server with some standby options and giving the server's connection params, and then the standby should make a special connection to the server requesting a data dump + WAL from the data dump start on. Would this be reasonably simple to implement ? I'm not at all familiar with postgres code or the C language, so this is just wishful thinking. Cheers, Csaba. On Tue, 2005-09-13 at 17:45, Russ Brown wrote: > Simon Riggs wrote: > > Barry, > > > > You can use PITR to archive transaction logs to a second server that is > > kept in standby mode. > > > > This will cope with any number of tables and cope with dynamic changes > > to tables. > > > > This is fairly straightforward and very low overhead. > > Set archive_command to a program that transfers xlog files to second > > server. > > Then set restore_command on the second server to a program that loops > > until the next file is available. > > Switchover time is low. > > > > Apologies for going slighly off topic, but isn't this basically how > MySQL does replication? I ask because one of the arguments against > moving to PostgreSQL in my organisation is 'no replication out of the > box'. But if the above is true it seems that all that is required are a > couple of scripts to handle log transfer and you have a form of > replication out of the box right there. > > Or am I missing something?
Russ Brown wrote: >Apologies for going slighly off topic, but isn't this basically how >MySQL does replication? >Or am I missing something? in the immortal words of mothers everywhere: "if all your friends jumped off of a bridge, would you do it too?" there are a lot of things that are "good enough" for mysql. richard
On Tue, 2005-09-13 at 10:45, Russ Brown wrote: > Simon Riggs wrote: > > Barry, > > > > You can use PITR to archive transaction logs to a second server that is > > kept in standby mode. > > > > This will cope with any number of tables and cope with dynamic changes > > to tables. > > > > This is fairly straightforward and very low overhead. > > Set archive_command to a program that transfers xlog files to second > > server. > > Then set restore_command on the second server to a program that loops > > until the next file is available. > > Switchover time is low. > > > > Apologies for going slighly off topic, but isn't this basically how > MySQL does replication? I ask because one of the arguments against > moving to PostgreSQL in my organisation is 'no replication out of the > box'. But if the above is true it seems that all that is required are a > couple of scripts to handle log transfer and you have a form of > replication out of the box right there. > > Or am I missing something? I don't know, but someone in your organization seems to be. Let me present it as a simple devil's choice, which would you rather have, proven replication, that works, but requires you to setup a secondary bit of software / system scripts (like rsync) but is tested and proven to work, or, an "out of the box" solution that is untested, unreliable, and possible unsafe for your data? Chosing a database because it has "out of the box" replication without paying attention to how it is implemented, how well it works, and what are the ways it can break is a recipe for (data) disaster. I've tested slony, and I know that for what we use it for, it's a good fit and it works well. I've tested MySQL's replication, and it simply can't do what I need from a replication system. It can't be setup on the fly on a live system with no down time, and it has reliability issues that make it a poor choice for a 24/7 enterprise replication system. That said, it's a great system for content management replication, where downtime is fine while setting up replication. But I wouldn't choose either because it was easier to implement. Being easy to implement is just sauce on the turkey. I need the meat to be good or the sauce doesn't matter.
Russ Brown wrote: > Apologies for going slighly off topic, but isn't this basically how > MySQL does replication? The PostgreSQL WAL log stores information on the level of "write these bytes to disk", the MySQL replication log stores information on the level of "user called this statement". So in MySQL, inserting the value of now() into a table might give different results if a slave runs in a different time zone. So it's sort of the same from the outside, but the internals are different and yield different possibilities and restrictions. -- Peter Eisentraut http://developer.postgresql.org/~petere/
Scott Marlowe wrote: > > I don't know, but someone in your organization seems to be. > Not just my organisation. I'm afraid this is an opinion I've seen in many places. > Let me present it as a simple devil's choice, which would you rather > have, proven replication, that works, but requires you to setup a > secondary bit of software / system scripts (like rsync) but is tested > and proven to work, or, an "out of the box" solution that is untested, > unreliable, and possible unsafe for your data? > Why does an "out of the box" solution have to be untested, unreliable and possibly unsafe for my data? How about a third choice: you can also use a proven, reliable and tested replication solution that is included in the core system because the core system basiclly provides it anyway. It's easy to set up, but (as with all replication solutions) doesn't fit all purposes. Depending on my requirements, I may well choose that one. > Chosing a database because it has "out of the box" replication without > paying attention to how it is implemented, how well it works, and what > are the ways it can break is a recipe for (data) disaster. > Absolutely, but you're preaching to the converted here. I'm not discussing which database I would choose. I was simply asking whether the WAL logs would be utilised to create a simple replication solution "Out of the box" with very little additional work on the part of the developers. > I've tested slony, and I know that for what we use it for, it's a good > fit and it works well. I've tested MySQL's replication, and it simply > can't do what I need from a replication system. It can't be setup on > the fly on a live system with no down time, and it has reliability > issues that make it a poor choice for a 24/7 enterprise replication > system. > Again, I'm not asking whether MySQL's solution is any good or not. I'm asking whether the WAL log could be used to provide a simple replication solution out of the box. I am fully aware that no single replication solution will fit all circumstances, but if PostgreSQL's core functionality can provide one such solution anyway, why not add a couple of scripts to make it all work out of the box and advertise the fact? > That said, it's a great system for content management replication, where > downtime is fine while setting up replication. > Precicely: one situation where this solution will work well, but nobody knows about it because it's not pushed as a feature. > But I wouldn't choose either because it was easier to implement. Being > easy to implement is just sauce on the turkey. I need the meat to be > good or the sauce doesn't matter. > Indeed. As said above, it has to be reliable, tested etc. I personally fear a lot of the things our MySQL databases do to our data, and would very much like to make the switch to PostgreSQL if allowed to do so. There may be problems with MySQL's 'out of the box' replication solution, but that doesn't mean that having one at all is a Bad Thing. I can't help but think that had I just asked about the possibility of using the WAL log idea for an 'out of the box' replication solution without mentioning the word 'MySQL', the responses received would have been very different. -- Russ
On Tue, 2005-09-13 at 15:51, Russ Brown wrote: > Scott Marlowe wrote: > > Let me present it as a simple devil's choice, which would you rather > > have, proven replication, that works, but requires you to setup a > > secondary bit of software / system scripts (like rsync) but is tested > > and proven to work, or, an "out of the box" solution that is untested, > > unreliable, and possible unsafe for your data? > > > > Why does an "out of the box" solution have to be untested, unreliable > and possibly unsafe for my data? It doesn't have to be. My main point was that many people assume that since it's included and works out of the box that it MUST be tested and reliable. MySQL's replication is not particularly reliable. As someone who's done some "pull the plug" testing on it and PostgreSQL I can confidently say that PostgreSQL and it's replication can survive fairly nasty power / network failure modes, and MySQL's replication and db engine seem much more prone to problems caused by such scenarios. So, to me, it's "tested" but most definitely not proven reliable. > How about a third choice: you can also use a proven, reliable and tested > replication solution that is included in the core system because the > core system basiclly provides it anyway. It's easy to set up, but (as > with all replication solutions) doesn't fit all purposes. If PostgreSQL provided it, I'd use it. But, I'd rather pick a replication engine that I know works, having tested it, and wait for it to be included in the back end at some future date. > Depending on my requirements, I may well choose that one. Since my first requirement is reliable replication, MySQL's replication is out. > > Chosing a database because it has "out of the box" replication without > > paying attention to how it is implemented, how well it works, and what > > are the ways it can break is a recipe for (data) disaster. > > > > Absolutely, but you're preaching to the converted here. I'm not > discussing which database I would choose. I was simply asking whether > the WAL logs would be utilised to create a simple replication solution > "Out of the box" with very little additional work on the part of the > developers. I think it's getting there. But like so many things PostgreSQL, it won't be put into the core until it's considered mature and stable. > I am fully aware that no single replication > solution will fit all circumstances, but if PostgreSQL's core > functionality can provide one such solution anyway, why not add a couple > of scripts to make it all work out of the box and advertise the fact? Because it's probably system dependent, i.e. the scripts / C programs need to work on all the platforms supported by postgresql before it would get included. If it only worked on Linux and / or BSD, it's not done. > > That said, it's a great system for content management replication, where > > downtime is fine while setting up replication. > > > > Precicely: one situation where this solution will work well, but nobody > knows about it because it's not pushed as a feature. Actually I was speaking of MySQL's replication there, in case that wasn't clear (not sure if it was or not...) Keep in mind, PITR just came out in release 6 months ago. It's not really tested thoroughly, and the ways to set it up are probably varied enough that there's no one way that's considered "Standard" just yet. > > But I wouldn't choose either because it was easier to implement. Being > > easy to implement is just sauce on the turkey. I need the meat to be > > good or the sauce doesn't matter. > > > > Indeed. As said above, it has to be reliable, tested etc. I personally > fear a lot of the things our MySQL databases do to our data, and would > very much like to make the switch to PostgreSQL if allowed to do so. > There may be problems with MySQL's 'out of the box' replication > solution, but that doesn't mean that having one at all is a Bad Thing. It's not just what MySQL's db engine is willing to do to your data, it's how its replication engine may fail and you only find out months after the fact when the primary goes down that all your data is 6 months or so out of date. > I can't help but think that had I just asked about the possibility of > using the WAL log idea for an 'out of the box' replication solution > without mentioning the word 'MySQL', the responses received would have > been very different. Hmmmm. I would imagine you would have gotten different responses as well too. Using WAL logs for replication is one subject, MySQL replication is another. PITR is a new feature that's still be fleshed out, but is already quite useful. MySQL's replication has a reputation for dogdy behaviour. So, I can't imagine the two being discussed in the same way either.
On 9/13/05 2:45 PM, "Scott Marlowe" <smarlowe@g2switchworks.com> wrote: > On Tue, 2005-09-13 at 10:45, Russ Brown wrote: >> Simon Riggs wrote: >>> Barry, >>> >>> You can use PITR to archive transaction logs to a second server that is >>> kept in standby mode. >>> >>> This will cope with any number of tables and cope with dynamic changes >>> to tables. >>> >>> This is fairly straightforward and very low overhead. >>> Set archive_command to a program that transfers xlog files to second >>> server. >>> Then set restore_command on the second server to a program that loops >>> until the next file is available. >>> Switchover time is low. >>> >> >> Apologies for going slighly off topic, but isn't this basically how >> MySQL does replication? I ask because one of the arguments against >> moving to PostgreSQL in my organisation is 'no replication out of the >> box'. But if the above is true it seems that all that is required are a >> couple of scripts to handle log transfer and you have a form of >> replication out of the box right there. >> >> Or am I missing something? > > I don't know, but someone in your organization seems to be. > > Let me present it as a simple devil's choice, which would you rather > have, proven replication, that works, but requires you to setup a > secondary bit of software / system scripts (like rsync) but is tested > and proven to work, or, an "out of the box" solution that is untested, > unreliable, and possible unsafe for your data? > When I was putting together a fairly complex log-shipping solution in Oracle (sorry for the O word...), I was presented with that exact choice: use Oracle's built-in log shipping/recovery mechanism, or design an 'in rsync we trust' system of scripts. I chose the scripts, and its worked without a burp for a looong time now. Easy to test, easy to debug, predictable, small simple parts. Its really not that hard. Keep track of disk space, and make sure to check the size of the destination file when you move something around and not just its existence. Not much else to it. > Chosing a database because it has "out of the box" replication without > paying attention to how it is implemented, how well it works, and what > are the ways it can break is a recipe for (data) disaster. > We're getting back to the oft-repeated mantra here - replication is hard. Anyone saying it can be effortless doesn't understand the complexity of the problem. > I've tested slony, and I know that for what we use it for, it's a good > fit and it works well. I've tested MySQL's replication, and it simply > can't do what I need from a replication system. It can't be setup on > the fly on a live system with no down time, and it has reliability > issues that make it a poor choice for a 24/7 enterprise replication > system. > > That said, it's a great system for content management replication, where > downtime is fine while setting up replication. > > But I wouldn't choose either because it was easier to implement. Being > easy to implement is just sauce on the turkey. I need the meat to be > good or the sauce doesn't matter. > > ---------------------------(end of broadcast)--------------------------- > TIP 6: explain analyze is your friend
> How about a third choice: you can also use a proven, reliable and tested > replication solution that is included in the core system because the > core system basiclly provides it anyway. Sure, but that one is spelled "Sybase", not "MySQL" ;-) -- Scott Ribe scott_ribe@killerbytes.com http://www.killerbytes.com/ (303) 665-7007 voice
Scott Ribe wrote: >>How about a third choice: you can also use a proven, reliable and tested >> replication solution that is included in the core system because the >>core system basiclly provides it anyway. > > > Sure, but that one is spelled "Sybase", not "MySQL" ;-) > > It's amazing how misunderstood my post was. My third choice was a hypothetical future version of PostgreSQL, modified from its current form very slightly to include a form of replication 'out of the box': a couple of scripts to enable WAL log transfer and also a solution to the problem of WAL log delay mentioned by other posters. I only mentioned MySQL because their 'out of the box' solution involves transferring the binlogs, which is similar to the method of transferring the PostgreSQL WAL logs, and it just made me think. That's all. I wasn't comparing, I wasn't suggesting MySQL is better than PostgreSQL. I wasn't suggesting that they have the 'ultimate' solution. I wasn't even suggesting that they have a good solution. It just made me think. That's all. Well, I've learned my lesson. Next time I post I'll be sure not to mention MySQL in any way, shape or form. -- Russ
On Fri, 2005-09-16 at 12:51, Russ Brown wrote: > Scott Ribe wrote: > >>How about a third choice: you can also use a proven, reliable and tested > >> replication solution that is included in the core system because the > >>core system basiclly provides it anyway. > > > > > > Sure, but that one is spelled "Sybase", not "MySQL" ;-) > > > > > > It's amazing how misunderstood my post was. > > My third choice was a hypothetical future version of PostgreSQL, > modified from its current form very slightly to include a form of > replication 'out of the box': a couple of scripts to enable WAL log > transfer and also a solution to the problem of WAL log delay mentioned > by other posters. > > I only mentioned MySQL because their 'out of the box' solution involves > transferring the binlogs, which is similar to the method of transferring > the PostgreSQL WAL logs, and it just made me think. That's all. I wasn't > comparing, I wasn't suggesting MySQL is better than PostgreSQL. I wasn't > suggesting that they have the 'ultimate' solution. I wasn't even > suggesting that they have a good solution. It just made me think. That's > all. > > Well, I've learned my lesson. Next time I post I'll be sure not to > mention MySQL in any way, shape or form. Actually, I would just suggest to not hold it up as an example of how things should be done. That would work for me. There was a time, 5 to 10 years ago, when MySQL AB spent a LOT of energy demonizing PostgreSQL to make themselves look better. There were pages of misinformation in their documentation about how PostgreSQL was basically crap, and MySQL did everything right, and a lot of people spent a lot of time debunking that. MySQL AB now plays better with others, and hasn't engaged in the kind of character assassination they once did, but there's STILL a sore spot for most PostgreSQL users and developers there, because they used to have to spend a lot of energy and time explaining that what was on the MySQL site was lies and misinformation. A LOT of time. And it did hurt PostgreSQL, in terms of keeping people away from it. So, there's an almost automatic response triggered by someone mentioning how MySQL does things, especially if they're perceived to be holding MySQL up as an example to the PostgreSQL community on how things should be done. In my original post, my main point wasn't just against MySQL, it was against the philosophy that just because replication is included and part of the core of a database, it doesn't mean that it's reliable or well tested. And MySQL is a fine example of that. Their replication really does have a lot of issues. So, feel free to mention MySQL, but know that mostly when it's mentioned here, it's mentioned as an example of how things shouldn't be done. In terms of coding, marketing, testing, or licensing.
Scott Marlowe wrote: > On Fri, 2005-09-16 at 12:51, Russ Brown wrote: > >>Scott Ribe wrote: >> >>>>How about a third choice: you can also use a proven, reliable and tested >>>> replication solution that is included in the core system because the >>>>core system basiclly provides it anyway. >>> >>> >>>Sure, but that one is spelled "Sybase", not "MySQL" ;-) >>> >>> >> >>It's amazing how misunderstood my post was. >> >>My third choice was a hypothetical future version of PostgreSQL, >>modified from its current form very slightly to include a form of >>replication 'out of the box': a couple of scripts to enable WAL log >>transfer and also a solution to the problem of WAL log delay mentioned >>by other posters. >> >>I only mentioned MySQL because their 'out of the box' solution involves >>transferring the binlogs, which is similar to the method of transferring >>the PostgreSQL WAL logs, and it just made me think. That's all. I wasn't >>comparing, I wasn't suggesting MySQL is better than PostgreSQL. I wasn't >>suggesting that they have the 'ultimate' solution. I wasn't even >>suggesting that they have a good solution. It just made me think. That's >>all. >> >>Well, I've learned my lesson. Next time I post I'll be sure not to >>mention MySQL in any way, shape or form. > > > Actually, I would just suggest to not hold it up as an example of how > things should be done. That would work for me. > I didn't!!!!!! > There was a time, 5 to 10 years ago, when MySQL AB spent a LOT of energy > demonizing PostgreSQL to make themselves look better. There were pages > of misinformation in their documentation about how PostgreSQL was > basically crap, and MySQL did everything right, and a lot of people > spent a lot of time debunking that. > I remember that time, and I remember being very annoyed about it. I am still frustrated now by people who will believe the FUD that was spread at the time and won't even consider PostgreSQL as a result. That is basically why the company I work for uses MySQL, and simply will not consider changing, no matter how hard I try to make it happen. > MySQL AB now plays better with others, and hasn't engaged in the kind of > character assassination they once did, but there's STILL a sore spot for > most PostgreSQL users and developers there, because they used to have to > spend a lot of energy and time explaining that what was on the MySQL > site was lies and misinformation. A LOT of time. And it did hurt > PostgreSQL, in terms of keeping people away from it. > Indeed. As I say above, that's why my company is staying away from it, despite my best efforts. > So, there's an almost automatic response triggered by someone mentioning > how MySQL does things, especially if they're perceived to be holding > MySQL up as an example to the PostgreSQL community on how things should > be done. > I've noticed that. I've been reading (and occasionally posting to) this list for a few years now, and it's the one and only thing about it that bugs me. Other lists bug me in a lot of ways due to attitudes of some of the major contributors, but I've always enjoyed this list greatly primarily *because* of the major contributors (yourself included). It's educational, friendly and very helpful. I learn a lot from this list and enjoy the discussion. This one thing bugs me because I'm not generally an emotionally reactive person: I prefer to consider things fairly before responding, which is why I frequently don't respond at all to things if I don't think it would help matters. I feel that it is an extremely unwise policy to automatically assume that what your competitors are doing is worse than what you are doing, and that there's nothing you can learn from them. That's how you get left behind. That's not to say that there *is* a great deal that PostgreSQL can learn from MySQL, but one should not assume that there is nothing. From my readings on this list the majority of examples of people using MySQL as an example of how PostgreSQL should be doing things have been misguided/wrong/trolling etc. However, from time to time a legitimate example is raised, and in those situations the responses given have sometimes been disappointing. > In my original post, my main point wasn't just against MySQL, it was > against the philosophy that just because replication is included and > part of the core of a database, it doesn't mean that it's reliable or > well tested. And MySQL is a fine example of that. Their replication > really does have a lot of issues. > Indeed. But just to stress the point, I wasn't stating that the included replication in MySQL was any good (though it's not terrible as we're using it heavily in an extremely high-volume situation with few problems), I was just bringing up the idea of getting a decent replication solution included in PostgreSQL for relatively little effort. > So, feel free to mention MySQL, but know that mostly when it's mentioned > here, it's mentioned as an example of how things shouldn't be done. In > terms of coding, marketing, testing, or licensing. > I think in future I'll just stick to not mentioning it. :) Regards, -- Russ
On Sep 16, 2005, at 4:30 PM, Russ Brown wrote: >> just because replication is included and >> part of the core of a database, it doesn't mean that it's reliable or >> well tested. I just jumped into this thread, and this statement just triggers one thought. Many Postgres end users rely on experts like those on this list for the best advice on how to use Postgres and what might be the best solution for non-core needs such as replication. Inclusion does probably make those users feel comfort that the experts for that particular software or product deems it one of , if not the best solution for the given problem. It's not always true and may not be right for this situation, since replication needs for different situations may vary widely. But inclusion does imply some type of endorsement. For instance I feel pretty comfortable with TSeach2 though I don't know much about. That fact that it comes in the contribs is an endorsement. I'd be confused if the consensus was that there is a better, as easy to use and license compatible full text index available and was for some reason NOT included. Of course, I'd still ask the list. . .
Added to TODO: * Allow WAL traffic to be steamed to another server for stand-by replication --------------------------------------------------------------------------- Csaba Nagy wrote: > Well, AFAICT this kind of replication in postgres is not be named "out > of the box". Setting up the whole thing to work, and test it to really > work reliably is not exactly trivial, and you could have a number of > surprises (like when there's little activity, the last WAL stays at the > server and so the stand-by can be off by a considerable amount of time, > even if in terms of data quantity the difference is not that big). You > must master well scripting to write reliable scripts to handle the > process, though I think there are some examples around, but you still > need to customize them for sure. > > What is missing is an easy way to set up the hot stand-by, without > requiring the user to do too much side-work. > > I wonder if there is a way to set up a direct link between the standby > and the server and instead of copying files just send directly the WAL > data over by some background process ? The overhead of sending the data > directly should be similar with copying the files. Then the easiest way > to set up a standby would be to start up the stand-by server with some > standby options and giving the server's connection params, and then the > standby should make a special connection to the server requesting a data > dump + WAL from the data dump start on. Would this be reasonably simple > to implement ? I'm not at all familiar with postgres code or the C > language, so this is just wishful thinking. > > Cheers, > Csaba. > > > > On Tue, 2005-09-13 at 17:45, Russ Brown wrote: > > Simon Riggs wrote: > > > Barry, > > > > > > You can use PITR to archive transaction logs to a second server that is > > > kept in standby mode. > > > > > > This will cope with any number of tables and cope with dynamic changes > > > to tables. > > > > > > This is fairly straightforward and very low overhead. > > > Set archive_command to a program that transfers xlog files to second > > > server. > > > Then set restore_command on the second server to a program that loops > > > until the next file is available. > > > Switchover time is low. > > > > > > > Apologies for going slighly off topic, but isn't this basically how > > MySQL does replication? I ask because one of the arguments against > > moving to PostgreSQL in my organisation is 'no replication out of the > > box'. But if the above is true it seems that all that is required are a > > couple of scripts to handle log transfer and you have a form of > > replication out of the box right there. > > > > Or am I missing something? > > > ---------------------------(end of broadcast)--------------------------- > TIP 4: Have you searched our list archives? > > http://archives.postgresql.org > -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania 19073
On Sep 19, 2005, at 7:10 PM, Bruce Momjian wrote: > Added to TODO: > > * Allow WAL traffic to be steamed to another server for stand-by > replication "steamed" or "streamed"? :) -- Thomas F. O'Connell Co-Founder, Information Architect Sitening, LLC Strategic Open Source: Open Your i™ http://www.sitening.com/ 110 30th Avenue North, Suite 6 Nashville, TN 37203-6320 615-469-5150 615-469-5151 (fax)
> Indeed. But just to stress the point, I wasn't stating that the included > replication in MySQL was any good (though it's not terrible as we're > using it heavily in an extremely high-volume situation with few > problems), I was just bringing up the idea of getting a decent > replication solution included in PostgreSQL for relatively little effort. No, but IIRC, you didn't state that is was a substandard solution, and, also IIRC, it really sounded as though you believed it was a good one. >> So, feel free to mention MySQL, but know that mostly when it's mentioned >> here, it's mentioned as an example of how things shouldn't be done. In >> terms of coding, marketing, testing, or licensing. >> > > I think in future I'll just stick to not mentioning it. :) Probably not necessary; just make it clear whether you're saying "MySQL claims... and it would be good for Postgres to have its own solution..." or "MySQL has an actual working full-blown good solution for... that Postgres would do well to emulate". -- Scott Ribe scott_ribe@killerbytes.com http://www.killerbytes.com/ (303) 665-7007 voice