Thread: why vacuum
hi, i was in a minor flame war with a mysql guy - his major grouse was that 'I wouldnt commit mission critical data to a database that needs to be vacuumed once a week'. So why does pg need vacuum? -- regards kg http://www.livejournal.com/users/lawgon tally ho! http://avsap.org.in ಇಂಡ್ಲಿನಕ್ಸ வாழ்க!
Kenneth Gonsalves <lawgon@thenilgiris.com> writes: > i was in a minor flame war with a mysql guy - his major grouse was that > 'I wouldnt commit mission critical data to a database that needs to be > vacuumed once a week'. This guy is not worth arguing with. > So why does pg need vacuum? Every database needs maintenance operations. PG is designed in a way that exposes the maintenance operations to the control of the DBA a bit more than most other DBMSes do: specifically, you get to decide when some of the overhead work happens. We think this is a feature, because you can schedule the overhead for low-activity periods (nights, weekends, whatever). In other DBMSes the equivalent work happens as part of foreground queries, no matter how time-critical they might be. Now, there's no doubt that for a database run by a non-expert person who can't even spell DBA, exposing this sort of knob isn't very helpful. So there's work afoot to provide automatic maintenance tools (ie, autovacuum). Over time I think autovacuum will get smart enough that even experts will usually use it. But that point will only be reached when autovacuum has some idea about doing more work during low-load periods. Unless MySQL invents some concept equivalent to VACUUM, they won't have any prayer at all of being able to shift maintenance overhead to low-load times. regards, tom lane
On Wed, Oct 26, 2005 at 10:15:17AM +0530, Kenneth Gonsalves wrote: > i was in a minor flame war with a mysql guy - his major grouse was that > 'I wouldnt commit mission critical data to a database that needs to be > vacuumed once a week'. The use of the word "commit" is amusing, considering that MySQL's default table type doesn't support transactions. There's always InnoDB, but it seems like there was something about that in the news recently.... Compare the following lists of gotchas and decide which database *you'd* commit mission-critical data to: http://sql-info.de/mysql/gotchas.html http://sql-info.de/postgresql/postgres-gotchas.html > So why does pg need vacuum? See the documentation: http://www.postgresql.org/docs/8.0/interactive/maintenance.html#ROUTINE-VACUUMING -- Michael Fuhr
On Wed, 26 Oct 2005 15:14, Tom Lane wrote: > Kenneth Gonsalves <lawgon@thenilgiris.com> writes: > > (A MySQueeeel guy said,not Kenneth)... > > 'I wouldnt commit mission critical data to a database that needs to be > > vacuumed once a week'. My two-penneth worth...<flamage> I wouldn't commit mission critical data to a database (or DBA) that doesn't have a concept of vacuuming (or desire to do it regularly). But, less flamingly, I wouldn't commit mission-critical data to something that lacked the ability to have proper constraints, triggers and server-side procedures to ensure the data actually remains sensible. Note that Sybase/MS-SQL's check constraint model asserts the constraint BEFORE the trigger, which discourages you from attempting to check and handle meaning of data!</flamage> > This guy is not worth arguing with. D'Accord! > > So why does pg need vacuum? For (inter alia) the same reason that * Oracle has an ANALYZE_SCHEMA and DBMS_SPACE_ADMIN and (hoist by his own petard) * MySQueeeeaL has myisamchk --stats_method=method_name --analyze <flamage>Oh, well: MySQL bigot and internal consistency? whadya expect?</flamage> Dave Bath (Oracle DBA for health/telcos way back in 1986: honeywrong GCOS and Pr1mos) -- David T. Bath dave.bath@unix.net
"Bath, David" <dave.bath@unix.net> writes: > ... Note that Sybase/MS-SQL's > check constraint model asserts the constraint BEFORE the trigger, which > discourages you from attempting to check and handle meaning of data! Er, doesn't PG do it that way too? regards, tom lane
On Wednesday 26 Oct 2005 11:52 am, Bath, David wrote: > > This guy is not worth arguing with. > D'Accord! thanks all for the clarification. in case anyone is interested in the original conversation it is here: http://ebergen.net/wordpress/?p=83 -- regards kg http://www.livejournal.com/users/lawgon tally ho! http://avsap.org.in ಇಂಡ್ಲಿನಕ್ಸ வாழ்க!
On Tue, 2005-10-25 at 23:45, Kenneth Gonsalves wrote: > hi, > i was in a minor flame war with a mysql guy - his major grouse was that > 'I wouldnt commit mission critical data to a database that needs to be > vacuumed once a week'. So why does pg need vacuum? The absolutely funniest thing about what this guy is saying is that he seems rather ignorant of the behaviour of innodb tables. They have another name for the vacuum command there. It's: ALTER TABLE tbl_name ENGINE=INNODB Which rebuilds the whole fraggin's table, with an exclusive lock. and guess what innodb does if you don't run this command every so often? Can you guess yet? Yep, that's right, it just keeps growing and growing and growing. Hell, innodb isn't any better than the original mvcc implementation postgresql had when vacuums were all full and took exclusive locks. But at least with PostgreSQL it was a well documented issue, and was mentioned in the administrative section of the docs, so you knew you had to do it. It's kind of tucked away in the innodb section of the mysql docs, and most mysql folks don't even know they need to do it, since they almost all use myisam table types. If someone is more worried about postgresql's non-blocking, easily scheduled vacuuming, but is using myisam tables, and a database that by default allows numeric overflows to just insert the maximum possible value, I wouldn't trust them with handling my paycheck with their fan-boy database.
On Tue, 2005-10-25 at 23:45, Kenneth Gonsalves wrote: > hi, > i was in a minor flame war with a mysql guy - his major grouse was that > 'I wouldnt commit mission critical data to a database that needs to be > vacuumed once a week'. So why does pg need vacuum? Oh man oh man. After reading the article, I realized he was saying that he wouldn't trust PostgreSQL to replace Oracle. He apparently wouldn't trust MySQL to replace oracle either. But, the next time someone says that slony is a toy add on, and MySQL has REAL replication, point them to THIS page on the same blog: http://ebergen.net/wordpress/?p=70 In short, it basically shows that MySQL replication is incredibly fragile, and not fit for production on any real system. The lack of system wide transaction support, like postgresql has, makes the problem he outlines that much worse. The hoops people will jump through to use their favorite toys...
On 10/26/2005 11:19 AM, Scott Marlowe wrote: > On Tue, 2005-10-25 at 23:45, Kenneth Gonsalves wrote: >> hi, >> i was in a minor flame war with a mysql guy - his major grouse was that >> 'I wouldnt commit mission critical data to a database that needs to be >> vacuumed once a week'. So why does pg need vacuum? > > Oh man oh man. After reading the article, I realized he was saying that > he wouldn't trust PostgreSQL to replace Oracle. He apparently wouldn't > trust MySQL to replace oracle either. > > But, the next time someone says that slony is a toy add on, and MySQL > has REAL replication, point them to THIS page on the same blog: > > http://ebergen.net/wordpress/?p=70 > You must have missed the FAQ and other side notes about replication in the MySQL manual. Essentially MySQL replication is nothing but a query duplicating system, with the added sugar of taking care of now() and some other non-deterministic things, but not all of them. Non-deterministic user defined procedures, functions and triggers will simply blow MySQL's sophisticated replication apart. Jan > In short, it basically shows that MySQL replication is incredibly > fragile, and not fit for production on any real system. The lack of > system wide transaction support, like postgresql has, makes the problem > he outlines that much worse. > > The hoops people will jump through to use their favorite toys... > > ---------------------------(end of broadcast)--------------------------- > TIP 5: don't forget to increase your free space map settings -- #======================================================================# # It's easier to get forgiveness for being wrong than for being right. # # Let's break this rule - forgive me. # #================================================== JanWieck@Yahoo.com #
On Wed, 2005-10-26 at 11:09, Jan Wieck wrote: > On 10/26/2005 11:19 AM, Scott Marlowe wrote: > > > On Tue, 2005-10-25 at 23:45, Kenneth Gonsalves wrote: > >> hi, > >> i was in a minor flame war with a mysql guy - his major grouse was that > >> 'I wouldnt commit mission critical data to a database that needs to be > >> vacuumed once a week'. So why does pg need vacuum? > > > > Oh man oh man. After reading the article, I realized he was saying that > > he wouldn't trust PostgreSQL to replace Oracle. He apparently wouldn't > > trust MySQL to replace oracle either. > > > > But, the next time someone says that slony is a toy add on, and MySQL > > has REAL replication, point them to THIS page on the same blog: > > > > http://ebergen.net/wordpress/?p=70 > > > > You must have missed the FAQ and other side notes about replication in > the MySQL manual. Essentially MySQL replication is nothing but a query > duplicating system, with the added sugar of taking care of now() and > some other non-deterministic things, but not all of them. > > Non-deterministic user defined procedures, functions and triggers will > simply blow MySQL's sophisticated replication apart. True, but I never expected a CTRL-C to the mysql command line to break replication. Even for MySQL's lackadaisical behaviour, that's pretty far out.
Scott Marlowe <smarlowe@g2switchworks.com> writes: > On Tue, 2005-10-25 at 23:45, Kenneth Gonsalves wrote: >> i was in a minor flame war with a mysql guy - his major grouse was that >> 'I wouldnt commit mission critical data to a database that needs to be >> vacuumed once a week'. So why does pg need vacuum? > Oh man oh man. After reading the article, I realized he was saying that > he wouldn't trust PostgreSQL to replace Oracle. Well, that's a slightly more respectable point of view, but Oracle has surely got its own set of gotchas ... doesn't it still have issues if you run a transaction that's large enough to overrun the fixed-size rollback areas (or whatever they call them)? regards, tom lane
On Wed, 2005-10-26 at 11:12, Tom Lane wrote: > Scott Marlowe <smarlowe@g2switchworks.com> writes: > > On Tue, 2005-10-25 at 23:45, Kenneth Gonsalves wrote: > >> i was in a minor flame war with a mysql guy - his major grouse was that > >> 'I wouldnt commit mission critical data to a database that needs to be > >> vacuumed once a week'. So why does pg need vacuum? > > > Oh man oh man. After reading the article, I realized he was saying that > > he wouldn't trust PostgreSQL to replace Oracle. > > Well, that's a slightly more respectable point of view, but Oracle has > surely got its own set of gotchas ... doesn't it still have issues if > you run a transaction that's large enough to overrun the fixed-size > rollback areas (or whatever they call them)? Yep, and it also has the "snapshot too old" issue for long running transactions (long running meaning how many other things have happened since it started, not length of time). Imagine starting a backup in postgresql, getting an hour into it and suddenly it ends because 10,000,000 rows have been inserted while it was going. That kind of error. My reply on that forum pointed out that EVERY database has gotchas, and to pretend that your database's gotchas are ok but some other database's gotchas are unacceptable is disingenuous at best.
Yes, pg does triggers first before asserting check constraints! Was Re: why vacuum
From
"Bath, David"
Date:
Tom, After I wrote > > Sybase/MS-SQL's check constraint model asserts the constraint > > BEFORE the trigger, which discourages you from attempting to> > > check and handle meaning of data! you wrote (2005-10-26 17:00) > Er, doesn't PG do it that way too? Well, it works for me! In this case (with examples and caveats below), postgresql (and Oracle) have got it right, which is a BIG reason why you never get an Oracle guy to define Sybase/MS-SQL systems - the other big reason is when you look at the sybase/mssql nestlevel internal parameter - got the scars to prove it! Quite frankly, if pg did not allow you to tidy things during pre-insert and pre-update triggers, I wouldn't be so keen on it, and stick to Oracle. My general approach is: 1) Pre-* triggers are for tidying up the data to cover for what is obviously a typo by user and their intent is clear 2) Post-* triggers are for propagating required data changes, i.e. implications such as updating the "current balance" attribute in a "customer account" record whenever the dollar value in a transaction detail record changes. Let me give a simple example: 1) Define attribute x as a varchar(whatever). 2) Ensure x has no leading/trailing whites ... CHECK ((NOT (X ~ '^[ \t\n\r]')) AND (NOT (X ~ '[ \t\n\r]$'))) 3) During "BEFORE INSERT" and "BEFORE UPDATE" (rowlevel) triggers, include NEW.x := btrim(NEW.x, ' \t\n\r'); 4) INSERT INTO y (x, ...) VALUES ('\tblah blahdy blah\n', ...) 5) SELECT x FROM y WHERE ... Get 'blah blahdy blah' back. Any processing overhead is trivial compared to the time wasted by users, by a dba when the user's complain, or undesired application behaviour when developers make invalid assumptions about the data. Another useful example, based on my opinion/experience that any change of arbitrary primary keys is imnsho wrong-headed and recoverable, I usually do the following in a pre-update row-level trigger, especially when pk is set from a sequence: NEW.pk := OLD.pk ; or are least NEW.pk := coalesce(NEW.pk, OLD.pk) Caveat: This approach DOES NOT WORK if we 1) define a domain (z) as a varchar(whatever), 2) put the constraint on z 3) use domain z as the datatype for x 4) attempt to change x during pre* rowlevel triggers as above as pg barfs as soon as you assign a value to a domain that breaks its constraint. This caveat prompted my recent question on this list about the possibility of a "pre-assert trigger" on a domain. (Version 9? Pretty please with sugar on top?). Wow! pg is even smarter than even YOU thought Tom! For this to be the case, you guys must be excellent designers and coders, and I tips my lid to you. Bugs are common, serendipitous useful features are almost as rare as neonates with teeth. -- dave.bath@unix.net Question for the day: delta(dummies_guide, executive_summary)=?
Re: Yes, pg does triggers first before asserting check constraints! Was Re: why vacuum
From
Tom Lane
Date:
"Bath, David" <dave.bath@unix.net> writes: > you wrote (2005-10-26 17:00) >>> Sybase/MS-SQL's check constraint model asserts the constraint >>> BEFORE the trigger, which discourages you from attempting to> >>> check and handle meaning of data! >> Er, doesn't PG do it that way too? > Well, it works for me! You're right of course. I was confusing this with a related problem that people occasionally complain about, which is that datatype-related errors are thrown before the trigger can do anything about them. For instance, you can't stuff "abc" into a int4 field and hope that the trigger will have a chance to replace it with something valid. (Yes, people have asked for that :-() A less silly example is that domain constraints on a field of a domain type get checked before the trigger can run. regards, tom lane
On Wednesday 26 October 2005 17:44, Scott Marlowe wrote: > On Tue, 2005-10-25 at 23:45, Kenneth Gonsalves wrote: > > hi, > > i was in a minor flame war with a mysql guy - his major grouse was that > > 'I wouldnt commit mission critical data to a database that needs to be > > vacuumed once a week'. So why does pg need vacuum? > > The absolutely funniest thing about what this guy is saying is that he > seems rather ignorant of the behaviour of innodb tables. They have > another name for the vacuum command there. It's: > > ALTER TABLE tbl_name ENGINE=INNODB > > Which rebuilds the whole fraggin's table, with an exclusive lock. > > and guess what innodb does if you don't run this command every so often? > > Can you guess yet? Yep, that's right, it just keeps growing and growing > and growing. Not quite so. I'm running quite a few (>50) mysql/innodb servers with database sizes raging from 500mb to 50gb, and I never had to rebuild any innodb tables this way. InnoDB uses index-based data storage and rollback segments, which makes it harder to add bloat to their databases, as compared to PG (but autovacuum is my saviour). Innodb will actually free space when you do DELETE or TRUNCATE, but still, it's tables, indexes and tablespaces will get fragmented. This gets worse over time, but it had never been a big problem for me. My databases do 50 queries/second on average, 24/7. Note - all of this can be due to my access and data change patterns; YMMV. The "only" cleanup operation I do is CHECK/OPTIMIZE, on monthly basis; it's not much better than old PG's VACUUM, as it brings mysql/innodb's performance down by 5x-10x times; same goes for almost any long-running query. I'm moving those servers to PG, due to this (concurrency) and other reasons. My top 3 reasons are: a much better concurrency (even with bg vacuums running :-), a much better planner, and PG's rich feature set. -- Best Regards, Igor Shevchenko
On Wed, 2005-10-26 at 10:19 -0500, Scott Marlowe wrote: > But, the next time someone says that slony is a toy add on, and MySQL > has REAL replication, point them to THIS page on the same blog: > > http://ebergen.net/wordpress/?p=70 > > In short, it basically shows that MySQL replication is incredibly > fragile, and not fit for production on any real system. The lack of > system wide transaction support, like postgresql has, makes the problem > he outlines that much worse. > > The hoops people will jump through to use their favorite toys... I see no point in blatantly putting 'other' products such shape. Pgsql offers no replication at all, you need to use slony (wich is also a poor replacement for a wannabe replication), or some other commercial products. What about 2PC? What about linking the databases from different servers? I've been using MSSQL for over 6 years now, started with MSSQL7 and went on with MSSQL2000. The replication it offers is superb! It runs smoothly, you have click-me-click interface with wich you can create publications and deploy them to subscribers with ease. Ok, there are gotchas (and they-re ms-style funny), but it's all well documented, and works most of the time as expected. So what? Btw, I 'ported' the merge replication from MSSQL to postgres. It basicaly adds triggers to every table that is 'published' for replication. There is a separate table to store and calculate the change differences from several servers (so you could do update on any of the servers and change will be propagated to the others). I'm missing 2PC badly here, I wrote some stupid python 'thingie' wich should act as 2PC serializer, but that's slow as hell. And triggers slow down postgres quite a bit. So, to end this 'my father has bigger car than yours' debate, when will postgres have two phase commit protocol implemented? I presume that should come after you allow something like SELECT someCol FROM myServer.myDatabase[Schema].myTable... Mike -- Mario Splivalo Mob-Art mario.splivalo@mobart.hr "I can do it quick, I can do it cheap, I can do it well. Pick any two."
On Wed, 2005-10-26 at 12:09 -0400, Jan Wieck wrote: > > > > You must have missed the FAQ and other side notes about replication in > the MySQL manual. Essentially MySQL replication is nothing but a query > duplicating system, with the added sugar of taking care of now() and > some other non-deterministic things, but not all of them. > > Non-deterministic user defined procedures, functions and triggers will > simply blow MySQL's sophisticated replication apart. > That is just not true. I haven't tried MySQL 4 and above, but in 3.5.x replication wos working as expected, and the load that replication posed was insignificant. The only TRUE problem was that replication was unidirectional. That SAME problem has Slony, and other 'replication systems' available for postgres. Mike -- Mario Splivalo Mob-Art mario.splivalo@mobart.hr "I can do it quick, I can do it cheap, I can do it well. Pick any two."
On Thu, Oct 27, 2005 at 10:22:41AM +0200, Mario Splivalo wrote: > offers no replication at all, you need to use slony (wich is also a poor > replacement for a wannabe replication), or some other commercial > products. What about 2PC? What about linking the databases from Slony is in fact a community-supported system; so I don't know why you think that amounts to "no replication at all". And since this is a community-supported system, it'd be nice if you said why it's a "poor replacement for wannabe replication". What's wrong with it? > Btw, I 'ported' the merge replication from MSSQL to postgres. It > basicaly adds triggers to every table that is 'published' for > replication. There is a separate table to store and calculate the change > differences from several servers (so you could do update on any of the > servers and change will be propagated to the others). I'm missing 2PC > badly here, I wrote some stupid python 'thingie' wich should act as 2PC > serializer, but that's slow as hell. And triggers slow down postgres > quite a bit. This is interesting. Care to package it up for others, or write a proof-of-concept outline for the lists or General Bits or something like that? This is a different sort of replication people are asking for. Note that you get 2PC in the next Postgres release. A -- Andrew Sullivan | ajs@crankycanuck.ca When my information changes, I alter my conclusions. What do you do sir? --attr. John Maynard Keynes
On Thu, 2005-10-27 at 06:21 -0400, Andrew Sullivan wrote: > On Thu, Oct 27, 2005 at 10:22:41AM +0200, Mario Splivalo wrote: > > offers no replication at all, you need to use slony (wich is also a poor > > replacement for a wannabe replication), or some other commercial > > products. What about 2PC? What about linking the databases from > > Slony is in fact a community-supported system; so I don't know why > you think that amounts to "no replication at all". And since this is > a community-supported system, it'd be nice if you said why it's a > "poor replacement for wannabe replication". What's wrong with it? Postgres itself offers no replication. You could achive some sort of replication by restoring the parts of WAL files, but that's rather inconvinient. Then, if you want to replicate your data in any way, you need to take slony, or whatever is existant out there, commercial or open-free-source. I appologize here if I insulted the slony developers, I ment nothing like that :) I am sorry. I was just addressing the issue where replication to one means 'just move my data here', and at others it means 'merge my data'. > > Btw, I 'ported' the merge replication from MSSQL to postgres. It > > basicaly adds triggers to every table that is 'published' for > > replication. There is a separate table to store and calculate the change > > differences from several servers (so you could do update on any of the > > servers and change will be propagated to the others). I'm missing 2PC > > badly here, I wrote some stupid python 'thingie' wich should act as 2PC > > serializer, but that's slow as hell. And triggers slow down postgres > > quite a bit. > > This is interesting. Care to package it up for others, or write a > proof-of-concept outline for the lists or General Bits or something > like that? This is a different sort of replication people are asking > for. Note that you get 2PC in the next Postgres release. I'll be glad to, I'm just not that familiar (in fact, i'm not familiar at all) with the 'proof-of-cocept' or 'General Bits' terms, so if you could http-redirect me, I'll be most thankfull. Mentioning the 2PC, is it available in pg8.1beta4? Mike -- Mario Splivalo Mob-Art mario.splivalo@mobart.hr "I can do it quick, I can do it cheap, I can do it well. Pick any two."
Mario Splivalo wrote: > On Wed, 2005-10-26 at 12:09 -0400, Jan Wieck wrote: > >>You must have missed the FAQ and other side notes about replication in >>the MySQL manual. Essentially MySQL replication is nothing but a query >>duplicating system, with the added sugar of taking care of now() and >>some other non-deterministic things, but not all of them. >> >>Non-deterministic user defined procedures, functions and triggers will >>simply blow MySQL's sophisticated replication apart. > > That is just not true. I haven't tried MySQL 4 and above, but in 3.5.x > replication wos working as expected, and the load that replication posed > was insignificant. The only TRUE problem was that replication was > unidirectional. That SAME problem has Slony, and other 'replication > systems' available for postgres. I think you're missing the "non-deterministic" bit. Mysql replication is based on shipping statements AFAICT http://dev.mysql.com/doc/refman/5.0/en/binary-log.html "The binary log contains all statements which updated data or potentially could have updated it (for example, a DELETE which matched no rows). Statements are stored in the form of “events” that describe the modifications." So - if your statement contains something non-deterministic that isn't catered for in Mysql's code then it will break. At it's simplest - if I write a function my_random() and then do: UPDATE foo SET a=1 WHERE b < my_random(); IF my_random() returns different results on different machines, then the replication will be broken. See the manual entry below: http://dev.mysql.com/doc/refman/5.0/en/replication-features.html That's not to say the system is worthless - it works fine for many people. But it does have limitations. -- Richard Huxton Archonet Ltd
On 10/27/2005 4:22 AM, Mario Splivalo wrote: > On Wed, 2005-10-26 at 12:09 -0400, Jan Wieck wrote: >> > >> >> You must have missed the FAQ and other side notes about replication in >> the MySQL manual. Essentially MySQL replication is nothing but a query >> duplicating system, with the added sugar of taking care of now() and >> some other non-deterministic things, but not all of them. >> >> Non-deterministic user defined procedures, functions and triggers will >> simply blow MySQL's sophisticated replication apart. >> > > That is just not true. I haven't tried MySQL 4 and above, but in 3.5.x > replication wos working as expected, and the load that replication posed > was insignificant. The only TRUE problem was that replication was > unidirectional. That SAME problem has Slony, and other 'replication > systems' available for postgres. Read again. "Non-deterministic user defined procedures ...". Please give me *ONE* example of a user defined procedure in 3.5.x at all. You must have missed the point that stored procedures are a new feature in 5.0. Jan -- #======================================================================# # It's easier to get forgiveness for being wrong than for being right. # # Let's break this rule - forgive me. # #================================================== JanWieck@Yahoo.com #
On 10/27/2005 4:22 AM, Mario Splivalo wrote: > I see no point in blatantly putting 'other' products such shape. Pgsql > offers no replication at all, you need to use slony (wich is also a poor > replacement for a wannabe replication), or some other commercial > products. What about 2PC? What about linking the databases from > different servers? I agree that Scott's comment was a bit harsh. But would you please add a few details that explain what makes Slony a "poor replacement" in your opinion? And please don't repeat that stupid "not builtin". Any add-on is as good as its reliability and features. Or would you at the same time say that MySQL has only a poor replacement for wannabe transactions and foreign keys, because their storage engines are in fact add-ons? > Btw, I 'ported' the merge replication from MSSQL to postgres. It > basicaly adds triggers to every table that is 'published' for > replication. There is a separate table to store and calculate the change > differences from several servers (so you could do update on any of the > servers and change will be propagated to the others). I'm missing 2PC > badly here, I wrote some stupid python 'thingie' wich should act as 2PC > serializer, but that's slow as hell. And triggers slow down postgres > quite a bit. Would you consider publishing that code under the BSD license? It sounds very much like one of the "other add-on replication systems" our users keep asking for. If you can't publish the code, do you have any design papers or a technical concept that could be used as a base for a new PostgreSQL community project? Jan -- #======================================================================# # It's easier to get forgiveness for being wrong than for being right. # # Let's break this rule - forgive me. # #================================================== JanWieck@Yahoo.com #
On Thu, Oct 27, 2005 at 02:34:13PM +0200, Mario Splivalo wrote: > > Postgres itself offers no replication. You could achive some sort of > replication by restoring the parts of WAL files, but that's rather > inconvinient. Then, if you want to replicate your data in any way, you Well, AFAIK Oracle itself offers no replication, either. If you want it, you have to buy a license for it. Which means it's an add-on. Heck, most Linux distributions' kernels don't offer support for network cards: they're an add-on. We call them modules. (In case it's not clear, I don't buy the "itself/add-on" distinction. The point is that the whole system works together. PostgreSQL most definitely offers replication. In fact, you can get warm-standby with WAL shipping, or read-only capabilities with Slony or some other tools.) > like that :) I am sorry. I was just addressing the issue where > replication to one means 'just move my data here', and at others it > means 'merge my data'. Yes: this multiple-meaning "replication" word causes a great deal of confusion. But just because one person has need A does not mean that need B isn't a real one. My employer, Afilias, sponsors the Slony work, in the direction of our needs. We did not initially have a many-write-nodes scenario in mind, and we had a pressing need for a "single master" system. So that's the itch we scratched. > I'll be glad to, I'm just not that familiar (in fact, i'm not familiar > at all) with the 'proof-of-cocept' or 'General Bits' terms, so if you > could http-redirect me, I'll be most thankfull. Proof of concept is just a description of what you did, how it worked, design and limitations, &c. Post it to the lists (uh, -general or maybe -hackers, I suppose), or put it on a web page or whatever. General Bits is a fairly regular column that Elein Mustain puts out. I bet she'd include a submission on this topic, although you'd have to ask her. You can find GB at <http://varlena.com/varlena/GeneralBits/>. > Mentioning the 2PC, is it available in pg8.1beta4? Should be. A -- Andrew Sullivan | ajs@crankycanuck.ca The whole tendency of modern prose is away from concreteness. --George Orwell
On 10/27/2005 8:34 AM, Mario Splivalo wrote: > Postgres itself offers no replication. Oracle itself offers no replication. IBM DB2 itself offers no replication. Yet most of the products out there for Oracle, DB2 and PostgreSQL are far better than what I read here: http://dev.mysql.com/doc/refman/5.0/en/replication-features.html What a great example of "offering replication itself" ... *cough* I guess it'd be easier to port a well designed add-on like Slony-I to MySQL (now that they have triggers) than fixing their broken design. Not that I volunteer to do it, but Slony-I is released under BSD, anyone feel free to offer a good solution for MySQL. Jan -- #======================================================================# # It's easier to get forgiveness for being wrong than for being right. # # Let's break this rule - forgive me. # #================================================== JanWieck@Yahoo.com #
On Thu, Oct 27, 2005 at 02:21:15PM +0100, Richard Huxton wrote: > So - if your statement contains something non-deterministic that isn't > catered for in Mysql's code then it will break. > > At it's simplest - if I write a function my_random() and then do: > UPDATE foo SET a=1 WHERE b < my_random(); > IF my_random() returns different results on different machines, then the > replication will be broken. See the manual entry below: > http://dev.mysql.com/doc/refman/5.0/en/replication-features.html > > That's not to say the system is worthless - it works fine for many > people. But it does have limitations. And you can easily have multi-master syncronous replication in PostgreSQL using the same idea; just see pgCluster. -- Jim C. Nasby, Sr. Engineering Consultant jnasby@pervasive.com Pervasive Software http://pervasive.com work: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461
On Thu, 2005-10-27 at 11:51 -0400, Andrew Sullivan wrote: > Well, AFAIK Oracle itself offers no replication, either. If you want > it, you have to buy a license for it. Which means it's an add-on. > Heck, most Linux distributions' kernels don't offer support for > network cards: they're an add-on. We call them modules. (In case > it's not clear, I don't buy the "itself/add-on" distinction. The > point is that the whole system works together. PostgreSQL most > definitely offers replication. In fact, you can get warm-standby with > WAL shipping, or read-only capabilities with Slony or some other > tools.) I wish I was Dumbo now. I could hide myself with me ears :) > > I'll be glad to, I'm just not that familiar (in fact, i'm not familiar > > at all) with the 'proof-of-cocept' or 'General Bits' terms, so if you > > could http-redirect me, I'll be most thankfull. > > Proof of concept is just a description of what you did, how it > worked, design and limitations, &c. Post it to the lists (uh, > -general or maybe -hackers, I suppose), or put it on a web page or > whatever. General Bits is a fairly regular column that Elein Mustain > puts out. I bet she'd include a submission on this topic, although > you'd have to ask her. You can find GB at > <http://varlena.com/varlena/GeneralBits/>. I'll go trough my code, it's been a while since I touched it, I'll write some documentation and I'll inform the comunity. Thnx for the pointouts. Mike -- Mario Splivalo Mob-Art mario.splivalo@mobart.hr "I can do it quick, I can do it cheap, I can do it well. Pick any two."
On Fri, Oct 28, 2005 at 01:09:43PM +0200, Mario Splivalo wrote: > I'll go trough my code, it's been a while since I touched it, I'll write > some documentation and I'll inform the comunity. Thnx for the pointouts. And thank _you_ for proposing to do this. If everyone contributes their discoveries and improvements, we all benefit. A -- Andrew Sullivan | ajs@crankycanuck.ca When my information changes, I alter my conclusions. What do you do sir? --attr. John Maynard Keynes