Thread: PostgreSQL vs. MySQL: fight
After today's flurry of messages on this topic, I found myself wanting a summary of what I'd learned about the current state of things before the details slipped away. Somewhere in the middle of writing that, it morphed into an advocacy piece, so I figured why not put it on the Wiki: http://developer.postgresql.org/index.php?title=Why_PostgreSQL_instead_of_MySQL I may circle back to filling some of the stub sections myself eventually, you're all encouraged to beat me to it. -- * Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD
> After today's flurry of messages on this topic, I found myself wanting a > summary of what I'd learned about the current state of things before the > details slipped away. Somewhere in the middle of writing that, it > morphed into an advocacy piece, so I figured why not put it on the Wiki: > > http://developer.postgresql.org/index.php?title=Why_PostgreSQL_instead_of_MySQL > > > I may circle back to filling some of the stub sections myself > eventually, you're all encouraged to beat me to it. Hi, 2 additions to the mail-to-wiki gateway ;) (I don't have wiki write access). ------------------------------------------------------------------------- In addition, it doesn't support transactions that would allow the database to have [http://en.wikipedia.org/wiki/ACID ACID] properties. -> In addition, it doesn't support transactions or foreign key constraints that would allow the database to have [http://en.wikipedia.org/wiki/ACID ACID] properties. ------------------------------------------------------------------------- ------------------------------------------------------------------------- Also mention my favourite MySQL gotcha ( from the 5.1 manual at http://dev.mysql.com/doc/refman/5.1/en/create-table.html ): "The CHECK clause is parsed but ignored by all storage engines." ------------------------------------------------------------------------- Bye, Chris.
Greg Smith wrote: > After today's flurry of messages on this topic, I found myself wanting a > summary of what I'd learned about the current state of things before the > details slipped away. Somewhere in the middle of writing that, it > morphed into an advocacy piece, so I figured why not put it on the Wiki: > > http://developer.postgresql.org/index.php?title=Why_PostgreSQL_instead_of_MySQL > > > I may circle back to filling some of the stub sections myself > eventually, you're all encouraged to beat me to it. Greg, This piece is clearly aimed at end users and shouldn't be on the developers Wiki (which is only for internal project documentation/notes). Please move it over to techdocs, or refocus it as a guide for use by booth staff etc. Regards, Dave.
Dave Page wrote: > Greg Smith wrote: >> After today's flurry of messages on this topic, I found myself wanting >> a summary of what I'd learned about the current state of things before >> the details slipped away. Somewhere in the middle of writing that, it >> morphed into an advocacy piece, so I figured why not put it on the Wiki: >> >> http://developer.postgresql.org/index.php?title=Why_PostgreSQL_instead_of_MySQL >> >> >> I may circle back to filling some of the stub sections myself >> eventually, you're all encouraged to beat me to it. > > Greg, > > This piece is clearly aimed at end users and shouldn't be on the > developers Wiki (which is only for internal project > documentation/notes). Please move it over to techdocs, or refocus it as > a guide for use by booth staff etc. Generally having this document is a slippery slope. Its one of those things that will generate trouble if its not acurate and is not updated frequently enough. As such it should mention version numbers of MySQL where ever possible to make it clear which version of MySQL and PostgreSQL is compared in every section. It gets even trickier though with MySQL due to the fact that there are various different storage engines etc. regards, Lukas
Greg,
adding to Dave: that information is very thoughtfull and formulated carefully. It really deserves attention of people outside the rather limited group of the ones willing to hack on PostgreSQL core.
Best wishes,
Harald
--
GHUM Harald Massa
persuadere et programmare
Harald Armin Massa
Spielberger Straße 49
70435 Stuttgart
0173/9409607
fx 01212-5-13695179
-
EuroPython 2008 will take place in Vilnius, Lithuania - Stay tuned!
adding to Dave: that information is very thoughtfull and formulated carefully. It really deserves attention of people outside the rather limited group of the ones willing to hack on PostgreSQL core.
Best wishes,
Harald
On 7/25/07, Dave Page <dpage@postgresql.org> wrote:
Greg Smith wrote:
> After today's flurry of messages on this topic, I found myself wanting a
> summary of what I'd learned about the current state of things before the
> details slipped away. Somewhere in the middle of writing that, it
> morphed into an advocacy piece, so I figured why not put it on the Wiki:
>
> http://developer.postgresql.org/index.php?title=Why_PostgreSQL_instead_of_MySQL
>
>
> I may circle back to filling some of the stub sections myself
> eventually, you're all encouraged to beat me to it.
Greg,
This piece is clearly aimed at end users and shouldn't be on the
developers Wiki (which is only for internal project
documentation/notes). Please move it over to techdocs, or refocus it as
a guide for use by booth staff etc.
Regards, Dave.
---------------------------(end of broadcast)---------------------------
TIP 4: Have you searched our list archives?
http://archives.postgresql.org
--
GHUM Harald Massa
persuadere et programmare
Harald Armin Massa
Spielberger Straße 49
70435 Stuttgart
0173/9409607
fx 01212-5-13695179
-
EuroPython 2008 will take place in Vilnius, Lithuania - Stay tuned!
On Wed, 25 Jul 2007, Dave Page wrote: > This piece is clearly aimed at end users and shouldn't be on the developers > Wiki (which is only for internal project documentation/notes). Please move it > over to techdocs, or refocus it as a guide for use by booth staff etc. In this document's first six hours of life, I've gotten useful improvements to it from four people, some of which included material I wouldn't have found on my own. I expect that if it's left there, such useful improvements will continue to flood in for a week or so (trailing off after everyone has caught up with their e-mail post-OSCON). The great thing about having a Wiki that many active members of the community have accounts at is you can create a rough document like this one and flesh it out as various people add improvements based on their own expertise. Since the incremental time needed to improve the document is low, even the person who only has one small piece to add can do it easily, and after enough people have passed over it you end up with something no one person would have come up with on their own. I've watched some amazing documents get built this way. I get that this should go somewhere else eventually. If you're telling me the Wiki isn't appropriate even as a staging area for building this sort of document, which intended to benefit the community as a whole but not ready to be "published" more formally to something like techdocs, I'll respect that and nuke it from the developer's area. But understand that if you push me there immediately, the whole thing will likely just die instead. I personally am not in a position to completely flesh this out without some help, and I'm not going to put a partially complete article somewhere else. As a perfect example of this principle, you're telling me that something like this might be refocused as a guide for booth staff. I don't know what you'd want to meet that requirement--I've never been to your booth. But looking at what I've done as sort of a prototype, we could have that discussion now in the context of what would need to be changed to accomplish that goal. I've found that people in general are really quite good at telling you what's wrong with something you've done when you give them a target to criticize, and you get much better feedback from that sort of process than happens with a more traditional brainstorming approach. -- * Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD
On Wed, 25 Jul 2007, Lukas Kahwe Smith wrote: > As such it should mention version numbers of MySQL where ever possible > to make it clear which version of MySQL and PostgreSQL is compared in > every section. The majority of the links into the MySQL documentation consistantly point to the current production version (5.0). I wanted to see what general content expansion might happen before going through the exercise of seeing if anything has changed in the upcoming 5.1 so I only have to do that once. For example, the initial edits people have made make it obvious that this document should address MySQL's foreign key and check constraint limitations, which wasn't even on my radar initially. Until the strict SQL mode appeared in MySQL 5.0, there were so many gross problems that I didn't even really consider the software a database, and therefore I haven't been paying attention to its more subtle issues until relatively recently. > Its one of those things that will generate trouble if its not acurate > and is not updated frequently enough. You are absolutely correct that the specific version of each software needs to be nailed down as a defense against updates making the content out of date. As for the accuracy part, my initial approach here was to highlight general statements about the areas people should be aware of, and then point directly into their documentation for the details. The idea is not to do any serious mud-slinging, and instead just point out where the mud is. -- * Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD
Greg Smith wrote: > On Wed, 25 Jul 2007, Dave Page wrote: > >> This piece is clearly aimed at end users and shouldn't be on the >> developers Wiki (which is only for internal project >> documentation/notes). Please move it over to techdocs, or refocus it >> as a guide for use by booth staff etc. > > I get that this should go somewhere else eventually. If you're telling > me the Wiki isn't appropriate even as a staging area for building this > sort of document, which intended to benefit the community as a whole but > not ready to be "published" more formally to something like techdocs, > I'll respect that and nuke it from the developer's area. But understand > that if you push me there immediately, the whole thing will likely just > die instead. I personally am not in a position to completely flesh this > out without some help, and I'm not going to put a partially complete > article somewhere else. I have no issue with you actively working on it there. There was much discussion when the wiki was born, the upshot of which was that it will be restricted to use as a community resource to aid our collective efforts. It was specifically *not* to become an unofficial documentation site containing any content intended specifically for end user consumption. > As a perfect example of this principle, you're telling me that something > like this might be refocused as a guide for booth staff. I don't know > what you'd want to meet that requirement--I've never been to your booth. > But looking at what I've done as sort of a prototype, we could have that > discussion now in the context of what would need to be changed to > accomplish that goal. I've found that people in general are really > quite good at telling you what's wrong with something you've done when > you give them a target to criticize, and you get much better feedback > from that sort of process than happens with a more traditional > brainstorming approach. The booth guide was just an example of a community resource as opposed to a user resource. I would have seen it maybe as a document to help staff helping out at any show or conference around the world ensure their answers to any questions comparing us to MySQL were fair and accurate. I should add that I've long wanted to see something along these lines produced, and I'm happy to see it now. A feature comparison checklist might also be useful addition (as long as it fairly shows things that MySQL offers that we don't as well as the other way round). Regards, Dave
On Wed, 25 Jul 2007, Dave Page wrote: > I should add that I've long wanted to see something along these lines > produced, and I'm happy to see it now. Me too--that's why I wrote it! So my plan is to keep this on the Wiki for, say, two weeks, then convert what we've got at that point to techdocs. > A feature comparison checklist might also be useful addition (as long as > it fairly shows things that MySQL offers that we don't as well as the > other way round). Was trying to get the low-level stuff that many people miss done first. For the big picture, it would be easy enough to mash up http://monstera.man.poznan.pl/wiki/index.php/Mysql_vs_postgres with updates to current at http://dev.mysql.com/doc/refman/5.1/en/roadmap.html for the MySQL side, while http://developer.postgresql.org/index.php/Feature_Matrix should be useful for updating the PG side. -- * Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD
On Wed, 2007-07-25 at 08:30 -0400, Greg Smith wrote: > On Wed, 25 Jul 2007, Dave Page wrote: > > > This piece is clearly aimed at end users and shouldn't be on the developers > > Wiki (which is only for internal project documentation/notes). Please move it > > over to techdocs, or refocus it as a guide for use by booth staff etc. > > In this document's first six hours of life, I've gotten useful > improvements to it from four people, some of which included material I > wouldn't have found on my own. We love what yer doin', just do it on techdocs etc.. Maybe we need an Advocacy wiki? -- Simon Riggs EnterpriseDB http://www.enterprisedb.com
Dave Page wrote: > Greg Smith wrote: >> After today's flurry of messages on this topic, I found myself wanting >> a summary of what I'd learned about the current state of things before >> the details slipped away. Somewhere in the middle of writing that, it >> morphed into an advocacy piece, so I figured why not put it on the Wiki: >> >> http://developer.postgresql.org/index.php?title=Why_PostgreSQL_instead_of_MySQL >> >> >> I may circle back to filling some of the stub sections myself >> eventually, you're all encouraged to beat me to it. > > Greg, > > This piece is clearly aimed at end users and shouldn't be on the > developers Wiki (which is only for internal project > documentation/notes). Please move it over to techdocs, or refocus it as > a guide for use by booth staff etc. As the developer wiki is already being used for a bunch of advocacy efforts, I believe that it is more than acceptable. Joshua D. Drake > > Regards, Dave. > > ---------------------------(end of broadcast)--------------------------- > TIP 4: Have you searched our list archives? > > http://archives.postgresql.org > -- === The PostgreSQL Company: Command Prompt, Inc. === Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240 Providing the most comprehensive PostgreSQL solutions since 1997 http://www.commandprompt.com/ Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate PostgreSQL Replication: http://www.commandprompt.com/products/
Joshua D. Drake wrote: > Dave Page wrote: >> Greg Smith wrote: >>> After today's flurry of messages on this topic, I found myself wanting a >>> summary of what I'd learned about the current state of things before the >>> details slipped away. Somewhere in the middle of writing that, it >>> morphed into an advocacy piece, so I figured why not put it on the Wiki: >>> >>> http://developer.postgresql.org/index.php?title=Why_PostgreSQL_instead_of_MySQL >>> >>> I may circle back to filling some of the stub sections myself eventually, >>> you're all encouraged to beat me to it. >> Greg, >> This piece is clearly aimed at end users and shouldn't be on the >> developers Wiki (which is only for internal project documentation/notes). >> Please move it over to techdocs, or refocus it as a guide for use by booth >> staff etc. > > As the developer wiki is already being used for a bunch of advocacy > efforts, I believe that it is more than acceptable. On the contrary -- I'm of the opinion that before this gets widespread, we should get advocacy a wiki somewhere else. Don't we have snappy powerful machines lacking services to run? -- Alvaro Herrera http://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc.
Joshua D. Drake wrote: > Dave Page wrote: >> This piece is clearly aimed at end users and shouldn't be on the >> developers Wiki (which is only for internal project >> documentation/notes). Please move it over to techdocs, or refocus it >> as a guide for use by booth staff etc. > > As the developer wiki is already being used for a bunch of advocacy > efforts, I believe that it is more than acceptable. Huh? It's being use to *organise* advocacy stuff, but not a place for end users to visit. /D
On Wed, Jul 25, 2007 at 11:27:00PM -0400, Alvaro Herrera wrote: > Joshua D. Drake wrote: > > Dave Page wrote: > >> Greg Smith wrote: > >>> After today's flurry of messages on this topic, I found myself wanting a > >>> summary of what I'd learned about the current state of things before the > >>> details slipped away. Somewhere in the middle of writing that, it > >>> morphed into an advocacy piece, so I figured why not put it on the Wiki: > >>> > >>> http://developer.postgresql.org/index.php?title=Why_PostgreSQL_instead_of_MySQL > >>> > >>> I may circle back to filling some of the stub sections myself eventually, > >>> you're all encouraged to beat me to it. > >> Greg, > >> This piece is clearly aimed at end users and shouldn't be on the > >> developers Wiki (which is only for internal project documentation/notes). > >> Please move it over to techdocs, or refocus it as a guide for use by booth > >> staff etc. > > > > As the developer wiki is already being used for a bunch of advocacy > > efforts, I believe that it is more than acceptable. > > On the contrary -- I'm of the opinion that before this gets widespread, > we should get advocacy a wiki somewhere else. Don't we have snappy > powerful machines lacking services to run? Having a powerful machine is the *easy* part. Maintaining it and maintaining the contents of a wide-open wiki is not. //Magnus
Dave Page wrote: > Joshua D. Drake wrote: >> Dave Page wrote: >>> This piece is clearly aimed at end users and shouldn't be on the >>> developers Wiki (which is only for internal project >>> documentation/notes). Please move it over to techdocs, or refocus it >>> as a guide for use by booth staff etc. >> >> As the developer wiki is already being used for a bunch of advocacy >> efforts, I believe that it is more than acceptable. > > Huh? It's being use to *organise* advocacy stuff, but not a place for > end users to visit. *shrug* apples to apples. I really don't see a problem. Joshua D. Drake > > /D > > ---------------------------(end of broadcast)--------------------------- > TIP 4: Have you searched our list archives? > > http://archives.postgresql.org > -- === The PostgreSQL Company: Command Prompt, Inc. === Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240 Providing the most comprehensive PostgreSQL solutions since 1997 http://www.commandprompt.com/ Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate PostgreSQL Replication: http://www.commandprompt.com/products/
With much appreciated contributions from others including Chris Browne in particular, I've now finished up the original scope of what I wanted to cover on this topic: http://developer.postgresql.org/index.php/Why_PostgreSQL_instead_of_MySQL I tried to keep the text punchy by hitting the important points and leaning heavily on references in lieu of trying to include all the details in the document. I feel like the "Join Complexity" section is still a little lightweight and could use more input. I'm happy with the rest, but they could certainly benefit from some review by others. Josh in particular should take a look at how I covered the recent Sun benchmark results to make sure I'm fairly characterizing those. The "Application Support" section didn't really fit the rest of the document once it shaped up, I moved what had been written onto the discussion page; something may come out of that later. I intend to keep this active on the Wiki for another week for continued proofreading and hopefully some useful feedback; consider August 10th the deadline for updates. At that point, I'll repackage this for techdocs and then pull it out of the developer site. -- * Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD
Greg Smith wrote: > I tried to keep the text punchy by hitting the important points and > leaning heavily on references in lieu of trying to include all the > details in the document. I feel like the "Join Complexity" section is > still a little lightweight and could use more input. I'm happy with the > rest, but they could certainly benefit from some review by others. Josh > in particular should take a look at how I covered the recent Sun > benchmark results to make sure I'm fairly characterizing those. One major difference of MySQL is that it provides SQL hints. This alleviates the issues with slow complex joins, but is obviously a dangerous tool if not used wisely. That being said, it can often be a very quick fix, whereas in PostgreSQL you do not have such tools if you ever need them. regards, Lukas
Lukas Kahwe Smith wrote: > One major difference of MySQL is that it provides SQL hints. This > alleviates the issues with slow complex joins, but is obviously a > dangerous tool if not used wisely. That being said, it can often be a > very quick fix, whereas in PostgreSQL you do not have such tools if you > ever need them. One more thing .. MySQL currently has horrible performance for sub selects. This will improve slightly in 5.1 and is scheduled to be "solved" with 5.2 regards, Lukas
Lukas Kahwe Smith wrote: > Greg Smith wrote: > >> I tried to keep the text punchy by hitting the important points and >> leaning heavily on references in lieu of trying to include all the details >> in the document. I feel like the "Join Complexity" section is still a >> little lightweight and could use more input. I'm happy with the rest, but >> they could certainly benefit from some review by others. Josh in >> particular should take a look at how I covered the recent Sun benchmark >> results to make sure I'm fairly characterizing those. > > One major difference of MySQL is that it provides SQL hints. This > alleviates the issues with slow complex joins, but is obviously a dangerous > tool if not used wisely. That being said, it can often be a very quick fix, > whereas in PostgreSQL you do not have such tools if you ever need them. Which probably partly explains our optimizer is so much better (the other part being that Tom is so damn smart). -- Alvaro Herrera http://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc.
On Fri, Aug 03, 2007 at 06:35:16PM +0200, Lukas Kahwe Smith wrote: > Lukas Kahwe Smith wrote: > > >One major difference of MySQL is that it provides SQL hints. This > >alleviates the issues with slow complex joins, but is obviously a > >dangerous tool if not used wisely. That being said, it can often > >be a very quick fix, whereas in PostgreSQL you do not have such > >tools if you ever need them. > > One more thing...MySQL currently has horrible performance for sub > selects. This will improve slightly in 5.1 and is scheduled to be > "solved" with 5.2 Given MySQL AB's track record, I'll believe it when I see independent confirmation. Cheers, David. -- David Fetter <david@fetter.org> http://fetter.org/ phone: +1 415 235 3778 AIM: dfetter666 Skype: davidfetter Remember to vote! Consider donating to PostgreSQL: http://www.postgresql.org/about/donate
Alvaro Herrera wrote: > Lukas Kahwe Smith wrote: >> Greg Smith wrote: >> >>> I tried to keep the text punchy by hitting the important points and >>> leaning heavily on references in lieu of trying to include all the details >>> in the document. I feel like the "Join Complexity" section is still a >>> little lightweight and could use more input. I'm happy with the rest, but >>> they could certainly benefit from some review by others. Josh in >>> particular should take a look at how I covered the recent Sun benchmark >>> results to make sure I'm fairly characterizing those. >> One major difference of MySQL is that it provides SQL hints. This >> alleviates the issues with slow complex joins, but is obviously a dangerous >> tool if not used wisely. That being said, it can often be a very quick fix, >> whereas in PostgreSQL you do not have such tools if you ever need them. > > Which probably partly explains our optimizer is so much better (the > other part being that Tom is so damn smart). Heh .. oh btw .. MySQL does not provide any access to the stats tables like PostgreSQL does. regards, Lukas
On 8/3/07, Lukas Kahwe Smith <smith@pooteeweet.org> wrote: > Heh .. oh btw .. MySQL does not provide any access to the stats tables > like PostgreSQL does. In terms of keeping the playing-field honest, I wanted to add that MySQL has a spiffy multi-level SQL profiler that PostgreSQL doesn't. -- Jonah H. Harris, Software Architect | phone: 732.331.1324 EnterpriseDB Corporation | fax: 732.331.1301 33 Wood Ave S, 3rd Floor | jharris@enterprisedb.com Iselin, New Jersey 08830 | http://www.enterprisedb.com/
On 04/08/07, Greg Smith <gsmith@gregsmith.com> wrote: > With much appreciated contributions from others including Chris Browne in > particular, I've now finished up the original scope of what I wanted to > cover on this topic: > > http://developer.postgresql.org/index.php/Why_PostgreSQL_instead_of_MySQL > > I tried to keep the text punchy by hitting the important points and > leaning heavily on references in lieu of trying to include all the details > in the document. A relatively big plus for MySQL in serious environments is replication and the whole MySQL Cluster bit. There are some promising tools such as pgpool-II - http://snipr.com/1p518 - but it would be useful to see something similar come bundled with PostgreSQL that I could install as an option. Unless I am missing something.. Shanx
Shashank Tripathi wrote: > On 04/08/07, Greg Smith <gsmith@gregsmith.com> wrote: >> With much appreciated contributions from others including Chris Browne in >> particular, I've now finished up the original scope of what I wanted to >> cover on this topic: >> >> http://developer.postgresql.org/index.php/Why_PostgreSQL_instead_of_MySQL >> >> I tried to keep the text punchy by hitting the important points and >> leaning heavily on references in lieu of trying to include all the details >> in the document. > > > A relatively big plus for MySQL in serious environments is replication > and the whole MySQL Cluster bit. There are some promising tools such > as pgpool-II - http://snipr.com/1p518 - but it would be useful to see > something similar come bundled with PostgreSQL that I could install as > an option. Unless I am missing something.. While we are on it, MySQL has much better charset support as well. They also have automatic support for updateable views (I know you can achieve the same with manual definition of rules). Oh and their multi threaded nature seems to make their life a lot harder when they add new features. InnoDB has scaling issues with more CPU's and they do not have custom datatype support nor CHECK constraints. Phew, this can go on for a while :) regards, Lukas
Lukas Kahwe Smith escribió: > Oh and their multi threaded nature seems to make their life a lot harder > when they add new features. InnoDB has scaling issues with more CPU's and > they do not have custom datatype support nor CHECK constraints. One thing you can be sure is that their multithreaded nature makes it awfully hard for them to take Postgres code and use it without plenty of modifications. -- Alvaro Herrera http://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc.
On Sat, 4 Aug 2007, Lukas Kahwe Smith wrote: > Phew, this can go on for a while :) My scope from the beginning was not to do a feature comparison. That's a gigantic mess, and there are pages out there who cover that material fairly well already. All I wanted to hit are the very basic SQL/engine issues that tend to be misrepresented (often because the pages are outdated) when comparing the performance/reliability tradeoffs of the main database core of each system. Clustering, character sets, random features that aren't key parts of the SQL spec--not really topical here. Thanks to everyone who made suggestions about the join complexity section, I expanded that section based on all the messages that went by today and am happy now with how it covers that topic. -- * Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD
* Greg Smith: > With much appreciated contributions from others including Chris Browne > in particular, I've now finished up the original scope of what I > wanted to cover on this topic: > > http://developer.postgresql.org/index.php/Why_PostgreSQL_instead_of_MySQL I think for fairness, you should mention that PostgreSQL's MVCC locking does not properly implement the SERIALIZABLE semantics. For some table types (InnoDB, IIRC), MySQL implements phantom key logging, so it gets more cases right. -- Florian Weimer <fweimer@bfk.de> BFK edv-consulting GmbH http://www.bfk.de/ Kriegsstraße 100 tel: +49-721-96201-1 D-76133 Karlsruhe fax: +49-721-96201-99
On Sat, 4 Aug 2007, Florian Weimer wrote: > I think for fairness, you should mention that PostgreSQL's MVCC locking > does not properly implement the SERIALIZABLE semantics. For some table > types (InnoDB, IIRC), MySQL implements phantom key logging, so it gets > more cases right. I assume you mean phantom key locking, not logging. I've been trying to follow up on this, but I still don't understand exactly what you're describing--certainly not well enough to explain it. There's a section in the PostgreSQL documentation describing "Serializable Isolation versus True Serializability" at http://www.postgresql.org/docs/current/static/transaction-iso.html ; does that cover what you describe as "not properly implement the SERIALIZABLE semantics" or is there something else you're alluding to here? I also can't find anything definitive on why MySQL's phantom key implementation is a better solution. The two most relevant documents seem to be http://dev.mysql.com/doc/refman/5.0/en/innodb-next-key-locking.html http://www.greatlinux.com/mysql/books/mysqlpress/mysql-tutorial/ch10.html but I don't see how that "gets more cases right". Can you comment more about this? As a side-note, it's hard for me to feel too compelled to point out a theoretical advantage for MySQL here when I find stuff like http://bugs.mysql.com/bug.php?id=27197 floating around (that's just the worse of several such bugs I came across when researching this topic). -- * Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD
Greg Smith wrote: > http://dev.mysql.com/doc/refman/5.0/en/innodb-next-key-locking.html > http://www.greatlinux.com/mysql/books/mysqlpress/mysql-tutorial/ch10.html I recently covered a related item (prevting phantom rows) regarding MySQL in my blog: http://pooteeweet.org/blog/745 regards, Lukas
On Fri, Aug 10, 2007 at 09:24:24AM +0200, Lukas Kahwe Smith wrote: > Greg Smith wrote: > > >http://dev.mysql.com/doc/refman/5.0/en/innodb-next-key-locking.html > >http://www.greatlinux.com/mysql/books/mysqlpress/mysql-tutorial/ch10.html > > I recently covered a related item (prevting phantom rows) regarding > MySQL in my blog: > http://pooteeweet.org/blog/745 Wait... isn't InnoDB an MVCC system? Why do they need gap locking at all? Shouldn't they be able to just pull the right version? -- Decibel!, aka Jim Nasby decibel@decibel.org EnterpriseDB http://enterprisedb.com 512.569.9461 (cell)
Attachment
On Fri, 2007-08-10 at 16:11 -0500, Decibel! wrote: > On Fri, Aug 10, 2007 at 09:24:24AM +0200, Lukas Kahwe Smith wrote: > > Greg Smith wrote: > > > > >http://dev.mysql.com/doc/refman/5.0/en/innodb-next-key-locking.html > > >http://www.greatlinux.com/mysql/books/mysqlpress/mysql-tutorial/ch10.html > > > > I recently covered a related item (prevting phantom rows) regarding > > MySQL in my blog: > > http://pooteeweet.org/blog/745 > > Wait... isn't InnoDB an MVCC system? Why do they need gap locking at > all? Shouldn't they be able to just pull the right version? Is there a document explaining more of the differences between the postgresql MVCC model and something closer to InnoDB or Oracle, where it has rollback segments? I'm interested in the design tradeoffs between the two ideas. Regards, Jeff Davis
Jeff Davis wrote: > On Fri, 2007-08-10 at 16:11 -0500, Decibel! wrote: >> On Fri, Aug 10, 2007 at 09:24:24AM +0200, Lukas Kahwe Smith wrote: >>> Greg Smith wrote: >>> >>>> http://dev.mysql.com/doc/refman/5.0/en/innodb-next-key-locking.html >>>> http://www.greatlinux.com/mysql/books/mysqlpress/mysql-tutorial/ch10.html >>> I recently covered a related item (prevting phantom rows) regarding >>> MySQL in my blog: >>> http://pooteeweet.org/blog/745 >> Wait... isn't InnoDB an MVCC system? Why do they need gap locking at >> all? Shouldn't they be able to just pull the right version? > > Is there a document explaining more of the differences between the > postgresql MVCC model and something closer to InnoDB or Oracle, where it > has rollback segments? I'm interested in the design tradeoffs between > the two ideas. I cannot give you an exact comparison. But the PostgreSQL docs are pretty good on how things work there and the following article explains how things are in Oracle and the rest: http://www.ibphoenix.com/main.nfs?page=ibp_mvcc_roman regards, Lukas
On 8/10/07, Jeff Davis <pgsql@j-davis.com> wrote: > Is there a document explaining more of the differences between the > postgresql MVCC model and something closer to InnoDB or Oracle, where it > has rollback segments? I'm interested in the design tradeoffs between > the two ideas. Not really, but the best reference is Transactional Information Systems: Theory, Algorithms, and the Practice of Concurrency Control by Weikum & Vossen. PostgreSQL uses multi-version timestamp ordering (MVTO) and Oracle/InnoDB use multi-version read consistency (MVRC). The main difference is that PostgreSQL is with-REDO/no-UNDO because it stores every row version in the main table, and Oracle/InnoDB are with-REDO/with-UNDO and they reconstruct a block and/or row image from the log to provide read consistency. -- Jonah H. Harris, Software Architect | phone: 732.331.1324 EnterpriseDB Corporation | fax: 732.331.1301 33 Wood Ave S, 3rd Floor | jharris@enterprisedb.com Iselin, New Jersey 08830 | http://www.enterprisedb.com/
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Jonah H. Harris wrote: > On 8/10/07, Jeff Davis <pgsql@j-davis.com> wrote: >> Is there a document explaining more of the differences between the >> postgresql MVCC model and something closer to InnoDB or Oracle, where it >> has rollback segments? I'm interested in the design tradeoffs between >> the two ideas. > > Not really, but the best reference is Transactional Information > Systems: Theory, Algorithms, and the Practice of Concurrency Control > by Weikum & Vossen. I have this book. It is a very good book. It can also make your brain hurt. Joshua D. Drake > PostgreSQL uses multi-version timestamp ordering > (MVTO) and Oracle/InnoDB use multi-version read consistency (MVRC). > The main difference is that PostgreSQL is with-REDO/no-UNDO because it > stores every row version in the main table, and Oracle/InnoDB are > with-REDO/with-UNDO and they reconstruct a block and/or row image from > the log to provide read consistency. > - -- === The PostgreSQL Company: Command Prompt, Inc. === Sales/Support: +1.503.667.4564 24x7/Emergency: +1.800.492.2240 PostgreSQL solutions since 1997 http://www.commandprompt.com/ UNIQUE NOT NULL Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate PostgreSQL Replication: http://www.commandprompt.com/products/ -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFGvS5gATb/zqfZUUQRAnuzAKClGrTcks1twLjFiwvyQ9vXFKZw0ACgkbWy YXbtLNMkQPM2ZtVxj7MmnKc= =53cN -----END PGP SIGNATURE-----
On Fri, 10 Aug 2007, Jonah H. Harris wrote: > The main difference is that PostgreSQL is with-REDO/no-UNDO because it > stores every row version in the main table, and Oracle/InnoDB are > with-REDO/with-UNDO and they reconstruct a block and/or row image from > the log to provide read consistency. I'm finding it hard to parse this sentence; it's a bit too run-on and I'm not sure exactly what you mean here. Any chance I could get you (or someone else who actually understands the distinction you're making better than I do) to re-write it so it's clearer? Regardless, I just incorporated all of Jonah's comments from that very helpful message into the page, and there's a pile of additional references in there on this topic now. I'm comfortable that Lukas's warning not to oversell the PostgreSQL MVCC has been addressed. As part of the final cleanup I'm doing today (in prep for the conversion to techdocs) I also added a "Credits" section that specicially recognizes all the work Chris and Lukas did by writing parts of the document. I assume you two would want your names and links to your web sites in there; if you want to avoid collateral damage from the seeming inevitable flames publicizing this info will produce you can certainly make yourself anonymous instead. -- * Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD
On Sat, 2007-08-11 at 00:06 +0200, Lukas Kahwe Smith wrote: > > Is there a document explaining more of the differences between the > > postgresql MVCC model and something closer to InnoDB or Oracle, where it > > has rollback segments? I'm interested in the design tradeoffs between > > the two ideas. > > I cannot give you an exact comparison. But the PostgreSQL docs are > pretty good on how things work there and the following article explains > how things are in Oracle and the rest: > http://www.ibphoenix.com/main.nfs?page=ibp_mvcc_roman > Thanks for the link. If I understand correctly, the idea is that non-postgres mvcc systems (interbase, etc) write the new version in the old location, and copy the old tuple version to a special undo log area. Is that a reasonable summary? I wonder how they are able to update records when the new version takes up more space than the old version? Also, how do they update indexes that point to a value that has changed? And how do they reclaim storage for deletes? It seems like the approach of interbase, etc, has some advantages by keeping better cluster order and reducing the need for VACUUM, but seems like it might introduce other problems (although they don't explain what those other problems are). Hopefully HOT is the best of all worlds. Regards, Jeff Davis
Jeff Davis wrote: > On Sat, 2007-08-11 at 00:06 +0200, Lukas Kahwe Smith wrote: >>> Is there a document explaining more of the differences between the >>> postgresql MVCC model and something closer to InnoDB or Oracle, where it >>> has rollback segments? I'm interested in the design tradeoffs between >>> the two ideas. >> I cannot give you an exact comparison. But the PostgreSQL docs are >> pretty good on how things work there and the following article explains >> how things are in Oracle and the rest: >> http://www.ibphoenix.com/main.nfs?page=ibp_mvcc_roman >> > > Thanks for the link. > > If I understand correctly, the idea is that non-postgres mvcc systems > (interbase, etc) write the new version in the old location, and copy the > old tuple version to a special undo log area. Is that a reasonable > summary? Correct. > I wonder how they are able to update records when the new version takes > up more space than the old version? Also, how do they update indexes > that point to a value that has changed? And how do they reclaim storage > for deletes? I do not know the perfect answers to all of these, but here is my attempt: 1) they need to shift around pages, which I presume they will try work around as much as possible 2) Not really sure, I presume they maintain some kind of flag to tell transactions that want the old version to traverse some kind of list in the undo log 3) Well they do it immediately and just keep a copy in the undo log around until all transactions that started before the delete have ended. This is their primary advantage over the PostgreSQL style. There is no need for vaccum. > It seems like the approach of interbase, etc, has some advantages by > keeping better cluster order and reducing the need for VACUUM, but seems > like it might introduce other problems (although they don't explain what > those other problems are). Hopefully HOT is the best of all worlds. Well due to 1) and similar effects I would assume that a rollback is more expensive for them. I guess PostgreSQL trades some disk space and the need for a clean up task like vacuum for working around the issues in the Oracle style MVCC, which relies on a rollback log that needs to be allocated independently, that needs to be undone on a rollback etc. regards, Lukas
Hello > > Well due to 1) and similar effects I would assume that a rollback is > more expensive for them. > I did't test MySQL three years, But three years ago, MySQL rollback was really, really slow. Regards Pavel Stehule
Pavel, > I did't test MySQL three years, But three years ago, MySQL rollback > was really, really slow. Based on recent testing I did, rollbacks are very expensive -- like 3x to 4x the cost of the original insert in high-concurrency environments. Also not always reliable (as of 5.0). -- Josh Berkus PostgreSQL @ Sun San Francisco
On 8/13/07, Lukas Kahwe Smith <smith@pooteeweet.org> wrote: > > If I understand correctly, the idea is that non-postgres mvcc systems > > (interbase, etc) write the new version in the old location, and copy the > > old tuple version to a special undo log area. Is that a reasonable > > summary? > > Correct. Well, not in all cases. For Oracle and InnoDB, this is correct. Ingres has update-in-place as-well-as a similar method to HOT. Firebird/Interbase stores a delta row-version in the main table which is cleaned up automagically on later queries. > I do not know the perfect answers to all of these, but here is my attempt: > 1) they need to shift around pages, which I presume they will try work > around as much as possible Depending on the system, their either move the whole row, or a piece of it to another block. > 2) Not really sure, I presume they maintain some kind of flag to tell > transactions that want the old version to traverse some kind of list in > the undo log There is generally a table which keeps track of row versions such that the engine knows where to find the old version. Oracle, for example, has an ITL (interested transaction list) which points to a location of the rollback segment which contains the original (UNDO) data needed to rebuild the correct block/row version. > 3) Well they do it immediately and just keep a copy in the undo log > around until all transactions that started before the delete have ended. > This is their primary advantage over the PostgreSQL style. There is no > need for vaccum. When storing the old version in the log, there is no reclamation needed. In the case of Ingres/Interbase/Firebird, it is generally reclaimed automagically later by the system. > Well due to 1) and similar effects I would assume that a rollback is > more expensive for them. Yes, rollback is more expensive in *most* of the other systems. However, 97% of all transactions commit (statistic, but changes depending on the application)... so those systems have been designed as, "optimized for commit". > I guess PostgreSQL trades some disk space and the need for a clean up > task like vacuum for working around the issues in the Oracle style MVCC, > which relies on a rollback log that needs to be allocated independently, > that needs to be undone on a rollback etc. Basically. Reading the aforementioned books are a good place to start in regards to differing MVCC models. -- Jonah H. Harris, Software Architect | phone: 732.331.1324 EnterpriseDB Corporation | fax: 732.331.1301 33 Wood Ave S, 3rd Floor | jharris@enterprisedb.com Iselin, New Jersey 08830 | http://www.enterprisedb.com/
"Jonah H. Harris" <jonah.harris@gmail.com> writes: > On 8/13/07, Lukas Kahwe Smith <smith@pooteeweet.org> wrote: >> > If I understand correctly, the idea is that non-postgres mvcc systems >> > (interbase, etc) write the new version in the old location, and copy the >> > old tuple version to a special undo log area. Is that a reasonable >> > summary? >> >> Correct. > > Well, not in all cases. For Oracle and InnoDB, this is correct. > Ingres has update-in-place as-well-as a similar method to HOT. > Firebird/Interbase stores a delta row-version in the main table which > is cleaned up automagically on later queries. Note that in the case of Oracle another interesting difference is that it's storing the old version *of the block* not the record. At least when I last was using it. That has advantages and disadvantages of course. >> 3) Well they do it immediately and just keep a copy in the undo log >> around until all transactions that started before the delete have ended. >> This is their primary advantage over the PostgreSQL style. There is no >> need for vaccum. > > When storing the old version in the log, there is no reclamation > needed. In the case of Ingres/Interbase/Firebird, it is generally > reclaimed automagically later by the system. DB2 stores it in the log, but Oracle stores it in a separate area called the rollback segments. Management of this space was Oracle's biggest bugaboo in the past but from what I understand they have it under control now. The problems Oracle had with rollback segments were exactly analogous to the problems we have with vacuum, though the actual failure modes are different. >> Well due to 1) and similar effects I would assume that a rollback is >> more expensive for them. > > Yes, rollback is more expensive in *most* of the other systems. > However, 97% of all transactions commit (statistic, but changes > depending on the application)... so those systems have been designed > as, "optimized for commit". I don't like "optimized for commit" as a shorthand. In both architectures a commit is essentially a noop. The interesting differences are what happens when you do when you look at a record which has been updated. In Postgres you ignore it, in Oracle you have to do a random access I/O to fetch it from the rollback segment. Analyzing how expensive that is is complex as it depends heavily on how much cache you have and the mix of transactions. Also, with rollback segments you have to do twice as much I/O at update time. Again you hope the rollback segments are cached though, so it's unclear how much that costs. And with DB2's approach of using the logs as rollback you don't really have to do any I/O you didn't already have to do anyways. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com
Gregory Stark wrote: > DB2 stores it in the log, but Oracle stores it in a separate area called the Just so that nobody gets confused on this, IBM has gone out of their way to curse MVCC (see the discussion of the IBM vs. Oracle fight on the firebird article I posted a while back). So no MVCC in DB2. regards, Lukas
On Mon, Aug 13, 2007 at 10:16:15PM -0700, Josh Berkus wrote: > Pavel, > > > I did't test MySQL three years, But three years ago, MySQL rollback > > was really, really slow. > > Based on recent testing I did, rollbacks are very expensive -- like 3x to 4x > the cost of the original insert in high-concurrency environments. Also not > always reliable (as of 5.0). Wow, that's a serious gotcha. If there's a way to reproduce that we should certainly let Ian Barwick of MySQL gotchas fame know about it. -- Decibel!, aka Jim Nasby decibel@decibel.org EnterpriseDB http://enterprisedb.com 512.569.9461 (cell)
Attachment
On Tue, Aug 14, 2007 at 05:34:14PM -0500, Decibel! wrote: > On Mon, Aug 13, 2007 at 10:16:15PM -0700, Josh Berkus wrote: > > Pavel, > > > > > I did't test MySQL three years, But three years ago, MySQL rollback > > > was really, really slow. > > > > Based on recent testing I did, rollbacks are very expensive -- like 3x to 4x > > the cost of the original insert in high-concurrency environments. Also not > > always reliable (as of 5.0). > > Wow, that's a serious gotcha. If there's a way to reproduce that we > should certainly let Ian Barwick of MySQL gotchas fame know about it. That's under "Epic, feature-length rollbacks," and yes, that's with InnoDB tables. MyISAM tables have no ROLLBACK capability at all. http://sql-info.de/en/mysql/transaction-innodb-table.html#4_3 Cheers, David. -- David Fetter <david@fetter.org> http://fetter.org/ phone: +1 415 235 3778 AIM: dfetter666 Skype: davidfetter Remember to vote! Consider donating to PostgreSQL: http://www.postgresql.org/about/donate