Thread: MySQL+InnoDB vs. PostgreSQL test?
Folks, I've had requests from a couple of businesses to see results of infomal MySQL +InnoDB vs. PostgreSQL tests. I know that we don't have the setup to do full formal benchmarking, but surely someone in our community has gone head-to-head on your own application? -- -Josh Berkus Aglio Database Solutions San Francisco
On Mon, 2004-02-02 at 12:21, Josh Berkus wrote: > Folks, > > I've had requests from a couple of businesses to see results of infomal MySQL > +InnoDB vs. PostgreSQL tests. I know that we don't have the setup to do > full formal benchmarking, but surely someone in our community has gone > head-to-head on your own application? > We have the setup to do informal benchmarking via OSDL, but afaik mysql doesn't conform to any of the dbt benchmarks... Robert Treat -- Build A Brighter Lamp :: Linux Apache {middleware} PostgreSQL
Josh, I evaluated MySQL + InnoDB briefly for a project, once. I didn't get very far because of some severe limitations in MySQL. I had to import all of the data from an existing database (MS SQL). One of the tables was about 8 million rows, 10 fields, and had 5 indexes. I found it quite impossible to import into MySQL. I would import the data into a table with no indexes, then perform a bunch of manipulation on it (I wasn't just converting from MS SQL, but also needed to alter quite a bit of the structure). After the manipulation, I would drop some columns and build the indexes. It took MySQL over 4 days to do this! What I found out was that any DDL changes to a table in MySQL actually does this: create a new table, copy all of the data over, then drop the old table and rename the new one. Whenever I added a new index, MySQL would go through the process of rebuilding each previous index. Same thing when adding or dropping columns. I could not find a way to import all of the data in a reasonable amount of time. For comparison, it took less that 45 minutes to import all of the data in to PostgreSQL (that's ALL of the data, not just that one table). Needless to say (but I'll say it anyway :-), I didn't get any farther in my evaluation, there was no point. One more thing that annoyed me. If you started a process, such as a large DDL operation, or heaven forbid, a cartesian join (what? I never do that!). There's no way to cancel it with InnoDB. You have to wait for it to finish. Hitting ctrl+c in their command line tool only kills the command line tool, the process continues. Even if you stop the database and restart it (including with a hard boot), it will pick right up where it left off and continue. That proved to be way too much of a pain for me. Disclaimer: I'm not a real MySQL expert, or anything. There could be ways of getting around this, but after two weeks of trying, I decided to give up. It only took me a few hours to build the requisite PostgreSQL scripts and I never looked back. Adam Ruth On Feb 2, 2004, at 10:21 AM, Josh Berkus wrote: > Folks, > > I've had requests from a couple of businesses to see results of > infomal MySQL > +InnoDB vs. PostgreSQL tests. I know that we don't have the setup > to do > full formal benchmarking, but surely someone in our community has gone > head-to-head on your own application? > > -- > -Josh Berkus > Aglio Database Solutions > San Francisco > > > ---------------------------(end of > broadcast)--------------------------- > TIP 8: explain analyze is your friend >
> One more thing that annoyed me. If you started a process, such as a > large DDL operation, or heaven forbid, a cartesian join (what? I never > do that!). I believe InnoDB also has O(n) rollback time. eg. if you are rolling back 100 million row changes, it takes a long, long time. In PostgreSQL rolling back is O(1)... Chris
On Tue, 3 Feb 2004, Christopher Kings-Lynne wrote: > > One more thing that annoyed me. If you started a process, such as a > > large DDL operation, or heaven forbid, a cartesian join (what? I never > > do that!). > > I believe InnoDB also has O(n) rollback time. eg. if you are rolling > back 100 million row changes, it takes a long, long time. In PostgreSQL > rolling back is O(1)... Actually, it takes signifigantly longer to rollback than to roll forward, so to speak, so that if you inserted for 10,000 rows and it took 5 minutes, it would take upwards of 30 times as long to roll back. This is from the docs: http://www.mysql.com/documentation/mysql/bychapter/manual_Table_types.html#InnoDB_tuning Point 8: # Beware of big rollbacks of mass inserts: InnoDB uses the insert buffer to save disk I/O in inserts, but in a corresponding rollback no such mechanism is used. A disk-bound rollback can take 30 times the time of the corresponding insert. Killing the database process will not help because the rollback will start again at the database startup. The only way to get rid of a runaway rollback is to increase the buffer pool so that the rollback becomes CPU-bound and runs fast, or delete the whole InnoDB database.
> Seriously, I am tired of this kind of question. You gotta get bold > enough to stand up in a "meeting" like that, say "guy's, you can ask me > how this compares to Oracle ... but if you're seriously asking me how > this compares to MySQL, call me again when you've done your homework". Hey at least I noticed that InnoDB has one essential feature we don't: SELECT ... IN SHARE MODE; Which does a shared lock on a row as opposed to a write lock, hence avoiding nasty foreign key deadlocks... Chris
Chris, > Hey at least I noticed that InnoDB has one essential feature we don't: > > SELECT ... IN SHARE MODE; > > Which does a shared lock on a row as opposed to a write lock, hence > avoiding nasty foreign key deadlocks... Um, wrong. We don't lock rows for SELECT. -- -Josh Berkus ______AGLIO DATABASE SOLUTIONS___________________________ Josh Berkus Complete information technology josh@agliodbs.com and data management solutions (415) 565-7293 for law firms, small businesses fax 621-2533 and non-profit organizations. San Francisco
Josh Berkus <josh@agliodbs.com> writes: >> Hey at least I noticed that InnoDB has one essential feature we don't: >> SELECT ... IN SHARE MODE; >> >> Which does a shared lock on a row as opposed to a write lock, hence >> avoiding nasty foreign key deadlocks... > Um, wrong. We don't lock rows for SELECT. No, but Chris is correct that we could do with having some kind of shared lock facility at the row level. regards, tom lane
>>Um, wrong. We don't lock rows for SELECT. > > No, but Chris is correct that we could do with having some kind of > shared lock facility at the row level. Out of interest, what is it about this particular task that's so hard? (Not that I could code it myself). But surely you can use the same sort of thing as the FOR UPDATE code path? Chris
Christopher Kings-Lynne <chriskl@familyhealth.com.au> writes: >> No, but Chris is correct that we could do with having some kind of >> shared lock facility at the row level. > Out of interest, what is it about this particular task that's so hard? Keeping track of multiple lockers in a fixed amount of disk space. regards, tom lane
>>Out of interest, what is it about this particular task that's so hard? > > > Keeping track of multiple lockers in a fixed amount of disk space. Why not look at how InnoDB does it? Or is that not applicable?
Chris, > > Which does a shared lock on a row as opposed to a write lock, hence > > avoiding nasty foreign key deadlocks... > > Um, wrong. We don't lock rows for SELECT. Unless you meant something else? Am I not following you? -- -Josh Berkus Aglio Database Solutions San Francisco
>>Um, wrong. We don't lock rows for SELECT. > > Unless you meant something else? Am I not following you? I mean row level shared read lock. eg. a lock that says, you can read but you cannot delete. It's what postgres needs to alleviate its foreign key trigger deadlock problems. Chris
Well, when I prepared my PG presentation I did some testing of MySQL (So I could be justified in calling it lousy :). I used the latest release (4.0.something I think) I was first bitten by my table type being MyISAM when I thought I set the default ot InnoDB. But I decided since my test was going to be read-only MyISAM should be the best possible choice. I loaded up a couple million records and changed my stored procedure into a perl script [I also decided to use this perl script for testing PG to be fair]. For one client mysql simply screamed. Then I decided to see what happens with 20 clients. MySQL clocked in at 650 seconds. During this time the machine was VERY unresponsive. To be fair, that could be Linux, not MySQL. PG (7.3.4) clocked in at 220 seconds. The machine was perfectly fine during the test - nice and responsive. The hardware wasn't much - dual p2-450 running stock RH8. (2x15k 18g scsi drives for the data volume) Then I decided to try the "beloved" InnoDB. Well.. after it sat for a few hours at 100% cpu loading the data I killed it off and gave up on InnoDB.. I am interested in the numbers. Perhaps I'll fire it up again someday and let it finish loading. Remember - you cannot judge mysql by since connection performance - you can't beat it. But just add up the concurrency and watch the cookies tumble -- Jeff Trout <jeff@jefftrout.com> http://www.jefftrout.com/ http://www.stuarthamm.net/
> script [I also decided to use this perl script for testing PG to be > fair]. > > For one client mysql simply screamed. > If already have test case set up, you could inform us, from where Postgres starts to beat MySql. Because if with 5 clients it still "screams" then i would give it a try in case of that kind of requirements. Rigmor Ukuhe > Then I decided to see what happens with 20 clients. > > MySQL clocked in at 650 seconds. During this time the machine was VERY > unresponsive. To be fair, that could be Linux, not MySQL. > > PG (7.3.4) clocked in at 220 seconds. The machine was perfectly fine > during the test - nice and responsive. > > The hardware wasn't much - dual p2-450 running stock RH8. (2x15k 18g > scsi drives for the data volume) > > Then I decided to try the "beloved" InnoDB. > > Well.. after it sat for a few hours at 100% cpu loading the data I > killed it off and gave up on InnoDB.. I am interested in the numbers. > Perhaps I'll fire it up again someday and let it finish loading. > > Remember - you cannot judge mysql by since connection performance - you > can't beat it. But just add up the concurrency and watch the cookies > tumble > > -- > Jeff Trout <jeff@jefftrout.com> > http://www.jefftrout.com/ > http://www.stuarthamm.net/ > > ---------------------------(end of broadcast)--------------------------- > TIP 8: explain analyze is your friend > --- > Incoming mail is certified Virus Free. > Checked by AVG anti-virus system (http://www.grisoft.com). > Version: 6.0.564 / Virus Database: 356 - Release Date: 19.01.2004 > --- Outgoing mail is certified Virus Free. Checked by AVG anti-virus system (http://www.grisoft.com). Version: 6.0.564 / Virus Database: 356 - Release Date: 19.01.2004
On Tue, 3 Feb 2004 16:02:00 +0200 "Rigmor Ukuhe" <rigmor.ukuhe@finestmedia.com> wrote: > > script [I also decided to use this perl script for testing PG to be > > fair]. > > > > For one client mysql simply screamed. > > > > If already have test case set up, you could inform us, from where > Postgres starts to beat MySql. Because if with 5 clients it still > "screams" then i would give it a try in case of that kind of > requirements. > I just checked (to see about restarting the innodb test) and it appears that it'll take a bit of work to get the machine up and running. I don't have time right now to do further testing. However, you could try it out. Not sure at what point it will topple, in my case it didn't matter if it ran good with 5 clients as I'll always have many more clients than 5. -- Jeff Trout <jeff@jefftrout.com> http://www.jefftrout.com/ http://www.stuarthamm.net/
On Tue, 03 Feb 2004 11:46:05 -0500 Tom Lane <tgl@sss.pgh.pa.us> wrote: > Jeff <threshar@torgo.978.org> writes: > > Not sure at what point it will topple, in my case it didn't matter > > if it ran good with 5 clients as I'll always have many more clients > > than 5. > > I did some idle, very unscientific tests the other day that indicated > that MySQL insert performance starts to suck with just 2 concurrent > inserters. Given a file containing 10000 INSERT commands, a single > mysql client ran the file in about a second. So if I feed the file > simultaneously to two mysqls in two shell windows, it should take > about two seconds total to do the 20000 inserts, right? The observed > times were 13 to 15 seconds. (I believe this is with a MyISAM table, > since I just said CREATE TABLE without any options.) > MyISAM is well known to suck if you update/insert/delete because it simply aquires a full table lock when you perform those operations! InnoDB is supposed to be better at that. So your results are fairly in line with what you should see. -- Jeff Trout <jeff@jefftrout.com> http://www.jefftrout.com/ http://www.stuarthamm.net/
Jeff <threshar@torgo.978.org> writes: > On Tue, 03 Feb 2004 11:46:05 -0500 > Tom Lane <tgl@sss.pgh.pa.us> wrote: >> I did some idle, very unscientific tests the other day that indicated >> that MySQL insert performance starts to suck with just 2 concurrent >> inserters. Given a file containing 10000 INSERT commands, a single >> mysql client ran the file in about a second. So if I feed the file >> simultaneously to two mysqls in two shell windows, it should take >> about two seconds total to do the 20000 inserts, right? The observed >> times were 13 to 15 seconds. (I believe this is with a MyISAM table, >> since I just said CREATE TABLE without any options.) > MyISAM is well known to suck if you update/insert/delete because it > simply aquires a full table lock when you perform those operations! Sure, I wasn't expecting it to actually overlap any operations. (If you try the same test with Postgres, the scaling factor is a little better than linear because we do get some overlap.) But that shouldn't result in a factor-of-seven slowdown. There's something badly wrong with their low-level locking algorithms I think. regards, tom lane
Jeff <threshar@torgo.978.org> writes: > Not sure at what point it will topple, in my case it didn't matter if it > ran good with 5 clients as I'll always have many more clients than 5. I did some idle, very unscientific tests the other day that indicated that MySQL insert performance starts to suck with just 2 concurrent inserters. Given a file containing 10000 INSERT commands, a single mysql client ran the file in about a second. So if I feed the file simultaneously to two mysqls in two shell windows, it should take about two seconds total to do the 20000 inserts, right? The observed times were 13 to 15 seconds. (I believe this is with a MyISAM table, since I just said CREATE TABLE without any options.) It does scream with only one client though ... regards, tom lane
Josh Berkus wrote: > Folks, > > I've had requests from a couple of businesses to see results of infomal MySQL > +InnoDB vs. PostgreSQL tests. I know that we don't have the setup to do > full formal benchmarking, but surely someone in our community has gone > head-to-head on your own application? > Josh, how does someone compare an Apache+PHP+MySQL "thing" against something implemented with half the stuff done in stored procedures and the entire business model guarded by referential integrity, custom triggers and whatnot? Seriously, I am tired of this kind of question. You gotta get bold enough to stand up in a "meeting" like that, say "guy's, you can ask me how this compares to Oracle ... but if you're seriously asking me how this compares to MySQL, call me again when you've done your homework". Jan -- #======================================================================# # It's easier to get forgiveness for being wrong than for being right. # # Let's break this rule - forgive me. # #================================================== JanWieck@Yahoo.com #
Wow, I didn't know that (didn't get far enough to test any rollback). That's not a good thing. <facetious>But then again, it's MySQL who needs rollback anyway?</facetious> On Feb 2, 2004, at 5:44 PM, Christopher Kings-Lynne wrote: >> One more thing that annoyed me. If you started a process, such as a >> large DDL operation, or heaven forbid, a cartesian join (what? I >> never do that!). > > I believe InnoDB also has O(n) rollback time. eg. if you are rolling > back 100 million row changes, it takes a long, long time. In > PostgreSQL rolling back is O(1)... > > Chris >
> Seriously, I am tired of this kind of question. You gotta get bold > enough to stand up in a "meeting" like that, say "guy's, you can ask me > how this compares to Oracle ... but if you're seriously asking me how > this compares to MySQL, call me again when you've done your homework". Can they call you at the unemployment office? -- Mike Nolan
Jan Wieck wrote: > It might not work with the words I used above, but the point I tried to > make is that the hardest thing you can "sell" is a "no". I mean, not > just saying "no", but selling it in a way that the customer will not go > with the next idiot who claims "we can do that". But you will need some kind of data or reasoning to back up your response, especially if it is deviating from the conventional wisdom, or from some familiar system. Especially in this case, it's not a "no" answer that's being sold... it's "solution a is better than solution b, even though you might be more familiar with solution b." Cheers, Mark
Mike Nolan wrote: >> Seriously, I am tired of this kind of question. You gotta get bold >> enough to stand up in a "meeting" like that, say "guy's, you can ask me >> how this compares to Oracle ... but if you're seriously asking me how >> this compares to MySQL, call me again when you've done your homework". > > Can they call you at the unemployment office? It might not work with the words I used above, but the point I tried to make is that the hardest thing you can "sell" is a "no". I mean, not just saying "no", but selling it in a way that the customer will not go with the next idiot who claims "we can do that". If the customer has a stupid idea, like envisioning an enterprise solution based on ImSOL, there is no way you will be able to deliver it. Paying customer or not, you will fail if you bow to their "strategic" decisions and ignore knowing that the stuff they want to use just doesn't fit. That is absolutely not ImSOL specific. If someone comes to me and asks for a HA scenario with zero transaction loss during failover, we can discuss a little if this is really what he needs or not, but if he needs that, the solution will be Oracle or DB2, for sure I will not claim that PostgreSQL can do that, because it cannot. Jan -- #======================================================================# # It's easier to get forgiveness for being wrong than for being right. # # Let's break this rule - forgive me. # #================================================== JanWieck@Yahoo.com #