Thread: PostgreSQL vs MySQL, and FreeBSD
Hi, I just read this document and thought I should share it with this list: http://people.freebsd.org/~kris/scaling/7.0%20Preview.pdf Among other things (FreeBSD advocacy, mostly :) ), it contains a direct comparison between MySQL and PostgreSQL on various platforms, with PostgreSQL winning!
On Nov 9, 2007 7:06 AM, Ivan Voras <ivoras@geri.cc.fer.hr> wrote: > I just read this document and thought I should share it with this list: > > http://people.freebsd.org/~kris/scaling/7.0%20Preview.pdf Nice presentation. Thanks for posting it on here. > Among other things (FreeBSD advocacy, mostly :) ), it contains a direct > comparison between MySQL and PostgreSQL on various platforms, with > PostgreSQL winning! :) -- Jonah H. Harris, Sr. Software Architect | phone: 732.331.1324 EnterpriseDB Corporation | fax: 732.331.1301 499 Thornall Street, 2nd Floor | jonah.harris@enterprisedb.com Edison, NJ 08837 | http://www.enterprisedb.com/
> > Among other things (FreeBSD advocacy, mostly :) ), it contains a direct > comparison between MySQL and PostgreSQL on various platforms, with > PostgreSQL winning! > Hello, If the queries are complex, this is understable. I had a performance review of a Hibernate project (Java Object Relation Mapping) using MySQL. ORM produces easily "complex" queries with joins and subqueries. MySQL uses nested loops for subqueries which lead to performance issues with growing database size. They state in their documentation that for version 5.2 there are improvements planned regarding this kind of query. Best Regards Sebastian
On Nov 9, 2007, at 6:06 AM, Ivan Voras wrote: > Hi, > > I just read this document and thought I should share it with this > list: > > http://people.freebsd.org/~kris/scaling/7.0%20Preview.pdf > > Among other things (FreeBSD advocacy, mostly :) ), it contains a > direct > comparison between MySQL and PostgreSQL on various platforms, with > PostgreSQL winning! Which is typical for those who aren't in on the FUD :) Erik Jones Software Developer | Emma® erik@myemma.com 800.595.4401 or 615.292.5888 615.292.0777 (fax) Emma helps organizations everywhere communicate & market in style. Visit us online at http://www.myemma.com
On Nov 9, 2007 9:41 AM, Sebastian Hennebrueder <usenet@laliluna.de> wrote: > If the queries are complex, this is understable. I had a performance > review of a Hibernate project (Java Object Relation Mapping) using > MySQL. ORM produces easily "complex" queries with joins and subqueries. > MySQL uses nested loops for subqueries which lead to performance issues > with growing database size. > > They state in their documentation that for version 5.2 there are > improvements planned regarding this kind of query. So, MySQL 5.2 will be catching up to version 7.1 or 7.2 of PostgreSQL in that regard?
On Fri, 9 Nov 2007, Sebastian Hennebrueder wrote: > If the queries are complex, this is understable. The queries used for this comparison are trivial. There's only one table involved and there are no joins. It's testing very low-level aspects of performance. -- * Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD
On Fri, 9 Nov 2007 11:11:18 -0500 (EST) Greg Smith <gsmith@gregsmith.com> wrote: > On Fri, 9 Nov 2007, Sebastian Hennebrueder wrote: > > > If the queries are complex, this is understable. > > The queries used for this comparison are trivial. There's only one table > involved and there are no joins. It's testing very low-level aspects of > performance. Actually, what it's really showing is parallelism, and I've always expected PostgreSQL to come out on top in that arena. -- Bill Moran Potential Technologies http://www.potentialtech.com
Bill Moran wrote: > On Fri, 9 Nov 2007 11:11:18 -0500 (EST) > Greg Smith <gsmith@gregsmith.com> wrote: >> On Fri, 9 Nov 2007, Sebastian Hennebrueder wrote: >>> If the queries are complex, this is understable. >> The queries used for this comparison are trivial. There's only one table >> involved and there are no joins. It's testing very low-level aspects of >> performance. > > Actually, what it's really showing is parallelism, and I've always > expected PostgreSQL to come out on top in that arena. Isn't it showing Postgres winning even without parallelism. At 1 threads, Postgres looks like 800TPS where MysQL comes in at about 600TPS on their Opteron charts.
Seems to me there is more thread model implementation problem on FreeBSD, and databases just reflecting it... Most of the test I done on Solaris show the same performance level on the same short READ-only queries for MySQL and PostgreSQL. And to be honest till the end, thread model should be far faster (context switching between threads is way faster vs processes), but - as I say usually - even a very good idea may be just wasted by a poor implementation... And in case of MySQL they have too much locking to manage concurrency between threads which kills all thread model benefits... Also, to compare apples to apples, they should run this test from remote client rather locally on the same host - however in this case the result for PostgreSQL will mostly depends on client implementation: if client implements reading via CURSOR (quite often), reading will generate 4x times more intensive network traffic than necessary and final PostgreSQL result will be worse... Reading this article I'm just happy for them to see progress done on FreeBSD :-) As well to demonstrate OS parallelism it's not so impressive to see 4CPU server results rather 8CPU or 32threaded Niagara... Don't know why they did not present similar performance graphs for these platform, strange no?... Rgds, -Dimitri On 11/9/07, Ron Mayer <rm_pg@cheapcomplexdevices.com> wrote: > Bill Moran wrote: > > On Fri, 9 Nov 2007 11:11:18 -0500 (EST) > > Greg Smith <gsmith@gregsmith.com> wrote: > >> On Fri, 9 Nov 2007, Sebastian Hennebrueder wrote: > >>> If the queries are complex, this is understable. > >> The queries used for this comparison are trivial. There's only one table > >> involved and there are no joins. It's testing very low-level aspects of > >> performance. > > > > Actually, what it's really showing is parallelism, and I've always > > expected PostgreSQL to come out on top in that arena. > > Isn't it showing Postgres winning even without parallelism. > > At 1 threads, Postgres looks like 800TPS where MysQL comes > in at about 600TPS on their Opteron charts. > > ---------------------------(end of broadcast)--------------------------- > TIP 9: In versions below 8.0, the planner will ignore your desire to > choose an index scan if your joining column's datatypes do not > match >
Dimitri wrote: > Seems to me there is more thread model implementation problem on > FreeBSD, and databases just reflecting it... Most of the test I done > on Solaris show the same performance level on the same short READ-only > queries for MySQL and PostgreSQL. > > And to be honest till the end, thread model should be far faster > (context switching between threads is way faster vs processes), but - > as I say usually - even a very good idea may be just wasted by a poor > implementation... And in case of MySQL they have too much locking to > manage concurrency between threads which kills all thread model > benefits... Also, to compare apples to apples, they should run this > test from remote client rather locally on the same host - however in > this case the result for PostgreSQL will mostly depends on client > implementation: if client implements reading via CURSOR (quite often), > reading will generate 4x times more intensive network traffic than > necessary and final PostgreSQL result will be worse... > > Reading this article I'm just happy for them to see progress done on FreeBSD :-) > As well to demonstrate OS parallelism it's not so impressive to see > 4CPU server results rather 8CPU or 32threaded Niagara... Don't know > why they did not present similar performance graphs for these > platform, strange no?... I don't find it strange. I would rather see benchmarks on what the majority of people running on the platform are going to run. Most people don't run 8core machines and they especially don't run 32thread Niagra boxes. Joshua D. Drake > > Rgds, > -Dimitri >
On Sun, Nov 11, 2007 at 08:27:02PM +0100, Dimitri wrote: > As well to demonstrate OS parallelism it's not so impressive to see > 4CPU server results rather 8CPU or 32threaded Niagara... Don't know > why they did not present similar performance graphs for these > platform, strange no?... I guess it's because their Niagara support is still very raw, and besides, it's not a very common platform. /* Steinar */ -- Homepage: http://www.sesse.net/
Steinar H. Gunderson wrote: > On Sun, Nov 11, 2007 at 08:27:02PM +0100, Dimitri wrote: >> As well to demonstrate OS parallelism it's not so impressive to see >> 4CPU server results rather 8CPU or 32threaded Niagara... Don't know >> why they did not present similar performance graphs for these >> platform, strange no?... > > I guess it's because their Niagara support is still very raw, and besides, > it's not a very common platform. > > /* Steinar */ Not sure how much coding would need to be done for Niagra chips but I would think that it is more likely a problem of getting the funds so they can have one to work on. -- Shane Ambler pgSQL@Sheeky.Biz Get Sheeky @ http://Sheeky.Biz
On Nov 11, 2007, at 2:17 PM, Joshua D. Drake wrote: > Dimitri wrote: >> Seems to me there is more thread model implementation problem on >> FreeBSD, and databases just reflecting it... Most of the test I done >> on Solaris show the same performance level on the same short READ- >> only >> queries for MySQL and PostgreSQL. >> And to be honest till the end, thread model should be far faster >> (context switching between threads is way faster vs processes), but - >> as I say usually - even a very good idea may be just wasted by a poor >> implementation... And in case of MySQL they have too much locking to >> manage concurrency between threads which kills all thread model >> benefits... Also, to compare apples to apples, they should run this >> test from remote client rather locally on the same host - however in >> this case the result for PostgreSQL will mostly depends on client >> implementation: if client implements reading via CURSOR (quite >> often), >> reading will generate 4x times more intensive network traffic than >> necessary and final PostgreSQL result will be worse... >> Reading this article I'm just happy for them to see progress done >> on FreeBSD :-) >> As well to demonstrate OS parallelism it's not so impressive to see >> 4CPU server results rather 8CPU or 32threaded Niagara... Don't know >> why they did not present similar performance graphs for these >> platform, strange no?... > > I don't find it strange. I would rather see benchmarks on what the > majority of people running on the platform are going to run. > > Most people don't run 8core machines and they especially don't run > 32thread Niagra boxes. Wait! So, what do you check you're email with? :) Erik Jones Software Developer | Emma® erik@myemma.com 800.595.4401 or 615.292.5888 615.292.0777 (fax) Emma helps organizations everywhere communicate & market in style. Visit us online at http://www.myemma.com
On Fri, 2007-11-09 at 16:41 +0100, Sebastian Hennebrueder wrote: > If the queries are complex, this is understable. I had a performance > review of a Hibernate project (Java Object Relation Mapping) using > MySQL. ORM produces easily "complex" queries with joins and subqueries. > MySQL uses nested loops for subqueries which lead to performance issues > with growing database size. Even for Postgresql, nested loops are still evil and hampers performance.
> -----Original Message----- > From: Ow Mun Heng > Subject: Re: [PERFORM] PostgreSQL vs MySQL, and FreeBSD > > Even for Postgresql, nested loops are still evil and hampers > performance. I don't know about that. There are times when it is the right plan: explain analyze select * from table1 t1 inner join table2 t2 on t1.f_id = t2.id where t1.id = 'xyzzy'; QUERY PLAN ---------------------------------------------------------------------------- ---------------------------------------------------------------- Nested Loop (cost=0.00..17.65 rows=1 width=344) (actual time=0.080..0.096 rows=1 loops=1) -> Index Scan using table1_pkey on table1 t (cost=0.00..9.18 rows=1 width=238) (actual time=0.044..0.048 rows=1 loops=1) Index Cond: ((id)::text = 'xyzzy'::text) -> Index Scan using table2_pkey on table2 i (cost=0.00..8.46 rows=1 width=106) (actual time=0.019..0.023 rows=1 loops=1) Index Cond: (t.f_id = i.id) Total runtime: 0.224 ms set enable_nestloop=off; SET explain analyze select * from table1 t1 inner join table2 t2 on t1.f_id = t2.id where t1.id = 'xyzzy'; QUERY PLAN ---------------------------------------------------------------------------- ------------------------------------------------------------ Hash Join (cost=9.18..72250.79 rows=1 width=344) (actual time=13493.572..15583.049 rows=1 loops=1) Hash Cond: (i.id = t.f_id) -> Seq Scan on table2 i (cost=0.00..61297.40 rows=2188840 width=106) (actual time=0.015..8278.347 rows=2188840 loops=1) -> Hash (cost=9.18..9.18 rows=1 width=238) (actual time=0.056..0.056 rows=1 loops=1) -> Index Scan using table1_pkey on table1 t (cost=0.00..9.18 rows=1 width=238) (actual time=0.040..0.045 rows=1 loops=1) Index Cond: ((id)::text = 'xyzzy'::text) Total runtime: 15583.212 ms (I changed the table names, but everything else is real.)
On Nov 16, 2007 10:56 AM, Dave Dutcher <dave@tridecap.com> wrote: > I don't know about that. There are times when it is the right plan: Agreed. IMHO, there's nothing wrong with nested-loop join as long as it's being used properly. -- Jonah H. Harris, Sr. Software Architect | phone: 732.331.1324 EnterpriseDB Corporation | fax: 732.331.1301 499 Thornall Street, 2nd Floor | jonah.harris@enterprisedb.com Edison, NJ 08837 | http://www.enterprisedb.com/
Dimitri wrote: > Reading this article I'm just happy for them to see progress done on FreeBSD :-) > As well to demonstrate OS parallelism it's not so impressive to see > 4CPU server results rather 8CPU or 32threaded Niagara... Don't know > why they did not present similar performance graphs for these > platform, strange no?... Well, most of the results in the document (http://people.freebsd.org/~kris/scaling/7.0%20Preview.pdf) are for 8-CPU machines, which is about the most you can get with off the shelf hardware (2x4-core CPU, the document has both Xeon and Opteron results). Niagara support is unfinished, so there's nothing to report there. On the other hand, the document does compare between several versions of Linux, FreeBSD, NetBSD and DragonflyBSD, with both MySQL and PostgreSQL, so you can draw your conclusions (if any) from there.
Attachment
On Fri, 16 Nov 2007 11:06:11 -0500 "Jonah H. Harris" <jonah.harris@gmail.com> wrote: > On Nov 16, 2007 10:56 AM, Dave Dutcher <dave@tridecap.com> wrote: > > I don't know about that. There are times when it is the right > > plan: > > Agreed. IMHO, there's nothing wrong with nested-loop join as long > as it's being used properly. Can you explain further please? (I'm not disagreeing with you, just want to know when nested loops are not used properly - does the planner make mistakes that you have to watch out for?) Thx, Josh
On Nov 16, 2007 3:36 PM, Josh Trutwin <josh@trutwins.homeip.net> wrote: > > Agreed. IMHO, there's nothing wrong with nested-loop join as long > > as it's being used properly. > > Can you explain further please? (I'm not disagreeing with you, just > want to know when nested loops are not used properly - does the > planner make mistakes that you have to watch out for?) As long as statistics are updated properly, it's generally not an issue. You just don't want the system using a nested-loop join incorrectly (like when table sizes are equal, the outer table is larger than the inner table, or the inner table itself is overly large). -- Jonah H. Harris, Sr. Software Architect | phone: 732.331.1324 EnterpriseDB Corporation | fax: 732.331.1301 499 Thornall Street, 2nd Floor | jonah.harris@enterprisedb.com Edison, NJ 08837 | http://www.enterprisedb.com/
On Fri, 2007-11-16 at 11:06 -0500, Jonah H. Harris wrote: > On Nov 16, 2007 10:56 AM, Dave Dutcher <dave@tridecap.com> wrote: > > I don't know about that. There are times when it is the right plan: > > Agreed. IMHO, there's nothing wrong with nested-loop join as long as > it's being used properly. I do agree also, but in some other cases, the usage of nested loops (esp when the number of rows estimated to be returned vs the actual number of rows being returned differs by up to 100x (or more) then it becomes a major issue. The example pointed out by Dave D shows the est rows = 1 and actual rows=1, then good performance of course.