Thread: Tuning/performance issue...
We're having a problem with a query during our investigation into Postgres (as an Oracle replacement). This query Postgres takes 20-40 seconds (multiple runs). Tom Lan recommended I post it here, with an explain-analyze.
Here's the query:
EXPLAIN ANALYZE SELECT company_name, address_1, address_2, address_3, city,
address_list.state_province_id, state_province_short_desc, country_desc, zip_code, address_list.country_id,
contact_info.email, commercial_entity.user_account_id, phone_num_1, phone_num_fax, website, boats_website
FROM commercial_entity, country, user_account,
address_list LEFT JOIN state_province ON address_list.state_province_id = state_province.state_province_id
LEFT JOIN contact_info ON address_list.contact_info_id = contact_info.contact_info_id
WHERE address_list.address_type_id = 101
AND commercial_entity.commercial_entity_id=225528
AND commercial_entity.commercial_entity_id = address_list.commercial_entity_id
AND address_list.country_id = country.country_id
AND commercial_entity.user_account_id = user_account.user_account_id
AND user_account.user_role_id IN (101, 101);
address_list.state_province_id, state_province_short_desc, country_desc, zip_code, address_list.country_id,
contact_info.email, commercial_entity.user_account_id, phone_num_1, phone_num_fax, website, boats_website
FROM commercial_entity, country, user_account,
address_list LEFT JOIN state_province ON address_list.state_province_id = state_province.state_province_id
LEFT JOIN contact_info ON address_list.contact_info_id = contact_info.contact_info_id
WHERE address_list.address_type_id = 101
AND commercial_entity.commercial_entity_id=225528
AND commercial_entity.commercial_entity_id = address_list.commercial_entity_id
AND address_list.country_id = country.country_id
AND commercial_entity.user_account_id = user_account.user_account_id
AND user_account.user_role_id IN (101, 101);
Here's the explain:
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Nested Loop (cost=0.00..64570.33 rows=1 width=385) (actual time=42141.08..42152.06 rows=1 loops=1)
-> Nested Loop (cost=0.00..64567.30 rows=1 width=361) (actual time=42140.80..42151.77 rows=1 loops=1)
-> Nested Loop (cost=0.00..64563.97 rows=1 width=349) (actual time=42140.31..42151.27 rows=1 loops=1)
Join Filter: ("outer".commercial_entity_id = "inner".commercial_entity_id)
-> Index Scan using commercial_entity_pkey on commercial_entity (cost=0.00..5.05 rows=1 width=94) (actual time=0.57..0.58 rows=1 loops=1)
Index Cond: (commercial_entity_id = 225528::numeric)
-> Materialize (cost=63343.66..63343.66 rows=97221 width=255) (actual time=41741.96..41901.17 rows=90527 loops=1)
-> Merge Join (cost=0.00..63343.66 rows=97221 width=255) (actual time=1.44..41387.68 rows=90527 loops=1)
Merge Cond: ("outer".contact_info_id = "inner".contact_info_id)
-> Nested Loop (cost=0.00..830457.52 rows=97221 width=222) (actual time=0.95..39178.32 rows=90527 loops=1)
Join Filter: ("outer".state_province_id = "inner".state_province_id)
-> Index Scan using addr_list_ci_id_i on address_list (cost=0.00..586676.65 rows=97221 width=205) (actual time=0.49..2159.90 rows=90527 loops=1)
Filter: (address_type_id = 101::numeric)
-> Seq Scan on state_province (cost=0.00..1.67 rows=67 width=17) (actual time=0.00..0.21 rows=67 loops=90527)
-> Index Scan using contact_info_pkey on contact_info (cost=0.00..3366.76 rows=56435 width=33) (actual time=0.44..395.75 rows=55916 loops=1)
-> Index Scan using user_account_pkey on user_account (cost=0.00..3.32 rows=1 width=12) (actual time=0.46..0.46 rows=1 loops=1)
Index Cond: ("outer".user_account_id = user_account.user_account_id)
Filter: (user_role_id = 101::numeric)
-> Index Scan using country_pkey on country (cost=0.00..3.01 rows=1 width=24) (actual time=0.25..0.25 rows=1 loops=1)
Index Cond: ("outer".country_id = country.country_id)
Total runtime: 42165.44 msec
(21 rows)
Nested Loop (cost=0.00..64570.33 rows=1 width=385) (actual time=42141.08..42152.06 rows=1 loops=1)
-> Nested Loop (cost=0.00..64567.30 rows=1 width=361) (actual time=42140.80..42151.77 rows=1 loops=1)
-> Nested Loop (cost=0.00..64563.97 rows=1 width=349) (actual time=42140.31..42151.27 rows=1 loops=1)
Join Filter: ("outer".commercial_entity_id = "inner".commercial_entity_id)
-> Index Scan using commercial_entity_pkey on commercial_entity (cost=0.00..5.05 rows=1 width=94) (actual time=0.57..0.58 rows=1 loops=1)
Index Cond: (commercial_entity_id = 225528::numeric)
-> Materialize (cost=63343.66..63343.66 rows=97221 width=255) (actual time=41741.96..41901.17 rows=90527 loops=1)
-> Merge Join (cost=0.00..63343.66 rows=97221 width=255) (actual time=1.44..41387.68 rows=90527 loops=1)
Merge Cond: ("outer".contact_info_id = "inner".contact_info_id)
-> Nested Loop (cost=0.00..830457.52 rows=97221 width=222) (actual time=0.95..39178.32 rows=90527 loops=1)
Join Filter: ("outer".state_province_id = "inner".state_province_id)
-> Index Scan using addr_list_ci_id_i on address_list (cost=0.00..586676.65 rows=97221 width=205) (actual time=0.49..2159.90 rows=90527 loops=1)
Filter: (address_type_id = 101::numeric)
-> Seq Scan on state_province (cost=0.00..1.67 rows=67 width=17) (actual time=0.00..0.21 rows=67 loops=90527)
-> Index Scan using contact_info_pkey on contact_info (cost=0.00..3366.76 rows=56435 width=33) (actual time=0.44..395.75 rows=55916 loops=1)
-> Index Scan using user_account_pkey on user_account (cost=0.00..3.32 rows=1 width=12) (actual time=0.46..0.46 rows=1 loops=1)
Index Cond: ("outer".user_account_id = user_account.user_account_id)
Filter: (user_role_id = 101::numeric)
-> Index Scan using country_pkey on country (cost=0.00..3.01 rows=1 width=24) (actual time=0.25..0.25 rows=1 loops=1)
Index Cond: ("outer".country_id = country.country_id)
Total runtime: 42165.44 msec
(21 rows)
I will post the schema in a seperate email - the list has rejected one big email 3 times now.
David
David Griffiths <dgriffiths@boats.com> writes: > ... FROM commercial_entity, country, user_account, > address_list LEFT JOIN state_province ON address_list.state_province_id > = state_province.state_province_id > LEFT JOIN contact_info ON address_list.contact_info_id = > contact_info.contact_info_id > WHERE ... I believe what you're getting burnt by is that PG's planner interprets this as forcing the address_list * state_province * contact_info join to be done before it joins those tables to commercial_entity, country, and user_account --- for discussion see http://www.postgresql.org/docs/7.3/static/explicit-joins.html Unfortunately your WHERE-clause restriction conditions are on address_list, commercial_entity, and user_account; and it seems the address_list constraint is very weak. So the plan ends up forming a large fraction of the address_list * state_province * contact_info join, only to throw it away again when there's no matching rows selected from commercial_entity and user_account. The actual runtime and actual row counts from the EXPLAIN ANALYZE output show that this is what's happening. The most efficient way to handle this query would probably be to join the three tables with restrictions first, and then join the other tables to those. You could force this with not too much rewriting using something like (untested, but I think it's right) ... FROM commercial_entity CROSS JOIN user_account CROSS JOIN address_list LEFT JOIN state_province ON address_list.state_province_id = state_province.state_province_id LEFT JOIN contact_info ON address_list.contact_info_id = contact_info.contact_info_id CROSS JOIN country WHERE ... The explicit JOINs associate left-to-right, so this gives the intended join order. (In your original query, explicit JOIN binds more tightly than commas do.) The reason PG's planner doesn't discover this join order for itself is that it's written to not attempt to re-order outer joins from the syntactically defined ordering. In general, such reordering would change the results. It is possible to analyze the query and prove that certain reorderings are valid (don't change the results), but we don't currently have code to do that. > As a reference, our production Oracle database (exactly the same > hardware, but RAID-mirroring) with way more load can handle the query in > 1-2 seconds. I have MySQL 4.0.14 with InnoDB on the same machine > (shutdown when I am testing Postgres, and visa versa) and it does the > query in 0.20 seconds. I'm prepared to believe that Oracle contains code that actually does the analysis about which outer-join reorderings are valid, and is then able to find the right join order by deduction. The last I heard about MySQL, they have no join-order analysis at all; they unconditionally interpret this type of query left-to-right, ie as ... FROM ((((commercial_entity CROSS JOIN country) CROSS JOIN user_account) CROSS JOIN address_list) LEFT JOIN state_province ON ...) LEFT JOIN contact_info ON ... WHERE ... This is clearly at odds with the SQL spec's syntactically defined join order semantics. It's possible that it always yields the same results as the spec requires, but I'm not at all sure about that. In any case this strategy is certainly not "better" than ours, it just performs poorly on a different set of queries. Would I be out of line to speculate that your query was previously tuned to work well in MySQL? regards, tom lane
> The most efficient way to handle this query would probably be to join > the three tables with restrictions first, and then join the other tables > to those. You could force this with not too much rewriting using > something like (untested, but I think it's right) > > ... FROM commercial_entity CROSS JOIN user_account CROSS JOIN > address_list LEFT JOIN state_province ON address_list.state_province_id > = state_province.state_province_id > LEFT JOIN contact_info ON address_list.contact_info_id = > contact_info.contact_info_id > CROSS JOIN country > WHERE ... > > The explicit JOINs associate left-to-right, so this gives the intended > join order. (In your original query, explicit JOIN binds more tightly > than commas do.) Ok - that's interesting - I'll have to do some reading and more testing. > The reason PG's planner doesn't discover this join order for itself > is that it's written to not attempt to re-order outer joins from the > syntactically defined ordering. In general, such reordering would > change the results. It is possible to analyze the query and prove that > certain reorderings are valid (don't change the results), but we don't > currently have code to do that. Not sure I follow. Are you saying that, depending on when the outer-join is applied to the rows found at the time, you may end up with a different set of rows? I would have expected the optimizer to do the outer-joins last, as the extra data received by the outer-joins is not mandatory, and won't affect the rows that were retreived by joining user_account, address_list, and commercial_entity. An outer join would *never* be the most restrictive join in a query. I thought (from my readings on Oracle query tuning) that finding the most restrictive table/index was the first task of an optimizer. Reduce the result set as quickly as possible. That query has the line, "AND commercial_entity.commercial_entity_id=225528", which uses an index (primary key) and uses an "=". I would have expected that to be done first, then joined with the other inner-join tables, and finally have the outer-joins applied to the final result set to fill in the "might be there" data. Anyway, if the optimizer does the outer-joins first (address_list with state_province and contact_info), then it's picking the table with the most rows (address_list has 200K+ rows, where the other 3 big tables have 70K-90K). Would re-ordering the FROM clause (and LEFT JOIN portions) help? Could you give an example where applying an outer-join at a different time could result in different results? I think I can see at situation where you use part of the results in the outer-join in the where clause, but I am not sure. > I'm prepared to believe that Oracle contains code that actually does the > analysis about which outer-join reorderings are valid, and is then able > to find the right join order by deduction. I'm not sure about Oracle (other than what I stated above). In fact, about half the time, updating table stats to try to get the Oracle optimizer to do a better job on a query results in even worse performance. > ... FROM ((((commercial_entity CROSS JOIN country) CROSS JOIN > user_account) CROSS JOIN address_list) > LEFT JOIN state_province ON ...) > LEFT JOIN contact_info ON ... > WHERE ... > > This is clearly at odds with the SQL spec's syntactically defined join > order semantics. It's possible that it always yields the same results > as the spec requires, but I'm not at all sure about that. Again, I don't know. On the 3 queries based on these tables, Postgres and MySQL return the exact same data (they use the same data set). Do you have a link to the SQL spec's join-order requirements? > In any case > this strategy is certainly not "better" than ours, it just performs > poorly on a different set of queries. Would I be out of line to > speculate that your query was previously tuned to work well in MySQL? The query was pulled from our codebase (written for Oracle). I added a bit to it to make it slower, and then ported to MySQL and tested there first (just re-wrote the outer-join syntax). I found that re-ordering the tables in the from-clause on MySQL changed the time by 45-ish% (0.36 seconds to .20 seconds), but that's because I had forgotten to re-analyze the tables after refreshing the dataset. Now, table order doesn't make a difference in speed (or results). If anything, I've done more tuning for Postgres - added some extra indexes to try to help (country.country_id had a composite index with another column, but not an index for just it), etc. The dataset and schema is pure-Oracle. I extracted it out of the database, removed all Oracle-specific extensions, changed the column types, and migrated the indexes and foreign keys to MySQL and Postgres. Nothing more (other than an extra index or two for Postgres - nada for MySQL). This is all part of a "migrate away from Oracle" project. We are looking at 3 databases - MySQL (InnoDB), Postgres and Matisse (object oriented). We have alot of queries like this or worse, and I'm worried that many of them would need to be re-written. The developers know SQL, but nothing about tuning, etc. Thanks for the quick response - I will try explicit joining, and I'm looking forward to your comments on outer-joins and the optmizer (and anything else I've written). David.
On Tue, 30 Sep 2003, David Griffiths wrote: > > This is all part of a "migrate away from Oracle" project. We are looking at > 3 databases - > MySQL (InnoDB), Postgres and Matisse (object oriented). We have alot of > queries like this > or worse, and I'm worried that many of them would need to be re-written. The > developers > know SQL, but nothing about tuning, etc. > There's a movement at my company to ditch several commercial db's in favor of a free one. I'm currently the big pg fan around here and I've actually written a rather lengthy presentation about pg features, why, tuning, etc. but another part was some comparisons to other db's.. I decided so I wouldn't be blinding flaming mysql to give it a whirl and loaded it up with the same dataset as pg. First thing I hit was lack of stored procedures. But I decided to code around that, giving mysql the benefit of the doubt. What I found was interesting. For 1-2 concurrent 'beaters' it screamed. ultra-fast. But.. If you increase the concurrent beaters up to say, 20 Mysql comes to a grinding halt.. Mysql and the machine itself become fairly unresponsive. And if you do cache unfriendly queries it becomes even worse. On PG - no problems at all. Scaled fine and dandy up. And with 40 concurrent beaters the machine was still responsive. (The numbers for 20 client was 220 seconds (pg) and 650 seconds (mysql)) So that is another test to try out - Given your configuration I expect you have lots of concurrent activity. -- Jeff Trout <jeff@jefftrout.com> http://www.jefftrout.com/ http://www.stuarthamm.net/
David Griffiths <dgriffiths@boats.com> writes: >> The reason PG's planner doesn't discover this join order for itself >> is that it's written to not attempt to re-order outer joins from the >> syntactically defined ordering. In general, such reordering would >> change the results. It is possible to analyze the query and prove that >> certain reorderings are valid (don't change the results), but we don't >> currently have code to do that. > Not sure I follow. Are you saying that, depending on when the outer-join is > applied to the rows found at the time, you may end up with a different set > of rows? Here's an example showing that it's not always safe to rearrange join order in the presence of outer joins: jtest=# create table a (f1 int); CREATE TABLE jtest=# create table b (f1 int, f2 int); CREATE TABLE jtest=# create table c(f1 int, f2 int); CREATE TABLE jtest=# insert into a values (1); INSERT 431307 1 jtest=# insert into b values (10,10); INSERT 431308 1 jtest=# insert into b values (11,11); INSERT 431309 1 jtest=# insert into c values (1,10); INSERT 431310 1 jtest=# insert into c values (2,11); INSERT 431311 1 jtest=# SELECT * FROM a, b LEFT JOIN c ON b.f2 = c.f2 WHERE a.f1 = c.f1; f1 | f1 | f2 | f1 | f2 ----+----+----+----+---- 1 | 10 | 10 | 1 | 10 (1 row) Per spec the JOIN operator binds more tightly than comma, so this is equivalent to: jtest=# SELECT * FROM a JOIN (b LEFT JOIN c ON b.f2 = c.f2) ON a.f1 = c.f1; f1 | f1 | f2 | f1 | f2 ----+----+----+----+---- 1 | 10 | 10 | 1 | 10 (1 row) Now suppose we try to join A and C before joining to B: jtest=# SELECT * FROM b LEFT JOIN (a join c ON a.f1 = c.f1) ON b.f2 = c.f2; f1 | f2 | f1 | f1 | f2 ----+----+----+----+---- 10 | 10 | 1 | 1 | 10 11 | 11 | | | (2 rows) We get a different answer, because some C rows are eliminated before reaching the left join, causing null-extended B rows to be added. (I don't have a MySQL installation here to try, but if they still work the way they used to, they get the wrong answer on the first query.) The point of this example is just that there are cases where it'd be incorrect for the planner to change the ordering of joins from what is implied by the query syntax. It is always safe to change the join order when only inner joins are involved. There are cases where outer join order is safe to change too, but you need analysis code that checks the query conditions to prove that a particular rearrangement is safe. Right now, we don't have such code, and so we just follow the simple rule "never rearrange any outer joins". > I would have expected the optimizer to do the outer-joins last, as the > extra data received by the outer-joins is not mandatory, and won't > affect the rows that were retreived by joining user_account, > address_list, and commercial_entity. I think your example falls into the category of provably-safe rearrangements ... but as I said, the planner doesn't know that. > An outer join would *never* be the most restrictive > join in a query. Sure it can, if the restriction conditions are mainly on the outer join's tables. But that's not really the issue here. As best I can tell without seeing your data statistics, the most restrictive conditions in your query are the ones on commercial_entity.commercial_entity_id and user_account.user_role_id. The trick is to apply those before joining any other tables. regards, tom lane
Jeff, I would really appreciate if you could send me that lengthy presentation that you've written on pg/other dbs comparison. Thanks. Oleg -----Original Message----- From: Jeff [mailto:threshar@torgo.978.org] Sent: Wednesday, October 01, 2003 6:23 AM To: David Griffiths Cc: pgsql-performance@postgresql.org Subject: Re: [PERFORM] Tuning/performance issue... Importance: Low On Tue, 30 Sep 2003, David Griffiths wrote: > > This is all part of a "migrate away from Oracle" project. We are > looking at 3 databases - MySQL (InnoDB), Postgres and Matisse (object > oriented). We have alot of queries like this > or worse, and I'm worried that many of them would need to be re-written. The > developers > know SQL, but nothing about tuning, etc. > There's a movement at my company to ditch several commercial db's in favor of a free one. I'm currently the big pg fan around here and I've actually written a rather lengthy presentation about pg features, why, tuning, etc. but another part was some comparisons to other db's.. I decided so I wouldn't be blinding flaming mysql to give it a whirl and loaded it up with the same dataset as pg. First thing I hit was lack of stored procedures. But I decided to code around that, giving mysql the benefit of the doubt. What I found was interesting. For 1-2 concurrent 'beaters' it screamed. ultra-fast. But.. If you increase the concurrent beaters up to say, 20 Mysql comes to a grinding halt.. Mysql and the machine itself become fairly unresponsive. And if you do cache unfriendly queries it becomes even worse. On PG - no problems at all. Scaled fine and dandy up. And with 40 concurrent beaters the machine was still responsive. (The numbers for 20 client was 220 seconds (pg) and 650 seconds (mysql)) So that is another test to try out - Given your configuration I expect you have lots of concurrent activity. -- Jeff Trout <jeff@jefftrout.com> http://www.jefftrout.com/ http://www.stuarthamm.net/ ---------------------------(end of broadcast)--------------------------- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match ************************************* This e-mail may contain privileged or confidential material intended for the named recipient only. If you are not the named recipient, delete this message and all attachments. Unauthorized reviewing, copying, printing, disclosing, or otherwise using information in this e-mail is prohibited. We reserve the right to monitor e-mail sent through our network. *************************************
On Wed, 1 Oct 2003, Oleg Lebedev wrote: > Jeff, > I would really appreciate if you could send me that lengthy presentation > that you've written on pg/other dbs comparison. > Thanks. > After I give the presentation at work and collect comments from my coworkers (and remove some information you folks don't need to know :) I will be very willing to post it for people to see. -- Jeff Trout <jeff@jefftrout.com> http://www.jefftrout.com/ http://www.stuarthamm.net/
That would be great! When do you think this would be ready for us to see ;?) -----Original Message----- From: Jeff [mailto:threshar@torgo.978.org] Sent: Wednesday, October 01, 2003 11:42 AM To: Oleg Lebedev Cc: pgsql-performance@postgresql.org Subject: RE: [PERFORM] Tuning/performance issue... On Wed, 1 Oct 2003, Oleg Lebedev wrote: > Jeff, > I would really appreciate if you could send me that lengthy > presentation that you've written on pg/other dbs comparison. Thanks. > After I give the presentation at work and collect comments from my coworkers (and remove some information you folks don't need to know :) I will be very willing to post it for people to see. -- Jeff Trout <jeff@jefftrout.com> http://www.jefftrout.com/ http://www.stuarthamm.net/ ************************************* This e-mail may contain privileged or confidential material intended for the named recipient only. If you are not the named recipient, delete this message and all attachments. Unauthorized reviewing, copying, printing, disclosing, or otherwise using information in this e-mail is prohibited. We reserve the right to monitor e-mail sent through our network. *************************************
I have updated the FAQ to be: In comparison to MySQL or leaner database systems, we are faster for multiple users, complex queries, and a read/write query load. MySQL is faster for SELECT queries done by a few users. Is this accurate? It seems so. --------------------------------------------------------------------------- Oleg Lebedev wrote: > Jeff, > I would really appreciate if you could send me that lengthy presentation > that you've written on pg/other dbs comparison. > Thanks. > > Oleg > > -----Original Message----- > From: Jeff [mailto:threshar@torgo.978.org] > Sent: Wednesday, October 01, 2003 6:23 AM > To: David Griffiths > Cc: pgsql-performance@postgresql.org > Subject: Re: [PERFORM] Tuning/performance issue... > Importance: Low > > > On Tue, 30 Sep 2003, David Griffiths wrote: > > > > > This is all part of a "migrate away from Oracle" project. We are > > looking at 3 databases - MySQL (InnoDB), Postgres and Matisse (object > > oriented). We have alot of queries like this > > or worse, and I'm worried that many of them would need to be > re-written. The > > developers > > know SQL, but nothing about tuning, etc. > > > > There's a movement at my company to ditch several commercial db's in > favor of a free one. I'm currently the big pg fan around here and I've > actually written a rather lengthy presentation about pg features, why, > tuning, etc. but another part was some comparisons to other db's.. > > I decided so I wouldn't be blinding flaming mysql to give it a whirl and > loaded it up with the same dataset as pg. First thing I hit was lack of > stored procedures. But I decided to code around that, giving mysql the > benefit of the doubt. What I found was interesting. > > For 1-2 concurrent > 'beaters' it screamed. ultra-fast. But.. If you increase the concurrent > beaters up to say, 20 Mysql comes to a grinding halt.. Mysql and the > machine itself become fairly unresponsive. And if you do cache > unfriendly > queries it becomes even worse. On PG - no problems at all. Scaled fine > and dandy up. And with 40 concurrent beaters the machine was still > responsive. (The numbers for 20 client was 220 seconds (pg) and 650 > seconds (mysql)) > > So that is another test to try out - Given your configuration I expect > you have lots of concurrent activity. > > -- > Jeff Trout <jeff@jefftrout.com> > http://www.jefftrout.com/ > http://www.stuarthamm.net/ > > > > ---------------------------(end of broadcast)--------------------------- > TIP 9: the planner will ignore your desire to choose an index scan if > your > joining column's datatypes do not match > > ************************************* > > This e-mail may contain privileged or confidential material intended for the named recipient only. > If you are not the named recipient, delete this message and all attachments. > Unauthorized reviewing, copying, printing, disclosing, or otherwise using information in this e-mail is prohibited. > We reserve the right to monitor e-mail sent through our network. > > ************************************* > > ---------------------------(end of broadcast)--------------------------- > TIP 8: explain analyze is your friend > -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania 19073
On Fri, 2003-10-03 at 21:39, Bruce Momjian wrote: > I have updated the FAQ to be: > > In comparison to MySQL or leaner database systems, we are > faster for multiple users, complex queries, and a read/write query > load. MySQL is faster for SELECT queries done by a few users. > > Is this accurate? It seems so. May wish to say ... for simple SELECT queries ... Several left outer joins, subselects and a large number of joins are regularly performed faster in PostgreSQL due to a more mature optimizer. But MySQL can pump out SELECT * FROM table WHERE key = value; queries in a hurry. I've often wondered if they win on those because they have a lighter weight parser / optimizer with less "lets try simplifying this query" steps or if the MYISAM storage mechanism is simply quicker at pulling data off the disk.
Attachment
Rod Taylor wrote: -- Start of PGP signed section. > On Fri, 2003-10-03 at 21:39, Bruce Momjian wrote: > > I have updated the FAQ to be: > > > > In comparison to MySQL or leaner database systems, we are > > faster for multiple users, complex queries, and a read/write query > > load. MySQL is faster for SELECT queries done by a few users. > > > > Is this accurate? It seems so. > > May wish to say ... for simple SELECT queries ... Updated. > Several left outer joins, subselects and a large number of joins are > regularly performed faster in PostgreSQL due to a more mature optimizer. > > But MySQL can pump out SELECT * FROM table WHERE key = value; queries in > a hurry. > > > I've often wondered if they win on those because they have a lighter > weight parser / optimizer with less "lets try simplifying this query" I think that is part of it. > steps or if the MYISAM storage mechanism is simply quicker at pulling > data off the disk. And their heap is indexed by myisam, right. I know with Ingres that Isam was usually faster than btree because you didn't have all those leaves to traverse to get to the data. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania 19073
Centuries ago, Nostradamus foresaw when pgman@candle.pha.pa.us (Bruce Momjian) would write: > I have updated the FAQ to be: > > In comparison to MySQL or leaner database systems, we are > faster for multiple users, complex queries, and a read/write query > load. MySQL is faster for SELECT queries done by a few users. > > Is this accurate? It seems so. I would think it more accurate if you use the phrase "faster for simple SELECT queries." MySQL uses a rule-based optimizer which, when the data fits the rules well, can pump queries through lickety-split without any appreciable pause for evaluation (or reflection :-). That's _quite_ a successful strategy when users are doing what loosely amounts to evaluating association tables. select * from table where key = value; Which is just like tying a Perl variable to a hash table, and doing $value = $TABLE{$key}; In web applications where they wanted something a _little_ more structured than hash tables, that may 'hit the spot.' Anything hairier than that gets, of course, hairier. If you want something that's TRULY more structured, you may lose a lot of hair :-). -- output = reverse("gro.gultn" "@" "enworbbc") http://www.ntlug.org/~cbbrowne/oses.html "If you want to talk with some experts about something, go to the bar where they hang out, buy a round of beers, and they'll surely talk your ear off, leaving you wiser than before. If you, a stranger, show up at the bar, walk up to the table, and ask them to fax you a position paper, they'll tell you to call their office in the morning and ask for a rate sheet." -- Miguel Cruz
Rod Taylor <rbt@rbt.ca> writes: > I've often wondered if they win on those because they have a lighter > weight parser / optimizer with less "lets try simplifying this query" > steps or if the MYISAM storage mechanism is simply quicker at pulling > data off the disk. Comparing pre-PREPAREd queries would probably tell something about that. regards, tom lane
On Fri, 3 Oct 2003, Bruce Momjian wrote: > > I have updated the FAQ to be: > > In comparison to MySQL or leaner database systems, we are > faster for multiple users, complex queries, and a read/write query > load. MySQL is faster for SELECT queries done by a few users. > > Is this accurate? It seems so. > > Another thing I noticed - If you use a dataset that can live in mysql's query cache / os cache it screams, until it has to hit the disk. then GRINDING HALT. It would be nice if someone (I don't have the time now) did a comparison of say: selct value where name = XXX; [where xxx varies] with 1,10,20,50 connections then make progressively more complex queries. And occasionally point out mysql silly omissions: select * from myview where id = 1234 [Oh wait! mysql doesn't have views. Ooopsy!] Wrapping up - PG is not that slow for simple queries either. It can be rather zippy - and PREPARE can give HUGE gains - even for simple statements. I've often wondered if YACC, etc is a bottleneck (You can only go as fast as your parser can go). Hurray for PG! And I'm giving my PG presentation monday. I hope to post it tuesday after I update with comments I receive and remove confidential information. -- Jeff Trout <jeff@jefftrout.com> http://www.jefftrout.com/ http://www.stuarthamm.net/