Thread: SQL Server performing much better?!?!
Hi, I'm having a little bit of trouble figuring out a query and why it takes so long to execute. As a preface, I have the same exact database (or at least very closely duplicated) set up in micro$oft SQL Server, with the same schema, data and the same indexes, etc. When I run this query on a dinky Windows2000 machine (my laptop) it runs relatively fast and comes back in 7 seconds. I have postgreSQL set up on a sun solaris box with 1 GB of memory and it takes over 25 seconds! One caveat is that it's a shared managed server run by verio (VPS). But I still figure it's beafier than my puny laptop! Both databases contain the same data and the same indexes. I've even tried bumping up the sort_mem up from 512 to 16384 (16 MB?). The result set that comes back is 8604 rows. But even if i put a limit of 5 it still takes a long time to run. Also, I've observed that when I run this query in the PostgreSQL database it pegs the CPU on the server at 100%, I don't see any iowait or anything, memory usage doesn't jump up or anything and the swap in use stays the same (all monitored from top). From the plan it looks like it's using indexes where appropriate and when it does do a table scan the cost doesn't seem to be that much. The big operations seem to be when merge joins are performed. I've included both the query and the plan bellow. Any help would be HUGELY appreciated as I'm in the embarrasing situation where SQL Server is heavily out performing PostgreSQL running on a sun box. Heck, it's even running faster on my laptop running cygwin configured in the same way! PLEASE HELP! Thanks! Christian ==================================================== Query: ==================================================== SELECT a.user_id, b.sample_id FROM user_company a, sample_manufacturer b, samples c WHERE a.company_id = b.manufacturer_id AND b.sample_id = c.sample_id AND c.sample_state = 1 UNION SELECT a.user_id, b.sample_id FROM user_company a, samples b, users c WHERE a.company_id = b.sample_manufacturer_id AND b.sample_state = 1 AND b.sample_author_id = c.user_id AND NOT EXISTS ( SELECT p.territory_id FROM territories p, territory_ranges q, manufacturer_territories r WHERE r.manufacturer_id = b.sample_manufacturer_id AND r.assignment_flag = 2 AND r.territory_id = p.territory_id AND p.territory_id = q.territory_id AND p.type IN (1, 2) AND c.zip BETWEEN q.start_value AND q.end_value ) limit 5 ==================================================== Plan: ==================================================== Limit (cost=26137.82..26138.07 rows=5 width=33) (actual time=25895.91..25896.01 rows=5 loops=1) -> Unique (cost=26137.82..26251.99 rows=2283 width=33) (actual time=25895.90..25895.98 rows=6 loops=1) -> Sort (cost=26137.82..26137.82 rows=22834 width=33) (actual time=25895.88..25895.90 rows=13 loops=1) -> Append (cost=98.67..24484.74 rows=22834 width=33) (actual time=82.38..25102.67 rows=19265 loops=1) -> Subquery Scan *SELECT* 1 (cost=98.67..342.06 rows=16937 width=20) (actual time=82.37..609.69 rows=17098 loops=1) -> Merge Join (cost=98.67..342.06 rows=16937 width=20) (actual time=82.35..434.78 rows=17098 loops=1) -> Index Scan using ix_user_company_company_id on user_company a (cost=0.00..26.85 rows=1168 width=8) (actual time=0.41..11.83 rows=373 loops=1) -> Sort (cost=98.67..98.67 rows=763 width=12) (actual time=79.78..108.86 rows=17098 loops=1) -> Hash Join (cost=33.09..62.16 rows=763 width=12) (actual time=26.18..58.97 rows=769 loops=1) -> Seq Scan on sample_manufacturer b (cost=0.00..15.69 rows=769 width=8) (actual time=0.12..13.54 rows=769 loops=1) -> Hash (cost=31.59..31.59 rows=602 width=4) (actual time=23.93..23.93 rows=0 loops=1) -> Seq Scan on samples c (cost=0.00..31.59 rows=602 width=4) (actual time=0.17..18.48 rows=602 loops=1) -> Subquery Scan *SELECT* 2 (cost=24038.44..24142.68 rows=5897 width=33) (actual time=24381.93..24456.97 rows=2167 loops=1) -> Merge Join (cost=24038.44..24142.68 rows=5897 width=33) (actual time=24381.92..24433.35 rows=2167 loops=1) -> Index Scan using ix_user_company_company_id on user_company a (cost=0.00..26.85 rows=1168 width=8) (actual time=0.06..10.58 rows=373 loops=1) -> Sort (cost=24038.44..24038.44 rows=301 width=25) (actual time=24380.15..24383.39 rows=2165 loops=1) -> Merge Join (cost=0.00..24026.05 rows=301 width=25) (actual time=1346.68..24378.12 rows=97 loops=1) -> Index Scan using ix_samples_author_id on samples b (cost=0.00..42.12 rows=602 width=12) (actual time=0.46..32.05 rows=602 loops=1) -> Index Scan using users_pkey on users c (cost=0.00..60.58 rows=1144 width=13) (actual time=0.40..57.70 rows=1419 loops=1) SubPlan -> Nested Loop (cost=0.00..53.30 rows=1 width=12) (actual time=40.27..40.27 rows=1 loops=602) -> Nested Loop (cost=0.00..48.43 rows=2 width=8) (actual time=40.18..40.18 rows=1 loops=602) -> Index Scan using ix_manufacturer_terr_combo3 on manufacturer_territories r (cost=0.00..19.80 rows=5 width=4) (actual time=0.11..0.91 rows=27 loops=602) -> Index Scan using ix_territory_ranges_combo_1 on territory_ranges q (cost=0.00..5.71 rows=1 width=4) (actual time=1.46..1.46 rows=0 loops=15995) -> Index Scan using territories_pkey on territories p (cost=0.00..3.02 rows=1 width=4) (actual time=0.08..0.08 rows=1 loops=505) Total runtime: 25915.75 msec _________________________________________________________ Do You Yahoo!? Get your free @yahoo.com address at http://mail.yahoo.com
Good point, here's the info I should've also provided: - 7.2, installed from the source. - Solaris 2.6 - Only compilation option changed was --prefix to install to a different directory - all config settings were left at default, but i also experimented with upping sort_mem but that's it. - to start all i do is use this command: postmaster -i -S -D /usr/local/pgsql/data >& /usr/var/log/pgsql.log That's pretty much it, obviously nothing to fancy. Again, any help would be incredibly appreciated. Right now I've got the small company I work for that used to be completely a Micro$oft shop totally sold on unix and more importantly open-source, but this performance problem sure isn't helping my little "campaign." ;-) Thanks, Christian -----Original Message----- From: fred@panda.baobab.home [mailto:fred@panda.baobab.home]On Behalf Of Frederic Saincy Sent: Tuesday, March 19, 2002 8:37 PM To: Christian Cabanero Subject: Re: [GENERAL] SQL Server performing much better?!?! Hi, Maybe you should post on the list - your postgresql version (there are some bugs in older optimizer for some sql syntax ) - your solaris version - your compilation options - your config files - your starting options ... I can't help right now, maybe someone could Bye. _________________________________________________________ Do You Yahoo!? Get your free @yahoo.com address at http://mail.yahoo.com
On Wed, 2002-03-20 at 01:50, Christian Cabanero wrote: > Good point, here's the info I should've also provided: > [snip] I don't think many people on the list are MS SQL experts -- IMO, the best we can do is concentrate on improving PostgreSQL performance. To that end, can you tell us: (1) the query that is slow in Postgres but fast in MS SQL (2) the definition of any tables, indexes, views, etc used by the query in #1. (3) the output of EXPLAIN for the query in #1. Also, have you run VACUUM ANALYZE? Cheers, Neil -- Neil Conway <neilconway@rogers.com> PGP Key ID: DB3C29FC
On Tue, 19 Mar 2002 18:06:44 -0800 "Christian Cabanero" <chumpboy@yahoo.com> wrote: > (all monitored from top). From the plan it looks like it's using indexes > where appropriate and when it does do a table scan the cost doesn't seem to > be that much. The big operations seem to be when merge joins are performed. > I've included both the query and the plan bellow. In case of using EXISTS clauses, it seems to be sometimes the case that the results, for which those who execute queries are all agog, are not returned quickly. Regards, Masaru Sugawara
Hi Christian, > I'm having a little bit of trouble figuring out a query and why it takes so > long to execute. As a preface, I have the same exact database (or at least > very closely duplicated) set up in micro$oft SQL Server, with the same > schema, data and the same indexes, etc. When I run this query on a dinky > Windows2000 machine (my laptop) it runs relatively fast and comes back in 7 > seconds. I have postgreSQL set up on a sun solaris box with 1 GB of memory > and it takes over 25 seconds! > > Heck, it's even running faster on my laptop running cygwin configured in the > same way! PLEASE HELP! If your laptop runs faster using PostgreSQL under cygwin there should be a problem on your solaris server configuration andnot postgresql. PosgreSQL on cygwin is not considered production ready so I guess it is not surprise being slower thanMS SQL Server, but the solaris machine should be much faster. Hint: Do you run vacuum frequentily? [s, Fernando Lozano
MS SQL Server is a very good database. It has hundreds of well paid programmers working on it. It makes decisions for youabout things that PostgreSQL will not. I migrated from MSSQL Server to PostgreSQL and found that I had done a lot of things 'wrong' or at least inefficiently andMSSQL Server let me get away with it, PostgreSQL did not. PostgreSQL will not probably ever outperform MSSQL in all areas. PostgreSQL will always outperform MSSQL Server in price,stability, portability, and support. Ian A. Harding Programmer/Analyst II Tacoma-Pierce County Health Department (253) 798-3549 mailto: iharding@tpchd.org >>> "Christian Cabanero" <chumpboy@yahoo.com> 03/19/02 06:06PM >>> Hi, I'm having a little bit of trouble figuring out a query and why it takes so long to execute. As a preface, I have the same exact database (or at least very closely duplicated) set up in micro$oft SQL Server, with the same schema, data and the same indexes, etc. When I run this query on a dinky Windows2000 machine (my laptop) it runs relatively fast and comes back in 7 seconds. I have postgreSQL set up on a sun solaris box with 1 GB of memory and it takes over 25 seconds! One caveat is that it's a shared managed server run by verio (VPS). But I still figure it's beafier than my puny laptop! Both databases contain the same data and the same indexes. I've even tried bumping up the sort_mem up from 512 to 16384 (16 MB?). The result set that comes back is 8604 rows. But even if i put a limit of 5 it still takes a long time to run. Also, I've observed that when I run this query in the PostgreSQL database it pegs the CPU on the server at 100%, I don't see any iowait or anything, memory usage doesn't jump up or anything and the swap in use stays the same (all monitored from top). From the plan it looks like it's using indexes where appropriate and when it does do a table scan the cost doesn't seem to be that much. The big operations seem to be when merge joins are performed. I've included both the query and the plan bellow. Any help would be HUGELY appreciated as I'm in the embarrasing situation where SQL Server is heavily out performing PostgreSQL running on a sun box. Heck, it's even running faster on my laptop running cygwin configured in the same way! PLEASE HELP! Thanks! Christian ==================================================== Query: ==================================================== SELECT a.user_id, b.sample_id FROM user_company a, sample_manufacturer b, samples c WHERE a.company_id = b.manufacturer_id AND b.sample_id = c.sample_id AND c.sample_state = 1 UNION SELECT a.user_id, b.sample_id FROM user_company a, samples b, users c WHERE a.company_id = b.sample_manufacturer_id AND b.sample_state = 1 AND b.sample_author_id = c.user_id AND NOT EXISTS ( SELECT p.territory_id FROM territories p, territory_ranges q, manufacturer_territories r WHERE r.manufacturer_id = b.sample_manufacturer_id AND r.assignment_flag = 2 AND r.territory_id = p.territory_id AND p.territory_id = q.territory_id AND p.type IN (1, 2) AND c.zip BETWEEN q.start_value AND q.end_value ) limit 5 ==================================================== Plan: ==================================================== Limit (cost=26137.82..26138.07 rows=5 width=33) (actual time=25895.91..25896.01 rows=5 loops=1) -> Unique (cost=26137.82..26251.99 rows=2283 width=33) (actual time=25895.90..25895.98 rows=6 loops=1) -> Sort (cost=26137.82..26137.82 rows=22834 width=33) (actual time=25895.88..25895.90 rows=13 loops=1) -> Append (cost=98.67..24484.74 rows=22834 width=33) (actual time=82.38..25102.67 rows=19265 loops=1) -> Subquery Scan *SELECT* 1 (cost=98.67..342.06 rows=16937 width=20) (actual time=82.37..609.69 rows=17098 loops=1) -> Merge Join (cost=98.67..342.06 rows=16937 width=20) (actual time=82.35..434.78 rows=17098 loops=1) -> Index Scan using ix_user_company_company_id on user_company a (cost=0.00..26.85 rows=1168 width=8) (actual time=0.41..11.83 rows=373 loops=1) -> Sort (cost=98.67..98.67 rows=763 width=12) (actual time=79.78..108.86 rows=17098 loops=1) -> Hash Join (cost=33.09..62.16 rows=763 width=12) (actual time=26.18..58.97 rows=769 loops=1) -> Seq Scan on sample_manufacturer b (cost=0.00..15.69 rows=769 width=8) (actual time=0.12..13.54 rows=769 loops=1) -> Hash (cost=31.59..31.59 rows=602 width=4) (actual time=23.93..23.93 rows=0 loops=1) -> Seq Scan on samples c (cost=0.00..31.59 rows=602 width=4) (actual time=0.17..18.48 rows=602 loops=1) -> Subquery Scan *SELECT* 2 (cost=24038.44..24142.68 rows=5897 width=33) (actual time=24381.93..24456.97 rows=2167 loops=1) -> Merge Join (cost=24038.44..24142.68 rows=5897 width=33) (actual time=24381.92..24433.35 rows=2167 loops=1) -> Index Scan using ix_user_company_company_id on user_company a (cost=0.00..26.85 rows=1168 width=8) (actual time=0.06..10.58 rows=373 loops=1) -> Sort (cost=24038.44..24038.44 rows=301 width=25) (actual time=24380.15..24383.39 rows=2165 loops=1) -> Merge Join (cost=0.00..24026.05 rows=301 width=25) (actual time=1346.68..24378.12 rows=97 loops=1) -> Index Scan using ix_samples_author_id on samples b (cost=0.00..42.12 rows=602 width=12) (actual time=0.46..32.05 rows=602 loops=1) -> Index Scan using users_pkey on users c (cost=0.00..60.58 rows=1144 width=13) (actual time=0.40..57.70 rows=1419 loops=1) SubPlan -> Nested Loop (cost=0.00..53.30 rows=1 width=12) (actual time=40.27..40.27 rows=1 loops=602) -> Nested Loop (cost=0.00..48.43 rows=2 width=8) (actual time=40.18..40.18 rows=1 loops=602) -> Index Scan using ix_manufacturer_terr_combo3 on manufacturer_territories r (cost=0.00..19.80 rows=5 width=4) (actual time=0.11..0.91 rows=27 loops=602) -> Index Scan using ix_territory_ranges_combo_1 on territory_ranges q (cost=0.00..5.71 rows=1 width=4) (actual time=1.46..1.46 rows=0 loops=15995) -> Index Scan using territories_pkey on territories p (cost=0.00..3.02 rows=1 width=4) (actual time=0.08..0.08 rows=1 loops=505) Total runtime: 25915.75 msec _________________________________________________________ Do You Yahoo!? Get your free @yahoo.com address at http://mail.yahoo.com ---------------------------(end of broadcast)--------------------------- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to majordomo@postgresql.org)
On Tue, 19 Mar 2002 18:06:44 -0800 "Christian Cabanero" <chumpboy@yahoo.com> wrote: > SELECT > a.user_id, b.sample_id > FROM > user_company a, > samples b, > users c > WHERE > a.company_id = b.sample_manufacturer_id AND > b.sample_state = 1 AND > b.sample_author_id = c.user_id AND > NOT EXISTS > ( > SELECT > p.territory_id > FROM > territories p, > territory_ranges q, > manufacturer_territories r > WHERE > r.manufacturer_id = b.sample_manufacturer_id AND > r.assignment_flag = 2 AND > r.territory_id = p.territory_id AND > p.territory_id = q.territory_id AND p.type IN (1, 2) > AND > c.zip BETWEEN q.start_value AND q.end_value > ) If query 1, 2 don't return rows so much and query 2 isn't slow, using EXCEPT ALL might work faster than using NOT EXISTS. Thus, could you show us the results of EXPLAIN ANALYZE they'll return respectively ? -- query 1. SELECT COUNT(*) -- a.user_id, b.sample_id FROM user_company a, samples b, users c WHERE a.company_id = b.sample_manufacturer_id AND b.sample_state = 1 AND b.sample_author_id = c.user_id ; -- query 2. SELECT COUNT(*) -- a.user_id, b.sample_id FROM user_company a, samples b, users c territories p, territory_ranges q, manufacturer_territories r WHERE a.company_id = b.sample_manufacturer_id AND b.sample_state = 1 AND b.sample_author_id = c.user_id AND r.manufacturer_id = b.sample_manufacturer_id AND r.assignment_flag = 2 AND r.territory_id = p.territory_id AND p.territory_id = q.territory_id AND p.type IN (1, 2) AND c.zip BETWEEN q.start_value AND q.end_value ; Regards, Masaru Sugawara
Hi Christian, >sold on unix and more importantly open-source, but this performance >problem >sure isn't helping my little "campaign." ;-) No doubt ! um.. looks like Solaris is the issue here - If you can persuade your server management company to play ball, how about : 1/ Change Platform Use an Intel box with Linux or Freebsd, you can be assured of _vastly_ better performance with Postgresql in either case. (or...) 2/ Change OS Run Linux on the Sparc. Even if your management company will not help you out, the option 1/ is pretty inexpensive... and you get a _more_ fully open solution (OS as well) best of luck Mark
(I haven't followed the thread closely, but would like to comment anyway ;) > >sold on unix and more importantly open-source, but this performance > >problem sure isn't helping my little "campaign." ;-) > um.. looks like Solaris is the issue here - If you can persuade your > server management company to play ball, how about : > 1/ Change Platform > 2/ Change OS That is a pretty drastic set of alternatives. Solaris can do just fine as a PostgreSQL platform. Certainly 2GHz Intel hotboxes can spin more cpu cycles than a 200MHz Solaris box (I don't know the specifics of your hosting environment), and if you are just flat out asking for more than the box can do then swapping hosts is an alternative. But unless you have exhausted the possibilities for tuning on your existing box you likely still have some things you can do to make the most of what you have. Good luck! - Thomas
On Sáb 23 Mar 2002 10:32, Thomas Lockhart wrote: > (I haven't followed the thread closely, but would like to comment anyway > ;) Niether did I. :-) > > >sold on unix and more importantly open-source, but this performance > > >problem sure isn't helping my little "campaign." ;-) > > > > um.. looks like Solaris is the issue here - If you can persuade your > > server management company to play ball, how about : > > 1/ Change Platform > > 2/ Change OS > > That is a pretty drastic set of alternatives. Solaris can do just fine > as a PostgreSQL platform. Certainly 2GHz Intel hotboxes can spin more Yes, PostgreSQL works pretty good on Solaris/SPARC, which is my main DB platform. But at this moment Solaris is a almost dead platform (last release, 9, was the last, and there will be no more), and the hardware is too expensive. If you ask your SUN dealer next year what they can offer you, they will probably tell you that they have these nice Intel servers with SUN/Linux inside. :-P So, and to finish, I don't see them as so drastic, but more like realistic. :-) Saludos... :-) -- Porqué usar una base de datos relacional cualquiera, si podés usar PostgreSQL? ----------------------------------------------------------------- Martín Marqués | mmarques@unl.edu.ar Programador, Administrador, DBA | Centro de Telematica Universidad Nacional del Litoral -----------------------------------------------------------------
On Sun, 2002-03-24 at 01:32, Thomas Lockhart wrote: > (I haven't followed the thread closely, but would like to comment anyway > ;) No problem... thats probably what I did ... :-) > > That is a pretty drastic set of alternatives. Solaris can do just fine > as a PostgreSQL platform. Certainly 2GHz Intel hotboxes can spin more > cpu cycles than a 200MHz Solaris box (I don't know the specifics of your > hosting environment), and if you are just flat out asking for more than > the box can do then swapping hosts is an alternative. But unless you > have exhausted the possibilities for tuning on your existing box you > likely still have some things you can do to make the most of what you > have. > > Good luck! > > - Thomas Agreed...I currently work with Solaris and its generally quite good... but there seems to be a definite performance problem with Postgresql on Solaris (see a previous thread with title "Solaris Performance"). I am planning to profile a test case query that I have, so the problem can be identified...(soon hopefully) Cheers Mark
On 25 Mar 2002, Mark kirkwood wrote: > Agreed...I currently work with Solaris and its generally quite good... > but there seems to be a definite performance problem with Postgresql on > Solaris (see a previous thread with title "Solaris Performance"). Solaris is slow when writing to ufs filesystem in default configuration. I guess that postgresql writes out temporary data. The writes can probably be speed up when writing to tempfs or by disabling the journal. Writing data can probably speed up with the mount option "nologging". Reading data can probably speed up with the mount option "noatime". Can someone try this? -- PGP/GPG Key-ID: http://blackhole.pca.dfn.de:11371/pks/lookup?op=get&search=0xB5A1AFE1