Home > mailing lists

Thread: same question little different test MSSQL vrs Postgres

same question little different test MSSQL vrs Postgres

From

"Joel Fradkin"

Date:

26 January 2005, 03:49:03

I also tried a simple select * from tblcase where clientum = ‘SAKS’

On both MSSQL and Postgres.

MSSQL was 3 secs, Postgres was 27 secs.

There is a key for clientnum, but it appeared on both systems (identical Dell Desktops Postgres is running Linux MSSQL is XP) it did not do a indexed search.

Joel Fradkin

Wazagua, Inc.
2520 Trailmate Dr
Sarasota, Florida 34243
Tel. 941-753-7111 ext 305

jfradkin@wazagua.com
www.wazagua.com
Powered by Wazagua
Providing you with the latest Web-based technology & advanced tools.
© 2004. WAZAGUA, Inc. All rights reserved. WAZAGUA, Inc
This email message is for the use of the intended recipient(s) and may contain confidential and privileged information. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply email and delete and destroy all copies of the original message, including attachments.

Re: same question little different test MSSQL vrs Postgres

From

Dennis Sacks

Date:

26 January 2005, 04:21:26

Joel Fradkin wrote: <blockquote cite="mid000901c5035a$3b619cc0$797ba8c0@jfradkin" type="cite"><style>
<!--/* Style Definitions */p.MsoNormal, li.MsoNormal,
div.MsoNormal{margin:0in;margin-bottom:.0001pt;font-size:12.0pt;font-family:"TimesNew Roman";}
 
a:link, span.MsoHyperlink{color:blue;text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed{color:purple;text-decoration:underline;}
span.EmailStyle17{font-family:Arial;color:windowtext;}
@page Section1{size:8.5in 11.0in;margin:1.0in 1.25in 1.0in 1.25in;}
div.Section1{page:Section1;}
--> </style><div class="Section1"><p class="MsoNormal"><font face="Arial" size="2"><span style="font-size: 10pt;
font-family:Arial;">I also tried a simple select * from tblcase where clientum = ‘SAKS’</span></font><p
class="MsoNormal"><fontface="Arial" size="2"><span style="font-size: 10pt; font-family: Arial;">On both MSSQL and
Postgres.</span></font><pclass="MsoNormal"><font face="Arial" size="2"><span style="font-size: 10pt; font-family:
Arial;">MSSQLwas 3 secs, Postgres was 27 secs.</span></font><p class="MsoNormal"><font face="Arial" size="2"><span
style="font-size:10pt; font-family: Arial;"> </span></font><p class="MsoNormal"><font face="Arial" size="2"><span
style="font-size:10pt; font-family: Arial;">There is a key for clientnum, but it appeared on both systems (identical
DellDesktops Postgres is running Linux MSSQL is XP) it did not do a indexed search.</span></font></div></blockquote><br
/>One of the things you'll want to do regularly is run a "vacuum analyze". You can read up on this in the postgresql
docs.This is essential to the indexes being used properly. At a bare minimum, after you import a large amount of data,
you'llwant to run vacuum analyze.<br /><br /> Dennis Sacks<br /><a class="moz-txt-link-abbreviated"
href="mailto:dennis@illusions.com">dennis@illusions.com</a><br/>

Re: same question little different test MSSQL vrs Postgres

From

Greg Stark

Date:

26 January 2005, 06:26:58

"Joel Fradkin" <jfradkin@wazagua.com> writes:

> I also tried a simple select * from tblcase where clientum = 'SAKS'

Try:

explain analyze select * from tblcase where clientum = 'SAKS'

Send the output.

-- 
greg

Re: same question little different test MSSQL vrs Postgres

From

"Joel Fradkin"

Date:

26 January 2005, 13:26:02

QUERY PLAN
"Seq Scan on tblcase  (cost=0.00..30066.21 rows=37401 width=996) (actual
time=0.344..962.260 rows=22636 loops=1)"
"  Filter: ((clientnum)::text = 'SAKS'::text)"
"Total runtime: 1034.434 ms"

Joel Fradkin
Wazagua, Inc.
2520 Trailmate Dr
Sarasota, Florida 34243
Tel.  941-753-7111 ext 305
jfradkin@wazagua.com
www.wazagua.com
Powered by Wazagua
Providing you with the latest Web-based technology & advanced tools.
C 2004. WAZAGUA, Inc. All rights reserved. WAZAGUA, IncThis email message is for the use of the intended recipient(s)
andmay
 
contain confidential and privileged information.  Any unauthorized review,
use, disclosure or distribution is prohibited.  If you are not the intended
recipient, please contact the sender by reply email and delete and destroy
all copies of the original message, including attachments.


-----Original Message-----
From: gsstark@mit.edu [mailto:gsstark@mit.edu] 
Sent: Wednesday, January 26, 2005 1:27 AM
To: Joel Fradkin
Cc: pgsql-sql@postgresql.org
Subject: Re: [SQL] same question little different test MSSQL vrs Postgres


"Joel Fradkin" <jfradkin@wazagua.com> writes:

> I also tried a simple select * from tblcase where clientum = 'SAKS'

Try:

explain analyze select * from tblcase where clientum = 'SAKS'

Send the output.

-- 
greg

Re: same question little different test MSSQL vrs Postgres

From

Richard Huxton

Date:

26 January 2005, 13:59:07

Joel Fradkin wrote:
> QUERY PLAN
> "Seq Scan on tblcase  (cost=0.00..30066.21 rows=37401 width=996) (actual
> time=0.344..962.260 rows=22636 loops=1)"
> "  Filter: ((clientnum)::text = 'SAKS'::text)"
> "Total runtime: 1034.434 ms"

That's only 1 second - to return 22,636 rows. Not 27 seconds, as in the 
original post. You'll never persuade PG to use the index when some 75% 
of your rows match the filter - it just doesn't make sense.

--  Richard Huxton  Archonet Ltd

Re: same question little different test MSSQL vrs Postgres

From

"Joel Fradkin"

Date:

26 January 2005, 14:54:57

Well last evening (did not try it this morning) it was taking the extra
time.

I have made some adjustments to the config file per a few web sites that you
all recommended my looking at.

It is now using 137 of 756 meg avail.
it is now taking 8 secs to return 22,000 rows (using pgadminIII in a sql
edit window).

The EXPLAIN ANALYSE still shows the same as below, but the table has 344,000
recs of which only 22636 are clientnum = 'SAKS'

I am still doing a seq search (this applies to the view question where if it
is a small result set it used a index search but on a larger return set it
did a seq search) in my view, but with the adjustments to the kernel I get a
result in 140 secs (MSSQL was 135 secs).

This is not production, I am still very worried that I have to do all this
tweeking to use this, MSSQL worked out of the box as it does (not saying its
great, but I never had to adjust a kernel setting etc). Since we cannot
afford the 70,000 dollars they want to license it I am not implying I can
use MSSQL, but I could look at other DB's like MYSQL, or Firebird, etc.

I have a lot of time now (two weeks) in this conversion and do not wish to
give up, I will see if I can learn what is needed to get the maximum
performance. I have seen much information available and this list has been a
huge resource. I really appreciate all the help.


Joel Fradkin
Wazagua, Inc.
2520 Trailmate Dr
Sarasota, Florida 34243
Tel.  941-753-7111 ext 305
> QUERY PLAN
> "Seq Scan on tblcase  (cost=0.00..30066.21 rows=37401 width=996) (actual
> time=0.344..962.260 rows=22636 loops=1)"
> "  Filter: ((clientnum)::text = 'SAKS'::text)"
> "Total runtime: 1034.434 ms"

That's only 1 second - to return 22,636 rows. Not 27 seconds, as in the
original post. You'll never persuade PG to use the index when some 75%
of your rows match the filter - it just doesn't make sense.

--  Richard Huxton  Archonet Ltd

Re: same question little different test MSSQL vrs Postgres

From

Richard Huxton

Date:

26 January 2005, 15:24:40

Joel Fradkin wrote:
> Well last evening (did not try it this morning) it was taking the extra
> time.
> 
> I have made some adjustments to the config file per a few web sites that you
> all recommended my looking at.

The crucial one I'd say is the performance guide at:  http://www.varlena.com/varlena/GeneralBits/Tidbits/index.php
The first half-dozen settings are the crucial ones.

> It is now using 137 of 756 meg avail.
> it is now taking 8 secs to return 22,000 rows (using pgadminIII in a sql
> edit window).

That might be too much RAM. Don't forget PG likes to work with your 
operating-system (unlike many other DBs). Make sure Windows is using 
enough RAM to cache diskspace.
I'm curious as to how this takes 8secs whereas you had 1 second earlier. 
Are you sure some of this isn't pgadmin's overhead to display the rows?

> The EXPLAIN ANALYSE still shows the same as below, but the table has 344,000
> recs of which only 22636 are clientnum = 'SAKS'

That sounds like it's about the borderline between using an index and 
not (depending on cache-size, disk speeds etc).

> I am still doing a seq search (this applies to the view question where if it
> is a small result set it used a index search but on a larger return set it
> did a seq search) in my view, but with the adjustments to the kernel I get a
> result in 140 secs (MSSQL was 135 secs).

If you want to check whether the index would help, try issuing the 
following before running your query:  SET ENABLE_SEQSCAN=FALSE;
This will force PG to use any index it can regardless of whether it 
thinks it will help.

> This is not production, I am still very worried that I have to do all this
> tweeking to use this, MSSQL worked out of the box as it does (not saying its
> great, but I never had to adjust a kernel setting etc). Since we cannot
> afford the 70,000 dollars they want to license it I am not implying I can
> use MSSQL, but I could look at other DB's like MYSQL, or Firebird, etc.

I'm a little curious what kernel settings you are changing on Windows. I 
wasn't aware there was much to be done there.

I'm afraid you do have to change half a dozen settings in 
postgresql.conf to match your workload, but PG runs on a much wider 
range of machines than MSSQL so it's difficult to come up with a 
"reasonable" default. Takes me about 5 minutes when I setup an 
installation to make sure the figures are reasonable (rather than the 
best they can be).

> I have a lot of time now (two weeks) in this conversion and do not wish to
> give up, I will see if I can learn what is needed to get the maximum
> performance. I have seen much information available and this list has been a
> huge resource. I really appreciate all the help.

--  Richard Huxton  Archonet Ltd

Re: SPAM Re: same question little different test MSSQL

From

Franco Bruno Borghesi

Date:

26 January 2005, 15:56:14

I've tested in a relation of mine, with about 20 attributes, and here are the results:<br /><br /> test=# select
count(*)from gestionestareas;<br />  count<br /> --------<br />  447681<br /> (1 row)<br /><br /> test=# explain
analyzeselect * from gestionestareas where agrupable;<br />                                                         
QUERYPLAN<br />
----------------------------------------------------------------------------------------------------------------------------<br
/> Seq Scan on gestionestareas  (cost=0.00..12334.81 rows=155495 width=372) (actual time=0.603..1176.177 rows=153530
loops=1)<br/>    Filter: agrupable<br />  Total runtime: 1380.113 ms<br /> (3 rows)<br /><br /> So, doing a seq scan on
a450,000 rows table and fetching 150,000 rows takes only 1.3 secs. This a 900Mhz PIII, 1GB mem (133 Mhz), 7200RPM ide
disk,running freebsd. This machine is also a webserver (apache & tomcat), mail server, file server (smb & nfs),
ldapserver, etc.<br /><br /> I don't  use pgadmin, I use psql (postgresql console client). It took 2 minutes to display
theresults of the above query (without the EXPLAIN ANALIZE).... this makes me think, couldn't be the problem that
pgadmin,psql, etc. takes too much time to display all the rows? It seems a client software problem, not a server
problem.<br/><br /> My advice is, use EXPLAIN ANALYZE to test both servers performance. If you want to know which the
finalresults will be, you test both databases from a self programmed application (java, php, C++, etc.).<br /><br />
Hopethis helped.<br /><br /> Joel Fradkin wrote: <blockquote cite="mid000001c503b7$6ea08800$797ba8c0@jfradkin"
type="cite"><prewrap="">Well last evening (did not try it this morning) it was taking the extra
 
time.

I have made some adjustmenNots to the config file per a few web sites that you
all recommended my looking at.

It is now using 137 of 756 meg avail.
it is now taking 8 secs to return 22,000 rows (using pgadminIII in a sql
edit window).

The EXPLAIN ANALYSE still shows the same as below, but the table has 344,000
recs of which only 22636 are clientnum = 'SAKS'

I am still doing a seq search (this applies to the view question where if it
is a small result set it used a index search but on a larger return set it
did a seq search) in my view, but with the adjustments to the kernel I get a
result in 140 secs (MSSQL was 135 secs).

This is not production, I am still very worried that I have to do all this
tweeking to use this, MSSQL worked out of the box as it does (not saying its
great, but I never had to adjust a kernel setting etc). Since we cannot
afford the 70,000 dollars they want to license it I am not implying I can
use MSSQL, but I could look at other DB's like MYSQL, or Firebird, etc.

I have a lot of time now (two weeks) in this conversion and do not wish to
give up, I will see if I can learn what is needed to get the maximum
performance. I have seen much information available and this list has been a
huge resource. I really appreciate all the help.


Joel Fradkin
Wazagua, Inc.
2520 Trailmate Dr
Sarasota, Florida 34243
Tel.  941-753-7111 ext 305 </pre><blockquote type="cite"><pre wrap="">QUERY PLAN
"Seq Scan on tblcase  (cost=0.00..30066.21 rows=37401 width=996) (actual
time=0.344..962.260 rows=22636 loops=1)"
"  Filter: ((clientnum)::text = 'SAKS'::text)"
"Total runtime: 1034.434 ms"   </pre></blockquote><pre wrap="">
That's only 1 second - to return 22,636 rows. Not 27 seconds, as in the 
original post. You'll never persuade PG to use the index when some 75% 
of your rows match the filter - it just doesn't make sense.

--  Richard Huxton  Archonet Ltd


---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?
              <a class="moz-txt-link-freetext"
href="http://www.postgresql.org/docs/faq">http://www.postgresql.org/docs/faq</a>
 </pre></blockquote><br />

Re: same question little different test MSSQL vrs Postgres

From

"Joel Fradkin"

Date:

26 January 2005, 15:59:42

The postgres is running on Linux Fedora core 3 (production will be redhat on
Dell 4 proc 8 gig box).

My client pgadminIII is running on XP.

Sorry I was not clearer on this.

I am playing with the settings now, I got it to return in 100 secs (the view
that is that took 135 on MSSQL). My testing is using identical Dell desktops
for the MSSQL and the Linux, with a third machine for the clients.

I do not mind getting up to speed on the proper setting to optimize the
hardware, I am worried that as production environment can be somewhat
dynamic that I will have issues getting a optimized environment and that it
will work for our needs. My whole reason for being here is that our duel
proc production MSSQL server is just no longer keeping up with the demand,
so it is important that whatever I implement is going to up to the
challenge. I am still convinced Postgres was the correct choice, especially
with all the guidance I have been able to get here. 100 seconds will be fine
compared to the 135 of MSSQL, I just was getting worse responses before
adjusting. At the moment I think I went too far as I see it using swap and
going slower, but it never used much of the 756 meg (137 max was all I ever
saw it use).

I guess the swap buffers and cache are the important settings (least that
seems to be what is affecting the memory). Not sure exactly what would cause
it to use seq vrs index, but I will try the force and see if it helps the
speed.

Joel Fradkin
Wazagua, Inc.
2520 Trailmate Dr
Sarasota, Florida 34243
Tel.  941-753-7111 ext 305
jfradkin@wazagua.com
www.wazagua.com
Powered by Wazagua
Providing you with the latest Web-based technology & advanced tools.
C 2004. WAZAGUA, Inc. All rights reserved. WAZAGUA, IncThis email message is for the use of the intended recipient(s)
andmay 
contain confidential and privileged information.  Any unauthorized review,
use, disclosure or distribution is prohibited.  If you are not the intended
recipient, please contact the sender by reply email and delete and destroy
all copies of the original message, including attachments.

-----Original Message-----
From: Richard Huxton [mailto:dev@archonet.com]
Sent: Wednesday, January 26, 2005 10:21 AM
To: Joel Fradkin
Cc: gsstark@mit.edu; pgsql-sql@postgresql.org
Subject: Re: [SQL] same question little different test MSSQL vrs Postgres

Joel Fradkin wrote:
> Well last evening (did not try it this morning) it was taking the extra
> time.
>
> I have made some adjustments to the config file per a few web sites that
you
> all recommended my looking at.

The crucial one I'd say is the performance guide at:  http://www.varlena.com/varlena/GeneralBits/Tidbits/index.php
The first half-dozen settings are the crucial ones.

> It is now using 137 of 756 meg avail.
> it is now taking 8 secs to return 22,000 rows (using pgadminIII in a sql
> edit window).

That might be too much RAM. Don't forget PG likes to work with your
operating-system (unlike many other DBs). Make sure Windows is using
enough RAM to cache diskspace.
I'm curious as to how this takes 8secs whereas you had 1 second earlier.
Are you sure some of this isn't pgadmin's overhead to display the rows?

> The EXPLAIN ANALYSE still shows the same as below, but the table has
344,000
> recs of which only 22636 are clientnum = 'SAKS'

That sounds like it's about the borderline between using an index and
not (depending on cache-size, disk speeds etc).

> I am still doing a seq search (this applies to the view question where if
it
> is a small result set it used a index search but on a larger return set it
> did a seq search) in my view, but with the adjustments to the kernel I get
a
> result in 140 secs (MSSQL was 135 secs).

If you want to check whether the index would help, try issuing the
following before running your query:  SET ENABLE_SEQSCAN=FALSE;
This will force PG to use any index it can regardless of whether it
thinks it will help.

> This is not production, I am still very worried that I have to do all this
> tweeking to use this, MSSQL worked out of the box as it does (not saying
its
> great, but I never had to adjust a kernel setting etc). Since we cannot
> afford the 70,000 dollars they want to license it I am not implying I can
> use MSSQL, but I could look at other DB's like MYSQL, or Firebird, etc.

I'm a little curious what kernel settings you are changing on Windows. I
wasn't aware there was much to be done there.

I'm afraid you do have to change half a dozen settings in
postgresql.conf to match your workload, but PG runs on a much wider
range of machines than MSSQL so it's difficult to come up with a
"reasonable" default. Takes me about 5 minutes when I setup an
installation to make sure the figures are reasonable (rather than the
best they can be).

> I have a lot of time now (two weeks) in this conversion and do not wish to
> give up, I will see if I can learn what is needed to get the maximum
> performance. I have seen much information available and this list has been
a
> huge resource. I really appreciate all the help.

--  Richard Huxton  Archonet Ltd

Re: same question little different test MSSQL vrs Postgres

From

Greg Stark

Date:

26 January 2005, 16:20:38

"Joel Fradkin" <jfradkin@wazagua.com> writes:

> QUERY PLAN
> "Seq Scan on tblcase  (cost=0.00..30066.21 rows=37401 width=996) (actual
> time=0.344..962.260 rows=22636 loops=1)"
> "  Filter: ((clientnum)::text = 'SAKS'::text)"
> "Total runtime: 1034.434 ms"

Well that says it only took 1s. So it seems this is highly dependent on
whether the data is in cache. Perhaps it was in cache on MSSQL when you
profiled it there and not on postgres?

You could put an index on clientnum, but if the data is usually in cache like
this it might not even be necessary.

-- 
greg

Re: same question little different test MSSQL vrs Postgres

From

Richard Huxton

Date:

26 January 2005, 16:27:38

Joel Fradkin wrote:
> The postgres is running on Linux Fedora core 3 (production will be redhat on
> Dell 4 proc 8 gig box).
> 
> My client pgadminIII is running on XP.
> 
> Sorry I was not clearer on this.

Ah! you're the gent who had the problems with SE-Linux on Fedora 3. 
Sorry - should have made the connection, but there's so much traffic on 
the lists it's easy to miss.

> I am playing with the settings now, I got it to return in 100 secs (the view
> that is that took 135 on MSSQL). My testing is using identical Dell desktops
> for the MSSQL and the Linux, with a third machine for the clients.
> 
> I do not mind getting up to speed on the proper setting to optimize the
> hardware, I am worried that as production environment can be somewhat
> dynamic that I will have issues getting a optimized environment and that it
> will work for our needs. My whole reason for being here is that our duel
> proc production MSSQL server is just no longer keeping up with the demand,
> so it is important that whatever I implement is going to up to the
> challenge.

You might want to look at the overall design of the database at some 
point too. Also, don't forget the compromises you made when designing 
for MSSQL might not be useful (or even harmful) with PG.
> I am still convinced Postgres was the correct choice, especially
> with all the guidance I have been able to get here. 100 seconds will be fine
> compared to the 135 of MSSQL, I just was getting worse responses before
> adjusting. At the moment I think I went too far as I see it using swap and
> going slower, but it never used much of the 756 meg (137 max was all I ever
> saw it use).

If you're on Linux then 135MB sounds like too much (for one client, far 
too much).

> I guess the swap buffers and cache are the important settings (least that
> seems to be what is affecting the memory). Not sure exactly what would cause
> it to use seq vrs index, but I will try the force and see if it helps the
> speed.

Try starting with your shared-buffers at say 4000-8000 (32MB to 64MB), 
sort-mem/work-mem at 8000-32000 (8MB-32MB), random-page-cost somewhere 
between 2 and 4. Then, judge how much RAM your box is using to cache 
disk-space (free -m) and set effective-cache-size accordingly. That's it 
- you may want to play around with the figures slightly, but pick the 
lowest numbers above and restart PG and it'll run OK.

--  Richard Huxton  Archonet Ltd

Re: same question little different test MSSQL vrs Postgres

From

"Joel Fradkin"

Date:

26 January 2005, 16:28:35

I tried the SET ENABLE_SEQSCAN=FALSE;
And the result took 29 secs instead of 117.

After playing around with the cache and buffers etc I see I am no longer
doing any swapping (not sure how I got the 100 sec response might have been
shared buffers set higher, been goofing around with it all morning).

My worry here is it should obviously use an index scan so something is not
setup correctly yet. I don't want to second guess the analyzer (or is this a
normal thing?)

Least it is blowing the doors off MSSQL (which is what I touted to my boss
and was pretty upset when I got no result last night).

The 117 was before I forced the seq off so even doing a seq I am getting
results now that are better then MSSQL.

Joel Fradkin
Wazagua, Inc.
2520 Trailmate Dr
Sarasota, Florida 34243
Tel.  941-753-7111 ext 305
jfradkin@wazagua.com
www.wazagua.com
Powered by Wazagua
Providing you with the latest Web-based technology & advanced tools.
C 2004. WAZAGUA, Inc. All rights reserved. WAZAGUA, IncThis email message is for the use of the intended recipient(s)
andmay 
contain confidential and privileged information.  Any unauthorized review,
use, disclosure or distribution is prohibited.  If you are not the intended
recipient, please contact the sender by reply email and delete and destroy
all copies of the original message, including attachments.

-----Original Message-----
From: Richard Huxton [mailto:dev@archonet.com]
Sent: Wednesday, January 26, 2005 10:21 AM
To: Joel Fradkin
Cc: gsstark@mit.edu; pgsql-sql@postgresql.org
Subject: Re: [SQL] same question little different test MSSQL vrs Postgres

Joel Fradkin wrote:
> Well last evening (did not try it this morning) it was taking the extra
> time.
>
> I have made some adjustments to the config file per a few web sites that
you
> all recommended my looking at.

The crucial one I'd say is the performance guide at:  http://www.varlena.com/varlena/GeneralBits/Tidbits/index.php
The first half-dozen settings are the crucial ones.

> It is now using 137 of 756 meg avail.
> it is now taking 8 secs to return 22,000 rows (using pgadminIII in a sql
> edit window).

That might be too much RAM. Don't forget PG likes to work with your
operating-system (unlike many other DBs). Make sure Windows is using
enough RAM to cache diskspace.
I'm curious as to how this takes 8secs whereas you had 1 second earlier.
Are you sure some of this isn't pgadmin's overhead to display the rows?

> The EXPLAIN ANALYSE still shows the same as below, but the table has
344,000
> recs of which only 22636 are clientnum = 'SAKS'

That sounds like it's about the borderline between using an index and
not (depending on cache-size, disk speeds etc).

> I am still doing a seq search (this applies to the view question where if
it
> is a small result set it used a index search but on a larger return set it
> did a seq search) in my view, but with the adjustments to the kernel I get
a
> result in 140 secs (MSSQL was 135 secs).

If you want to check whether the index would help, try issuing the
following before running your query:  SET ENABLE_SEQSCAN=FALSE;
This will force PG to use any index it can regardless of whether it
thinks it will help.

> This is not production, I am still very worried that I have to do all this
> tweeking to use this, MSSQL worked out of the box as it does (not saying
its
> great, but I never had to adjust a kernel setting etc). Since we cannot
> afford the 70,000 dollars they want to license it I am not implying I can
> use MSSQL, but I could look at other DB's like MYSQL, or Firebird, etc.

I'm a little curious what kernel settings you are changing on Windows. I
wasn't aware there was much to be done there.

I'm afraid you do have to change half a dozen settings in
postgresql.conf to match your workload, but PG runs on a much wider
range of machines than MSSQL so it's difficult to come up with a
"reasonable" default. Takes me about 5 minutes when I setup an
installation to make sure the figures are reasonable (rather than the
best they can be).

> I have a lot of time now (two weeks) in this conversion and do not wish to
> give up, I will see if I can learn what is needed to get the maximum
> performance. I have seen much information available and this list has been
a
> huge resource. I really appreciate all the help.

--  Richard Huxton  Archonet Ltd

Re: same question little different test MSSQL vrs Postgres

From

Greg Stark

Date:

26 January 2005, 16:50:22

"Joel Fradkin" <jfradkin@wazagua.com> writes:

> I tried the SET ENABLE_SEQSCAN=FALSE;
> And the result took 29 secs instead of 117.
> 
> After playing around with the cache and buffers etc I see I am no longer
> doing any swapping (not sure how I got the 100 sec response might have been
> shared buffers set higher, been goofing around with it all morning).

If it's swapping you're definitely going to get bad results. You really want
the *majority* of RAM left free for the OS to cache disk data.

> My worry here is it should obviously use an index scan so something is not
> setup correctly yet. I don't want to second guess the analyzer (or is this a
> normal thing?)

No that's not obvious. 22k out of 344k is a selectivity of 6.6% which is
probably about borderline. The optimizer is estimating even worse at 10.9%
which isn't far off but puts it well out of the range for an index scan.

If you really want to get postgres using an index scan you'll have to a)
improve the estimate using "alter table tblcase alter column clientnum set
statistics" to raise the statistics target for that column and reanalyze. 

And b) lower random_page_cost. random_page_cost tells postgres how much slower
indexes are than table scans and at the default setting it accurately
represents most disk hardware. If your database fits within RAM and is often
cached then you might have to lower it to model that fact. But you shouldn't
do it based on a single query like this.

-- 
greg

Re: same question little different test MSSQL vrs Postgres

From

"Joel Fradkin"

Date:

26 January 2005, 17:01:15

Thank you I will look at that info.
I did do an EXPLAIN ANALYSE on the view and could see it was doing the seq
scan on 3 fields, so I did an index for the three fields and it then chose
an index scan and ran in 27 seconds.

I also did adjust my defaults to much smaller numbers on shared buffers (per
the tidbits page recommendation like 8 meg for my memory size). I looked at
http://www.desknow.com/kb/idx/0/061/article/ which recommended doing a
vacuum verbose to determine the exact max_fsm_pages and I set the cache to
use 25% of my available memory per the recommendation on tid bits.

Joel Fradkin
Wazagua, Inc.
2520 Trailmate Dr
Sarasota, Florida 34243
Tel.  941-753-7111 ext 305
jfradkin@wazagua.com
www.wazagua.com
Powered by Wazagua
Providing you with the latest Web-based technology & advanced tools.
C 2004. WAZAGUA, Inc. All rights reserved. WAZAGUA, IncThis email message is for the use of the intended recipient(s)
andmay 
contain confidential and privileged information.  Any unauthorized review,
use, disclosure or distribution is prohibited.  If you are not the intended
recipient, please contact the sender by reply email and delete and destroy
all copies of the original message, including attachments.

-----Original Message-----
From: gsstark@mit.edu [mailto:gsstark@mit.edu]
Sent: Wednesday, January 26, 2005 11:50 AM
To: Joel Fradkin
Cc: 'Richard Huxton'; gsstark@mit.edu; pgsql-sql@postgresql.org;
ac@wazagua.com; Steve Goldsmith
Subject: Re: [SQL] same question little different test MSSQL vrs Postgres

"Joel Fradkin" <jfradkin@wazagua.com> writes:

> I tried the SET ENABLE_SEQSCAN=FALSE;
> And the result took 29 secs instead of 117.
>
> After playing around with the cache and buffers etc I see I am no longer
> doing any swapping (not sure how I got the 100 sec response might have
been
> shared buffers set higher, been goofing around with it all morning).

If it's swapping you're definitely going to get bad results. You really want
the *majority* of RAM left free for the OS to cache disk data.

> My worry here is it should obviously use an index scan so something is not
> setup correctly yet. I don't want to second guess the analyzer (or is this
a
> normal thing?)

No that's not obvious. 22k out of 344k is a selectivity of 6.6% which is
probably about borderline. The optimizer is estimating even worse at 10.9%
which isn't far off but puts it well out of the range for an index scan.

If you really want to get postgres using an index scan you'll have to a)
improve the estimate using "alter table tblcase alter column clientnum set
statistics" to raise the statistics target for that column and reanalyze.

And b) lower random_page_cost. random_page_cost tells postgres how much
slower
indexes are than table scans and at the default setting it accurately
represents most disk hardware. If your database fits within RAM and is often
cached then you might have to lower it to model that fact. But you shouldn't
do it based on a single query like this.

--
greg

Re: SPAM Re: same question little different test MSSQL

From

Franco Bruno Borghesi

Date:

26 January 2005, 17:12:55

Maybe you should tweak the cpu_index_tuple_cost parameter instead of disabling sequential scans. De default value is
0.001,you should change it to a lower value (0.0005 or something).<br /><br /> Joel Fradkin wrote: <blockquote
cite="mid000001c503c4$816b1880$797ba8c0@jfradkin"type="cite"><pre wrap="">I tried the SET ENABLE_SEQSCAN=FALSE;
 
And the result took 29 secs instead of 117.

After playing around with the cache and buffers etc I see I am no longer
doing any swapping (not sure how I got the 100 sec response might have been
shared buffers set higher, been goofing around with it all morning).

My worry here is it should obviously use an index scan so something is not
setup correctly yet. I don't want to second guess the analyzer (or is this a
normal thing?)

Least it is blowing the doors off MSSQL (which is what I touted to my boss
and was pretty upset when I got no result last night).

The 117 was before I forced the seq off so even doing a seq I am getting
results now that are better then MSSQL.

Joel Fradkin
Wazagua, Inc.
2520 Trailmate Dr
Sarasota, Florida 34243
Tel.  941-753-7111 ext 305
<a class="moz-txt-link-abbreviated" href="mailto:jfradkin@wazagua.com">jfradkin@wazagua.com</a>
<a class="moz-txt-link-abbreviated" href="http://www.wazagua.com">www.wazagua.com</a>
Powered by Wazagua
Providing you with the latest Web-based technology & advanced tools.
C 2004. WAZAGUA, Inc. All rights reserved. WAZAGUA, IncThis email message is for the use of the intended recipient(s)
andmay
 
contain confidential and privileged information.  Any unauthorized review,
use, disclosure or distribution is prohibited.  If you are not the intended
recipient, please contact the sender by reply email and delete and destroy
all copies of the original message, including attachments.


-----Original Message-----
From: Richard Huxton [<a class="moz-txt-link-freetext" href="mailto:dev@archonet.com">mailto:dev@archonet.com</a>] 
Sent: Wednesday, January 26, 2005 10:21 AM
To: Joel Fradkin
Cc: <a class="moz-txt-link-abbreviated" href="mailto:gsstark@mit.edu">gsstark@mit.edu</a>; <a
class="moz-txt-link-abbreviated"href="mailto:pgsql-sql@postgresql.org">pgsql-sql@postgresql.org</a>
 
Subject: Re: [SQL] same question little different test MSSQL vrs Postgres

Joel Fradkin wrote: </pre><blockquote type="cite"><pre wrap="">Well last evening (did not try it this morning) it was
takingthe extra
 
time.

I have made some adjustments to the config file per a few web sites that   </pre></blockquote><pre wrap="">you
</pre><blockquotetype="cite"><pre wrap="">all recommended my looking at.   </pre></blockquote><pre wrap="">
 
The crucial one I'd say is the performance guide at:  <a class="moz-txt-link-freetext"
href="http://www.varlena.com/varlena/GeneralBits/Tidbits/index.php">http://www.varlena.com/varlena/GeneralBits/Tidbits/index.php</a>
The first half-dozen settings are the crucial ones.
 </pre><blockquote type="cite"><pre wrap="">It is now using 137 of 756 meg avail.
it is now taking 8 secs to return 22,000 rows (using pgadminIII in a sql
edit window).   </pre></blockquote><pre wrap="">
That might be too much RAM. Don't forget PG likes to work with your 
operating-system (unlike many other DBs). Make sure Windows is using 
enough RAM to cache diskspace.
I'm curious as to how this takes 8secs whereas you had 1 second earlier. 
Are you sure some of this isn't pgadmin's overhead to display the rows?
 </pre><blockquote type="cite"><pre wrap="">The EXPLAIN ANALYSE still shows the same as below, but the table has
</pre></blockquote><prewrap="">344,000 </pre><blockquote type="cite"><pre wrap="">recs of which only 22636 are
clientnum= 'SAKS'   </pre></blockquote><pre wrap="">
 
That sounds like it's about the borderline between using an index and 
not (depending on cache-size, disk speeds etc).
 </pre><blockquote type="cite"><pre wrap="">I am still doing a seq search (this applies to the view question where if
</pre></blockquote><prewrap="">it </pre><blockquote type="cite"><pre wrap="">is a small result set it used a index
searchbut on a larger return set it
 
did a seq search) in my view, but with the adjustments to the kernel I get   </pre></blockquote><pre wrap="">a
</pre><blockquotetype="cite"><pre wrap="">result in 140 secs (MSSQL was 135 secs).   </pre></blockquote><pre wrap="">
 
If you want to check whether the index would help, try issuing the 
following before running your query:  SET ENABLE_SEQSCAN=FALSE;
This will force PG to use any index it can regardless of whether it 
thinks it will help.
 </pre><blockquote type="cite"><pre wrap="">This is not production, I am still very worried that I have to do all this
tweeking to use this, MSSQL worked out of the box as it does (not saying   </pre></blockquote><pre wrap="">its
</pre><blockquotetype="cite"><pre wrap="">great, but I never had to adjust a kernel setting etc). Since we cannot
 
afford the 70,000 dollars they want to license it I am not implying I can
use MSSQL, but I could look at other DB's like MYSQL, or Firebird, etc.   </pre></blockquote><pre wrap="">
I'm a little curious what kernel settings you are changing on Windows. I 
wasn't aware there was much to be done there.

I'm afraid you do have to change half a dozen settings in 
postgresql.conf to match your workload, but PG runs on a much wider 
range of machines than MSSQL so it's difficult to come up with a 
"reasonable" default. Takes me about 5 minutes when I setup an 
installation to make sure the figures are reasonable (rather than the 
best they can be).
 </pre><blockquote type="cite"><pre wrap="">I have a lot of time now (two weeks) in this conversion and do not wish to
give up, I will see if I can learn what is needed to get the maximum
performance. I have seen much information available and this list has been   </pre></blockquote><pre wrap="">a
</pre><blockquotetype="cite"><pre wrap="">huge resource. I really appreciate all the help.   </pre></blockquote><pre
wrap="">
--  Richard Huxton  Archonet Ltd


---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?
              <a class="moz-txt-link-freetext"
href="http://www.postgresql.org/docs/faq">http://www.postgresql.org/docs/faq</a>
 </pre></blockquote><br />

Re: same question little different test MSSQL vrs Postgres

From

Richard Huxton

Date:

26 January 2005, 17:39:34

Joel Fradkin wrote:
> Thank you I will look at that info.
> I did do an EXPLAIN ANALYSE on the view and could see it was doing the seq
> scan on 3 fields, so I did an index for the three fields and it then chose
> an index scan and ran in 27 seconds.
> 
> I also did adjust my defaults to much smaller numbers on shared buffers (per
> the tidbits page recommendation like 8 meg for my memory size). I looked at
> http://www.desknow.com/kb/idx/0/061/article/ which recommended doing a
> vacuum verbose to determine the exact max_fsm_pages and I set the cache to
> use 25% of my available memory per the recommendation on tid bits.

Note that the effective_cache_size (if I've spelt it right) just tells 
PG what your cache size is. You should set it based on what "free" tells 
you about your system's use of memory.

--   Richard Huxton  Archonet Ltd

Re: same question little different test MSSQL vrs Postgres

From

Bruno Wolff III

Date:

26 January 2005, 20:50:40

On Tue, Jan 25, 2005 at 21:21:08 -0700, Dennis Sacks <dennis@illusions.com> wrote:
> 
> One of the things you'll want to do regularly is run a "vacuum analyze". 
> You can read up on this in the postgresql docs. This is essential to the 
> indexes being used properly. At a bare minimum, after you import a large 
> amount of data, you'll want to run vacuum analyze.

Note that there is no need to vacuum after inserts (only updates and deletes),
so you can just do an analyze in that case.

Re: same question little different test MSSQL vrs Postgres

From

Dennis Sacks

Date:

26 January 2005, 20:56:30

Bruno Wolff III wrote:

>On Tue, Jan 25, 2005 at 21:21:08 -0700,
>  Dennis Sacks <dennis@illusions.com> wrote:
>  
>
>>One of the things you'll want to do regularly is run a "vacuum analyze". 
>>You can read up on this in the postgresql docs. This is essential to the 
>>indexes being used properly. At a bare minimum, after you import a large 
>>amount of data, you'll want to run vacuum analyze.
>>    
>>
>
>Note that there is no need to vacuum after inserts (only updates and deletes),
>so you can just do an analyze in that case.
>  
>
Good point! Analyze after bulk inserts, vacuum analyze after 
updates/deletes and inserts. :)

Dennis Sacks

Re: same question little different test MSSQL vrs Postgres

From

Andrew Hammond

Date:

26 January 2005, 22:15:51

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

We've done some pretty extensive benchmarking and load testing on a
couple of platforms including the Xeon and Opteron. You may have already
bought that Dell box, but I'll say it anyway. Xeon quad processors are a
terrible platform for postgres. Trying to use more than 4GB of memory on
a 32 bit machine is a waste of money.

If you want performance, get a quad Opteron with the same amount of
memory. I guarantee you'll see at least an order of magnitude
performance improvement and substantially more under highly concurrent
loads. If you decide to go this way, HP sells a very nice box. I also
strongly recommend you investigate SuSE instead of RedHat. Fedora core
is good technology, but SuSE offers equally good technology with better
support.

Also make sure that your SCSI HBA is actually using the 64 bit PCI bus.
There are cards out there which plug into 64 bit PCI but only actually
address 32 bits (Qlogic's QLA2340 / 2342 for example).

You make no mention of the disk subsystem you plan to use. This is most
critical part of your system. Database performance is almost always
bound by IO. Usually disk IO. Briefly, put PGDATA on the widest RAID 10
array of disks you can manage. It's not worth spending the extra money
to get 15kRPM disks for this. The size of the disks involved is pretty
much irrelevant, only the number of them matters. Put the WAL files on a
dedicated RAID 1 pair of 15kRPM disks. Put the postgres log files (or
syslog) on a seperate filesystem.

- --
Andrew Hammond    416-673-4138    ahammond@ca.afilias.info
Database Administrator, Afilias Canada Corp.
CB83 2838 4B67 D40F D086 3568 81FC E7E5 27AF 4A9A


Joel Fradkin wrote:
| The postgres is running on Linux Fedora core 3 (production will be
redhat on
| Dell 4 proc 8 gig box).
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.5 (GNU/Linux)

iD8DBQFB+BaPgfzn5SevSpoRAgirAKDBbedScL3leQVidZjmsGmxoph8wQCgvhoW
2ZznEkxOMA3btZEBdzHd8TU=
=eg7h
-----END PGP SIGNATURE-----

Re: same question little different test MSSQL vrs Postgres

From

"Joel Fradkin"

Date:

27 January 2005, 14:08:06

Now you tell me.
We had a fellow working here kept screaming AMD, but I am a very paranoid
person and was not aware Linux and Postgres have been running on the new
chips. I don't like to be a first. We have bought the Dell and I cant tell
you if the controller uses 64bits, I just got what they had on their page
for their 4 proc rack mount. Part of my reason for going Dell was we already
have Dell equipment and the Linux support is offered from Dell as well, so I
have one vendor to worry about. Being a developer and Director of IT I want
the fastest best, but sometimes I flavor my opinions with safest and
easiest. The RPM delivery is something I understand (it's easy). What is SU
like? Is there any difference in the performance between the two Vendors?
I am sure we will be buying more Postgres servers in the near future (One of
the big reasons we are taking the time to convert from MSSQL was so we could
afford to invest in more servers MSSQL was cost prohibitive even for one
server). As easy as Fedura was I still had several issues getting to where I
am now, so I am paranoid of something that requires even more knowledge to
pull it off; that being said I never minded getting into the details to get
a better end result. As you said we have made the investment in the Dell
(25K). I feel pretty stupid if it is as you say a waste of money to get 8
gigs on this platform as I just made that same mistake a year ago when I
bought the 2 processor boxes with standard addition MSSQL and 4 gigs (It
only uses 2 gig). I was under the impression this machine would utilize all
8 gigs. Are you saying only 4 will be available for caching etc, or just the
chipset cant deal with numbers 8 gig and will be slower to access them? If
it is the later then I would imagine it would still outperform a similar box
with 4 gig assuming my demand on cache is larger then 4 gig.

Just to confirm you have these quad Opteron (I am assuming a 4 processor
config?) in a production environment running su and postgres with hardware
support from HP and software from su? You indicate three separate physical
drives will give best performance (one for data 10K speeds, one for admin,
one for wall 15 speed)? I am not too sophisticated at knowing how to irder
this arrangement and set it up in Linux, any chance you could detail (1 card
with 2 channels 4 10k drives on one channel, 2 15k drives on the second, do
I need another channel and drive(s) for admin files?), drive layout when
installing config in postgres to utilize?

If need be maybe we can get you to do this as a consultant as I do
understand how important the hardware and the proper config is.  I found out
too late with MSSQL that I should have used two seprate drive arrays, one
for data, one for log (this would have required the split back plane).

So not to plug a specific vendor but if you have production environment
example with real equipment suggestions I would be very appreciative.

I know that's a lot to ask so if you don't have time that's cool, thanks so
much for bringing this up so that my next purchase I will seriously look at
quad Opteron technology if it is a tried and true solution for this OS and
Postgres.

Joel Fradkin


-----Original Message-----
From: Andrew Hammond [mailto:ahammond@ca.afilias.info]
Sent: Wednesday, January 26, 2005 5:16 PM
To: Joel Fradkin
Cc: pgsql-sql@postgresql.org
Subject: Re: [SQL] same question little different test MSSQL vrs Postgres

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

We've done some pretty extensive benchmarking and load testing on a
couple of platforms including the Xeon and Opteron. You may have already
bought that Dell box, but I'll say it anyway. Xeon quad processors are a
terrible platform for postgres. Trying to use more than 4GB of memory on
a 32 bit machine is a waste of money.

If you want performance, get a quad Opteron with the same amount of
memory. I guarantee you'll see at least an order of magnitude
performance improvement and substantially more under highly concurrent
loads. If you decide to go this way, HP sells a very nice box. I also
strongly recommend you investigate SuSE instead of RedHat. Fedora core
is good technology, but SuSE offers equally good technology with better
support.

Also make sure that your SCSI HBA is actually using the 64 bit PCI bus.
There are cards out there which plug into 64 bit PCI but only actually
address 32 bits (Qlogic's QLA2340 / 2342 for example).

You make no mention of the disk subsystem you plan to use. This is most
critical part of your system. Database performance is almost always
bound by IO. Usually disk IO. Briefly, put PGDATA on the widest RAID 10
array of disks you can manage. It's not worth spending the extra money
to get 15kRPM disks for this. The size of the disks involved is pretty
much irrelevant, only the number of them matters. Put the WAL files on a
dedicated RAID 1 pair of 15kRPM disks. Put the postgres log files (or
syslog) on a seperate filesystem.

- --
Andrew Hammond    416-673-4138    ahammond@ca.afilias.info
Database Administrator, Afilias Canada Corp.
CB83 2838 4B67 D40F D086 3568 81FC E7E5 27AF 4A9A


Joel Fradkin wrote:
| The postgres is running on Linux Fedora core 3 (production will be
redhat on
| Dell 4 proc 8 gig box).
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.5 (GNU/Linux)

iD8DBQFB+BaPgfzn5SevSpoRAgirAKDBbedScL3leQVidZjmsGmxoph8wQCgvhoW
2ZznEkxOMA3btZEBdzHd8TU=
=eg7h
-----END PGP SIGNATURE-----

Re: same question little different test MSSQL vrs Postgres

From

Mischa

Date:

01 February 2005, 04:05:34

Quoting Dennis Sacks <dennis@illusions.com>:
> Bruno Wolff III wrote:
> >On Tue, Jan 25, 2005 at 21:21:08 -0700,
> >  Dennis Sacks <dennis@illusions.com> wrote:
> >>One of the things you'll want to do regularly is run a "vacuum analyze". 
> >>You can read up on this in the postgresql docs. This is essential to the 
> >>indexes being used properly. At a bare minimum, after you import a large 
> >>amount of data, you'll want to run vacuum analyze.
> >
> Good point! Analyze after bulk inserts, vacuum analyze after 
> updates/deletes and inserts. :)

Hmmm ... in performance tests of bulk inserts into a table with six indexes, I
found that without vacuum analyze (rather than analyze), insertion slowed down
albeit something less than linearly. Table of 6M rows, about 3GB (including
index files).
This is 7.4.1 on SuSE Linux, RAID5 Xeon(sigh) 2.8GHz 4GB nothing else running.
The inserts were always done with an existing record check (LEFT JOIN ... WHERE
joinkey IS NULL).
-- 
"Dreams come true, not free."