Thread: Re: Opteron vs Xeon (Was: What to do with 6 disks?)

Re: Opteron vs Xeon (Was: What to do with 6 disks?)

From

"Anjan Dave"

Date:

20 April 2005, 13:33:18

There have been some discussions on this list and others in general about Dell's version of RAID cards, and server
support,mainly linux support. 

Before I venture into having another vendor in the shop I want to know if there are any dos/don't's about 4-way Opteron
offeringsfrom Sun and HP. 

Don't want to put the topic on a different tangent, but I would be interested in the discussion of AMD Vs. XEON in
termsof actual products available today. 

Thanks,
Anjan

-----Original Message-----
From: Christian Sander Røsnes [mailto:christian@aspiro.no]
Sent: Wednesday, April 20, 2005 12:14 PM
To: Bruce Momjian
Cc: pgsql-performance@postgresql.org
Subject: Re: [PERFORM] Opteron vs Xeon (Was: What to do with 6 disks?)

On Wednesday 20 April 2005 17:50, Bruce Momjian wrote:
> Anjan Dave wrote:
> > In terms of vendor specific models -
> >
> > Does anyone have any good/bad experiences/recommendations for a 4-way
> > Opteron from Sun (v40z, 6 internal drives) or HP (DL585 5 internal
> > drives) models?
> >
> > This is in comparison with the new Dell 6850 (it has PCIexpress, faster
> > FSB 667MHz, which doesn't match up with AMD's total IO bandwidth, but
> > much better than previous 6650s).
>
> Dell cuts too many corners to be a good server.

Hi

Which corners do Dell cut compared to the competition ?

Thanks

Christian

---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
    (send "unregister YourEmailAddressHere" to majordomo@postgresql.org)

Re: Opteron vs Xeon (Was: What to do with 6 disks?)

From

"Joel Fradkin"

Date:

20 April 2005, 15:15:20

There have been some discussions on this list and others in general about
Dell's version of RAID cards, and server support, mainly linux support.

I was pretty impressed with the Dell guy. He spent the day with me remotely
and went through my system 6650 with powervault. Changed my drives from ext3
to ext2 with no journaling checked all the drivers and such.

I did not see any marked improvement, but I dont think my issues are
related to the hardware.

I am giving up on postgres and three developers two months of work and
trying MYSQL.

I have posted several items and not got a response (not that I expect folks
to drop everything). I want to thank everyone who has been of help and there
are several.

It just is running way slow on several of my views. I tried them today in
MYSQL and found that the MYSQL was beating out my MSSQL.

On certain items I could get PG to work ok, but it never was faster the
MSSQL. On certain items it is taking several minutes compared to a few
seconds on MYSQL.

I really like the environment and feel I have learned a lot in the past few
months, but bottom line for me is speed. We bought a 30K Dell 6650 to get
better performance. I chose PG because MSSQL was 70K to license. I believe
the MYSQL will be 250.00 to license for us, but I may choose the 4k platinum
support just to feel safe about having some one to touch base with in the
event of an issue.

Again thanks to everyone who has answered my newb questions and helped me
get it on the 3 spindles and tweek the install. Commandpromt.com was a big
help and if I wanted to budget a bunch more $ and mostly if I was at liberty
to share my database with them they may of helped me get through all the
issues. I am not sure I am walking away feeling real good about postgres,
because it just should not take a rocket scientist to get it to work, and I
used to think I was fairly smart and could figure stuff out and I hate
admitting defeat (especially since we have everything working with postgres
now).

Re: Joel's Performance Issues WAS : Opteron vs Xeon

From

Josh Berkus

Date:

20 April 2005, 15:48:33

Joel,

> I did not see any marked improvement, but I dont think my issues are
> related to the hardware.

If you won't believe it, then we certainly can't convince you.  AFAIK your bad
view is a bad query plan made worse by the Dell's hardware problems.

> I am giving up on postgres and three developers two months of work and
> trying MYSQL.

I'd suggest testing your *whole* application and not just this one query.  And
remember that you need to test InnoDB tables if you want transactions.

>
> I have posted several items and not got a response (not that I expect folks
> to drop everything). I want to thank everyone who has been of help and
> there are several.

Hmmm ... I see about 25 responses to some of your posts on this list.
Including ones by some of our head developers.   That's more than you'd get
out of a paid MSSQL support contract, I know from experience.

If you want anything more, then you'll need a "do-or-die" contract with a
support company. If your frustration is because you can't find this kind of
help than I completely understand ... I have a waiting list for performance
contracts myself.  (and, if you hired me the first thing I'd tell you is to
junk the Dell)

> I really like the environment and feel I have learned a lot in the past few
> months, but bottom line for me is speed. We bought a 30K Dell 6650 to get
> better performance.

Would have been smart to ask on this list *before* buying the Dell, hey?  Even
a Google of this mailing list would have been informative.

> I chose PG because MSSQL was 70K to license. I believe
> the MYSQL will be 250.00 to license for us, but I may choose the 4k
> platinum support just to feel safe about having some one to touch base with
> in the event of an issue.

Hmmm ... you're willing to pay MySQL $4k but expect the PG community to solve
all your problems with free advice and a couple $100 with CMD?   I sense an
apples vs. barca loungers comparison here ...

> I am not sure I am walking away feeling real good about
> postgres, because it just should not take a rocket scientist to get it to
> work, and I used to think I was fairly smart and could figure stuff out and
> I hate admitting defeat (especially since we have everything working with
> postgres now).

While I understand your frustration (I've been frustrated more than a few
times with issues that stump me on Linux, for example) it's extremely unfair
to lash out at a community that has provided you a lot of free advice because
the advice hasn't fixed everything yet.  By my reading, you first raised your
query issue 6 days ago.  6 days is not a lot of time for getting *free*
troubleshooting help by e-mail. Certainly it's going to take more than 6 days
to port to MySQL.

--
--Josh

Josh Berkus
Aglio Database Solutions
San Francisco

Re: Joel's Performance Issues WAS : Opteron vs Xeon

From

"Joel Fradkin"

Date:

20 April 2005, 16:52:50

Sorry if you feel I am lashing out at a community.
Just to say it again, I am very appreciative of all the help everyone has
supplied.

I am running on more then just the 4 proc Dell (in fact my tests have been
mostly on desktops).

I have MSSQL running on a 2 proc dell which until my load has increased
(over aprx 2 years) it was just fine. I totally agree that there are better
solutions based on this lists comments, but I have all Dell hardware now and
resist trying different vendors just to suit Postgres. I was under the
impression there were still issues with 64bit postgres and Linux (or at
least were when I purchased). I believed I could make my next aquistion a
opteron based hardware.

Again I am not at all trying to critasize any one, so please except my
apology if I some how came across with that attitude. I am very disappointed
at this point. My views may not be that great (although I am not saying that
either), but they run ok on MSSQL and appear to run ok on MYSQL.

I wish I did understand what I am doing wrong because I do not wish to
revisit engineering our application for MYSQL.

I would of spent more $ with Command, but he does need my data base to help
me and I am not able to do that.

I agree testing the whole app is the only way to see and unfortunately it is
a time consuming bit. I do not have to spend 4k on MYSQL, that is if I want
to have their premium support. I can spend $250.00 a server for the
commercial license if I find the whole app does run well. I just loaded the
data last night and only had time to convert one view this morning. I am
sure it is something I do not understand and not a problem with postgres. I
also am willing to take time to get more knowledgeable, but my time is
running out and I feel honestly stupid.

I have been in the process of converting for over two months and have said
several times these lists are a godsend.

It was never my intention to make you feel like I was flaming anyone
involved. On the contrary, I feel many have taken time to look at my
questions and given excellent advice. I know I check the archives so
hopefully that time will help others after me.

I may yet find that MYSQL is not a good fit as well. I have my whole app
converted at this point and find pg works well for a lot of my usage.

There are some key reporting views that need to retrieve many rows with many
joins that just take too long to pull the data. I told my boss just now that
if I try to de-normalize many of these data sets (like 6 main groups of data
that the reporting may work, but as is many of my web pages are timing out
(these are pages that still work on MSSQL and the 2 proc machine).

Thanks again for all the help and know I truly appreciate what time every
one has spent on my issues.

I may find that revisiting the datasets is a way to make PG work, or as you
mentioned maybe I can get some one with more knowledge to step in locally. I
did ask Tom if he knew of anyone, maybe some one else on the list is aware
of a professional in the Tampa FL area.

Realistically I don't think a 30k$ Dell is a something that needs to be
junked. I am pretty sure if I got MSSQL running on it, it would outperform
my two proc box. I can agree it may not have been the optimal platform. My
decision is not based solely on the performance on the 4 proc box.

Joel Fradkin


-----Original Message-----
From: Josh Berkus [mailto:josh@agliodbs.com]
Sent: Wednesday, April 20, 2005 1:54 PM
To: Joel Fradkin
Cc: pgsql-performance@postgresql.org
Subject: Re: [PERFORM] Joel's Performance Issues WAS : Opteron vs Xeon

Joel,

> I did not see any marked improvement, but I don't think my issues are
> related to the hardware.

If you won't believe it, then we certainly can't convince you.  AFAIK your
bad
view is a bad query plan made worse by the Dell's hardware problems.

> I am giving up on postgres and three developers two months of work and
> trying MYSQL.

I'd suggest testing your *whole* application and not just this one query.
And
remember that you need to test InnoDB tables if you want transactions.

>
> I have posted several items and not got a response (not that I expect
folks
> to drop everything). I want to thank everyone who has been of help and
> there are several.

Hmmm ... I see about 25 responses to some of your posts on this list.
Including ones by some of our head developers.   That's more than you'd get
out of a paid MSSQL support contract, I know from experience.

If you want anything more, then you'll need a "do-or-die" contract with a
support company. If your frustration is because you can't find this kind of
help than I completely understand ... I have a waiting list for performance
contracts myself.  (and, if you hired me the first thing I'd tell you is to
junk the Dell)

> I really like the environment and feel I have learned a lot in the past
few
> months, but bottom line for me is speed. We bought a 30K Dell 6650 to get
> better performance.

Would have been smart to ask on this list *before* buying the Dell, hey?
Even
a Google of this mailing list would have been informative.

> I chose PG because MSSQL was 70K to license. I believe
> the MYSQL will be 250.00 to license for us, but I may choose the 4k
> platinum support just to feel safe about having some one to touch base
with
> in the event of an issue.

Hmmm ... you're willing to pay MySQL $4k but expect the PG community to
solve
all your problems with free advice and a couple $100 with CMD?   I sense an
apples vs. barca loungers comparison here ...

> I am not sure I am walking away feeling real good about
> postgres, because it just should not take a rocket scientist to get it to
> work, and I used to think I was fairly smart and could figure stuff out
and
> I hate admitting defeat (especially since we have everything working with
> postgres now).

While I understand your frustration (I've been frustrated more than a few
times with issues that stump me on Linux, for example) it's extremely unfair

to lash out at a community that has provided you a lot of free advice
because
the advice hasn't fixed everything yet.  By my reading, you first raised
your
query issue 6 days ago.  6 days is not a lot of time for getting *free*
troubleshooting help by e-mail. Certainly it's going to take more than 6
days
to port to MySQL.

--
--Josh

Josh Berkus
Aglio Database Solutions
San Francisco

Re: Joel's Performance Issues WAS : Opteron vs Xeon

From

Josh Berkus

Date:

20 April 2005, 17:17:42

Joel,

> I have MSSQL running on a 2 proc dell which until my load has increased
> (over aprx 2 years) it was just fine. I totally agree that there are better
> solutions based on this lists comments, but I have all Dell hardware now
> and resist trying different vendors just to suit Postgres. I was under the
> impression there were still issues with 64bit postgres and Linux (or at
> least were when I purchased). I believed I could make my next aquistion a
> opteron based hardware.

Yeah, sorry, the Dell stuff is a sore point with me.   You can't imagine the
number of conversations I have that go like this:
"We're having a severe performance problem with PostgreSQL"
"What hardware/OS are you using?"
"Dell *650 with RHAS 3.0 ...."

BTW, which Update version is your RHAS?   If you're on Update3, you can grab
more performance right there by upgrading to Update4.

> Again I am not at all trying to critasize any one, so please except my
> apology if I some how came across with that attitude. I am very
> disappointed at this point. My views may not be that great (although I am
> not saying that either), but they run ok on MSSQL and appear to run ok on
> MYSQL.

Yeah.  I think you'll find a few things that are vice-versa.   For that
matter, I can point to a number of queries we run better than Oracle, and a
number we don't.

Your particular query problem seems to stem from some bad estimates.   Can you
post an EXPLAIN ANALYZE based on all the advice people have given you so far?

> I wish I did understand what I am doing wrong because I do not wish to
> revisit engineering our application for MYSQL.

I can imagine.

> I would of spent more $ with Command, but he does need my data base to help
> me and I am not able to do that.

Yes.  For that matter, it'll take longer to troubleshoot on this list because
of your security concerns.

> I agree testing the whole app is the only way to see and unfortunately it
> is a time consuming bit. I do not have to spend 4k on MYSQL, that is if I
> want to have their premium support. I can spend $250.00 a server for the
> commercial license if I find the whole app does run well. I just loaded the
> data last night and only had time to convert one view this morning. I am
> sure it is something I do not understand and not a problem with postgres. I
> also am willing to take time to get more knowledgeable, but my time is
> running out and I feel honestly stupid.

You're not.  You have a real query problem and it will require further
troubleshooting to solve.  Some of us make a pretty handsome living solving
these kinds of problems, it take a lot of expert knowledge.

> It was never my intention to make you feel like I was flaming anyone
> involved. On the contrary, I feel many have taken time to look at my
> questions and given excellent advice. I know I check the archives so
> hopefully that time will help others after me.

Well, I overreacted too.   Sorry!

> I may find that revisiting the datasets is a way to make PG work, or as you
> mentioned maybe I can get some one with more knowledge to step in locally.
> I did ask Tom if he knew of anyone, maybe some one else on the list is
> aware of a professional in the Tampa FL area.

Well, Robert Treat is in Florida but I'm pretty sure he's busy full-time.

> Realistically I don't think a 30k$ Dell is a something that needs to be
> junked. I am pretty sure if I got MSSQL running on it, it would outperform
> my two proc box. I can agree it may not have been the optimal platform. My
> decision is not based solely on the performance on the 4 proc box.

Oh, certainly it's too late to buy a Sunfire or eServer instead.   You just
could have gotten far more bang for the buck with some expert advice, that's
all.   But don't bother with Dell support any further, they don't really have
the knowledge to help you.

So ... new EXPLAIN ANALYZE ?

--
--Josh

Josh Berkus
Aglio Database Solutions
San Francisco

Re: Joel's Performance Issues WAS : Opteron vs Xeon

From

John A Meinel

Date:

20 April 2005, 17:42:34

Joel Fradkin wrote:
...

>I would of spent more $ with Command, but he does need my data base to help
>me and I am not able to do that.
>
>
...

What if someone were to write an anonymization script. Something that
changes any of the "data" of the database, but leaves all of the
relational information. It could turn all strings into some sort of
hashed version, so you don't give out any identifiable information.
It could even modify relational entries, as long as it updated both
ends, and this didn't affect the actual performance at all.

I don't think this would be very hard to write. Especially if you can
give a list of the tables, and what columns need to be modified.

Probably this would generally be a useful script to have for cases like
this where databases are confidential, but need to be tuned by someone else.

Would that be reasonable?
I would think that by renaming columns, hashing the data in the columns,
and renaming tables, most of the proprietary information is removed,
without removing the database information.

John
=:->

Attachment

signature.asc

Re: Joel's Performance Issues WAS : Opteron vs Xeon

From

Vivek Khera

Date:

20 April 2005, 18:01:59

On Apr 20, 2005, at 4:22 PM, Josh Berkus wrote:

>> Realistically I don't think a 30k$ Dell is a something that needs to
>> be
>> junked. I am pretty sure if I got MSSQL running on it, it would
>> outperform
>> my two proc box. I can agree it may not have been the optimal
>> platform. My
>> decision is not based solely on the performance on the 4 proc box.
>
> Oh, certainly it's too late to buy a Sunfire or eServer instead.   You
> just
> could have gotten far more bang for the buck with some expert advice,
> that's
> all.   But don't bother with Dell support any further, they don't
> really have
> the knowledge to help you.
>

FWIW, I have a $20k Dell box (PE2650 with 14-disk external PowerVault
RAID enclosure) which I'm phasing out for a dual opteron box because it
can't handle the load.  It will be re-purposed as a backup system.
Damn waste of money, but complaining customers can cost more...

Trust me, it is likely your Dell hardware, as moving to the Opteron
system has improved performance tremendously with fewer disks.  Same
amount of RAM and other basic configurations.  Both have LSI based RAID
cards, even.

Re: Joel's Performance Issues WAS : Opteron vs Xeon

From

"Joel Fradkin"

Date:

20 April 2005, 21:26:05

I did think of something similar just loading the data tables with junk
records and I may visit that idea with Josh.

I did just do some comparisons on timing of a plain select * from tbl where
indexed column = x and it was considerably slower then both MSSQL and MYSQL,
so I am still a bit confused. This still might be configuration issue (I ran
on my 2gig desktop and the 8 gig Linux box comparisons were all ran on the
same machines as far MSSQL, MYSQL, and Postgres.
I turned off postgres when running MYSQL and turned off MYSQL when running
postgres, MSSQL had one of the two running while I tested it.

For the 360,000 records returned MYSQL did it in 40 seconds first run and 17
seconds second run.

MSSQL did it in 56 seconds first run and 16 seconds second run.

Postgres was on the second run
Total query runtime: 17109 ms.
Data retrieval runtime: 72188 ms.
331640 rows retrieved.

So like 89 on the second run.
The first run was 147 secs all told.

These are all on my 2 meg desktop running XP.
I can post the config. I noticed the postgres was using 70% of the cpu while
MSSQL was 100%.

Joel Fradkin


>I would of spent more $ with Command, but he does need my data base to help
>me and I am not able to do that.
>
>
...

What if someone were to write an anonymization script. Something that
changes any of the "data" of the database, but leaves all of the
relational information. It could turn all strings into some sort of
hashed version, so you don't give out any identifiable information.
It could even modify relational entries, as long as it updated both
ends, and this didn't affect the actual performance at all.

I don't think this would be very hard to write. Especially if you can
give a list of the tables, and what columns need to be modified.

Probably this would generally be a useful script to have for cases like
this where databases are confidential, but need to be tuned by someone else.

Would that be reasonable?
I would think that by renaming columns, hashing the data in the columns,
and renaming tables, most of the proprietary information is removed,
without removing the database information.

John
=:->

Re: Joel's Performance Issues WAS : Opteron vs Xeon

From

John A Meinel

Date:

20 April 2005, 21:45:45

Joel Fradkin wrote:

>I did think of something similar just loading the data tables with junk
>records and I may visit that idea with Josh.
>
>I did just do some comparisons on timing of a plain select * from tbl where
>indexed column = x and it was considerably slower then both MSSQL and MYSQL,
>so I am still a bit confused. This still might be configuration issue (I ran
>on my 2gig desktop and the 8 gig Linux box comparisons were all ran on the
>same machines as far MSSQL, MYSQL, and Postgres.
>I turned off postgres when running MYSQL and turned off MYSQL when running
>postgres, MSSQL had one of the two running while I tested it.
>
>For the 360,000 records returned MYSQL did it in 40 seconds first run and 17
>seconds second run.
>
>MSSQL did it in 56 seconds first run and 16 seconds second run.
>
>Postgres was on the second run
>Total query runtime: 17109 ms.
>Data retrieval runtime: 72188 ms.
>331640 rows retrieved.
>
>So like 89 on the second run.
>The first run was 147 secs all told.
>
>These are all on my 2 meg desktop running XP.
>I can post the config. I noticed the postgres was using 70% of the cpu while
>MSSQL was 100%.
>
>Joel Fradkin
>
>
Why is MYSQL returning 360,000 rows, while Postgres is only returning
330,000? This may not be important at all, though.
I also assume you are selecting from a plain table, not a view.

I suppose knowing your work_mem, and shared_buffers settings would be
useful.

How were you measuring "data retrieval time"? And how does this compare
to what you were measuring on the other machines? It might be possible
that what you are really measuring is just the time it takes psql to
load up all the data into memory, and then print it out. And since psql
defaults to measuring entry lengths for each column, this may not be
truly comparable.
It *looks* like it only takes 18s for postgres to get the data, but then
it is taking 72s to transfer the data to you. That would be network
latency, or something like that, not database latency.
And I would say that 18s is very close to 16 or 17 seconds.

I don't know what commands you were issuing, or how you measured,
though. You might be using some other interface (like ODBC), which I
can't say too much about.

John
=:->

Attachment

signature.asc

Re: Joel's Performance Issues WAS : Opteron vs Xeon

From

Josh Berkus

Date:

21 April 2005, 00:27:22

Joel,

Ok, please try this:

ALTER TABLE tblresponseheader ALTER COLUMN clientnum SET STATISTICS 1000;
ALTER TABLE tblresponseheader ALTER COLUMN locationid SET STATISTICS 1000;
ALTER TABLE tbllocation ALTER COLUMN clientnum SET STATISTICS 1000;
ALTER TABLE tbllocation ALTER COLUMN divisionid SET STATISTICS 1000;
ALTER TABLE tbllocation ALTER COLUMN regionid SET STATISTICS 1000;
ANALYZE tblresponseheader;
ANALYZE tbllocation;

Then run the EXPLAIN ANALYZE again.   (on Linux)


--
--Josh

Josh Berkus
Aglio Database Solutions
San Francisco

Re: Joel's Performance Issues WAS : Opteron vs Xeon

From

Tom Lane

Date:

21 April 2005, 01:35:47

John A Meinel <john@arbash-meinel.com> writes:
> Joel Fradkin wrote:
>> Postgres was on the second run
>> Total query runtime: 17109 ms.
>> Data retrieval runtime: 72188 ms.
>> 331640 rows retrieved.

> How were you measuring "data retrieval time"?

I suspect he's using pgadmin.  We've seen reports before suggesting that
pgadmin can be amazingly slow, eg here
http://archives.postgresql.org/pgsql-performance/2004-10/msg00427.php
where the *actual* data retrieval time as shown by EXPLAIN ANALYZE
was under three seconds, but pgadmin claimed the query runtime was 22
sec and data retrieval runtime was 72 sec.

I wouldn't be too surprised if that time was being spent formatting
the data into a table for display inside pgadmin.  It is a GUI after
all, not a tool for pushing vast volumes of data around.

It'd be interesting to check the runtimes for the same query with
LIMIT 3000, ie, see if a tenth as much data takes a tenth as much
processing time or not.  The backend code should be pretty darn
linear in this regard, but maybe pgadmin isn't.

            regards, tom lane

Re: Joel's Performance Issues WAS : Opteron vs Xeon

From

Andreas Pflug

Date:

21 April 2005, 10:06:07

Joel Fradkin wrote:
> I did think of something similar just loading the data tables with junk
> records and I may visit that idea with Josh.
>
> I did just do some comparisons on timing of a plain select * from tbl where
> indexed column = x and it was considerably slower then both MSSQL and MYSQL,
> so I am still a bit confused. This still might be configuration issue (I ran
> on my 2gig desktop and the 8 gig Linux box comparisons were all ran on the
> same machines as far MSSQL, MYSQL, and Postgres.
> I turned off postgres when running MYSQL and turned off MYSQL when running
> postgres, MSSQL had one of the two running while I tested it.
>
> For the 360,000 records returned MYSQL did it in 40 seconds first run and 17
> seconds second run.
>
> MSSQL did it in 56 seconds first run and 16 seconds second run.
>
> Postgres was on the second run
> Total query runtime: 17109 ms.
> Data retrieval runtime: 72188 ms.
> 331640 rows retrieved.

Beware!
 From the data, I can see that you're probably using pgAdmin3.
The time to execute your query including transfer of all data to the
client is 17s in this example, while displaying it (i.e. pure GUI and
memory alloc stuff) takes 72s. Execute to a file to avoid this.

Regards,
Andreas

Re: Joel's Performance Issues WAS : Opteron vs Xeon

From

"Joel Fradkin"

Date:

21 April 2005, 10:53:43

Why is MYSQL returning 360,000 rows, while Postgres is only returning
330,000? This may not be important at all, though.
I also assume you are selecting from a plain table, not a view.

Yes plain table. Difference in rows is one of the datasets had sears data in
it. It (speed differences found) is much worse on some of my views, which is
what forced me to start looking at other options.

I suppose knowing your work_mem, and shared_buffers settings would be
useful. I have posted my configs, but will add the Tampa to the bottom
again. My desktop has
# - Memory -

shared_buffers = 8000        # min 16, at least max_connections*2, 8KB
each
work_mem = 8000#1024        # min 64, size in KB
maintenance_work_mem = 16384    # min 1024, size in KB
#max_stack_depth = 2048        # min 100, size in KB

# - Free Space Map -

max_fsm_pages = 30000#20000        # min max_fsm_relations*16, 6 bytes
each
max_fsm_relations = 1000    # min 100, ~50 bytes each
# - Planner Cost Constants -

effective_cache_size = 80000#1000    # typically 8KB each
random_page_cost = 2        # units are one sequential page fetch cost

How were you measuring "data retrieval time"? And how does this compare
to what you were measuring on the other machines? It might be possible
that what you are really measuring is just the time it takes psql to
load up all the data into memory, and then print it out. And since psql
defaults to measuring entry lengths for each column, this may not be
truly comparable.
It *looks* like it only takes 18s for postgres to get the data, but then
it is taking 72s to transfer the data to you. That would be network
latency, or something like that, not database latency.
And I would say that 18s is very close to 16 or 17 seconds.
This was ran on the machine with database (as was MYSQL and MSSQL).
The PG timing was from PGADMIN and the 18 secs was second run, first run was
Same time to return the data and 70 secs to do the first part like 147 secs
all told, compared to the 40 seconds first run of MYSQL and 56 Seconds
MSSQL. MYSQL was done in their query tool, it returns the rows as well and
MSSQL was done in their query analyzer. All three tools appear to use a
similar approach. Just an FYI doing an explain analyze of my problem view
took much longer then actually returning the data in MSSQL and MYSQL. I have
done extensive testing with MYSQL (just this table and two of my problem
views). I am not using the transactional version, because I need the best
speed.


I don't know what commands you were issuing, or how you measured,
though. You might be using some other interface (like ODBC), which I
can't say too much about.

John
=:->

This is the Linux box config.
# -----------------------------
# PostgreSQL configuration file
# -----------------------------
#
# This file consists of lines of the form:
#
#   name = value
#
# (The '=' is optional.) White space may be used. Comments are introduced
# with '#' anywhere on a line. The complete list of option names and
# allowed values can be found in the PostgreSQL documentation. The
# commented-out settings shown in this file represent the default values.
#
# Please note that re-commenting a setting is NOT sufficient to revert it
# to the default value, unless you restart the postmaster.
#
# Any option can also be given as a command line switch to the
# postmaster, e.g. 'postmaster -c log_connections=on'. Some options
# can be changed at run-time with the 'SET' SQL command.
#
# This file is read on postmaster startup and when the postmaster
# receives a SIGHUP. If you edit the file on a running system, you have
# to SIGHUP the postmaster for the changes to take effect, or use
# "pg_ctl reload". Some settings, such as listen_address, require
# a postmaster shutdown and restart to take effect.


#---------------------------------------------------------------------------
# FILE LOCATIONS
#---------------------------------------------------------------------------

# The default values of these variables are driven from the -D command line
# switch or PGDATA environment variable, represented here as ConfigDir.
# data_directory = 'ConfigDir'        # use data in another directory
#data_directory = '/pgdata/data'
# hba_file = 'ConfigDir/pg_hba.conf'    # the host-based authentication file
# ident_file = 'ConfigDir/pg_ident.conf'  # the IDENT configuration file

# If external_pid_file is not explicitly set, no extra pid file is written.
# external_pid_file = '(none)'        # write an extra pid file


#---------------------------------------------------------------------------
# CONNECTIONS AND AUTHENTICATION
#---------------------------------------------------------------------------

# - Connection Settings -

#listen_addresses = 'localhost'    # what IP interface(s) to listen on;
                # defaults to localhost, '*' = any

listen_addresses = '*'
port = 5432
max_connections = 100
    # note: increasing max_connections costs about 500 bytes of shared
    # memory per connection slot, in addition to costs from
shared_buffers
    # and max_locks_per_transaction.
#superuser_reserved_connections = 2
#unix_socket_directory = ''
#unix_socket_group = ''
#unix_socket_permissions = 0777    # octal
#rendezvous_name = ''        # defaults to the computer name

# - Security & Authentication -

#authentication_timeout = 60    # 1-600, in seconds
#ssl = false
#password_encryption = true
#krb_server_keyfile = ''
#db_user_namespace = false


#---------------------------------------------------------------------------
# RESOURCE USAGE (except WAL)
#---------------------------------------------------------------------------

# - Memory -

shared_buffers = 12288 #5000  min 16, at least max_connections*2, 8KB each
#work_mem = 1024        # min 64, size in KB
work_mem = 16384 # 8192
#maintenance_work_mem = 16384    # min 1024, size in KB
#max_stack_depth = 2048        # min 100, size in KB

# - Free Space Map -

max_fsm_pages = 100000    #30000    # min max_fsm_relations*16, 6 bytes each
max_fsm_relations = 1500 #1000    # min 100, ~50 bytes each

# - Kernel Resource Usage -

#max_files_per_process = 1000    # min 25
#preload_libraries = ''

# - Cost-Based Vacuum Delay -

#vacuum_cost_delay = 0        # 0-1000 milliseconds
#vacuum_cost_page_hit = 1    # 0-10000 credits
#vacuum_cost_pagE_miss = 10    # 0-10000 credits
#vacuum_cost_page_dirty = 20    # 0-10000 credits
#vacuum_cost_limit = 200    # 0-10000 credits

# - Background writer -

#bgwriter_delay = 200        # 10-10000 milliseconds between rounds
#bgwriter_percent = 1        # 0-100% of dirty buffers in each round
#bgwriter_maxpages = 100    # 0-1000 buffers max per round


#---------------------------------------------------------------------------
# WRITE AHEAD LOG
#---------------------------------------------------------------------------

# - Settings -
fsync = true            # turns forced synchronization on or off
wal_sync_method = open_sync# fsync    # the default varies across
platforms:
                # fsync, fdatasync, open_sync, or
open_datasync
wal_buffers = 2048#8        # min 4, 8KB each
#commit_delay = 0        # range 0-100000, in microseconds
#commit_siblings = 5        # range 1-1000

# - Checkpoints -

checkpoint_segments = 100 #3    # in logfile segments, min 1, 16MB each
#checkpoint_timeout = 300    # range 30-3600, in seconds
#checkpoint_warning = 30    # 0 is off, in seconds

# - Archiving -

#archive_command = ''        # command to use to archive a logfile
segment


#---------------------------------------------------------------------------
# QUERY TUNING
#---------------------------------------------------------------------------

# - Planner Method Configuration -

#enable_hashagg = true
#enable_hashjoin = true
#enable_indexscan = true
enable_mergejoin = false
#enable_nestloop = true
#enable_seqscan = true
#enable_sort = true
#enable_tidscan = true

# - Planner Cost Constants -

effective_cache_size = 262144    #40000 typically 8KB each
#random_page_cost = 4        # units are one sequential page fetch cost
random_page_cost = 2
#cpu_tuple_cost = 0.01        # (same)
#cpu_index_tuple_cost = 0.001    # (same)
#cpu_operator_cost = 0.0025    # (same)

# - Genetic Query Optimizer -

#geqo = true
#geqo_threshold = 12
#geqo_effort = 5        # range 1-10
#geqo_pool_size = 0        # selects default based on effort
#geqo_generations = 0        # selects default based on effort
#geqo_selection_bias = 2.0    # range 1.5-2.0

# - Other Planner Options -

default_statistics_target = 250#10    # range 1-1000
#from_collapse_limit = 8
#join_collapse_limit = 8    # 1 disables collapsing of explicit JOINs


#---------------------------------------------------------------------------
# ERROR REPORTING AND LOGGING
#---------------------------------------------------------------------------

# - Where to Log -

#log_destination = 'stderr'    # Valid values are combinations of stderr,
                                # syslog and eventlog, depending on
                                # platform.

# This is relevant when logging to stderr:
redirect_stderr = true      # Enable capturing of stderr into log files.
# These are only relevant if redirect_stderr is true:
log_directory = 'pg_log'    # Directory where log files are written.
                            # May be specified absolute or relative to
PGDATA
log_filename = 'postgresql-%a.log' # Log file name pattern.
                            # May include strftime() escapes
log_truncate_on_rotation = true  # If true, any existing log file of the
                            # same name as the new log file will be
truncated
                            # rather than appended to.  But such truncation
                            # only occurs on time-driven rotation,
                            # not on restarts or size-driven rotation.
                            # Default is false, meaning append to existing
                            # files in all cases.
log_rotation_age = 1440     # Automatic rotation of logfiles will happen
after
                            # so many minutes.  0 to disable.
log_rotation_size = 0       # Automatic rotation of logfiles will happen
after
                            # so many kilobytes of log output.  0 to
disable.

# These are relevant when logging to syslog:
#syslog_facility = 'LOCAL0'
#syslog_ident = 'postgres'


# - When to Log -

#client_min_messages = notice    # Values, in order of decreasing detail:
                #   debug5, debug4, debug3, debug2, debug1,
                #   log, notice, warning, error

#log_min_messages = notice    # Values, in order of decreasing detail:
                #   debug5, debug4, debug3, debug2, debug1,
                #   info, notice, warning, error, log,
fatal,
                #   panic

#log_error_verbosity = default    # terse, default, or verbose messages

#log_min_error_statement = panic # Values in order of increasing severity:
                 #   debug5, debug4, debug3, debug2, debug1,
                 #   info, notice, warning, error,
panic(off)

#log_min_duration_statement = -1 # -1 is disabled, in milliseconds.

#silent_mode = false         # DO NOT USE without syslog or
redirect_stderr

# - What to Log -

#debug_print_parse = false
#debug_print_rewritten = false
#debug_print_plan = false
#debug_pretty_print = false
#log_connections = false
#log_disconnections = false
#log_duration = false
#log_line_prefix = ''        # e.g. '<%u%%%d> '
                # %u=user name %d=database name
                # %r=remote host and port
                # %p=PID %t=timestamp %i=command tag
                # %c=session id %l=session line number
                # %s=session start timestamp %x=transaction
id
                # %q=stop here in non-session processes
                # %%='%'
#log_statement = 'none'        # none, mod, ddl, all
#log_hostname = false


#---------------------------------------------------------------------------
# RUNTIME STATISTICS
#---------------------------------------------------------------------------

# - Statistics Monitoring -

#log_parser_stats = false
#log_planner_stats = false
#log_executor_stats = false
#log_statement_stats = false

# - Query/Index Statistics Collector -

#stats_start_collector = true
#stats_command_string = false
#stats_block_level = false
#stats_row_level = false
#stats_reset_on_server_start = true


#---------------------------------------------------------------------------
# CLIENT CONNECTION DEFAULTS
#---------------------------------------------------------------------------

# - Statement Behavior -

#search_path = '$user,public'    # schema names
#default_tablespace = ''    # a tablespace name, or '' for default
#check_function_bodies = true
#default_transaction_isolation = 'read committed'
#default_transaction_read_only = false
#statement_timeout = 0        # 0 is disabled, in milliseconds

# - Locale and Formatting -

#datestyle = 'iso, mdy'
#timezone = unknown        # actually, defaults to TZ environment
setting
#australian_timezones = false
#extra_float_digits = 0        # min -15, max 2
#client_encoding = sql_ascii    # actually, defaults to database encoding

# These settings are initialized by initdb -- they might be changed
lc_messages = 'en_US.UTF-8'        # locale for system error message
strings
lc_monetary = 'en_US.UTF-8'        # locale for monetary formatting
lc_numeric = 'en_US.UTF-8'        # locale for number formatting
lc_time = 'en_US.UTF-8'            # locale for time formatting

# - Other Defaults -

#explain_pretty_print = true
#dynamic_library_path = '$libdir'


#---------------------------------------------------------------------------
# LOCK MANAGEMENT
#---------------------------------------------------------------------------

#deadlock_timeout = 1000    # in milliseconds
#max_locks_per_transaction = 64    # min 10, ~200*max_connections bytes each


#---------------------------------------------------------------------------
# VERSION/PLATFORM COMPATIBILITY
#---------------------------------------------------------------------------

# - Previous Postgres Versions -

#add_missing_from = true
#regex_flavor = advanced    # advanced, extended, or basic
#sql_inheritance = true
#default_with_oids = true

# - Other Platforms & Clients -

#transform_null_equals = false

Re: Joel's Performance Issues WAS : Opteron vs Xeon

From

"Joel Fradkin"

Date:

21 April 2005, 11:36:11

I suspect he's using pgadmin.
Yup I was, but I did try running on the linux box in psql, but it was
running to the screen and took forever because of that.

The real issue is returning to my app using ODBC is very slow (Have not
tested the ODBC for MYSQL, MSSQL is ok (the two proc dell is running out of
steam but been good until this year when we about doubled our demand by
adding sears as a client).

Using odbc to postgres on some of the views (Josh from Command is having me
do some very specific testing) is timing out with a 10 minute time limit.
These are pages that still respond using MSSQL (this is wehere production is
using the duel proc and the test is using the 4 proc).

I have a tool that hooks to all three databases so I can try it with that
and see if I get different responses.

Joel

Re: Joel's Performance Issues WAS : Opteron vs Xeon

From

"Jim C. Nasby"

Date:

22 April 2005, 23:01:56

Hrm... I was about to suggest that for timing just the query (and not
output/data transfer time) using explain analyze, but then I remembered
that explain analyze can incur some non-trivial overhead with the timing
calls. Is there a way to run the query but have psql ignore the output?
If so, you could use \timing.

In any case, it's not valid to use pgadmin to time things.
--
Jim C. Nasby, Database Consultant               decibel@decibel.org
Give your computer some brain candy! www.distributed.net Team #1828

Windows: "Where do you want to go today?"
Linux: "Where do you want to go tomorrow?"
FreeBSD: "Are you guys coming, or what?"

Re: Joel's Performance Issues WAS : Opteron vs Xeon

From

Kevin Brown

Date:

23 April 2005, 03:30:58

Jim C. Nasby wrote:
> Hrm... I was about to suggest that for timing just the query (and not
> output/data transfer time) using explain analyze, but then I remembered
> that explain analyze can incur some non-trivial overhead with the timing
> calls. Is there a way to run the query but have psql ignore the output?
> If so, you could use \timing.

Would timing "SELECT COUNT(*) FROM (query)" work?


--
Kevin Brown                          kevin@sysexperts.com

Re: Joel's Performance Issues WAS : Opteron vs Xeon

From

Josh Berkus

Date:

23 April 2005, 15:28:37

Jim, Kevin,

> > Hrm... I was about to suggest that for timing just the query (and not
> > output/data transfer time) using explain analyze, but then I remembered
> > that explain analyze can incur some non-trivial overhead with the timing
> > calls. Is there a way to run the query but have psql ignore the output?
> > If so, you could use \timing.
>
> Would timing "SELECT COUNT(*) FROM (query)" work?

Just \timing would work fine; PostgreSQL doesn't return anything until it has
the whole result set.  That's why MSSQL vs. PostgreSQL timing comparisons are
deceptive unless you're careful:  MSSQL returns the results on block at a
time, and reports execution time as the time required to return the *first*
block, as opposed to Postgres which reports the time required to return the
whole dataset.

--
Josh Berkus
Aglio Database Solutions
San Francisco

Re: Joel's Performance Issues WAS : Opteron vs Xeon

From

Kevin Brown

Date:

26 April 2005, 23:47:25

Josh Berkus wrote:
> Jim, Kevin,
>
> > > Hrm... I was about to suggest that for timing just the query (and not
> > > output/data transfer time) using explain analyze, but then I remembered
> > > that explain analyze can incur some non-trivial overhead with the timing
> > > calls. Is there a way to run the query but have psql ignore the output?
> > > If so, you could use \timing.
> >
> > Would timing "SELECT COUNT(*) FROM (query)" work?
>
> Just \timing would work fine; PostgreSQL doesn't return anything until it has
> the whole result set.

Hmm...does \timing show the amount of elapsed time between query start
and the first results handed to it by the database (even if the
database itself has prepared the entire result set for transmission by
that time), or between query start and the last result handed to it by
the database?

Because if it's the latter, then things like server<->client network
bandwidth are going to affect the results that \timing shows, and it
won't necessarily give you a good indicator of how well the database
backend is performing.  I would expect that timing SELECT COUNT(*)
FROM (query) would give you an idea of how the backend is performing,
because the amount of result set data that has to go over the wire is
trivial.

Each is, of course, useful in its own right, and you want to be able
to measure both (so, for instance, you can get an idea of just how
much your network affects the overall performance of your queries).

> That's why MSSQL vs. PostgreSQL timing comparisons are
> deceptive unless you're careful:  MSSQL returns the results on block at a
> time, and reports execution time as the time required to return the *first*
> block, as opposed to Postgres which reports the time required to return the
> whole dataset.

Interesting.  I had no idea MSSQL did that, but I can't exactly say
I'm surprised.  :-)

--
Kevin Brown                          kevin@sysexperts.com

Re: Joel's Performance Issues WAS : Opteron vs Xeon

From

"Jim C. Nasby"

Date:

27 April 2005, 18:01:58

BTW, http://stats.distributed.net/~decibel/base.log is a test I ran;
select count(*) was ~6x faster than explain analyze select *.

On Tue, Apr 26, 2005 at 07:46:52PM -0700, Kevin Brown wrote:
> Josh Berkus wrote:
> > Jim, Kevin,
> >
> > > > Hrm... I was about to suggest that for timing just the query (and not
> > > > output/data transfer time) using explain analyze, but then I remembered
> > > > that explain analyze can incur some non-trivial overhead with the timing
> > > > calls. Is there a way to run the query but have psql ignore the output?
> > > > If so, you could use \timing.
> > >
> > > Would timing "SELECT COUNT(*) FROM (query)" work?
> >
> > Just \timing would work fine; PostgreSQL doesn't return anything until it has
> > the whole result set.
>
> Hmm...does \timing show the amount of elapsed time between query start
> and the first results handed to it by the database (even if the
> database itself has prepared the entire result set for transmission by
> that time), or between query start and the last result handed to it by
> the database?
>
> Because if it's the latter, then things like server<->client network
> bandwidth are going to affect the results that \timing shows, and it
> won't necessarily give you a good indicator of how well the database
> backend is performing.  I would expect that timing SELECT COUNT(*)
> FROM (query) would give you an idea of how the backend is performing,
> because the amount of result set data that has to go over the wire is
> trivial.
>
> Each is, of course, useful in its own right, and you want to be able
> to measure both (so, for instance, you can get an idea of just how
> much your network affects the overall performance of your queries).
>
>
> > That's why MSSQL vs. PostgreSQL timing comparisons are
> > deceptive unless you're careful:  MSSQL returns the results on block at a
> > time, and reports execution time as the time required to return the *first*
> > block, as opposed to Postgres which reports the time required to return the
> > whole dataset.
>
> Interesting.  I had no idea MSSQL did that, but I can't exactly say
> I'm surprised.  :-)
>
>
> --
> Kevin Brown                          kevin@sysexperts.com
>
> ---------------------------(end of broadcast)---------------------------
> TIP 3: if posting/reading through Usenet, please send an appropriate
>       subscribe-nomail command to majordomo@postgresql.org so that your
>       message can get through to the mailing list cleanly
>

--
Jim C. Nasby, Database Consultant               decibel@decibel.org
Give your computer some brain candy! www.distributed.net Team #1828

Windows: "Where do you want to go today?"
Linux: "Where do you want to go tomorrow?"
FreeBSD: "Are you guys coming, or what?"