Thread: ok you all win what is best opteron (I dont want a hosed system again)

From:
"Joel Fradkin"
Date:

We are up and somewhat happy.

 

I have been following threads (in case you don’t know I bought a 4 proc Dell recently) and the Opteron seems the way to go.

I just called HP for a quote, but don’t want to make any mistakes.

 

Is the battery backed cache good or bad for Postgres?

 

They are telling me I can only get a duel channel card if I want hardware raid 10 on the 14 drives.

I can get two cards but it has to be 7 and 7 (software raid?) which does not sound like it fixes my single point of failure (one of the listers mentioned my current system has 3 such single points).

 

Any of you hardware gurus spell out the optimal machine (I am hoping to be around 15K, might be able to go more if it’s a huge difference, I spent 30k on the Dell).

I do not have to go HP, and after seeing the fail ratio from Monarch from one lister I am bit scared shopping there.

Was there a conclusion on where is best to get one (I really want two one for development too).

 

 

Joel Fradkin

 

Wazagua, Inc.
2520 Trailmate Dr
Sarasota, Florida 34243
Tel.  941-753-7111 ext 305

 


www.wazagua.com
Powered by Wazagua
Providing you with the latest Web-based technology & advanced tools.
© 2004. WAZAGUA, Inc. All rights reserved. WAZAGUA, Inc
 This email message is for the use of the intended recipient(s) and may contain confidential and privileged information.  Any unauthorized review, use, disclosure or distribution is prohibited.  If you are not the intended recipient, please contact the sender by reply email and delete and destroy all copies of the original message, including attachments.

 


 

 

From:
"Merlin Moncure"
Date:

Joel wrote:
I have been following threads (in case you don't know I bought a 4 proc
Dell recently) and the Opteron seems the way to go.
I just called HP for a quote, but don't want to make any mistakes.
[snip]

At your level of play it's the DL585.
Have you checked out http://www.swt.com?

Merlin

From:
David Brown
Date:

Joel Fradkin wrote:

> Is the battery backed cache good or bad for Postgres?
>
Battery-backed avoids corruption if you have an unexpected power loss.
It's considered mandatory with large-cache write-back controllers if you
can't afford to lose any data.

> They are telling me I can only get a duel channel card if I want
> hardware raid 10 on the 14 drives.
>
> I can get two cards but it has to be 7 and 7 (software raid?) which
> does not sound like it fixes my single point of failure (one of the
> listers mentioned my current system has 3 such single points).
>
Sounds like you need to try another vendor. Are you aiming for two RAID
10 arrays or one RAID 10 and one RAID 5?

> Any of you hardware gurus spell out the optimal machine (I am hoping
> to be around 15K, might be able to go more if it’s a huge difference,
> I spent 30k on the Dell).
>
> I do not have to go HP, and after seeing the fail ratio from Monarch
> from one lister I am bit scared shopping there.
>
There's unlikely to be many common components between their workstation
and server offerings. You would expect case, power, graphics,
motherboard, storage controller and drives to all be different. But I'd
challenge that 50% failure stat anyway. Which components exactly? Hard
drives? Power supplies?

> Was there a conclusion on where is best to get one (I really want two
> one for development too).
>
Almost anyone can build a workstation or tower server, and almost anyone
else can service it for you. It gets trickier when you're talking 2U and
especially 1U. But really, these too can be maintained by anyone
competent. So I wonder about some people's obsession with
vendor-provided service.

Realistically, most Opteron solutions will use a Tyan motherboard (no
idea if this includes HP). For 4-way systems, there's currently only the
S4882, which includes an LSI dual channel SCSI controller. Different
vendors get to use different cases and cooling solutions and pick a
different brand/model of hard drive, but that's about it.

Tyan now also sells complete servers - hardly a stretch seeing they
already make the most important bit (after the CPU). Given the level of
interest in this forum, here's their list of US resellers:

http://www.tyan.com/products/html/us_alwa.html

If it's a tower server, build it yourself or pay someone to do it. It
really isn't challenging for anyone knowledgeable about hardware.

From:
"Joel Fradkin"
Date:

Thank you much for the info.
I will take a look. I think the prices I have been seeing may exclude us
getting another 4 proc box this soon. My boss asked me to get something in
the 15K range (I spent 30 on the Dell).
The HP seemed to run around 30 but it had a lot more drives then the dell
(speced it with 14 10k drives).

I can and will most likely build it myself to try getting a bit more bang
for the buck and it is a second server so if it dies it should not be a
catastrophie.

FYI everyone using our system (after a week of dealing with many bugs) have
been saying how much they like the speed.
I did have to do a lot of creative ideas to get it working in a way that
appears faster to the client.
Stuff like the queries default to limit 50 and as they hit next I up the
limit (also a flag to just show all records and a count, it used to default
to that). The two worst queries (our case and audit applications) I created
denormalized files and maintain them through code. All reporting comes off
those and it is lightning fast.

I just want to say again thanks to everyone who has helped me in the past
few months.

Joel Fradkin

Wazagua, Inc.
2520 Trailmate Dr
Sarasota, Florida 34243
Tel.  941-753-7111 ext 305


www.wazagua.com
Powered by Wazagua
Providing you with the latest Web-based technology & advanced tools.
C 2004. WAZAGUA, Inc. All rights reserved. WAZAGUA, Inc
 This email message is for the use of the intended recipient(s) and may
contain confidential and privileged information.  Any unauthorized review,
use, disclosure or distribution is prohibited.  If you are not the intended
recipient, please contact the sender by reply email and delete and destroy
all copies of the original message, including attachments.




-----Original Message-----
From: David Brown [mailto:]
Sent: Friday, May 13, 2005 7:03 PM
To: Joel Fradkin
Cc: 
Subject: Re: [PERFORM] ok you all win what is best opteron (I dont want a
hosed system again)

Joel Fradkin wrote:

> Is the battery backed cache good or bad for Postgres?
>
Battery-backed avoids corruption if you have an unexpected power loss.
It's considered mandatory with large-cache write-back controllers if you
can't afford to lose any data.

> They are telling me I can only get a duel channel card if I want
> hardware raid 10 on the 14 drives.
>
> I can get two cards but it has to be 7 and 7 (software raid?) which
> does not sound like it fixes my single point of failure (one of the
> listers mentioned my current system has 3 such single points).
>
Sounds like you need to try another vendor. Are you aiming for two RAID
10 arrays or one RAID 10 and one RAID 5?

> Any of you hardware gurus spell out the optimal machine (I am hoping
> to be around 15K, might be able to go more if it's a huge difference,
> I spent 30k on the Dell).
>
> I do not have to go HP, and after seeing the fail ratio from Monarch
> from one lister I am bit scared shopping there.
>
There's unlikely to be many common components between their workstation
and server offerings. You would expect case, power, graphics,
motherboard, storage controller and drives to all be different. But I'd
challenge that 50% failure stat anyway. Which components exactly? Hard
drives? Power supplies?

> Was there a conclusion on where is best to get one (I really want two
> one for development too).
>
Almost anyone can build a workstation or tower server, and almost anyone
else can service it for you. It gets trickier when you're talking 2U and
especially 1U. But really, these too can be maintained by anyone
competent. So I wonder about some people's obsession with
vendor-provided service.

Realistically, most Opteron solutions will use a Tyan motherboard (no
idea if this includes HP). For 4-way systems, there's currently only the
S4882, which includes an LSI dual channel SCSI controller. Different
vendors get to use different cases and cooling solutions and pick a
different brand/model of hard drive, but that's about it.

Tyan now also sells complete servers - hardly a stretch seeing they
already make the most important bit (after the CPU). Given the level of
interest in this forum, here's their list of US resellers:

http://www.tyan.com/products/html/us_alwa.html

If it's a tower server, build it yourself or pay someone to do it. It
really isn't challenging for anyone knowledgeable about hardware.


From:
Josh Berkus
Date:

Joel,

> The two worst queries (our case and audit applications) I created
> denormalized files and maintain them through code. All reporting comes off
> those and it is lightning fast.

This can often be called for.  I'm working on a 400GB data warehouse right
now, and almost *all* of our queries run from materialized aggregate tables.

--
--Josh

Josh Berkus
Aglio Database Solutions
San Francisco

From:
Mike Nolan
Date:

> This can often be called for.  I'm working on a 400GB data warehouse right
> now, and almost *all* of our queries run from materialized aggregate tables.

I thought that was pretty much the definition of data warehousing!  :-)
--
Mike Nolan

From:
William Yu
Date:

4-way SMP Opteron system is actually pretty damn cheap -- if you get
2xDual Core versus 4xSingle. I just ordered a 2x265 (4x1.8ghz) system
and the price was about $1300 more than a 2x244 (2x1.8ghz).

Now you might ask, is a 2xDC comparable to 4x1? Here's some benchmarks
I've found that showing DC versus Single @ the same clock rates/same #
cores.

SpecIntRate Windows:
4x846 = 56.7
2x270 = 62.6

SpecFPRate Windows:
4x846 = 52.5
2x270 = 55.3

SpecWeb99SSL:
4x846 = 3399
2x270 = 4100 (2 870s were used)

Specjbb2000 IBM JVM:
4x848 = 146385
4x275 = 157432

What it looks like is a DC system is about 1 clock blip faster than a
corresponding single core SMP system. E.g. if you have a 2xDC @ 1.8ghz,
you need a 4x1 @ 2ghz to match the speed. (In some benchmarks, the
difference is 2 clock steps up.)

On the surface, it looks pretty amazing that a 4x1 Opteron with twice
the memory bandwidth is slower than a corresponding 2xDC. (DC Opterons
use the same socket as plain jane Opterons so they use the same 2xDDR
memory setup.) It turns out the latency in a 2xDC setup is just so much
lower and most apps like lower latency than higher bandwidth. Look at
the diagram of the following Tyan 4-processor MB:

ftp://ftp.tyan.com/datasheets/d_s4882_100.pdf

Take particular note of the lack of diagonal lines connecting CPUs. What
this means is if a process running on CPU0 needs memory attached to
CPU3, it must request either CPU1 or CPU2 to forward the request for it.
Without NUMA support, we're looking at 25% of memory access runs @ 50ns,
50% 110ns, 25% 170ns. (Rough numbers, I'd have to do a lot of googling
to the find the exact latencies but I'm just too lazy now.)

Now consider a 2xDC system. The 2 cores inside a single package are
connected by an immensely fast internal SRQ connection. As long as
there's no bandwidth limitation, both cores have fullspeed access to
memory while core-to-core snooping on each respective cache is roughly
10ns. So memory access speeds look like so: 50% 50ns, 50% 110ns.

If the memory locations you are need to access happen to be contained in
the L1/L2 cache, this makes the difference even more pronounced. You
then get memory access patterns for 4x1: 25% 5ns, 50% 65ns, 25% 125ns
versus 2xDC: 25% 5ns, 25% 15ns, 50% 65ns.



Joel Fradkin wrote:
> Thank you much for the info.
> I will take a look. I think the prices I have been seeing may exclude us
> getting another 4 proc box this soon. My boss asked me to get something in
> the 15K range (I spent 30 on the Dell).
> The HP seemed to run around 30 but it had a lot more drives then the dell
> (speced it with 14 10k drives).

From:
Greg Stark
Date:

William Yu <> writes:

> It turns out the latency in a 2xDC setup is just so much lower and most apps
> like lower latency than higher bandwidth.

You haven't tested anything about "most apps". You tested what the SpecFoo
apps prefer. If you're curious about which Postgres prefers you'll have to
test with Postgres.

I'm not sure whether it will change the conclusion but I expect Postgres will
like bandwidth better than random benchmarks do.


--
greg

From:
William Yu
Date:

I say most apps because it's true. :) I would suggest that pretty much
every app (other than video/audio streaming) people think are
bandwidth-limited are actually latency-limited. Take the SpecFoo tests.
Sure I would have rather seen SAP/TPC/etc that would be more relevant to
Postgres but there aren't any apples-to-apples comparisons available
yet. But there's something to consider here. What people in the past
have believed is that memory bandwidth is the key to Spec numbers --
SpecFP isn't a test of floating point performance, it's a test of memory
bandwidth. Or is it? Numbers for DC Opterons show lower latency/lower
bandwith beating higher latency/higher bandwidth in what was supposedly
bandwidth limited. What may actually be happening is extra bandwidth
isn't actually used directly by the app itself -- instead the CPU uses
it for prefetching to hide latency.

Scrounging around for more numbers, I've found benchmarks at Anandtech
that relate better to Postgres. He has a "Order Entry" OLTP app running
on MS-SQL. 1xDC beats 2x1 -- 2xDC beats 4x1.

order entry reads
2x248 - 235113
1x175 - 257192
4x848 - 360014
2x275 - 392643

order entry writes
2x248 - 235107
1x175 - 257184
4x848 - 360008
2x275 - 392634

order entry stored procedures
2x248 - 2939
1x175 - 3215
4x848 - 4500
2x275 - 4908





Greg Stark wrote:

>William Yu <> writes:
>
>
>
>>It turns out the latency in a 2xDC setup is just so much lower and most apps
>>like lower latency than higher bandwidth.
>>
>>
>
>You haven't tested anything about "most apps". You tested what the SpecFoo
>apps prefer. If you're curious about which Postgres prefers you'll have to
>test with Postgres.
>
>I'm not sure whether it will change the conclusion but I expect Postgres will
>like bandwidth better than random benchmarks do.
>
>
>
>