Thread: Large (8M) cache vs. dual-core CPUs
I've been given the task of making some hardware recommendations for the next round of server purchases. The machines to be purchased will be running FreeBSD & PostgreSQL. Where I'm stuck is in deciding whether we want to go with dual-core pentiums with 2M cache, or with HT pentiums with 8M cache. Both of these are expensive bits of hardware, and I'm trying to gather as much evidence as possible before making a recommendation. The FreeBSD community seems pretty divided over which is likely to be better, and I have been unable to discover a method for estimating how much of the 2M cache on our existing systems is being used. Does anyone in the PostgreSQL community have any experience with large caches or dual-core pentiums that could make any recommendations? Our current Dell 2850 systems are CPU bound - i.e. they have enough RAM, and fast enough disks that the CPUs seem to be the limiting factor. As a result, this decision on what kind of CPUs to get in the next round of servers is pretty important. Any advice is much appreciated. -- Bill Moran Collaborative Fusion Inc. **************************************************************** IMPORTANT: This message contains confidential information and is intended only for the individual named. If the reader of this message is not an intended recipient (or the individual responsible for the delivery of this message to an intended recipient), please be advised that any re-use, dissemination, distribution or copying of this message is prohibited. Please notify the sender immediately by e-mail if you have received this e-mail by mistake and delete this e-mail from your system. E-mail transmission cannot be guaranteed to be secure or error-free as information could be intercepted, corrupted, lost, destroyed, arrive late or incomplete, or contain viruses. The sender therefore does not accept liability for any errors or omissions in the contents of this message, which arise as a result of e-mail transmission. ****************************************************************
On Tue, 2006-04-25 at 13:14, Bill Moran wrote: > I've been given the task of making some hardware recommendations for > the next round of server purchases. The machines to be purchased > will be running FreeBSD & PostgreSQL. > > Where I'm stuck is in deciding whether we want to go with dual-core > pentiums with 2M cache, or with HT pentiums with 8M cache. Given a choice between those two processors, I'd choose the AMD 64 x 2 CPU. It's a significantly better processor than either of the Intel choices. And if you get the HT processor, you might as well turn of HT on a PostgreSQL machine. I've yet to see it make postgresql run faster, but I've certainly seen HT make it run slower. If you can't run AMD in your shop due to bigotry (let's call a spade a spade) then I'd recommend the real dual core CPU with 2M cache. Most of what makes a database slow is memory and disk bandwidth. Few datasets are gonna fit in that 8M cache, and when they do, they'll get flushed right out by the next request anyway. > Does anyone in the PostgreSQL community have any experience with > large caches or dual-core pentiums that could make any recommendations? > Our current Dell 2850 systems are CPU bound - i.e. they have enough > RAM, and fast enough disks that the CPUs seem to be the limiting > factor. As a result, this decision on what kind of CPUs to get in > the next round of servers is pretty important. If the CPUs are running at 100% then you're likely not memory I/O bound, but processing speed bound. The dual core will definitely be the better option in that case. I take it you work at a "Dell Only" place, hence no AMD for you... Sad, cause the AMD is, on a price / performance scale, twice the processor for the same money as the Intel.
On Tue, 25 Apr 2006 14:14:35 -0400 Bill Moran <wmoran@collaborativefusion.com> wrote: > Does anyone in the PostgreSQL community have any experience with > large caches or dual-core pentiums that could make any > recommendations? Heh :) You're in the position I was in about a year ago - we "naturally" replaced our old Dell 2650 with £14k of Dell 6850 Quad Xeon with 8M cache, and TBH the performance is woeful :/ Having gone through Postgres consultancy, been through IBM 8-way POWER4 hardware, discovered a bit of a shortcoming in PG on N-way hardware (where N is large) [1] , I have been able to try out a dual-dual-core Opteron machine, and it flies. In fact, it flies so well that we ordered one that day. So, in short £3k's worth of dual-opteron beat the living daylights out of our Xeon monster. I can't praise the Opteron enough, and I've always been a firm Intel pedant - the HyperTransport stuff must really be doing wonders. I typically see 500ms searches on it instead of 1000-2000ms on the Xeon) As it stands, I've had to borrow this Opteron so much (and send live searches across the net to the remote box) because otherwise we simply don't have enough CPU power to run the website (!) Cheers, Gavin. [1] Simon Riggs + Tom Lane are currently involved in optimisation work for this - it turns out our extremely read-heavy load pattern reveals some buffer locking issues in PG.
On Tue, 2006-04-25 at 13:14, Bill Moran wrote: > I've been given the task of making some hardware recommendations for > the next round of server purchases. The machines to be purchased > will be running FreeBSD & PostgreSQL. > > Where I'm stuck is in deciding whether we want to go with dual-core > pentiums with 2M cache, or with HT pentiums with 8M cache. BTW: For an interesting article on why the dual core Opterons are so much better than their Intel cousins, read this article: http://techreport.com/reviews/2005q2/opteron-x75/index.x?pg=1 Enlightening read.
On Tue, Apr 25, 2006 at 01:33:38PM -0500, Scott Marlowe wrote: > Sad, cause the AMD is, on a price / performance scale, twice the > processor for the same money as the Intel. Maybe a year or two ago. Prices are all coming down. Intel more than AMD. AMD still seems better - but not X2, and it depends on the workload. X2 sounds like biggotry against Intel... :-) Cheers, mark -- mark@mielke.cc / markm@ncf.ca / markm@nortel.com __________________________ . . _ ._ . . .__ . . ._. .__ . . . .__ | Neighbourhood Coder |\/| |_| |_| |/ |_ |\/| | |_ | |/ |_ | | | | | | \ | \ |__ . | | .|. |__ |__ | \ |__ | Ottawa, Ontario, Canada One ring to rule them all, one ring to find them, one ring to bring them all and in the darkness bind them... http://mark.mielke.cc/
On Tue, 2006-04-25 at 13:38, mark@mark.mielke.cc wrote: > On Tue, Apr 25, 2006 at 01:33:38PM -0500, Scott Marlowe wrote: > > Sad, cause the AMD is, on a price / performance scale, twice the > > processor for the same money as the Intel. > > Maybe a year or two ago. Prices are all coming down. Intel more > than AMD. > > AMD still seems better - but not X2, and it depends on the workload. > > X2 sounds like biggotry against Intel... :-) Actually, that was from an article from this last month that compared the dual core intel to the amd. for every dollar spent on the intel, you got about half the performance of the amd. Not bigotry. fact. But don't believe me or the other people who've seen the difference. Go buy the Intel box. No skin off my back.
Bill Moran wrote: > I've been given the task of making some hardware recommendations for > the next round of server purchases. The machines to be purchased > will be running FreeBSD & PostgreSQL. > > Where I'm stuck is in deciding whether we want to go with dual-core > pentiums with 2M cache, or with HT pentiums with 8M cache. Dual Core Opterons :) Joshua D. Drake > > Both of these are expensive bits of hardware, and I'm trying to > gather as much evidence as possible before making a recommendation. > The FreeBSD community seems pretty divided over which is likely to > be better, and I have been unable to discover a method for estimating > how much of the 2M cache on our existing systems is being used. > > Does anyone in the PostgreSQL community have any experience with > large caches or dual-core pentiums that could make any recommendations? > Our current Dell 2850 systems are CPU bound - i.e. they have enough > RAM, and fast enough disks that the CPUs seem to be the limiting > factor. As a result, this decision on what kind of CPUs to get in > the next round of servers is pretty important. > > Any advice is much appreciated. > -- === The PostgreSQL Company: Command Prompt, Inc. === Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240 Providing the most comprehensive PostgreSQL solutions since 1997 http://www.commandprompt.com/
> But don't believe me or the other people who've seen the difference. Go > buy the Intel box. No skin off my back. To be more detailed... AMD Opteron has some specific technical advantages to their design over Intel when it comes to peforming for a database. Specifically no front side bus :) Also it is widely known and documented (just review the archives) that AMD performs better then the equivelant Intel CPU, dollar for dollar. Lastly it is also known that Dell frankly, sucks for PostgreSQL. Again, check the archives. Joshua D. Drake -- === The PostgreSQL Company: Command Prompt, Inc. === Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240 Providing the most comprehensive PostgreSQL solutions since 1997 http://www.commandprompt.com/
I've been doing plenty of performance evaluation on a parallel applicationActually, that was from an article from this last month that compared the dual core intel to the amd. for every dollar spent on the intel, you got about half the performance of the amd. Not bigotry. fact. But don't believe me or the other people who've seen the difference. Go buy the Intel box. No skin off my back.
we're developing here : on Dual Core Opterons, P4, P4D. I can say that
the Opterons open up a can of wupass on the Intel processors. Almost 2x
the performance on our application vs. what the SpecCPU numbers would
suggest.
David Boreham wrote: > >> Actually, that was from an article from this last month that compared >> the dual core intel to the amd. for every dollar spent on the intel, >> you got about half the performance of the amd. Not bigotry. fact. >> >> But don't believe me or the other people who've seen the difference. Go >> buy the Intel box. No skin off my back. >> > I've been doing plenty of performance evaluation on a parallel application > we're developing here : on Dual Core Opterons, P4, P4D. I can say that > the Opterons open up a can of wupass on the Intel processors. Almost 2x > the performance on our application vs. what the SpecCPU numbers would > suggest. Because Stone Cold Said So! > > -- === The PostgreSQL Company: Command Prompt, Inc. === Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240 Providing the most comprehensive PostgreSQL solutions since 1997 http://www.commandprompt.com/
Joshua D. Drake wrote: > David Boreham wrote: > > > >> Actually, that was from an article from this last month that compared > >> the dual core intel to the amd. for every dollar spent on the intel, > >> you got about half the performance of the amd. Not bigotry. fact. > >> > >> But don't believe me or the other people who've seen the difference. Go > >> buy the Intel box. No skin off my back. > >> > > I've been doing plenty of performance evaluation on a parallel application > > we're developing here : on Dual Core Opterons, P4, P4D. I can say that > > the Opterons open up a can of wupass on the Intel processors. Almost 2x > > the performance on our application vs. what the SpecCPU numbers would > > suggest. > > Because Stone Cold Said So! I'll believe someone who uses 'wupass' in a sentence any day! -- Bruce Momjian http://candle.pha.pa.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. +
As others have noted, the current price/performance "sweet spot" for DB servers is 2S 2C AMD CPUs. These CPUs are also thehighest performing x86 compatible solution for pg. If you must go Intel for some reason, then wait until the new NGMA CPU's (Conroe, Merom, Woodcrest) come out and see howthey bench on DB workloads. Preliminary benches on these chips look good, but I would not recommend making a purchasedecision based on just preliminary benches of unreleased products. If you must buy soon, then the decision is clear cut from anything except possinly a political/religious standpoint. The NetBurst based Pentium and Xeon solutions are simply not worth the money spent or the PITA they will put you throughcompared to the AMD dual cores. The new Intel NGMA CPUs may be different, but all the pertinent evidence is not yetavailable. My personal favorite pg platform at this time is one based on a 2 socket, dual core ready mainboard with 16 DIMM slots combinedwith dual core AMD Kx's. Less money than the "comparable" Intel solution and _far_ more performance. ...and even if you do buy Intel, =DON"T= buy Dell unless you like causing trouble for yourself. Bad experiences with Dell in general and their poor PERC RAID controllers in specific are all over this and other DB forums. Ron -----Original Message----- >From: Bill Moran <wmoran@collaborativefusion.com> >Sent: Apr 25, 2006 2:14 PM >To: pgsql-performance@postgresql.org >Subject: [PERFORM] Large (8M) cache vs. dual-core CPUs > > >I've been given the task of making some hardware recommendations for >the next round of server purchases. The machines to be purchased >will be running FreeBSD & PostgreSQL. > >Where I'm stuck is in deciding whether we want to go with dual-core >pentiums with 2M cache, or with HT pentiums with 8M cache. > >Both of these are expensive bits of hardware, and I'm trying to >gather as much evidence as possible before making a recommendation. >The FreeBSD community seems pretty divided over which is likely to >be better, and I have been unable to discover a method for estimating >how much of the 2M cache on our existing systems is being used. > >Does anyone in the PostgreSQL community have any experience with >large caches or dual-core pentiums that could make any recommendations? >Our current Dell 2850 systems are CPU bound - i.e. they have enough >RAM, and fast enough disks that the CPUs seem to be the limiting >factor. As a result, this decision on what kind of CPUs to get in >the next round of servers is pretty important. > >Any advice is much appreciated. >
>My personal favorite pg platform at this time is one based on a 2 socket, dual core ready mainboard with 16 DIMM slots combinedwith dual core AMD Kx's. > > Right. We've been buying Tyan bare-bones boxes like this. It's better to go with bare-bones than building boxes from bare metal because the cooling issues are addressed correctly. Note that if you need a large number of machines, then Intel Core Duo may give the best overall price/performance because they're cheaper to run and cool.
I've had intermittent "freeze and reboot" and, worse, just plain freeze problems with the Core Duo's I've been testing. I have not been able to narrow it down so I do not know if it is a platform issue or a CPU issue. It appears to be HW, notSW, related since I have experienced the problem both under M$ and Linux 2.6 based OS's. I have not tested the Core Duo'sunder *BSD. Also, being that they are only 32b Core Duo's have limited utility for a present day DB server. Power and space critical applications where 64b is not required may be a reasonable place for them... ...if the present reliability problems I'm seeing go away. Ron -----Original Message----- >From: David Boreham <david_list@boreham.org> >Sent: Apr 25, 2006 5:15 PM >To: pgsql-performance@postgresql.org >Subject: Re: [PERFORM] Large (8M) cache vs. dual-core CPUs > > >>My personal favorite pg platform at this time is one based on a 2 socket, dual core ready mainboard with 16 DIMM slotscombined with dual core AMD Kx's. >> >> >Right. We've been buying Tyan bare-bones boxes like this. >It's better to go with bare-bones than building boxes from bare metal >because the cooling issues are addressed correctly. > >Note that if you need a large number of machines, then Intel >Core Duo may give the best overall price/performance because >they're cheaper to run and cool. >
Ron Peacetree wrote: > As others have noted, the current price/performance "sweet spot" for DB servers is 2S 2C AMD CPUs. These CPUs are alsothe highest performing x86 compatible solution for pg. > > If you must go Intel for some reason, then wait until the new NGMA CPU's (Conroe, Merom, Woodcrest) come out and see howthey bench on DB workloads. Preliminary benches on these chips look good, but I would not recommend making a purchasedecision based on just preliminary benches of unreleased products. > > If you must buy soon, then the decision is clear cut from anything except possinly a political/religious standpoint. > The NetBurst based Pentium and Xeon solutions are simply not worth the money spent or the PITA they will put you throughcompared to the AMD dual cores. The new Intel NGMA CPUs may be different, but all the pertinent evidence is not yetavailable. > > My personal favorite pg platform at this time is one based on a 2 socket, dual core ready mainboard with 16 DIMM slotscombined with dual core AMD Kx's. > > Less money than the "comparable" Intel solution and _far_ more performance. > > ...and even if you do buy Intel, =DON"T= buy Dell unless you like causing trouble for yourself. > Bad experiences with Dell in general and their poor PERC RAID controllers in specific are all over this and other DB forums. > > Ron > To add to this... the HP DL 385 is a pretty nice dual core capable opteron box. Just don't buy the extra ram from HP (they like to charge entirely too much). Joshua D. Drake -- === The PostgreSQL Company: Command Prompt, Inc. === Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240 Providing the most comprehensive PostgreSQL solutions since 1997 http://www.commandprompt.com/
On Tue, Apr 25, 2006 at 01:33:38PM -0500, Scott Marlowe wrote: > On Tue, 2006-04-25 at 13:14, Bill Moran wrote: > > I've been given the task of making some hardware recommendations for > > the next round of server purchases. The machines to be purchased > > will be running FreeBSD & PostgreSQL. > > > > Where I'm stuck is in deciding whether we want to go with dual-core > > pentiums with 2M cache, or with HT pentiums with 8M cache. > > Given a choice between those two processors, I'd choose the AMD 64 x 2 > CPU. It's a significantly better processor than either of the Intel > choices. And if you get the HT processor, you might as well turn of HT > on a PostgreSQL machine. I've yet to see it make postgresql run faster, > but I've certainly seen HT make it run slower. Actually, believe it or not, a coworker just saw HT double the performance of pgbench on his desktop machine. Granted, not really a representative test case, but it still blew my mind. This was with a database that fit in his 1G of memory, and running windows XP. Both cases were newly minted pgbench databases with a scale of 40. Testing was 40 connections and 100 transactions. With HT he saw 47.6 TPS, without it was 21.1. I actually had IT build put w2k3 server on a HT box specifically so I could do more testing. -- Jim C. Nasby, Sr. Engineering Consultant jnasby@pervasive.com Pervasive Software http://pervasive.com work: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461
On Tue, Apr 25, 2006 at 01:42:31PM -0500, Scott Marlowe wrote: > On Tue, 2006-04-25 at 13:38, mark@mark.mielke.cc wrote: > > On Tue, Apr 25, 2006 at 01:33:38PM -0500, Scott Marlowe wrote: > > > Sad, cause the AMD is, on a price / performance scale, twice the > > > processor for the same money as the Intel. > > Maybe a year or two ago. Prices are all coming down. Intel more > > than AMD. > > AMD still seems better - but not X2, and it depends on the workload. > > X2 sounds like biggotry against Intel... :-) > Actually, that was from an article from this last month that compared > the dual core intel to the amd. for every dollar spent on the intel, > you got about half the performance of the amd. Not bigotry. fact. > But don't believe me or the other people who've seen the difference. Go > buy the Intel box. No skin off my back. AMD Opteron vs Intel Xeon is different than AMD X2 vs Pentium D. For AMD X2 vs Pentium D - I have both - in similar price range, and similar speed. I choose to use the AMD X2 as my server, and Pentium D as my Windows desktop. They're both quite fast. I made the choice I describe based on a lot of research. I was going to go both Intel, until I noticed that the Intel prices were dropping fast. 30% price cut in 2 months. AMD didn't drop at all during the same time. There are plenty of reasons to choose one over the other. Generally the AMD comes out on top. It is *not* 2X though. Anybody who claims this is being highly selective about which benchmarks they consider. One article is nothing. There is a lot of hype these days. AMD is winning the elite market, which means that they are able to continue to sell high. Intel, losing this market, is cutting its prices to compete. And they do compete. Quite well. Cheers, mark -- mark@mielke.cc / markm@ncf.ca / markm@nortel.com __________________________ . . _ ._ . . .__ . . ._. .__ . . . .__ | Neighbourhood Coder |\/| |_| |_| |/ |_ |\/| | |_ | |/ |_ | | | | | | \ | \ |__ . | | .|. |__ |__ | \ |__ | Ottawa, Ontario, Canada One ring to rule them all, one ring to find them, one ring to bring them all and in the darkness bind them... http://mark.mielke.cc/
On Tue, Apr 25, 2006 at 08:54:40PM -0400, mark@mark.mielke.cc wrote: > I made the choice I describe based on a lot of research. I was going > to go both Intel, until I noticed that the Intel prices were dropping > fast. 30% price cut in 2 months. AMD didn't drop at all during the > same time. Errr.. big mistake. That was going to be - I was going to go both AMD. > There are plenty of reasons to choose one over the other. Generally > the AMD comes out on top. It is *not* 2X though. Anybody who claims > this is being highly selective about which benchmarks they consider. I have an Intel Pentium D 920, and an AMD X2 3800+. These are very close in performance. The retail price difference is: Intel Pentium D 920 is selling for $310 CDN AMD X2 3800+ is selling for $347 CDN Another benefit of Pentium D over AMD X2, at least until AMD chooses to switch, is that Pentium D supports DDR2, whereas AMD only supports DDR. There are a lot of technical pros and cons to each - with claims from AMD that DDR2 can be slower than DDR - but one claim that isn't often made, but that helped me make my choice: 1) DDR2 supports higher transfer speeds. I'm using DDR2 5400 on the Intel. I think I'm at 3200 or so on the AMD X2. 2) DDR2 is cheaper. I purchased 1 Gbyte DDR2 5400 for $147 CDN. 1 Gbyte of DDR 3200 starts at around the same price, and stretches into $200 - $300 CDN. Now, granted, the Intel 920 requires more electricity to run. Running 24/7 for a year might make the difference in cost. It doesn't address point 1) though. I like my DDR2 5400. So, unfortunately, I won't be able to do a good test for you to prove that my Windows Pentium D box is not only cheaper to buy, but faster, because the specs aren't exactly equivalent. In the mean time, I'm quite enjoying my 3d games while doing other things at the same time. I imagine my desktop load approaches that of a CPU-bound database load. 3d games require significant I/O and CPU. Anybody who claims that Intel is 2X more expensive for the same performance, isn't considering all factors. No question at all - the Opteron is good, and the Xeon isn't - but the original poster didn't ask about Opeteron or Xeon, did he? For the desktop lines - X2 is not double Pentium D. Maybe 10%. Maybe not at all. Especially now that Intel is dropping it's prices due to overstock. Cheers, mark -- mark@mielke.cc / markm@ncf.ca / markm@nortel.com __________________________ . . _ ._ . . .__ . . ._. .__ . . . .__ | Neighbourhood Coder |\/| |_| |_| |/ |_ |\/| | |_ | |/ |_ | | | | | | \ | \ |__ . | | .|. |__ |__ | \ |__ | Ottawa, Ontario, Canada One ring to rule them all, one ring to find them, one ring to bring them all and in the darkness bind them... http://mark.mielke.cc/
mark@mark.mielke.cc wrote: > Another benefit of Pentium D over AMD X2, at least until AMD chooses > to switch, is that Pentium D supports DDR2, whereas AMD only supports > DDR. There are a lot of technical pros and cons to each - with claims > from AMD that DDR2 can be slower than DDR - but one claim that isn't > often made, but that helped me make my choice: > They're switching quite soon though -- within the next month now it seems, after moving up their earlier plans to launch in June: http://www.dailytech.com/article.aspx?newsid=1854 This Anandtech article shows the kind of performance increase we can expect with DDR2 on AMD's new socket: http://www.anandtech.com/cpuchipsets/showdoc.aspx?i=2741 The short version is that it's an improvement, but not an enormous one, and you need to spend quite a bit of cash on 800Mhz (PC6400) DDR2 sticks to see the most benefit. Some brief local (Australian) price comparisons show 1GB PC-3200 DDR sticks starting at just over AU$100, with 1GB PC2-4200 DDR2 sticks around the same price, though Anandtech's tests showed PC2-4200 DDR2 benching generally slower than PC-3200 DDR, probably due to the increased latency in DDR2. Comparing reasonable quality matched pairs of 1GB sticks, PC-3200 DDR still seems generally cheaper than PC2-5300 DDR2, though not by a lot, and I'm sure the DDR2 will start dropping even further as AMD systems start using it in the next month or so. One thing's for sure though -- Intel's Pentium D prices are remarkably low, and at the lower end of the price range AMD has nothing that's even remotely competitive in terms of price/performance. The Pentium D 805, for instance, with its dual 2.67Ghz cores, costs just AU$180. The X2 3800+ is a far better chip, but it's also two-and-a-half times the price. None of this really matters much in the server space though, where Opteron's real advantage over Xeon is not its greater raw CPU power, or its better dual-core implementation (though both would be hard to dispute), but the improved system bandwidth provided by Hypertransport. Even with Intel's next-gen CPUs, which look set to address the first two points quite well, they still won't have an interconnect technology that can really compete with AMD's. Thanks Leigh
>Another benefit of Pentium D over AMD X2, at least until AMD chooses >to switch, is that Pentium D supports DDR2, whereas AMD only supports >DDR. There are a lot of technical pros and cons to each - with claims >from AMD that DDR2 can be slower than DDR - but one claim that isn't >often made, but that helped me make my choice: > > 1) DDR2 supports higher transfer speeds. I'm using DDR2 5400 on > the Intel. I think I'm at 3200 or so on the AMD X2. > > 2) DDR2 is cheaper. I purchased 1 Gbyte DDR2 5400 for $147 CDN. > 1 Gbyte of DDR 3200 starts at around the same price, and > stretches into $200 - $300 CDN. > There's a logical fallacy here that needs to be noted. THROUGHPUT is better with DDR2 if and only if there is enough data to be fetched in a serial fashion from memory. LATENCY however is dependent on the base clock rate of the RAM involved. So PC3200, 200MHz x2, is going to actually perform better than PC2-5400, 166MHz x4, for almost any memory access patternexcept those that are highly sequential. In fact, even PC2-6400, 200MHz x4, has a disadvantage compared to 200MHz x2 memory. The minimum latency of the two types of memory in clock cycles is always going to be higher for the memory type that multipliesits base clock rate by the most. For the mostly random memory access patterns that comprise many DB applications, the base latency of the RAM involved isgoing to matter more than the peak throughput AKA the bandwidth of that RAM. The big message here is that despite engineering tricks and marketing claims, the base clock rate of the RAM you use matters. A minor point to be noted in addition here is that most DB servers under load are limited by their physical IO subsystem,their HDs, and not the speed of their RAM. All of the above comments about the relative performance of different RAM types become insignificant when performance isgated by the HD subsystem.
On Tue, Apr 25, 2006 at 11:07:17PM -0400, Ron Peacetree wrote: > THROUGHPUT is better with DDR2 if and only if there is enough data > to be fetched in a serial fashion from memory. > LATENCY however is dependent on the base clock rate of the RAM > involved. So PC3200, 200MHz x2, is going to actually perform better > than PC2-5400, 166MHz x4, for almost any memory access pattern > except those that are highly sequential. I had forgotten about this. Still, it's not quite as simple as you say. DDR2 has increased latency, however, it has a greater upper limit, and when run at the same clock speed (200 Mhz for 200 Mhz), it is not going to perform worse. Add in double the pre-fetching capability, and what you get is that most benchmarks show DDR2 5400 as being slightly faster than DDR 3200. AMD is switching to DDR2, and I believe that, even after making such a big deal about latency, and why they wouldn't switch to DDR2, they are now saying that their on-chip memory controller will be able to access DDR2 memory (when they support it soon) faster than Intel can, not having an on-chip memory controller. You said that DB accesses are random. I'm not so sure. In PostgreSQL, are not the individual pages often scanned sequentially, especially because all records are variable length? You don't think PostgreSQL will regularly read 32 bytes (8 bytes x 4) at a time, in sequence? Whether for table pages, or index pages - I'm not seeing why the accesses wouldn't be sequential. You believe PostgreSQL will access the table pages and index pages randomly on a per-byte basis? What is the minimum PostgreSQL record size again? Isn't it 32 bytes or over? :-) I wish my systems were running the same OS, and I'd run a test for you. Alas, I don't think comparing Windows to Linux would be valuable. > A minor point to be noted in addition here is that most DB servers > under load are limited by their physical IO subsystem, their HDs, > and not the speed of their RAM. It seems like a pretty major point to me. :-) It's why Opteron with RAID kicks ass over HyperTransport. > All of the above comments about the relative performance of > different RAM types become insignificant when performance is gated > by the HD subsystem. Yes. Luckily - we don't all have Terrabyte databases... :-) Cheers, mark -- mark@mielke.cc / markm@ncf.ca / markm@nortel.com __________________________ . . _ ._ . . .__ . . ._. .__ . . . .__ | Neighbourhood Coder |\/| |_| |_| |/ |_ |\/| | |_ | |/ |_ | | | | | | \ | \ |__ . | | .|. |__ |__ | \ |__ | Ottawa, Ontario, Canada One ring to rule them all, one ring to find them, one ring to bring them all and in the darkness bind them... http://mark.mielke.cc/
I'm posting this to the entire performance list in the hopes that it will be generally useful. =r -----Original Message----- >From: mark@mark.mielke.cc >Sent: Apr 26, 2006 3:25 AM >To: Ron Peacetree <rjpeace@earthlink.net> >Subject: Re: [PERFORM] Large (8M) cache vs. dual-core CPUs > >Hi Ron: > >As a result of your post on the matter, I've been redoing some of my >online research on this subject, to see whether I do have one or more >things wrong. > I'm always in favor of independent investigation to find the truth. :-) >You say: > >> THROUGHPUT is better with DDR2 if and only if there is enough data >> to be fetched in a serial fashion from memory. >... >> So PC3200, 200MHz x2, is going to actually perform better than >> PC2-5400, 166MHz x4, for almost any memory access pattern except >> those that are highly sequential. >... >> For the mostly random memory access patterns that comprise many DB >> applications, the base latency of the RAM involved is going to >> matter more than the peak throughput AKA the bandwidth of that RAM. > >I'm trying to understand right now - why does DDR2 require data to be >fetched in a serial fashion, in order for it to maximize bandwidth? > SDR transfers data on either the rising or falling edge of its clock cycle. DDR transfers data on both the rising and falling edge of the base clock signal. If there is a contiguous chunk of 2+ datumsto be transferred. DDR2 basically has a second clock that cycles at 2x the rate of the base clock and thus we get 4 data transfers per baseclock cycle. If there is a contiguous chunk of 4+ datums to be transferred. Note also what happens when transferring the first datum after a lull period. For purposes of example, let's pretend that we are talking about a base clock rate of 200MHz= 5ns. The SDR still transfers data every 5ns no matter what. The DDR transfers the 1st datum in 10ns and then assuming there are at least 2 sequential datums to be transferred will transferthe 2nd and subsequent sequential pieces of data every 2.5ns. The DDR2 transfers the 1st datum in 20ns and then assuming there are at least 4 sequential datums to be transferred willtransfer the 2nd and subsequent sequential pieces of data every 1.25ns. Thus we can see that randomly accessing RAM degrades performance significantly for DDR and DDR2. We can also see that theconditions for optimal RAM performance become more restrictive as we go from SDR to DDR to DDR2. The reason DDR2 with a low base clock rate excelled at tasks like streaming multimedia and stank at things like small transactionOLTP DB applications is now apparent. Factors like CPU prefetching and victim buffers can muddy this picture a bit. Also, if the CPU's off die IO is slower than the RAM it is talking to, how fast that RAM is becomes unimportant. The reason AMD is has held off from supporting DDR2 until now are: 1. DDR is EOL. JEDEC is not ratifying any DDR faster than 200x2 while DDR2 standards as fast as 333x4 are likely to beratified (note that Intel pretty much avoided DDR, leaving it to AMD, while DDR2 is Intel's main RAM technology. Guesswho has more pull with JEDEC?) 2. DDR and DDR2 RAM with equal base clock rates are finally available, removing the biggest performance difference betweenDDR and DDR2. 3. Due to the larger demand for DDR2, more of it is produced. That in turn has resulted in larger supplies of DDR2 thanDDR. Which in turn, especially when combined with the factors above, has resulted in lower prices for DDR2 than forDDR of the same or faster base clock rate by now. Hope this is helpful, Ron
> >The reason AMD is has held off from supporting DDR2 until now are: >1. DDR is EOL. JEDEC is not ratifying any DDR faster than 200x2 while DDR2 standards as fast as 333x4 are likely to beratified (note that Intel pretty much avoided DDR, leaving it to AMD, while DDR2 is Intel's main RAM technology. Guesswho has more pull with JEDEC?) > > > DDR2 is to RDRAM as C# is to Java ;)
mark@mark.mielke.cc wrote: > > I have an Intel Pentium D 920, and an AMD X2 3800+. These are very > close in performance. The retail price difference is: > > Intel Pentium D 920 is selling for $310 CDN > AMD X2 3800+ is selling for $347 CDN > > Anybody who claims that Intel is 2X more expensive for the same > performance, isn't considering all factors. No question at all - the > Opteron is good, and the Xeon isn't - but the original poster didn't > ask about Opeteron or Xeon, did he? For the desktop lines - X2 is not > double Pentium D. Maybe 10%. Maybe not at all. Especially now that > Intel is dropping it's prices due to overstock. There's part of the equation you are missing here. This is a PostgreSQL mailing list which means we're usually talking about performance of just this specific server app. While in general there may not be that much of a % difference between the 2 chips, there's a huge gap in Postgres. For whatever reason, Postgres likes Opterons. Way more than Intel P4-architecture chips. (And it appears way more than IBM Power4 chips and a host of other chips also.) Here's one of the many discussions we had about this issue last year: http://qaix.com/postgresql-database-development/337-670-re-opteron-vs-xeon-was-what-to-do-with-6-disks-read.shtml The exact reasons why Opteron runs PostgreSQL so much better than P4s, we're not 100% sure of. We have guesses -- lower memory latency, lack of shared FSB, better 64-bit, 64-bit IOMMU, context-switch storms on P4, better dualcore implementation and so on. Perhaps it's a combination of all the above factors but somehow, the general experience people have had is that equivalently priced Opterons servers run PostgreSQL 2X faster than P4 servers as the baseline and the gap increases as you add more sockets and more cores.
>While in general there may not be that much of a % difference between the 2 chips, >there's a huge gap in Postgres. For whatever reason, Postgres likes Opterons. >Way more than Intel P4-architecture chips. It isn't only Postgres. I work on a number of other server applications that also run much faster on Opterons than the published benchmark figures would suggest they should. They're all compiled with gcc4, so possibly there's a compiler issue. I don't run Windows on any of our Opteron boxes so I can't easily compare using the MS compiler.
Mea Culpa. There is a mistake in my example for SDR vs DDR vs DDR2. This is what I get for posting before my morning coffee. The base latency for all of the memory types is that of the base clock rate; 200MHz= 5ns in my given examples. I double factored, making DDR and DDR2 worse than they actually are. Again, my apologies. Ron -----Original Message----- >From: Ron Peacetree <rjpeace@earthlink.net> >Sent: Apr 26, 2006 8:40 AM >To: mark@mark.mielke.cc, pgsql-performance@postgresql.org >Subject: Re: [PERFORM] Large (8M) cache vs. dual-core CPUs > >I'm posting this to the entire performance list in the hopes that it will be generally useful. >=r <snip> > >Note also what happens when transferring the first datum after a lull period. >For purposes of example, let's pretend that we are talking about a base clock rate of 200MHz= 5ns. > >The SDR still transfers data every 5ns no matter what. >The DDR transfers the 1st datum in 10ns and then assuming there are at least 2 sequential datums to be >transferred willtransfer the 2nd and subsequent sequential pieces of data every 2.5ns. >The DDR2 transfers the 1st datum in 20ns and then assuming there are at least 4 sequential datums to be >transferred willtransfer the 2nd and subsequent sequential pieces of data every 1.25ns. > =5= ns to first transfer in all 3 casess. Bad Ron. No Biscuit! > >Thus we can see that randomly accessing RAM degrades performance significantly for DDR and DDR2. We can >also see thatthe conditions for optimal RAM performance become more restrictive as we go from SDR to DDR to >DDR2. >The reason DDR2 with a low base clock rate excelled at tasks like streaming multimedia and stank at things like >small transactionOLTP DB applications is now apparent. > >Factors like CPU prefetching and victim buffers can muddy this picture a bit. >Also, if the CPU's off die IO is slower than the RAM it is talking to, how fast that RAM is becomes unimportant. > These statements, and everything else I posted, are accurate.
Have a look at this Wikipedia page which outlines some differences between the AMD and Intel versions of 64-bit : http://en.wikipedia.org/wiki/EM64T > It isn't only Postgres. I work on a number of other server applications > that also run much faster on Opterons than the published benchmark > figures would suggest they should. They're all compiled with gcc4, > so possibly there's a compiler issue. I don't run Windows on any > of our Opteron boxes so I can't easily compare using the MS compiler.
On Tue, 2006-04-25 at 18:55, Jim C. Nasby wrote: > On Tue, Apr 25, 2006 at 01:33:38PM -0500, Scott Marlowe wrote: > > On Tue, 2006-04-25 at 13:14, Bill Moran wrote: > > > I've been given the task of making some hardware recommendations for > > > the next round of server purchases. The machines to be purchased > > > will be running FreeBSD & PostgreSQL. > > > > > > Where I'm stuck is in deciding whether we want to go with dual-core > > > pentiums with 2M cache, or with HT pentiums with 8M cache. > > > > Given a choice between those two processors, I'd choose the AMD 64 x 2 > > CPU. It's a significantly better processor than either of the Intel > > choices. And if you get the HT processor, you might as well turn of HT > > on a PostgreSQL machine. I've yet to see it make postgresql run faster, > > but I've certainly seen HT make it run slower. > > Actually, believe it or not, a coworker just saw HT double the > performance of pgbench on his desktop machine. Granted, not really a > representative test case, but it still blew my mind. This was with a > database that fit in his 1G of memory, and running windows XP. Both > cases were newly minted pgbench databases with a scale of 40. Testing > was 40 connections and 100 transactions. With HT he saw 47.6 TPS, > without it was 21.1. > > I actually had IT build put w2k3 server on a HT box specifically so I > could do more testing. Just to clarify, this is PostgreSQL on Windows, right? I wonder if the latest Linux kernel can do that well... I'm guessing that the kernel scheduler in Windows has had a lot of work to make it good at scheduling on a HT architecture than the linux kernel has.
David Boreham wrote: > It isn't only Postgres. I work on a number of other server applications > that also run much faster on Opterons than the published benchmark > figures would suggest they should. They're all compiled with gcc4, > so possibly there's a compiler issue. I don't run Windows on any > of our Opteron boxes so I can't easily compare using the MS compiler. Maybe it's just a fact that the majority of x86 64-bit development for open source software happens on Opteron/A64 machines. 64-bit AMD machines were selling a good year before 64-bit Intel machines were available. And even after Intel EMT64 were available, anybody in their right mind would have picked AMD machines over Intel due to cost/heat/performance. So you end up with 64-bit OSS being developed/optimized for Opterons and the 10% running Intel EMT64 handle compatibility issues. Would be interesting to see a survey of what machines OSS developers use to write/test/optimize their code.
On Tue, 2006-04-25 at 20:17, mark@mark.mielke.cc wrote: > On Tue, Apr 25, 2006 at 08:54:40PM -0400, mark@mark.mielke.cc wrote: > > I made the choice I describe based on a lot of research. I was going > > to go both Intel, until I noticed that the Intel prices were dropping > > fast. 30% price cut in 2 months. AMD didn't drop at all during the > > same time. > > Errr.. big mistake. That was going to be - I was going to go both AMD. > > > There are plenty of reasons to choose one over the other. Generally > > the AMD comes out on top. It is *not* 2X though. Anybody who claims > > this is being highly selective about which benchmarks they consider. > > I have an Intel Pentium D 920, and an AMD X2 3800+. These are very > close in performance. The retail price difference is: > > Intel Pentium D 920 is selling for $310 CDN > AMD X2 3800+ is selling for $347 CDN Let me be clear. The performance difference between those boxes running the latest first person shooter is not what I was alluding to in my first post. While the price of the Intel's may have dropped, there's a huge difference (often 2x or more) in performance when running PostgreSQL on otherwise similar chips from Intel and AMD. Note that my workstation at work, my workstation at home, and my laptop are all intel based machines. They work fine for that. But if I needed to build a big fast oracle or postgresql server, I'd almost certainly go with the AMD, especially so if I needed >2 cores, where the performance difference becomes greater and greater. You'd likely find that for PostgreSQL, the slowest dual core AMDs out would still beat the fasted Intel Dual cores, because of the issue we've seen on the list with context switching storms. If you haven't actually run a heavy benchmark of postgresql on the two architectures, please don't make your decision based on other benchmarks. Since you've got both a D920 and an X2 3800, that'd be a great place to start. Mock up some benchmark with a couple dozen threads hitting the server at once and see if the Intel can keep up. It should do OK, but not great. If you can get your hands on a dual dual-core setup for either, you should really start to see the advantage going to AMD, and by the time you get to a quad dual core setup, it won't even be a contest.
On Wed, Apr 26, 2006 at 10:27:18AM -0500, Scott Marlowe wrote: > If you haven't actually run a heavy benchmark of postgresql on the two > architectures, please don't make your decision based on other > benchmarks. Since you've got both a D920 and an X2 3800, that'd be a > great place to start. Mock up some benchmark with a couple dozen > threads hitting the server at once and see if the Intel can keep up. It Or better yet, use dbt* or even pgbench so others can reproduce... -- Jim C. Nasby, Sr. Engineering Consultant jnasby@pervasive.com Pervasive Software http://pervasive.com work: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461
On Wed, Apr 26, 2006 at 10:17:58AM -0500, Scott Marlowe wrote: > On Tue, 2006-04-25 at 18:55, Jim C. Nasby wrote: > > On Tue, Apr 25, 2006 at 01:33:38PM -0500, Scott Marlowe wrote: > > > On Tue, 2006-04-25 at 13:14, Bill Moran wrote: > > > > I've been given the task of making some hardware recommendations for > > > > the next round of server purchases. The machines to be purchased > > > > will be running FreeBSD & PostgreSQL. > > > > > > > > Where I'm stuck is in deciding whether we want to go with dual-core > > > > pentiums with 2M cache, or with HT pentiums with 8M cache. > > > > > > Given a choice between those two processors, I'd choose the AMD 64 x 2 > > > CPU. It's a significantly better processor than either of the Intel > > > choices. And if you get the HT processor, you might as well turn of HT > > > on a PostgreSQL machine. I've yet to see it make postgresql run faster, > > > but I've certainly seen HT make it run slower. > > > > Actually, believe it or not, a coworker just saw HT double the > > performance of pgbench on his desktop machine. Granted, not really a > > representative test case, but it still blew my mind. This was with a > > database that fit in his 1G of memory, and running windows XP. Both > > cases were newly minted pgbench databases with a scale of 40. Testing > > was 40 connections and 100 transactions. With HT he saw 47.6 TPS, > > without it was 21.1. > > > > I actually had IT build put w2k3 server on a HT box specifically so I > > could do more testing. > > Just to clarify, this is PostgreSQL on Windows, right? > > I wonder if the latest Linux kernel can do that well... I'm guessing > that the kernel scheduler in Windows has had a lot of work to make it > good at scheduling on a HT architecture than the linux kernel has. Yes, this is on Windows XP. Larry might also have a HT box with some other OS on it we can check with (though I suspect that maybe that's been beaten to death...) -- Jim C. Nasby, Sr. Engineering Consultant jnasby@pervasive.com Pervasive Software http://pervasive.com work: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461
On Tue, Apr 25, 2006 at 11:07:17PM -0400, Ron Peacetree wrote: > A minor point to be noted in addition here is that most DB servers under load are limited by their physical IO subsystem,their HDs, and not the speed of their RAM. I think if that were the only consideration we wouldn't be seeing such a dramatic difference between AMD and Intel though. Even in a disk-bound server, caching is going to have a tremendous impact, and that's essentially entirely bound by memory bandwith and latency. -- Jim C. Nasby, Sr. Engineering Consultant jnasby@pervasive.com Pervasive Software http://pervasive.com work: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461
Jim C. Nasby wrote: > On Wed, Apr 26, 2006 at 10:27:18AM -0500, Scott Marlowe wrote: > > If you haven't actually run a heavy benchmark of postgresql on the two > > architectures, please don't make your decision based on other > > benchmarks. Since you've got both a D920 and an X2 3800, that'd be a > > great place to start. Mock up some benchmark with a couple dozen > > threads hitting the server at once and see if the Intel can keep up. It > > Or better yet, use dbt* or even pgbench so others can reproduce... For why Opterons are superior to Intel for PostgreSQL, see: http://techreport.com/reviews/2005q2/opteron-x75/index.x?pg=2 Section "MESI-MESI-MOESI Banana-fana...". Specifically, this part about the Intel implementation: The processor with the Invalid data in its cache (CPU 0, let's say) might then wish to modify that chunk of data, but it could not do so while the only valid copy of the data is in the cache of the other processor (CPU 1). Instead, CPU 0 would have to wait until CPU 1 wrote the modified data back to main memory before proceeding.and that takes time, bus bandwidth, and memory bandwidth. This is the great drawback of MESI. AMD transfers the dirty cache line directly from cpu to cpu. I can imaging that helping our test-and-set shared memory usage quite a bit. -- Bruce Momjian http://candle.pha.pa.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. +
On Wed, Apr 26, 2006 at 02:48:53AM -0400, mark@mark.mielke.cc wrote: > You said that DB accesses are random. I'm not so sure. In PostgreSQL, > are not the individual pages often scanned sequentially, especially > because all records are variable length? You don't think PostgreSQL > will regularly read 32 bytes (8 bytes x 4) at a time, in sequence? > Whether for table pages, or index pages - I'm not seeing why the > accesses wouldn't be sequential. You believe PostgreSQL will access > the table pages and index pages randomly on a per-byte basis? What > is the minimum PostgreSQL record size again? Isn't it 32 bytes or > over? :-) Data within a page can absolutely be accessed randomly; it would be horribly inefficient to slog through 8K of data every time you needed to find a single row. The header size of tuples is ~23 bytes, depending on your version of PostgreSQL, and data fields have to start on the proper alignment (generally 4 bytes). So essentially the smallest row you can get is 28 bytes. I know that tuple headers are dealt with as a C structure, but I don't know if that means accessing any of the header costs the same as accessing the whole thing. I don't know if PostgreSQL can access fields within tuples without having to scan through at least the first part of preceeding fields, though I suspect that it can access fixed-width fields that sit before any varlena fields directly (without scanning through the other fields). If we ever got to the point of divorcing the in-memory tuple layout from the table layout it'd be interesting to experiment with having all varlena length info stored immediately after all fixed-width fields; that could potentially make accessing varlena's randomly faster. Note that null fields are indicated as such in the null bitmap, so I'm pretty sure that their in-tuple position doesn't matter much. Of course if you want the definitive answer, Use The Source. -- Jim C. Nasby, Sr. Engineering Consultant jnasby@pervasive.com Pervasive Software http://pervasive.com work: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461
On Wed, Apr 26, 2006 at 06:16:46PM -0400, Bruce Momjian wrote: > Jim C. Nasby wrote: > > On Wed, Apr 26, 2006 at 10:27:18AM -0500, Scott Marlowe wrote: > > > If you haven't actually run a heavy benchmark of postgresql on the two > > > architectures, please don't make your decision based on other > > > benchmarks. Since you've got both a D920 and an X2 3800, that'd be a > > > great place to start. Mock up some benchmark with a couple dozen > > > threads hitting the server at once and see if the Intel can keep up. It > > > > Or better yet, use dbt* or even pgbench so others can reproduce... > > For why Opterons are superior to Intel for PostgreSQL, see: > > http://techreport.com/reviews/2005q2/opteron-x75/index.x?pg=2 > > Section "MESI-MESI-MOESI Banana-fana...". Specifically, this part about > the Intel implementation: > > The processor with the Invalid data in its cache (CPU 0, let's say) > might then wish to modify that chunk of data, but it could not do so > while the only valid copy of the data is in the cache of the other > processor (CPU 1). Instead, CPU 0 would have to wait until CPU 1 wrote > the modified data back to main memory before proceeding.and that takes > time, bus bandwidth, and memory bandwidth. This is the great drawback of > MESI. > > AMD transfers the dirty cache line directly from cpu to cpu. I can > imaging that helping our test-and-set shared memory usage quite a bit. Wasn't the whole point of test-and-set that it's the recommended way to do lightweight spinlocks according to AMD/Intel? You'd think they'd have a way to make that performant on multiple CPUs (though if it's relying on possibly modifying an underlying data page I can't really think of how to do that without snaking through the cache...) -- Jim C. Nasby, Sr. Engineering Consultant jnasby@pervasive.com Pervasive Software http://pervasive.com work: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461
On Wed, Apr 26, 2006 at 05:37:31PM -0500, Jim C. Nasby wrote: > On Wed, Apr 26, 2006 at 06:16:46PM -0400, Bruce Momjian wrote: > > AMD transfers the dirty cache line directly from cpu to cpu. I can > > imaging that helping our test-and-set shared memory usage quite a bit. > Wasn't the whole point of test-and-set that it's the recommended way to > do lightweight spinlocks according to AMD/Intel? You'd think they'd have > a way to make that performant on multiple CPUs (though if it's relying > on possibly modifying an underlying data page I can't really think of > how to do that without snaking through the cache...) It's expensive no matter what. One method might be less expensive than another. :-) AMD definately seems to have things right for lowest absolute latency. 2X still sounds like an extreme case - but until I've actually tried a very large, or thread intensive PostgreSQL db on both, I probably shouldn't doubt the work of others too much. :-) Cheers, mark -- mark@mielke.cc / markm@ncf.ca / markm@nortel.com __________________________ . . _ ._ . . .__ . . ._. .__ . . . .__ | Neighbourhood Coder |\/| |_| |_| |/ |_ |\/| | |_ | |/ |_ | | | | | | \ | \ |__ . | | .|. |__ |__ | \ |__ | Ottawa, Ontario, Canada One ring to rule them all, one ring to find them, one ring to bring them all and in the darkness bind them... http://mark.mielke.cc/
On Apr 25, 2006, at 2:14 PM, Bill Moran wrote: > Where I'm stuck is in deciding whether we want to go with dual-core > pentiums with 2M cache, or with HT pentiums with 8M cache. In order of preference: Opterons (dual core or single core) Xeon with HT *disabled* at the BIOS level (dual or single core) Notice Xeon with HT is not on my list :-)
On Apr 25, 2006, at 5:09 PM, Ron Peacetree wrote: > ...and even if you do buy Intel, =DON"T= buy Dell unless you like > causing trouble for yourself. > Bad experiences with Dell in general and their poor PERC RAID > controllers in specific are all over this and other DB forums. I don't think that their current controllers suck like their older ones did. That's what you'll read about in the archives -- the old stuff. Eg, the 1850's embedded RAID controller really flies, but it only works with the internal disks. I can't comment on the external array controller for the 1850, but I cannot imagine it being any slower. And personally, I've not experienced any major problems aside from two bad PE1550's 4 years ago. And I have currently about 15 Dell servers running 24x7x365 doing various tasks, including postgres. However, my *big* databases always go on dual opteron boxes. my current favorite is the SunFire X4100 with an external RAID.
Hi all, Vivek Khera schrieb: > On Apr 25, 2006, at 2:14 PM, Bill Moran wrote: >> Where I'm stuck is in deciding whether we want to go with dual-core >> pentiums with 2M cache, or with HT pentiums with 8M cache. > > In order of preference: > > Opterons (dual core or single core) > Xeon with HT *disabled* at the BIOS level (dual or single core) > > > Notice Xeon with HT is not on my list :-) > I support Vivek's order of preference. I have been going through a nightmare of performance issues with different x86 hardware. At the end of the day I can say the Opterons are faster because of their memory bandwidth. I also had to disable HT on all our customers servers which were still using XEON's with HT. There is a paper from HP which describes the advantage of the memory architecture of the Opterons. This is the best explanation to me why Opteron 875 is faster than a XEON MP 3 GHz, which I did compare last year. I remember a thread in the postgresql devel list around HT in 2004, where you can find the reason why you should disable HT. This thread refers to Intel Developer Manual Volume 4 (Architecture Optimisation) where there is some advice regarding spin-wait loop. This is related to the code of src/include/storage/s_lock.h. Cheers Sven. ====== From Intel Developer Manual Volume 4 Synchronization for Short Periods The frequency and duration that a thread needs to synchronize with other threads depends application characteristics. When a synchronization loop needs very fast response, applications may use a spin-wait loop. A spin-wait loop is typically used when one thread needs to wait a short amount of time for another thread to reach a point of synchronization. A spin-wait loop consists of a loop that compares a synchronization variable with some pre-defined value [see Example 7-1(a)]. On a modern microprocessor with a superscalar speculative execution engine, a loop like this results in the issue of multiple simultaneous read requests from the spinning thread. These requests usually execute out-of-order with each read request being allocated a buffer resource. On detection of a write by a worker thread to a load that is in progress, the processor must guarantee no violations of memory order occur. The necessity of maintaining the order of outstanding memory operations inevitably costs the processor a severe penalty that impacts all threads. This penalty occurs on the Pentium Pro processor, the Pentium II processor and the Pentium III processor. However, the penalty on these processors is small compared with penalties suffered on the Pentium 4 and Intel Xeon processors. There the performance penalty for exiting the loop is about 25 times more severe. On a processor supporting Hyper-Threading Technology, spin-wait loops can consume a significant portion of the execution bandwidth of the processor. One logical processor executing a spin-wait loop can severely impact the performance of the other logical processor. ====