Thread: Opteron vs. Xeon performance differences

Opteron vs. Xeon performance differences

From
Bart Grantham
Date:

Forgive me if this has been beaten into the ground, but my team and I couldn’t find much conclusive study or posts on this issue.  To make a long story short: we’re experiencing Xeons as 50% slower than Opterons, even when the Xeon has twice as much cache and a slight clock speed advantage.

 

The full story: we have an older production server with 2G of RAM, 2.4GHz Opterons w/ 1M of cache.  The database is not large, only around 7M or 8M rows altogether, 2.5G on disk.  Most queries are reads, probably on a 10:1 proportion with writes.  In the process of upgrading this server to a pair of DRBD-mirrored (more on this below) servers we discovered that the new servers were actually slower than the older one.  The newer servers have 4G of RAM, 3.0GHz Xeons with 2M of cache.  And not just a little slower, but queries (simple, complex, and disgusting recursive stored procedures) routinely run in 50-100% more time than they did on the older server.  After many troubleshooting techniques (downgrading the kernel to that of the older machine, verifying version parity, copying the binary from the older server, building a 32bit binary on the new servers, running the entire database out of a ramdisk, and of course much tweaking of postgresql.conf) and seeing virtually no benefit from any of these tests I finally took the final leap: just pull the disks and throw them in a newer Opteron chassis (2.8GHz, 1M cache).  And whaddya know?  It’s got a 20% speed edge on the older Opteron, and blows away the performance of the newer Xeons.

 

One of my guys did some testing and it appears that LWLockAquire and LWLockRelease are the culprits, but we’re not entirely confident of our conclusion.  Any thoughts on why this might be so different between the two architectures?  We’re a hosting provider so we’ve got some spare equipment to work with and I’m going to request that we keep these two boxes up for a week or so.  Are there any other tests that you guys can suggest that would help get down to the bottom of this?  I figure that not everyone has access to as much gear as we do so it might be a good opportunity to get some A/B testing on a production database on identical OS/server installs on different hardware.  I’m content to just say “Well, we use Opterons then!”, but I imagine that if we could help bring equal performance to Xeon users that it would be worth the effort of volunteering.  To be clear, I have two machines sitting on the network ready for tweaking, one is a Xeon, the other is an Opteron, neither is in production and both can be fully mangled in the interest of figuring this out.

 

Speaking of being a hosting provider, I may as well take a moment to point out that we are working with DRBD for mirroring and have found it works beautifully with PG (MySQL as well).  Also, while our “Managed Database Service” product is geared around MySQL, Oracle, and MSSQL, we’re pretty familiar with PG and would be happy to talk to anyone about hosting needs they may have.

 

Thanks for listening, and again please let me know if there is further testing we can do to help get to the bottom of this Opteron/Xeon performance discrepancy.

 

Bart Grantham

VP of R&D

Logicworks, Inc.

www.logicworks.net

Re: Opteron vs. Xeon performance differences

From
"Scott Marlowe"
Date:
On Thu, Oct 9, 2008 at 3:34 PM, Bart Grantham <bg@logicworks.net> wrote:
> Forgive me if this has been beaten into the ground, but my team and I
> couldn't find much conclusive study or posts on this issue.  To make a long
> story short: we're experiencing Xeons as 50% slower than Opterons, even when
> the Xeon has twice as much cache and a slight clock speed advantage.

I'm not sure what causes this issue either, although I suspect it's
the inter-CPU / CPU to memory communication speeds that make the
difference.  It seems that as the number of CPUs increase, the opteron
lead increases over the xeon.

Re: Opteron vs. Xeon performance differences

From
"postgres Emanuel CALVO FRANCO"
Date:
How do you manage the wal in both servers?
The version kernel is the same in both?
Runs the same services?
Do you make some test with Posgresql only in both servers?
If the problem is the inter-CPU, i know you can specified the number
of processors
do you want to run dedicated to one process.


2008/10/10 Scott Marlowe <scott.marlowe@gmail.com>:
> On Thu, Oct 9, 2008 at 3:34 PM, Bart Grantham <bg@logicworks.net> wrote:
>> Forgive me if this has been beaten into the ground, but my team and I
>> couldn't find much conclusive study or posts on this issue.  To make a long
>> story short: we're experiencing Xeons as 50% slower than Opterons, even when
>> the Xeon has twice as much cache and a slight clock speed advantage.
>
> I'm not sure what causes this issue either, although I suspect it's
> the inter-CPU / CPU to memory communication speeds that make the
> difference.  It seems that as the number of CPUs increase, the opteron
> lead increases over the xeon.
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>

Re: Opteron vs. Xeon performance differences

From
"Matthew T. O'Connor"
Date:
Bart Grantham wrote:
> Forgive me if this has been beaten into the ground, but my team and I
> couldn’t find much conclusive study or posts on this issue.  To make a
> long story short: we’re experiencing Xeons as 50% slower than Opterons,
> even when the Xeon has twice as much cache and a slight clock speed
> advantage.

Simple question, you know that the plans are the same?  And I don't
think you said conclusively that it's the same version of PGSQL on both
servers?

Re: Opteron vs. Xeon performance differences

From
Greg Smith
Date:
On Thu, 9 Oct 2008, Bart Grantham wrote:

> The full story: we have an older production server with 2G of RAM,
> 2.4GHz Opterons w/ 1M of cache...The newer servers have 4G of RAM,
> 3.0GHz Xeons with 2M of cache.

Model numbers please?  I can probably guess for the Opterons, there are a
lot of different implementations lumped under the Xeon brand name.

Have you taken compared how fast the RAM is in the two systems?  We were
just talking about a similar unexpected performance different yesterday on
another list:
http://archives.postgresql.org/pgsql-performance/2008-10/msg00051.php

I'd be curious what memtest86+ and the simple hdparm -T benchmark say
about the two servers.  If those numbers correlate with the performance
difference you're seeing, the PostgreSQL code might have nothing to do
with it.  I've seen a 60% performance difference just between the best and
worst RAM I tried on a single motherboard recently.

--
* Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD

Re: Opteron vs. Xeon performance differences

From
Shane Ambler
Date:
Bart Grantham wrote:
> a long story short: we're experiencing Xeons as 50% slower than
> Opterons, even when the Xeon has twice as much cache and a slight
> clock speed advantage.

> tests I finally took the final leap: just pull the disks and throw
> them in a newer Opteron chassis (2.8GHz, 1M cache).  And whaddya
> know?  It's got a 20% speed edge on the older Opteron, and blows away
> the performance of the newer Xeons.

But is the difference in cpu or disk?

Do the two machines get a similar disk transfer rate?

Same raid card and disks in both machines, do they get the same MB/Sec?
(as opposed to on-board controllers)



--

Shane Ambler
pgSQL (at) Sheeky (dot) Biz

Get Sheeky @ http://Sheeky.Biz

Re: Opteron vs. Xeon performance differences

From
"postgres Emanuel CALVO FRANCO"
Date:
When i question about WAL, i mean if WAL is in other drive.

You must run a benchmark more expensive to cpu for make a conclusion.
Make a query that have more of 8 seconds, then you can see really if
exists a diference

in other way... i think you don't use the same image of the old server
in the new.
In that way could be a configuration kernel.

do you make a test of hardware instead postgres?? if the hard give you
better numbers, so postgres have
the problem.

2008/10/10 Shane Ambler <pgsql@sheeky.biz>:
> Bart Grantham wrote:
>>
>> a long story short: we're experiencing Xeons as 50% slower than
>> Opterons, even when the Xeon has twice as much cache and a slight
>> clock speed advantage.
>
>> tests I finally took the final leap: just pull the disks and throw
>> them in a newer Opteron chassis (2.8GHz, 1M cache).  And whaddya
>> know?  It's got a 20% speed edge on the older Opteron, and blows away
>> the performance of the newer Xeons.
>
> But is the difference in cpu or disk?
>
> Do the two machines get a similar disk transfer rate?
>
> Same raid card and disks in both machines, do they get the same MB/Sec?
> (as opposed to on-board controllers)
>
>
>
> --
>
> Shane Ambler
> pgSQL (at) Sheeky (dot) Biz
>
> Get Sheeky @ http://Sheeky.Biz
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>

Re: Opteron vs. Xeon performance differences

From
Greg Smith
Date:
On Fri, 10 Oct 2008, Bart Grantham wrote:

> The Opterons are 2220 SE's, the Xeons are 5450's I think (family "15", model "6").
>
> Xeon - 3056 MB in  2.00 seconds = 1527.85 MB/sec
> Opteron - 4944 MB in  2.00 seconds = 2472.50 MB/sec

There's something wrong with that Xeon system.  That number should be
twice that and your Xeon smoking those Opterons by 25% or so on
benchmarks.  My Q6600 system at home has a slower bus and clock speed than
your Xeon, but hits 3891MB/s on cached hdparm even with the slowest of the
RAM I have here.  Now that I got the first round right, can I make a
double or nothing bet that your Xeon system is either a) not running your
RAM in dual-channel mode or b) is getting throttled by power management?

> Should I cross post to pgsql-performance?  Or are most of the people on
> that list here, too?

That would have been a better place to start at, but don't bother
switching now--there's a lot of overlap.  Cross-posting to the lists here
is bad, partly because then replies by people who only belong to one of
the two end up bugging the list admins.  One of these days I'm going to
summarize the main lore on this topic into a Wiki article anyway, which
will pull the good stuff out of here regardless of the originating list.

--
* Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD