Re: Xeon twice the performance of opteron - Mailing list pgsql-performance

From mark
Subject Re: Xeon twice the performance of opteron
Date
Msg-id 005e01cbe50b$2f6fad00$8e4f0700$@com
Whole thread Raw
In response to Xeon twice the performance of opteron  (Jeff <threshar@torgo.978.org>)
List pgsql-performance

> -----Original Message-----
> From: pgsql-performance-owner@postgresql.org [mailto:pgsql-performance-
> owner@postgresql.org] On Behalf Of Jeff
> Sent: Thursday, March 17, 2011 9:14 AM
> To: pgsql-performance@postgresql.org
> Cc: Brian Ristuccia
> Subject: [PERFORM] Xeon twice the performance of opteron
>
> hey folks,
>
> Running into some odd performance issues between a few of our db
> boxes.  While trying to speed up a query I ran it on another box and
> it was twice as fast.  The plans are identical and various portions of
> the query run in the same amount of time - it all boils down to most
> of the time being spent in a join filter.  The plan is as good as it
> is going to get but the thing that is concerning me, which hopefully
> some folks here may have some insight on, is the very large difference
> in runtime.
>
> three boxes:
>     A: Intel(R) Xeon(R) CPU           E5345  @ 2.33GHz  (Runs query
> fastest)
>         4MB cache
>     B: Quad-Core AMD Opteron(tm) Processor 2352 (2.1GHZ) (Main
> production
> box, currently, middle speed)
>         512k cache
>     C: Quad-Core AMD Opteron(tm) Processor 2378 (2.4GHZ)
>         512k cache
>
> A & B are running PG 8.4.2 (yes, I know it desperately need to be
> upgraded). C was also on 8.4.2 and since it was not in production I
> upgraded it to 8.4.7 and got the same performance as 8.4.2.  Dataset
> on A & B is the same C is mostly the same, but is missing a couple
> weeks of data (but since this query runs over 3 years of data, it is
> negligable - plus C runs the slowest!)
>
> All three running FC10 with kernel Linux db06
> 2.6.27.19-170.2.35.fc10.x86_64 #1 SMP Mon Feb 23 13:00:23 EST 2009
> x86_64 x86_64 x86_64 GNU/Linux
>
> Load is very low on each box. The query is running from shared_buffers
> - no real IO is occuring.
>
> The average timing for the query in question is 90ms on A,  180ms on B
> and 190ms on C.
>
> Now here's where some odd stuff starts piling up: explain analyze
> overhead on said queries:
> 20ms on A, 50ms on B and 85ms on C(!!)
>
> We had one thought about potential NUMA issues, but doing a series
> (100) of connect, query, disconnect and looking at the timings reveals
> them all to be solid... but even still we wouldn't expect it to be
> that awful.  The smaller cache of the opterons is also a valid
> argument.
>
> I know we're running an old kernel, I'm tempted to upgrade to see what
> will happen, but at the same time I'm afraid it'll upgrade to a kernel
> with a broken [insert major subsystem here] which has happened before.
>
> Anybody have some insight into this or run into this before?
>
> btw, little more background on the query:
>
>     ->  Nested Loop  (cost=5.87..2763.69 rows=9943 width=0) (actual
> time=0.571..2
> 74.750 rows=766 loops=1)
>           Join Filter: (ce.eventdate >= (md.date - '6 days'::interval))
>           ->  Nested Loop  (cost=5.87..1717.98 rows=27 width=8)
> (actual time=0.53
> 3..8.301 rows=159 loops=1)
>         [stuff removed here]
>          ->  Index Scan using  xxxxxxx_date_idx on xxxxxx md
> (cost=0.00..19.50 rows=1099 width=8) (actual time=0.023..0.729
> rows=951 loops=15
> 9)
>                 Index Cond: (ce.eventdate <= md.date)
>
>
> On all three boxes that inner nestloop completes in about the same
> amount of time - it is that join filter that is causing the pain and
> agony. (If you are noticing the timing differences, that is because
> the numbers above are the actual numbers, not explain analyze).  The
> query is pulling up a rolling window of events that occured on a
> specific date. This query pulls up al the data for a period of time.
> ce.eventdate is indexed, and is used in the outer nestloop.  Thinking
> more about what is going on cache thrashing is certainly a possibility.
>
> the amazing explain analyze overhead is also very curious - we all
> know it adds overhead, but 85ms?  Yow.
>
> --
> Jeff Trout <jeff@jefftrout.com>
> http://www.stuarthamm.net/
> http://www.dellsmartexitin.com/

I am sure you might have already checked for this, but just incase...
Did you verify that no power savings stuff is turned on in the BIOS or at
the kernel ?

I have to set ours to something HP calls static high performance or
something like that if I want boxes that are normally pretty idle to execute
in a predictable fashion for sub second queries.

I assume you checked with a steam benchmark results on the AMD machines to
make sure they are getting in the ballpark of where they are supposed to ?







pgsql-performance by date:

Previous
From: Andy Colson
Date:
Subject: Re: Fastest pq_restore?
Next
From: Scott Marlowe
Date:
Subject: Re: Xeon twice the performance of opteron