Sunfire X4500 recommendations - Mailing list pgsql-performance

From Matt Smiley
Subject Sunfire X4500 recommendations
Whole thread Raw
Responses Re: Sunfire X4500 recommendations
List pgsql-performance
My company is purchasing a Sunfire x4500 to run our most I/O-bound databases, and I'd like to get some advice on
configurationand tuning.  We're currently looking at: 
 - Solaris 10 + zfs + RAID Z
 - CentOS 4 + xfs + RAID 10
 - CentOS 4 + ext3 + RAID 10
but we're open to other suggestions.

From previous message threads, it looks like some of you have achieved stellar performance under both Solaris 10 U2/U3
withzfs and CentOS 4.4 with xfs.  Would those of you who posted such results please describe how you tuned the OS/fs to
yieldthose figures (e.g. patches, special drivers, read-ahead, checksumming, write-through cache settings, etc.)? 

Most of our servers currently run CentOS/RedHat, and we have little experience with Solaris, but we're not opposed to
Solarisif there's a compelling reason to switch.  For example, it sounds like zfs snapshots may have a lighter
performancepenalty than LVM snapshots.  We've heard that just using LVM (even without active snapshots) imposes a
maximumsequential I/O rate of around 600 MB/s (although we haven't yet reached this limit experimentally). 

By the way, we've also heard that Solaris is "more stable" under heavy I/O load than Linux.  Have any of you
experiencedthis?  It's hard to put much stock in such a blanket statement, but naturally we don't want to introduce

Thanks in advance for your thoughts!

For reference:

Our database cluster will be 3-6 TB in size.  The Postgres installation will be 8.1 (at least initially), compiled to
use32 KB blocks (rather than 8 KB).  The workload will be predominantly OLAP.  The Sunfire X4500 has 2 dual-core
Opterons,16 GB RAM, 48 SATA disks (500 GB/disk * 48 = 24 TB raw -> 12 TB usable under RAID 10). 

So far, we've seen the X4500 deliver impressive but suboptimal results using the out-of-the-box installation of Solaris
+zfs.  The Linux testing is in the early stages (no xfs, yet), but so far it yeilds comparatively modest write rates
andvery poor read and rewrite rates. 

Results under Solaris with zfs:

Four concurrent writers:
% time dd if=/dev/zero of=/zpool1/test/50GB-zero1 bs=1024k count=51200 ; time sync
% time dd if=/dev/zero of=/zpool1/test/50GB-zero2 bs=1024k count=51200 ; time sync
% time dd if=/dev/zero of=/zpool1/test/50GB-zero3 bs=1024k count=51200 ; time sync
% time dd if=/dev/zero of=/zpool1/test/50GB-zero4 bs=1024k count=51200 ; time sync

Seq Write (bs = 1 MB):  128 + 122 + 131 + 124 = 505 MB/s

Four concurrent readers:
% time dd if=/zpool1/test/50GB-zero1 of=/dev/null bs=1024k
% time dd if=/zpool1/test/50GB-zero2 of=/dev/null bs=1024k
% time dd if=/zpool1/test/50GB-zero3 of=/dev/null bs=1024k
% time dd if=/zpool1/test/50GB-zero4 of=/dev/null bs=1024k

Seq Read (bs = 1 MB):   181 + 177 + 180 + 178 = 716 MB/s

One bonnie++ process:
% bonnie++ -r 16384 -s 32g:32k -f -n0 -d /zpool1/test/bonnie_scratch

Version  1.03       ------Sequential Output------ --Sequential Input- --Random-
                    -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
Machine   Size:chnk K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP  /sec %CP
thumper1    32G:32k           604173  98 268893  43           543389  59 519.2   3

4 concurrent synchronized bonnie++ processes:
% bonnie++ -p4
% bonnie++ -r 16384 -s 32g:32k -y -f -n0 -d /zpool1/test/bonnie_scratch
% bonnie++ -r 16384 -s 32g:32k -y -f -n0 -d /zpool1/test/bonnie_scratch
% bonnie++ -r 16384 -s 32g:32k -y -f -n0 -d /zpool1/test/bonnie_scratch
% bonnie++ -r 16384 -s 32g:32k -y -f -n0 -d /zpool1/test/bonnie_scratch
% bonnie++ -p-1

Combined results of 4 sessions:
Seq Output:   124 + 124 + 124 + 140 = 512 MB/s
Rewrite:       93 +  94 +  93 +  96 = 376 MB/s
Seq Input:    192 + 194 + 193 + 197 = 776 MB/s
Random Seek:  327 + 327 + 335 + 332 = 1321 seeks/s

Results under CentOS 4 with ext3 and LVM:

% bonnie++ -s 32g:32k -f -n0 -d /large_lvm_stripe/test/bonnie_scratch
Version  1.03       ------Sequential Output------ --Sequential Input- --Random-
                    -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
Machine   Size:chnk K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP  /sec %CP
thumper1.rt 32G:32k           346595  94 59448  11           132471  12 479.4   2

Summary of bonnie++ results:

                           sequential  sequential    sequential  scattered
Test case                  write MB/s  rewrite MB/s  read MB/s   seeks/s
-------------------------  ----------  ------------  ----------  ---------
Sol10+zfs, 1 process              604           269         543        519
Sol10+zfs, 4 processes            512           376         776       1321
Cent4+ext3+LVM, 1 process         347            59         132        479

pgsql-performance by date:

From: Michael Stone
Subject: Re: Performance of count(*)
From: Dimitri
Subject: Re: Sunfire X4500 recommendations