Thread: raid array seek performance

raid array seek performance

From
Samuel Gendler
Date:
I'm just beginning the process of benchmarking and tuning a new server.  Something I really haven't done before.  I'm using Greg's book as a guide.  I started with bonnie++ (1.96) and immediately got anomalous results (I think).

Hardware is as follows:

2x quad core xeon 5504 2.0Ghz, 2x4MB cache
192GB DDR3 1066 RAM
24x600GB 15K rpm SAS drives
adaptec 52445 controller

The default config, being tested at the moment, has 2 volumes, one 100GB and one 3.2TB, both are built from a stripe across all 24 disks, rather than splitting some spindles out for one volume and another set for the other volume.  At the moment, I'm only testing against the single 3.2TB volume.

The smaller volume is partitioned into /boot (ext2 and tiny) and / (ext4 and 91GB).  The larger volume is mounted as xfs with the following options (cribbed from an email to the list earlier this week, I think): logbufs=8,noatime,nodiratime,nobarrier,inode64,allocsize=16m

Bonnie++ delivered the expected huge throughput for sequential read and write.  It seems in line with other benchmarks I found online.  However, we are only seeing 180 seeks/sec, but seems quite low.  I'm hoping someone might be able to confirm that and. hopefully, make some suggestions for tracking down the problem if there is one.

Results are as follows:

1.96,1.96,newbox,1,1315935572,379G,,1561,99,552277,46,363872,34,3005,90,981924,49,179.1,56,16,,,,,19107,69,+++++,+++,20006,69,19571,72,+++++,+++,20336,63,7111us,10666ms,14067ms,65528us,592ms,170ms,949us,107us,160us,383us,31us,130us


Version      1.96   ------Sequential Output------ --Sequential Input- --Random-
                    -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
Machine        Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP  /sec %CP
newzonedb.z1.p 379G  1561  99 552277  46 363872  34  3005  90 981924  49 179.1  56
Latency              7111us   10666ms   14067ms   65528us     592ms     170ms
                    ------Sequential Create------ --------Random Create--------
                    -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete--
files:max:min        /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP
newbox            16 19107  69 +++++ +++ 20006  69 19571  72 +++++ +++ 20336  63
Latency               949us     107us     160us     383us      31us     130us

Also, my inclination is to default to the following volume layout:

2 disks in RAID 1 for system
4 disks in RAID 10 for WAL (xfs)
18 disks in RAID 10 for data (xfs)

Use case is minimal OLTP traffic, plus a fair amount of data warehouse style traffic - low connection count, queries over sizeable fact tables (100s of millions of rows) partitioned over time, insert-only data loading, via COPY, plus some tables are populated via aggregation queries over other tables.  Basically, based on performance of our current hardware, I'm not concerned about being able to handle the data-loading load, with the 4 drive raid 10 volume, so emphasis is on warehouse query speed.  I'm not best pleased by the 2 Ghz CPUs, in that context, but I wasn't given a choice on the hardware.

Any comments on that proposal are welcome.  I've got only a week to settle on a config and ready the box for production, so the number of iterations I can go through is limited.


Re: raid array seek performance

From
Samuel Gendler
Date:


On Tue, Sep 13, 2011 at 12:13 PM, Samuel Gendler <sgendler@ideasculptor.com> wrote:
I'm just beginning the process of benchmarking and tuning a new server.  Something I really haven't done before.  I'm using Greg's book as a guide.  I started with bonnie++ (1.96) and immediately got anomalous results (I think).

Hardware is as follows:

2x quad core xeon 5504 2.0Ghz, 2x4MB cache
192GB DDR3 1066 RAM
24x600GB 15K rpm SAS drives
adaptec 52445 controller

The default config, being tested at the moment, has 2 volumes, one 100GB and one 3.2TB, both are built from a stripe across all 24 disks, rather than splitting some spindles out for one volume and another set for the other volume.  At the moment, I'm only testing against the single 3.2TB volume.

The smaller volume is partitioned into /boot (ext2 and tiny) and / (ext4 and 91GB).  The larger volume is mounted as xfs with the following options (cribbed from an email to the list earlier this week, I think): logbufs=8,noatime,nodiratime,nobarrier,inode64,allocsize=16m

Bonnie++ delivered the expected huge throughput for sequential read and write.  It seems in line with other benchmarks I found online.  However, we are only seeing 180 seeks/sec, but seems quite low.  I'm hoping someone might be able to confirm that and. hopefully, make some suggestions for tracking down the problem if there is one.

Results are as follows:

1.96,1.96,newbox,1,1315935572,379G,,1561,99,552277,46,363872,34,3005,90,981924,49,179.1,56,16,,,,,19107,69,+++++,+++,20006,69,19571,72,+++++,+++,20336,63,7111us,10666ms,14067ms,65528us,592ms,170ms,949us,107us,160us,383us,31us,130us


Version      1.96   ------Sequential Output------ --Sequential Input- --Random-
                    -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
Machine        Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP  /sec %CP
newzonedb.z1.p 379G  1561  99 552277  46 363872  34  3005  90 981924  49 179.1  56
Latency              7111us   10666ms   14067ms   65528us     592ms     170ms
                    ------Sequential Create------ --------Random Create--------
                    -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete--
files:max:min        /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP
newbox            16 19107  69 +++++ +++ 20006  69 19571  72 +++++ +++ 20336  63
Latency               949us     107us     160us     383us      31us     130us


My seek times increase when I reduce the size of the file, which isn't surprising, since once everything fits into cache, seeks aren't dependent on mechanical movement.  However, I am seeing lots of bonnie++ results in google which appear to be for a file size that is 2x RAM which show numbers closer to 1000 seeks/sec (compared to my 180).  Usually, I am seeing 16GB file for 8GB hosts.  So what is an acceptable random seeks/sec number for a file that is 2x memory?  And does file size make a difference independent of available RAM such that the enormous 379GB file that is created on my host is skewing the results to the low end?


Re: raid array seek performance

From
Greg Smith
Date:
On 09/13/2011 03:13 PM, Samuel Gendler wrote:
> Bonnie++ delivered the expected huge throughput for sequential read
> and write.  It seems in line with other benchmarks I found online.
>  However, we are only seeing 180 seeks/sec, but seems quite low.

I wouldn't worry about that if the sequential rates are good.  The
bonnie++ seeks test has been giving me increasingly useless results
recently on modern hardware.  And bonnie++ 1.96 continues to give me
enough weird values that I'm still using 1.03e as my standard version.

If you want to get a useful measurement of seeks/second, setup
pgbench-tools with a SELECT-only test, and create a database that's 2 to
4X as big as RAM.  The TPS result you get from that is a much more
useful number for real-world seeks than this.

I'm working on a tool to directly benchmark seek performance in a way
that's useful for what people really want to know nowadays.  That's
going live to the world at the end of the month, at #PgWest:
http://pgwest2011.sched.org/event/875b87d8d237bef3a53ab27ac9c8057c

--
Greg Smith   2ndQuadrant US    greg@2ndQuadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us


Re: raid array seek performance

From
Merlin Moncure
Date:
On Wed, Sep 14, 2011 at 2:44 AM, Greg Smith <greg@2ndquadrant.com> wrote:
> If you want to get a useful measurement of seeks/second, setup pgbench-tools
> with a SELECT-only test, and create a database that's 2 to 4X as big as RAM.
>  The TPS result you get from that is a much more useful number for
> real-world seeks than this.

A thought on that note: it sure would be nice if you could define
scaling factor in terms of data size instead of linear multiples of
100000, something like:

pgbench -i -x 64gb

merlin