Re: Disk Benchmarking Question - Mailing list pgsql-performance

From Scott Marlowe
Subject Re: Disk Benchmarking Question
Date
Msg-id CAOR=d=0=rSYFqjH0fKofpquzm1L8t3oaQnDZ2fGj4YJmN0mYdg@mail.gmail.com
Whole thread Raw
In response to Disk Benchmarking Question  (Dave Stibrany <dstibrany@gmail.com>)
Responses Re: Disk Benchmarking Question
List pgsql-performance
On Thu, Mar 17, 2016 at 2:45 PM, Dave Stibrany <dstibrany@gmail.com> wrote:
> I'm pretty new to benchmarking hard disks and I'm looking for some advice on
> interpreting the results of some basic tests.
>
> The server is:
> - Dell PowerEdge R430
> - 1 x Intel Xeon E5-2620 2.4GHz
> - 32 GB RAM
> - 4 x 600GB 10k SAS Seagate ST600MM0088 in RAID 10
> - PERC H730P Raid Controller with 2GB cache in write back mode.
>
> The OS is Ubuntu 14.04, I'm using LVM and I have an ext4 volume for /, and
> an xfs volume for PGDATA.
>
> I ran some dd and bonnie++ tests and I'm a bit confused by the numbers. I
> ran 'bonnie++ -n0 -f' on the root volume.
>
> Here's a link to the bonnie test results
> https://www.dropbox.com/s/pwe2g5ht9fpjl2j/bonnie.today.html?dl=0
>
> The vendor stats say sustained throughput of 215 to 108 MBps, so I guess I'd
> expect around 400-800 MBps read and 200-400 MBps write. In any case, I'm
> pretty confused as to why the read and write sequential speeds are almost
> identical. Does this look wrong?

For future reference, it's good to include the data you linked to in
your post, as in 2, 5 or 10 years the postgresql discussion archives
will still be here but your dropbox may or may not, and then people
won't know what numbers you are referring to.

Given the size of your bonnie test set and the fact that you're using
RAID-10, the cache should make little or no difference. The RAID
controller may or may not interleave reads between all four drives.
Some do, some don't. It looks to me like yours doesn't. I.e. when
reading it's not reading all 4 disks at once, but just 2, 1 from each
pair.

But the important question here is what kind of workload are you
looking at throwing at this server? If it's going to be a reporting
database you may get as good or better read performance from RAID-5 as
RAID-10, especially if you add more drives. If you're looking at
transactional use then as Mike suggested SSDs might be your best
choice.

We run some big transactional dbs at work that are 4 to 6 TB and for
those we use 10 800GB SSDs in RAID-5 with the RAID controller cache
turned off. We can hit ~18k tps in pgbench on ~100GB test sets. With
the cache on we drop to 3 to 5k tps. With 512MB cache we overwrite the
cache every couple of seconds and it just gets in the way.

SSDs win hands down if you need random access speed. It's like a
Stanley Steamer (spinners) versus a Bugatti Veyron (SSDs).

For sequential throughput like a reporting server often spinners do
alright, as long as there's only one or two processes accessing your
data at a time. As soon as you start to get more accesses going as you
have RAID-10 pairs your performance will drop off noticeably.


pgsql-performance by date:

Previous
From: "Mike Sofen"
Date:
Subject: Re: Disk Benchmarking Question
Next
From: Scott Marlowe
Date:
Subject: Re: Disk Benchmarking Question