Re: With 4 disks should I go for RAID 5 or RAID 10 - Mailing list pgsql-performance

From Shane Ambler
Subject Re: With 4 disks should I go for RAID 5 or RAID 10
Date
Msg-id 47731D32.3040403@Sheeky.Biz
Whole thread Raw
In response to Re: With 4 disks should I go for RAID 5 or RAID 10  ("Fernando Hevia" <fhevia@ip-tel.com.ar>)
Responses Re: With 4 disks should I go for RAID 5 or RAID 10  (Greg Smith <gsmith@gregsmith.com>)
Re: With 4 disks should I go for RAID 5 or RAID 10  (Mark Mielke <mark@mark.mielke.cc>)
List pgsql-performance
Fernando Hevia wrote:

I'll start a little ways back first -

> Well, here rises another doubt. Should I go for a single RAID 1+0 storing OS
> + Data + WAL files or will I be better off with two RAID 1 separating data
> from OS + Wal files?

earlier you wrote -
> Database will be about 30 GB in size initially and growing 10 GB per year.
> Data is inserted overnight in two big tables and during the day mostly
> read-only queries are run. Parallelism is rare.

Now if the data is added overnight while no-one is using the server then
reading is where you want performance, provided any degradation in
writing doesn't slow down the overnight data loading enough to make it
too long to finish while no-one else is using it.

So in theory the only time you will have an advantage of having WAL on a
separate disk from data is at night when the data is loading itself (I
am assuming this is an automated step)
But *some*? gains can be made from having the OS separate from the data.




(This is for a theoretical discussion challenging the info/rumors that
abounds about RAID setups) not to start a bitch fight or flame war.


So for the guys who know the intricacies of RAID implementation -

I don't have any real world performance measures here.

For a setup that is only reading from disk (Santa sprinkles the data
down the air vent while we are all snug in our bed)

It has been mentioned that raid drivers/controllers can balance the
workload across the different disks - as Mark mentioned from the FreeBSD
6 man pages - the balance option can be set to
load|prefer|round-robin|split

So in theory a modern RAID 1 setup can be configured to get similar read
speeds as RAID 0 but would still drop to single disk speeds (or similar)
when writing, but RAID 0 can get the faster write performance.

So in a perfect setup (probably 1+0) 4x 300MB/s SATA drives could
deliver 1200MB/s of data to RAM, which is also assuming that all 4
channels have their own data path to RAM and aren't sharing.
(anyone know how segregated the on board controllers such as these are?)
(do some pci controllers offer better throughput?)

We all know that doesn't happen in the real world ;-) Let's say we are
restricted to 80% - 1000MB/s - and some of that (10%) gets used by the
system - so we end up with 900MB/s delivered off disk to postgres - that
would still be more than the perfect rate at which 2x 300MB/s drives can
deliver.

So in this situation - if configured correctly with a good controller
(driver for software RAID etc) a single 4 disk RAID 1+0 could outperform
two 2 disk RAID 1 setups with data/OS+WAL split between the two.

Is the real world speeds so different that this theory is real fantasy
or has hardware reached a point performance wise where this is close to
fact??



--

Shane Ambler
pgSQL (at) Sheeky (dot) Biz

Get Sheeky @ http://Sheeky.Biz

pgsql-performance by date:

Previous
From: Devrim GÜNDÜZ
Date:
Subject: Re: More shared buffers causes lower performances
Next
From: Greg Smith
Date:
Subject: Re: With 4 disks should I go for RAID 5 or RAID 10