Re: With 4 disks should I go for RAID 5 or RAID 10 - Mailing list pgsql-performance
From | Mark Mielke |
---|---|
Subject | Re: With 4 disks should I go for RAID 5 or RAID 10 |
Date | |
Msg-id | 47732BCB.2090302@mark.mielke.cc Whole thread Raw |
In response to | Re: With 4 disks should I go for RAID 5 or RAID 10 (Shane Ambler <pgsql@Sheeky.Biz>) |
Responses |
Re: With 4 disks should I go for RAID 5 or RAID 10
|
List | pgsql-performance |
Shane Ambler wrote: > So in theory a modern RAID 1 setup can be configured to get similar > read speeds as RAID 0 but would still drop to single disk speeds (or > similar) when writing, but RAID 0 can get the faster write performance. Unfortunately, it's a bit more complicated than that. RAID 1 has a sequential read problem, as read-ahead is wasted, and you may as well read from one disk and ignore the others. RAID 1 does, however, allows for much greater concurrency. 4 processes on a 4 disk RAID 1 system can, theoretically, each do whatever they want, without impacting each other. Database loads involving a single active read user will see greater performance with RAID 0. Database loads involving multiple concurrent active read users will see greater performance with RAID 1. All of these assume writes are not being performed to any great significance. Even single writes cause all disks in a RAID 1 system to synchronize temporarily eliminating the read benefit. RAID 0 allows some degree of concurrent reads and writes occurring at the same time (assuming even distribution of the data across the devices). Of course, RAID 0 systems have an expected life that decreases as the number of disks in the system increase. So, this is where we get to RAID 1+0. Redundancy, good read performance, good write performance, relatively simple implementation. For a mere cost of double the number of disk storage, you can get around the problems of RAID 1 and the problems of RAID 0. :-) > So in a perfect setup (probably 1+0) 4x 300MB/s SATA drives could > deliver 1200MB/s of data to RAM, which is also assuming that all 4 > channels have their own data path to RAM and aren't sharing. > (anyone know how segregated the on board controllers such as these are?) > (do some pci controllers offer better throughput?) > We all know that doesn't happen in the real world ;-) Let's say we are > restricted to 80% - 1000MB/s - and some of that (10%) gets used by the > system - so we end up with 900MB/s delivered off disk to postgres - > that would still be more than the perfect rate at which 2x 300MB/s > drives can deliver. I expect you would have to have good hardware, and a well tuned system to see 80%+ theoretical for common work loads. But then, this isn't unique to RAID. Even in a single disk system, one has trouble achieving 80%+ theoretical. :-) I achieve something closer to +20% - +60% over the theoretical performance of a single disk with my four disk RAID 1+0 partitions. Lots of compromises in my system though that I won't get into. For me, I value the redundancy, allowing for a single disk to fail and giving me time to easily recover, but for the cost of two more disks, I am able to counter the performance cost of redundancy, and actually see a positive performance effect instead. > So in this situation - if configured correctly with a good controller > (driver for software RAID etc) a single 4 disk RAID 1+0 could > outperform two 2 disk RAID 1 setups with data/OS+WAL split between the > two. > Is the real world speeds so different that this theory is real fantasy > or has hardware reached a point performance wise where this is close > to fact?? I think it depends on the balance. If every second operation requires a WAL write, having separate might make sense. However, if the balance is less than even, one would end up with one of the 2 disk RAID 1 setups being more idle than the other. It's not an exact science when it comes to the various compromises being made. :-) If you can only put 4 disks in to the system (either cost, or because of the system size), I would suggest RAID 1+0 on all four as sensible compromise. If you can put more in - start to consider breaking it up. :-) Cheers, mark -- Mark Mielke <mark@mielke.cc>
pgsql-performance by date: