On Mon, Mar 3, 2008 at 8:48 AM, Mark Mielke <mark@mark.mielke.cc> wrote:
> Matthew wrote:
> > On Sat, 1 Mar 2008, Craig James wrote:
> >> Right, I do understand that, but reliability is not a top priority in
> >> this system. The database will be replicated, and can be reproduced
> >> from the raw data.
> >
> > So what you're saying is:
> >
> > 1. Reliability is not important.
> > 2. There's zero write traffic once the database is set up.
> >
> > If this is true, then RAID-0 is the way to go. I think Greg's options
> > are good. Either:
> >
> > 2 discs RAID 1: OS
> > 6 discs RAID 0: database + WAL
> >
> > which is what we're using here (except with more discs), or:
> >
> > 8 discs RAID 10: everything
>
> Has anybody been able to prove to themselves that RAID 0 vs RAID 1+0 is
> faster for these sorts of loads? My understanding is that RAID 1+0 *can*
> reduce latency for reads, but that it relies on random access, whereas
> RAID 0 performs best for sequential scans? Does PostgreSQL ever do
> enough random access to make RAID 1+0 shine?
RAID 1+0 has certain theoretical advantages in parallel access
scenarios that straight RAID-0 wouldn't have. I.e. if you used n>2
disks in a mirror and built a RAID-0 out of those types of mirrors,
then you could theoretically have n users reading data on the same
"drive" (the raid-1 underneath the raid-0) at the same time where
RAID-0 would only have the one disk to read from. The effects of this
advantage are dulled by caching, depending on how much of the data set
you can cache. With a system that can cache it's whole data set in
memory (not uncommon for transactional systems) or at least a large
percentage, the n>2 RAID-1 sets aren't that big of an advantage.
RAID-0 of n drives should behave pretty similarly to RAID-10 with 2n
drives for most types of access. I.e. no better or worse for
sequential or random access, if the number of drives is equivalent.