Re: PowerEdge 2950 questions - Mailing list pgsql-performance

From Scott Marlowe
Subject Re: PowerEdge 2950 questions
Date
Msg-id 1156450941.7223.83.camel@state.g2switchworks.com
Whole thread Raw
In response to Re: PowerEdge 2950 questions  ("Merlin Moncure" <mmoncure@gmail.com>)
List pgsql-performance
On Thu, 2006-08-24 at 15:03, Merlin Moncure wrote:
> On 8/24/06, Scott Marlowe <smarlowe@g2switchworks.com> wrote:
> > On Thu, 2006-08-24 at 13:57, Merlin Moncure wrote:
> > > On 8/24/06, Jeff Davis <pgsql@j-davis.com> wrote:
> > > > On Thu, 2006-08-24 at 09:21 -0400, Merlin Moncure wrote:
> > > > > On 8/22/06, Jeff Davis <pgsql@j-davis.com> wrote:
> > > > > > On Tue, 2006-08-22 at 17:56 -0400, Bucky Jordan wrote:
> > > > > it's not the parity, it's the seeking.  Raid 5 gives you great
> > > > > sequential i/o but random is often not much better than a single
> > > > > drive.  Actually it's the '1' in raid 10 that plays the biggest role
> > > > > in optimizing seeks on an ideal raid controller.  Calculating parity
> > > > > was boring 20 years ago as it inolves one of the fastest operations in
> > > > > computing, namely xor. :)
> > > >
> > > > Here's the explanation I got: If you do a write on RAID 5 to something
> > > > that is not in the RAID controllers cache, it needs to do a read first
> > > > in order to properly recalculate the parity for the write.
> > >
> > > it's worse than that.  if you need to read something that is not in
> > > the o/s cache, all the disks except for one need to be sent to a
> > > physical location in order to get the data.
> >
> > Ummmm.  No.  Not in my experience.  If you need to read something that's
> > significantly larger than your stripe size, then yes, you'd need to do
> > that.  With typical RAID 5 stripe sizes of 64k to 256k, you could read 8
> > to 32 PostgreSQL 8k blocks from a single disk before having to move the
> > heads on the next disk to get the next part of data.  A RAID 5, being
> > read, acts much like a RAID 0 with n-1 disks.
>
> i just don't see raid 5 benchmarks backing that up. i know how it is
> supposed to work on paper, but all of the raid 5 systems I work with
> deliver lousy seek performance.  here is an example from the mysql
> folks:
> http://peter-zaitsev.livejournal.com/14415.html
> and another:
> http://storageadvisors.adaptec.com/2005/10/13/raid-5-pining-for-the-fjords/

Well, I've seen VERY good numbers out or RAID 5 arrays.  As long as I
wasn't writing to them.  :)

Trust me though, I'm no huge fan of RAID 5.

> > It's the writes that kill performance, since you've got to read two
> > disks and write two disks for every write, at a minimum.  This is why
> > small RAID 5 arrays bottleneck so quickly.  a 4 disk RAID 4 with two
> > writing threads is likely already starting to thrash.
> >
> > Or did you mean something else by that?
>
> well, that's correct, my point was that a 4 disk raid 1 can deliver
> more seeks, not necessarily that it is better.  as you say writes
> would kill performance. raid 10 seems to be a good compromise.  so is
> raid 6 possibly, although i dont see a lot performance data on that.

Yeah, I think RAID 10, in this modern day of large, inexpensive hard
drives, is the way to go for most transactional / heavily written
systems.

I'm not sure RAID-6 is worth the effort.  For smaller arrays (4 to 6),
you've got about as many "extra" drives as in RAID 1+0.  And that old
read twice write twice penalty becomes read twice (or is that thrice???)
and write thrice.  So, you'd chew up your iface bandwidth quicker.
Although in SAS / SATA I guess that part's not a big deal, the data has
to be moved around somewhere on the card / in the controller chips, so
it's still a problem somewhere waiting to happen in terms of bandwidth.

pgsql-performance by date:

Previous
From: "Merlin Moncure"
Date:
Subject: Re: PowerEdge 2950 questions
Next
From: "Merlin Moncure"
Date:
Subject: Re: PowerEdge 2950 questions