Re: Opinions on Raid

From: mark@mark.mielke.cc
Subject: Re: Opinions on Raid
Date: ,
Msg-id: 20070227172523.GA16230@mark.mielke.cc
(view: Whole thread, Raw)
In response to: Re: Opinions on Raid  (Ron)
List: pgsql-performance

Tree view

Opinions on Raid  ("Joe Uhl", )
 Re: Opinions on Raid  (Stefan Kaltenbrunner, )
 Re: Opinions on Raid  (Ron, )
  Re: Opinions on Raid  (, )
 Re: Opinions on Raid  (Scott Marlowe, )
  Re: Opinions on Raid  ("Joe Uhl", )
 Re: Opinions on Raid  (Scott Marlowe, )
 Re: Opinions on Raid  (Geoff Tolley, )
  Re: Opinions on Raid  (Arjen van der Meijden, )
   Re: Opinions on Raid  ("Steinar H. Gunderson", )

Hope you don't mind, Ron. This might be splitting hairs.

On Tue, Feb 27, 2007 at 11:05:39AM -0500, Ron wrote:
> The real CPU overhead when using SW RAID is when using any form of SW
> RAID that does XOR operations as part of writes (RAID 5, 6, 50, ...,
> etc).  At that point, you are essentially hammering on the CPU just
> as hard as you would on a dedicated RAID controller... ...and the
> dedicated RAID controller probably has custom HW helping it do this
> sort of thing more efficiently.
> That being said, SW RAID 5 in this sort of scenario can be reasonable
> if you =dedicate= a CPU core to it.  So in such a system, your "n"
> core box is essentially a "n-1" core box because you have to lock a
> core to doing nothing but RAID management.

I have an issue with the above explanation. XOR is cheap. It's one of
the cheapest CPU instructions available. Even with high bandwidth, the
CPU should always be able to XOR very fast.

This leads me to the belief that the RAID 5 problem has to do with
getting the data ready to XOR. With RAID 5, the L1/L2 cache is never
large enoguh to hold multiple stripes of data under regular load, and
the system may not have the blocks in RAM. Reading from RAM to find the
missing blocks shows up as CPU load. Reading from disk to find the
missing blocks shows up as system load. Dedicating a core to RAID 5
focuses on the CPU - which I believe to be mostly idle waiting for a
memory read. Dedicating a core reduces the impact, but can't eliminate
it, and the cost of a whole core to sit mostly idle waiting for memory
reads is high. Also, any reads scheduled by this core will affect the
bandwidth/latency for other cores.

Hardware RAID 5 solves this by using its own memory modules - like a
video card using its own memory modules. The hardware RAID can read
from its own memory or disk all day and not affect system performance.
Hopefully it has plenty of memory dedicated to holding the most
frequently required blocks.

> SW RAID 5 etc in usage scenarios involving far more reads than writes
> and light write loads can work quite well even if you don't dedicate
> a core to RAID management, but you must be careful about workloads
> that are, or that contain parts that are, examples of the first
> scenario I gave.  If you have any doubts about whether you are doing
> too many writes, dedicate a core to RAID stuff as in the first scenario.

I found software RAID 5 to suck such that I only use it for backups
now. It seemed that Linux didn't care to read-ahead or hold blocks in
memory for too long, and preferred to read and then write. It was awful.
RAID 5 doesn't seem like a good option even with hardware RAID. They mask
the issues with it behind a black box (dedicated hardware). The issues
still exist.

Most of my system is RAID 1+0 now. I have it broken up. Rarely read or
written files (long term storage) in RAID 5, The main system data on
RAID 1+0. The main system on RAID 1. A larger build partition on RAID
0. For a crappy server in my basement, I've very happy with my
software RAID performance now. :-)

Cheers,
mark

--
 /  /      __________________________
.  .  _  ._  . .   .__    .  . ._. .__ .   . . .__  | Neighbourhood Coder
|\/| |_| |_| |/    |_     |\/|  |  |_  |   |/  |_   |
|  | | | | \ | \   |__ .  |  | .|. |__ |__ | \ |__  | Ottawa, Ontario, Canada

  One ring to rule them all, one ring to find them, one ring to bring them all
                       and in the darkness bind them...

                           http://mark.mielke.cc/



pgsql-performance by date:

From: Chris
Date:
Subject: Re: Writting a "search engine" for a pgsql DB
From: Charles Sprickman
Date:
Subject: Re: Writting a "search engine" for a pgsql DB