Bill Moran wrote:
> In order to recalculate the parity, it has to have data from all disks. Thus,
> if you have 4 disks, it has to read 2 (the unknown data blocks included in
> the parity calculation) then write 2 (the new data block and the new
> parity data) Caching can help some, but if your data ends up being any
> size at all, the cache misses become more frequent than the hits. Even
> when caching helps, you max speed is still only the speed of a single
> disk.
>
If you have 4 disks, it can do either:
1) Read the old block, read the parity block, XOR the old block with
the parity block and the new block resulting in the new parity block,
write both the new parity block and the new block.
2) Read the two unknown blocks, XOR with the new block resulting in
the new parity block, write both the new parity block and the new block.
You are emphasizing 2 - but the scenario is also overly simplistic.
Imagine you had 10 drives on RAID 5. Would it make more sense to read 8
blocks and then write two (option 2, and the one you describe), or read
two blocks and then write two (option 1). Obviously, if option 1 or
option 2 can be satisfied from cache, it is better to not read at all.
I note that you also disagree with Dave, in that you are not claiming it
performs consistency checks on read. No system does this as performance
would go to the crapper.
Cheers,
mark
--
Mark Mielke <mark@mielke.cc>