Re: Hardware vs Software Raid - Mailing list pgsql-performance

From Merlin Moncure
Subject Re: Hardware vs Software Raid
Date
Msg-id b42b73150806261301h6a30ca58y1bbbfac727c9719d@mail.gmail.com
Whole thread Raw
In response to Re: Hardware vs Software Raid  ("Peter T. Breuer" <ptb@inv.it.uc3m.es>)
List pgsql-performance
On Thu, Jun 26, 2008 at 1:03 AM, Peter T. Breuer <ptb@inv.it.uc3m.es> wrote:
> "Also sprach Merlin Moncure:"
>> write back: raid controller can lie to host o/s. when o/s asks
>
> This is not what the linux software raid controller does, then. It
> does not queue requests internally at all, nor ack requests that have
> not already been acked by the components (modulo the fact that one can
> deliberately choose to have a slow component not be sync by allowing
> "write-behind" on it, in which case the "controller" will ack the
> incoming request after one of the compionents has been serviced,
> without waiting for both).
>
>> integrity and performance.  'write back' caching provides insane burst
>> IOPS (because you are writing to controller cache) and somewhat
>> improved sustained IOPS because the controller is reorganizing writes
>> on the fly in (hopefully) optimal fashion.
>
> This is what is provided by Linux file system and (ordinary) block
> device driver subsystem. It is deliberately eschewed by the soft raid
> driver, because any caching will already have been done above and below
> the driver, either in the FS or in the components.
>
>> > However the lack of extra buffering is really deliberate (double
>> > buffering is a horrible thing in many ways, not least because of the
>>
>> <snip>
>> completely unconvincing.
>
> But true.  Therefore the problem in attaining conviction must be at your
> end.  Double buffering just doubles the resources dedicated to a single
> request, without doing anything for it!  It doubles the frequency with
> which one runs out of resources, it doubles the frequency of the burst
> limit being reached.  It's deadly (deadlockly :) in the situation where

Only if those resources are drawn from the same pool.  You are
oversimplifying a calculation that has many variables such as cost.
CPUs for example are introducing more cache levels (l1, l2, l3), etc.
 Also, the different levels of cache have different capabilities.
Only the hardware controller cache is (optionally) allowed to delay
acknowledgment of a sync.  In postgresql terms, we get roughly the
same effect with the computers entire working memory with fsync
disabled...so that we are trusting, rightly or wrongly, that all
writes will eventually make it to disk.  In this case, the raid
controller cache is redundant and marginally useful.

> the receiving component device also needs resources in order to service
> the request, such as when the transport is network tcp (and I have my
> suspicions about scsi too).
>
>> the overhead of various cache layers is
>> completely minute compared to a full fault to disk that requires a
>> seek which is several orders of magnitude slower.
>
> That's aboslutely true when by "overhead" you mean "computation cycles"
> and absolutely false when by overhead you mean "memory resources", as I
> do.  Double buffering is a killer.

Double buffering is most certainly _not_ a killer (or at least, _the_
killer) in practical terms.  Most database systems that do any amount
of writing (that is, interesting databases) are bound by the ability
to randomly read and write to the storage medium, and only that.

This is why raid controllers come with a relatively small amount of
cache...there are diminishing returns from reorganizing writes.  This
is also why up and coming storage technologies (like flash) are so
interesting.  Disk drives have made only marginal improvements in
speed since the early 80's.

>> The linux software raid algorithms are highly optimized, and run on a
>
> I can confidently tell you that that's balderdash both as a Linux author

I'm just saying here that there is little/no cpu overhead for using
software raid on modern hardware.

> believe that on ARM mere copying takes so much cpu time that one wants
> to avoid it at all costs, whereas on intel it's a forgettable trivium).

This is a database list.  The main area of interest is in dealing with
server class hardware.

merlin

pgsql-performance by date:

Previous
From: "Peter T. Breuer"
Date:
Subject: Re: Hardware vs Software Raid
Next
From: david@lang.hm
Date:
Subject: Re: Hardware vs Software Raid