Re: Hardware vs Software Raid - Mailing list pgsql-performance
From | Peter T. Breuer |
---|---|
Subject | Re: Hardware vs Software Raid |
Date | |
Msg-id | 200806260503.m5Q53cA24979@inv.it.uc3m.es Whole thread Raw |
In response to | Re: Hardware vs Software RAID ("Merlin Moncure" <mmoncure@gmail.com>) |
Responses |
Re: Hardware vs Software Raid
Re: Hardware vs Software Raid Re: Hardware vs Software Raid |
List | pgsql-performance |
"Also sprach Merlin Moncure:" > write back: raid controller can lie to host o/s. when o/s asks This is not what the linux software raid controller does, then. It does not queue requests internally at all, nor ack requests that have not already been acked by the components (modulo the fact that one can deliberately choose to have a slow component not be sync by allowing "write-behind" on it, in which case the "controller" will ack the incoming request after one of the compionents has been serviced, without waiting for both). > integrity and performance. 'write back' caching provides insane burst > IOPS (because you are writing to controller cache) and somewhat > improved sustained IOPS because the controller is reorganizing writes > on the fly in (hopefully) optimal fashion. This is what is provided by Linux file system and (ordinary) block device driver subsystem. It is deliberately eschewed by the soft raid driver, because any caching will already have been done above and below the driver, either in the FS or in the components. > > However the lack of extra buffering is really deliberate (double > > buffering is a horrible thing in many ways, not least because of the > > <snip> > completely unconvincing. But true. Therefore the problem in attaining conviction must be at your end. Double buffering just doubles the resources dedicated to a single request, without doing anything for it! It doubles the frequency with which one runs out of resources, it doubles the frequency of the burst limit being reached. It's deadly (deadlockly :) in the situation where the receiving component device also needs resources in order to service the request, such as when the transport is network tcp (and I have my suspicions about scsi too). > the overhead of various cache layers is > completely minute compared to a full fault to disk that requires a > seek which is several orders of magnitude slower. That's aboslutely true when by "overhead" you mean "computation cycles" and absolutely false when by overhead you mean "memory resources", as I do. Double buffering is a killer. > The linux software raid algorithms are highly optimized, and run on a I can confidently tell you that that's balderdash both as a Linux author and as a software RAID linux author (check the attributions in the kernel source, or look up something like "Raiding the Noosphere" on google). > presumably (much faster) cpu than what the controller supports. > However, there is still some extra oomph you can get out of letting > the raid controller do what the software raid can't...namely delay > sync for a time. There are several design problems left in software raid in the linux kernel. One of them is the need for extra memory to dispatch requests with and as (i.e. buffer heads and buffers, both). bhs should be OK since the small cache per device won't be exceeded while the raid driver itself serialises requests, which is essentially the case (it does not do any buffering, queuing, whatever .. and tries hard to avoid doing so). The need for extra buffers for the data is a problem. On different platforms different aspects of that problem are important (would you believe that on ARM mere copying takes so much cpu time that one wants to avoid it at all costs, whereas on intel it's a forgettable trivium). I also wouldn't aboslutely swear that request ordering is maintained under ordinary circumstances. But of course we try. Peter
pgsql-performance by date: