Thread: RAID vs. Single Big SCSI Disk
We have three databases for our scientific research and are getting close to filling our 12 Gig partition. My boss thinks that just getting a really big (i.e. > 30 Gig) SCSI drive will be cheaper and should do nicely. Currently, we only have 4 people accessing the database and usually only have 1-2 jobs (e.g. selects, updates, etc.) going at any one time (probably a high estimate). The db sits on a Pentium II/400 MHz with RedHat 6.0. Other than mirroring, are there any other advantages (e.g. speed, cost) of just getting a RAID controller over, say, a 73 Gig Ultra SCSI Cheetah drive (which cost in the neighborhood of $1300). Also, can Postgres handle being spread over several disks? I'd think that the RAID must control disk spanning, but just want to make sure that Postgres would be compatible. Thanks -Tony Reina
"G. Anthony Reina" wrote: > We have three databases for our scientific research and are getting > close to filling our 12 Gig partition. My boss thinks that just getting > a really big (i.e. > 30 Gig) SCSI drive will be cheaper and should do > nicely. Currently, we only have 4 people accessing the database and > usually only have 1-2 jobs (e.g. selects, updates, etc.) going at any > one time (probably a high estimate). The db sits on a Pentium II/400 MHz > with RedHat 6.0. > > Other than mirroring, are there any other advantages (e.g. speed, cost) > of just getting a RAID controller over, say, a 73 Gig Ultra SCSI Cheetah > drive (which cost in the neighborhood of $1300). It sounds like you would be much better off with an Ultra ATA 66 software or hardware RAID solution. Maxtor 40 Gb ATA100 disks can be had for $100. each. Alone they operate near 20 Mb/sec and in a striped 2 disk Raid they can do 30-40 Mb/sec, probably faster than your Cheetah configuration for a fraction of the cost. 3ware makes a hardware RAID controller that would get you to 40 Mb/sec with two, or 70 mb/sec with four of these disks in RAID 0. With four disks in RAID 01 you can mirror and still get near 40 Mb/sec. The 3ware solution also relieves your cpu from the usual ATA overhead. > > > Also, can Postgres handle being spread over several disks? I'd think > that the RAID must control disk spanning, but just want to make sure > that Postgres would be compatible. That is transparent.
On Thu, Dec 07, 2000 at 06:24:20PM -0800, G. Anthony Reina wrote: > We have three databases for our scientific research and are getting > close to filling our 12 Gig partition. My boss thinks that just getting > a really big (i.e. > 30 Gig) SCSI drive will be cheaper and should do > nicely. Currently, we only have 4 people accessing the database and > usually only have 1-2 jobs (e.g. selects, updates, etc.) going at any > one time (probably a high estimate). The db sits on a Pentium II/400 MHz > with RedHat 6.0. > > Other than mirroring, are there any other advantages (e.g. speed, cost) > of just getting a RAID controller over, say, a 73 Gig Ultra SCSI Cheetah > drive (which cost in the neighborhood of $1300). A RAID can be both faster and more reliable than a single disk. I say "can" because not all RAID configurations will be. > Also, can Postgres handle being spread over several disks? I'd think > that the RAID must control disk spanning, but just want to make sure > that Postgres would be compatible. You can spread the data over several disks by moving some of the files and creating symlinks, or by using striping (software or hardware) or concatenation (software or hardware). Your alternatives for software concatenation/striping will depend on your OS. Software RAID will neiter be as fast (because there is no cache and because it's using your main CPU instead of a dedicated one) nor as reliable (because there is no battery-backup for updates) as a hardware RAID. -- Ragnar Kjørstad Big Storage
On Tue, Dec 12, 2000 at 11:20:08AM -0500, bob@bob.usuhs.mil wrote: > > We have three databases for our scientific research and are getting > > close to filling our 12 Gig partition. My boss thinks that just getting > > a really big (i.e. > 30 Gig) SCSI drive will be cheaper and should do > > nicely. Currently, we only have 4 people accessing the database and > > usually only have 1-2 jobs (e.g. selects, updates, etc.) going at any > > one time (probably a high estimate). The db sits on a Pentium II/400 MHz > > with RedHat 6.0. > > > > Other than mirroring, are there any other advantages (e.g. speed, cost) > > of just getting a RAID controller over, say, a 73 Gig Ultra SCSI Cheetah > > drive (which cost in the neighborhood of $1300). > > It sounds like you would be much better off with an Ultra ATA 66 > software or hardware RAID solution. Maxtor 40 Gb ATA100 disks > can be had for $100. each. Alone they operate near 20 Mb/sec > and in a striped 2 disk Raid they can do 30-40 Mb/sec, probably > faster than your Cheetah configuration for a fraction of the cost. > 3ware makes a hardware RAID controller that would get you to > 40 Mb/sec with two, or 70 mb/sec with four of these disks in RAID 0. > With four disks in RAID 01 you can mirror and still get near 40 Mb/sec. > The 3ware solution also relieves your cpu from the usual ATA overhead. There is more to a disk than just teoretical througput: * SCSI disks support tagged command queueing (TCQ) this means it can execute multiple requests at the same time, and that means less pause between requests and optimal request ordering. * Disk cache; Seagate Cheeta disk come with up to 16 MB of cache - it makes a big difference for performance. (and hardware RAID controllers come with several hunded MB of cache) * Seek time * Remapping bad blocks Theese are not strictly IDE vs SCSI issues, but SCSI disks are usually intended for more high-end market, so they usually score better than the average IDE-disk. Some IDE raid controllers "fix" some of theese problem, by having internal cache on the controller and use a scsi interface with TCQ to connect to the host - but as far as I know, 3ware does not. Another thing is that some IDE disk (Maxtor is one of them, I think) come with write-back cache enabled, without any battery-backup. This means when you write to disk, the disk will only put the data in cache and write it to disk later. This improves performance, but it will kill your data if your application relies on write-ordering. In other words - if your system crashes while writing to disk, it's likely that your database is corrupt when booting up again! OK, no attempt to start another SCSI vs IDE flamewar :-) -- Ragnar Kjørstad Big Storage