Thread: best arrangement of 3 disks for (insert) performance
Hi all, I have some new hardware on the way and would like some advice on how to get the most out of it.. its a dual xeon 2.4, 4gb ram and 3x identical 15k rpm scsi disks should i mirror 2 of the disks for postgres data, and use the 3rd disk for the o/s and the pg logs or raid5 the 3 disks or even stripe 2 disks for pg and use the 3rd for o/s,logs,backups ? the machine will be dealing with lots of inserts, basically as many as we can throw at it thanks, Richard
> the machine will be dealing with lots of inserts, basically as many as we can > throw at it If you mean lots of _transactions_ with few inserts per transaction you should get a RAID controller w/ battery backed write-back cache. Nothing else will improve your write performance by nearly as much. You could sell the RAM and one of the CPU'sto pay for it ;-) If you have lots of inserts but all in a few transactions then it's not quite so critical. M
rj@last.fm (Richard Jones) writes: > I have some new hardware on the way and would like some advice on > how to get the most out of it.. > > its a dual xeon 2.4, 4gb ram and 3x identical 15k rpm scsi disks > > should i mirror 2 of the disks for postgres data, and use the 3rd > disk for the o/s and the pg logs or raid5 the 3 disks or even stripe > 2 disks for pg and use the 3rd for o/s,logs,backups ? > > the machine will be dealing with lots of inserts, basically as many > as we can throw at it Having WAL on a separate drive from the database would be something of a win. I'd buy that 1 disk for OS+WAL and then RAID [something] across the other two drives for the database would be pretty helpful. After doing some [loose] benchmarking, the VERY best way to improve performance would involve a RAID controller with battery-backed cache. On a box with similar configuration to yours, it took ~3h for a particular set of data to load; on another one with battery-backed cache (and a dozen fast SCSI drives :-)), the same data took as little as 6 minutes to load. The BIG effect seemed to come from the controller. -- (reverse (concatenate 'string "ofni.smrytrebil" "@" "enworbbc")) <http://dev6.int.libertyrms.com/> Christopher Browne (416) 646 3304 x124 (land)
RIchard, > its a dual xeon 2.4, 4gb ram and 3x identical 15k rpm scsi disks > > should i mirror 2 of the disks for postgres data, and use the 3rd disk for the > o/s and the pg logs or raid5 the 3 disks or even stripe 2 disks for pg and > use the 3rd for o/s,logs,backups ? I'd mirror 2. Stripey RAID with few disks imposes a heavy performance penalty on data writes (particularly updates), sometimes as much as 50% for a RAID5-3disk config. I am a little curious why you've got a dual-xeon, but could only afford 3 disks .... -- -Josh Berkus Aglio Database Solutions San Francisco
The machine is coming from dell, and i have the option of a PERC 3/SC RAID Controller (32MB) or software raid. does anyone have any experience of this controller? its an additional £345 for this controller, i'd be interested to know what people think - my other option is to buy the raid controller separately, which appeals to me but i wouldnt know what to look for in a raid controller. that raid controller review site sounds like a good idea :) Richard. On Friday 12 September 2003 4:24 pm, Christopher Browne wrote: > rj@last.fm (Richard Jones) writes: > > I have some new hardware on the way and would like some advice on > > how to get the most out of it.. > > > > its a dual xeon 2.4, 4gb ram and 3x identical 15k rpm scsi disks > > > > should i mirror 2 of the disks for postgres data, and use the 3rd > > disk for the o/s and the pg logs or raid5 the 3 disks or even stripe > > 2 disks for pg and use the 3rd for o/s,logs,backups ? > > > > the machine will be dealing with lots of inserts, basically as many > > as we can throw at it > > Having WAL on a separate drive from the database would be something of > a win. I'd buy that 1 disk for OS+WAL and then RAID [something] > across the other two drives for the database would be pretty helpful. > > After doing some [loose] benchmarking, the VERY best way to improve > performance would involve a RAID controller with battery-backed cache. > > On a box with similar configuration to yours, it took ~3h for a > particular set of data to load; on another one with battery-backed > cache (and a dozen fast SCSI drives :-)), the same data took as little > as 6 minutes to load. The BIG effect seemed to come from the > controller.
The dual xeon arrangement is because the machine will also have to do some collaborative filtering which is very cpu intensive and very disk un-intensive, after loading the data into ram. On Friday 12 September 2003 5:49 pm, you wrote: > RIchard, > > > its a dual xeon 2.4, 4gb ram and 3x identical 15k rpm scsi disks > > > > should i mirror 2 of the disks for postgres data, and use the 3rd disk > > for > > the > > > o/s and the pg logs or raid5 the 3 disks or even stripe 2 disks for pg > > and use the 3rd for o/s,logs,backups ? > > I'd mirror 2. Stripey RAID with few disks imposes a heavy performance > penalty on data writes (particularly updates), sometimes as much as 50% for > a RAID5-3disk config. > > I am a little curious why you've got a dual-xeon, but could only afford 3 > disks ....
On Fri, 2003-09-12 at 12:55, Richard Jones wrote: > The machine is coming from dell, and i have the option of a > PERC 3/SC RAID Controller (32MB) > or software raid. > > does anyone have any experience of this controller? > its an additional £345 for this controller, i'd be interested to know what > people think - my other option is to buy the raid controller separately, > which appeals to me but i wouldnt know what to look for in a raid controller. Hardware raid with the write cache, and sell a CPU if necessary to buy it (don't sell the ram though!).
Attachment
> Having WAL on a separate drive from the database would be something of > a win. I'd buy that 1 disk for OS+WAL and then RAID [something] > across the other two drives for the database would be pretty helpful. Just my .02, I did a lot of testing before I deployed our ~50GB postgresql databases with various combinations of 6 15k SCSI drives. I did custom benchmarks to simulate our applications, I downloaded several benchmarks, etc. It might be a fluke, but I never got better performance with WALs on a different disk than I did with all 6 disks in a 0+1 configuration. Obviously that's not an option with 3 disks. =) I ended up going with that for easier space maintenance. Obviously YMMV, benchmark for your own situation. :)