Thread: best arrangement of 3 disks for (insert) performance

best arrangement of 3 disks for (insert) performance

From
Richard Jones
Date:
Hi all,
I have some new hardware on the way and would like some advice on how to get
the most out of it..

its a dual xeon 2.4,  4gb ram and 3x identical 15k rpm scsi disks

should i mirror 2 of the disks for postgres data, and use the 3rd disk for the
o/s and the pg logs or raid5 the 3 disks or even stripe 2 disks for pg and
use the 3rd for o/s,logs,backups ?

the machine will be dealing with lots of inserts, basically as many as we can
throw at it

thanks,
Richard

Re: best arrangement of 3 disks for (insert) performance

From
"Matt Clark"
Date:
> the machine will be dealing with lots of inserts, basically as many as we can
> throw at it

If you mean lots of _transactions_ with few inserts per transaction you should get a RAID controller w/ battery backed
write-back
cache.  Nothing else will improve your write performance by nearly as much.  You could sell the RAM and one of the
CPU'sto pay for 
it ;-)

If you have lots of inserts but all in a few transactions then it's not quite so critical.

M



Re: best arrangement of 3 disks for (insert) performance

From
Christopher Browne
Date:
rj@last.fm (Richard Jones) writes:
> I have some new hardware on the way and would like some advice on
> how to get the most out of it..
>
> its a dual xeon 2.4,  4gb ram and 3x identical 15k rpm scsi disks
>
> should i mirror 2 of the disks for postgres data, and use the 3rd
> disk for the o/s and the pg logs or raid5 the 3 disks or even stripe
> 2 disks for pg and use the 3rd for o/s,logs,backups ?
>
> the machine will be dealing with lots of inserts, basically as many
> as we can throw at it

Having WAL on a separate drive from the database would be something of
a win.  I'd buy that 1 disk for OS+WAL and then RAID [something]
across the other two drives for the database would be pretty helpful.

After doing some [loose] benchmarking, the VERY best way to improve
performance would involve a RAID controller with battery-backed cache.

On a box with similar configuration to yours, it took ~3h for a
particular set of data to load; on another one with battery-backed
cache (and a dozen fast SCSI drives :-)), the same data took as little
as 6 minutes to load.  The BIG effect seemed to come from the
controller.
--
(reverse (concatenate 'string "ofni.smrytrebil" "@" "enworbbc"))
<http://dev6.int.libertyrms.com/>
Christopher Browne
(416) 646 3304 x124 (land)

Re: best arrangement of 3 disks for (insert) performance

From
Josh Berkus
Date:
RIchard,

> its a dual xeon 2.4,  4gb ram and 3x identical 15k rpm scsi disks
>
> should i mirror 2 of the disks for postgres data, and use the 3rd disk for
the
> o/s and the pg logs or raid5 the 3 disks or even stripe 2 disks for pg and
> use the 3rd for o/s,logs,backups ?

I'd mirror 2.   Stripey RAID with few disks imposes a heavy performance
penalty on data writes (particularly updates), sometimes as much as 50% for a
RAID5-3disk config.

I am a little curious why you've got a dual-xeon, but could only afford 3
disks ....

--
-Josh Berkus
 Aglio Database Solutions
 San Francisco


Re: best arrangement of 3 disks for (insert) performance

From
Richard Jones
Date:
The machine is coming from dell, and i have the option of a
PERC 3/SC RAID Controller (32MB)
or software raid.

does anyone have any experience of this controller?
its an additional £345 for this controller, i'd be interested to know what
people think - my other option is to buy the raid controller separately,
which appeals to me but i wouldnt know what to look for in a raid controller.

that raid controller review site sounds like a good idea :)

Richard.

On Friday 12 September 2003 4:24 pm, Christopher Browne wrote:
> rj@last.fm (Richard Jones) writes:
> > I have some new hardware on the way and would like some advice on
> > how to get the most out of it..
> >
> > its a dual xeon 2.4,  4gb ram and 3x identical 15k rpm scsi disks
> >
> > should i mirror 2 of the disks for postgres data, and use the 3rd
> > disk for the o/s and the pg logs or raid5 the 3 disks or even stripe
> > 2 disks for pg and use the 3rd for o/s,logs,backups ?
> >
> > the machine will be dealing with lots of inserts, basically as many
> > as we can throw at it
>
> Having WAL on a separate drive from the database would be something of
> a win.  I'd buy that 1 disk for OS+WAL and then RAID [something]
> across the other two drives for the database would be pretty helpful.
>
> After doing some [loose] benchmarking, the VERY best way to improve
> performance would involve a RAID controller with battery-backed cache.
>
> On a box with similar configuration to yours, it took ~3h for a
> particular set of data to load; on another one with battery-backed
> cache (and a dozen fast SCSI drives :-)), the same data took as little
> as 6 minutes to load.  The BIG effect seemed to come from the
> controller.


Re: best arrangement of 3 disks for (insert) performance

From
Richard Jones
Date:
The dual xeon arrangement is because the machine will also have to do some
collaborative filtering which is very cpu intensive and very disk
un-intensive, after loading the data into ram.

On Friday 12 September 2003 5:49 pm, you wrote:
> RIchard,
>
> > its a dual xeon 2.4,  4gb ram and 3x identical 15k rpm scsi disks
> >
> > should i mirror 2 of the disks for postgres data, and use the 3rd disk
> > for
>
> the
>
> > o/s and the pg logs or raid5 the 3 disks or even stripe 2 disks for pg
> > and use the 3rd for o/s,logs,backups ?
>
> I'd mirror 2.   Stripey RAID with few disks imposes a heavy performance
> penalty on data writes (particularly updates), sometimes as much as 50% for
> a RAID5-3disk config.
>
> I am a little curious why you've got a dual-xeon, but could only afford 3
> disks ....


Re: best arrangement of 3 disks for (insert) performance

From
Rod Taylor
Date:
On Fri, 2003-09-12 at 12:55, Richard Jones wrote:
> The machine is coming from dell, and i have the option of a
> PERC 3/SC RAID Controller (32MB)
> or software raid.
>
> does anyone have any experience of this controller?
> its an additional £345 for this controller, i'd be interested to know what
> people think - my other option is to buy the raid controller separately,
> which appeals to me but i wouldnt know what to look for in a raid controller.

Hardware raid with the write cache, and sell a CPU if necessary to buy
it (don't sell the ram though!).

Attachment

Re: best arrangement of 3 disks for (insert) performance

From
Cott Lang
Date:
> Having WAL on a separate drive from the database would be something of
> a win.  I'd buy that 1 disk for OS+WAL and then RAID [something]
> across the other two drives for the database would be pretty helpful.

Just my .02,

I did a lot of testing before I deployed our ~50GB postgresql databases
with various combinations of 6 15k SCSI drives. I did custom benchmarks
to simulate our applications, I downloaded several benchmarks, etc.

It might be a fluke, but I never got better performance with WALs on a
different disk than I did with all 6 disks in a 0+1 configuration.
Obviously that's not an option with 3 disks. =)

I ended up going with that for easier space maintenance.

Obviously YMMV, benchmark for your own situation. :)