Re: How to setup disk spindles for best performance - Mailing list pgsql-performance

From Christiaan Willemsen
Subject Re: How to setup disk spindles for best performance
Date
Msg-id 48AD8465.7090604@technocon.com
Whole thread Raw
In response to Re: How to setup disk spindles for best performance  ("Scott Carey" <scott@richrelevance.com>)
Responses Re: How to setup disk spindles for best performance  (Greg Smith <gsmith@gregsmith.com>)
List pgsql-performance
Hi Scott,

Great info! Our RAID card is at  the moment a ICP vortex (Adaptec) ICP5165BR, and I'll be using it with Ubuntu server 8.04. I tried OpenSolaris, but it yielded even more terrible performance, specially using ZFS.. I guess that was just a missmatch. Anyway, I'm going to return the controller, because it does not scale very well with more that 4 disks in raid 10. Bandwidth is limited to 350MB/sec, and IOPS scale badly with extra disks...

So I guess, I'll be waiting for another controller first. The idea for xlog + os on 4 disk raid 10 and the rest for the data sound good :) I hope it will turn out that way too.. First another controller..

Regards,

Christiaan

Scott Carey wrote:
Indexes will be random write workload, but these won't by synchronous writes and will be buffered by the raid controller's cache.  Assuming you're using a hardware raid controller that is, and one that doesn't have major performance problems on your platform.  Which brings those questions up --- what is your RAID card and OS?

For reads, if your shared_buffers is large enough, your heavily used indexes won't likely go to disk much at all.

A good raid controller will typically help distribute the workload effectively on a large array.

You probably want a simple 2 disk mirror or 4 disks in raid 10 for your OS + xlog, and the rest for data + indexes -- with hot spares IF your card supports them.

The biggest risk to splitting up data and indexes is that you don't know how much I/O each needs relative to each other, and if this isn't a relatively constant ratio you will have one subset busy while the other subset is idle.
Unless you have extensively profiled your disk activity into index and data subsets and know roughly what the optimal ratio is, its probably going to cause more problems than it fixes. 
Furthermore, if this ratio changes at all, its a maintenance nightmare.  How much each would need in a perfect world is application dependant, so there can be no general recommendation other than:  don't do it.

On Thu, Aug 21, 2008 at 1:34 AM, Christiaan Willemsen <cwillemsen@technocon.com> wrote:
Thanks Joshua,

So what about putting the indexes on a separate array? Since we do a lot of inserts indexes are going to be worked on a lot of the time.

Regards,

Christiaan


Joshua D. Drake wrote:
Christiaan Willemsen wrote:
So, what you are basically saying, is that a single mirror is in general more than enough to facilitate the transaction log.

http://www.commandprompt.com/blogs/joshua_drake/2008/04/is_that_performance_i_smell_ext2_vs_ext3_on_50_spindles_testing_for_postgresql/
http://wiki.postgresql.org/wiki/HP_ProLiant_DL380_G5_Tuning_Guide

And to answer your question, yes. Transaction logs are written sequentially. You do not need a journaled file system and raid 1 is plenty for most if not all work loads.

Sincerely,

Joshua D. Drake


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

pgsql-performance by date:

Previous
From: "Scott Carey"
Date:
Subject: Re: How to setup disk spindles for best performance
Next
From: Moritz Onken
Date:
Subject: Re: Slow query with a lot of data