Re: how to partition disks - Mailing list pgsql-performance

From Jim C. Nasby
Subject Re: how to partition disks
Date
Msg-id 20060614151639.GA34196@pervasive.com
Whole thread Raw
In response to Re: how to partition disks  (Sven Geisler <sgeisler@aeccom.com>)
List pgsql-performance
On Wed, Jun 14, 2006 at 04:32:23PM +0200, Sven Geisler wrote:
> Hi Richard,
>
> Richard Broersma Jr schrieb:
> >>This depends on your application. Do you have a lot of disc reads?
> >>Anyhow, I would put the xlog always to a RAID 10 volume because most of
> >>the I/O for update and inserts is going to the xlog.
> >>
> >>4 discs xlog
> >>6 discs tables
> >>4 discs tables2
> >
> >I have a question in regards to I/O bandwidths of various raid
> >configuration.  Primary, does the
> >above suggested raid partitions imply that multiple (smaller) disk arrays
> >have a potential for
> >more I/O bandwidth than a larger raid 10 array?
>
> Yes.
> Because the disc arms didn't need to reposition that much as there would
> o with one large volume.
>
> For example, You run two queries with two clients and each queries needs
> to read some indices from disk. In this case it more efficient to read
> from different volumes than to read from one large volume where the disc
> arms has to jump.

But keep in mind that all of that is only true if you have very good
knowledge of how your data will be accessed. If you don't know that,
you'll almost certainly be better off just piling everything into one
RAID array and letting the controller deal with it.

Also, if you have a good RAID controller that's batter-backed,
seperating pg_xlog onto it's own array is much less likely to be a win.
The reason you normally put pg_xlog on it's own partition is because the
database has to fsync pg_xlog *at every single commit*. This means you
absolutely want that fsync to be as fast as possible. But with a good,
battery-backed controller, this no longer matters. The fsync is only
going to push the data into the controller, and the controller will take
things from there. That means it's far less important to put pg_xlog on
it's own array. I actually asked about this recently and one person did
reply that they'd done testing and found it was better to just put all
their drives into one array so they weren't wasting bandwidth on the
pg_xlog drives.

Even if you do decide to keep pg_xlog seperate, a 4 drive RAID10 for
that is overkill. It will be next to impossible for you to generate
enough WAL traffic to warrent it.

Your best bet is to perform testing with your application. That's the
only way you'll truely find out what's going to work best. Short of
that, your best bet is to just pile all the drives together. If you do
testing, I'd start first with the effect of a seperate pg_xlog. Only
after you have those results would I consider trying to do things like
split indexes from tables, etc.

BTW, you should consider reserving some of the drives in the array as
hot spares.
--
Jim C. Nasby, Sr. Engineering Consultant      jnasby@pervasive.com
Pervasive Software      http://pervasive.com    work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf       cell: 512-569-9461

pgsql-performance by date:

Previous
From: "John E. Vincent"
Date:
Subject: Performance of pg_dump on PGSQL 8.0
Next
From: "Jim C. Nasby"
Date:
Subject: Re: Precomputed constants?