Re: Peformance Tuning Opterons/ Hard Disk Layout - Mailing list pgsql-performance

From John Arbash Meinel
Subject Re: Peformance Tuning Opterons/ Hard Disk Layout
Date
Msg-id 421E2DC8.8040701@arbash-meinel.com
Whole thread Raw
In response to Peformance Tuning Opterons/ Hard Disk Layout  (John Allgood <john@turbocorp.com>)
List pgsql-performance
John Allgood wrote:

> Hello Again
>
> In the below statement you mention putting each database on its own
> raid mirror.
>
> "However, sticking with your arrangement, it would seem that you might be
> able to get some extra performance if each database is on it's own raid,
> since you are fairly likely to have 2 transactions occuring at the same
> time, that don't affect eachother (since you wouldn't have any foreign
> keys, etc on 2 separate databases.)"
>
> That would take alot of disk drives to accomplish. I was thinking
> maybe putting three or four databases on each raid and dividing the
> heaviest used databases on each mirrored set. And for each of these
> sets have its own mirror for pg_xlog. My question is what is the best
> way to setup postgres databases on different disks. I have setup
> multiple postmasters on this system as a test. The only problem was
> configuring each databases "ie postgresql.conf, pg_hba.conf".  Is
> there anyway in postgres to have everything in one cluster and have it
> seperated onto multiple drives. Here is a example of what is was
> thinking about.
>
I think this is something that you would have to try and see what works.
My first feeling is that 8-disks in RAID10 is better than 4 sets of RAID1.

> MIRROR1 - Database Group 1
> MIRROR2 - pg_xlog for database group 1
> MIRROR3 - Database Group 2
> MIRROR4 - pg_xlog for database group 2
> MIRROR5 - Database Group 3
> MIRROR6 - pg_xlog for database group 3
>
> This will take about 12 disk drives. I have a 14 bay Storage Bay I can
> use two of the drives for hotspare's.
>
I would have all of them in 1 database cluster, which means they are all
served by the same postgres daemon. Which I believe means that they all
use the same pg_xlog. That means you only need 1 raid for pg_xlog,
though I would make it a 4-drive RAID10. (RAID1 is redundant, but
actually slower on writes, you need the 0 to speed up reading/writing, I
could be wrong).

I believe you can still split each database onto it's own raid later on
if you find that you need to.

So this is my proposal 1:
OS RAID (sounds like this is not in the Storage Bay).
4-drives RAID10 pg_xlog
8-drives RAID10 database cluster
2-drives Hot spares / RAID1

If you feel like you want to partition your databases, you could also do
proposal 2:
4-drives RAID10 pg_xlog
4-drives RAID10 databases master + 1-4
4-drives RAID10 databases 5-9
2-drives hotspare / RAID1

If you think partitioning is better than striping, you could do proposal 3:
4-drives RAID10 pg_xlog
2-drives RAID1 master database
2-drives RAID1 databases 1,2,3
2-drives RAID1 databases 4,5
2-drives RAID1 databases 6,7
2-drives RAID1 databases 8,9

There are certainly a lot of potential arrangements here, and it's not
like I've tried a lot of them. pg_xlog seems like a big enough
bottleneck that it would be good to put it on it's own RAID10, to make
it as fast as possible.

It also depends a lot on whether you will be write heavy/read heavy,
etc. RAID5 works quite well for reading, very poor for writing. But if
the only reason to have the master database is to perform read heavy
queries, and all the writing is done at night in bulk fashion with
careful tuning to avoid saturation, then maybe you would want to put the
master database on a RAID5 so that you can get extra disk space.
You could do proposal 4:
4-drive RAID10 pg_xlog
4-drive RAID5 master db
2-drive RAID1 dbs 1-3
2-drive RAID1 dbs 4-6
2-drive RAID1 dbs 7-9

You might also do some testing and find that pg_xlog doesn't deserve
it's own 4 disks, and they would be better off in the bulk tables.

Unfortunately a lot of this would come down to performance testing on
your dataset, with a real data load. Which isn't very easy to do.
I personally like the simplicity of proposal 1.

John
=:->

>
> Thanks
>
> John Allgood - ESC
> Systems Administrator



Attachment

pgsql-performance by date:

Previous
From: "Joel Fradkin"
Date:
Subject: Re: Peformance Tuning Opterons/ Hard Disk Layout
Next
From: Bruce Momjian
Date:
Subject: Re: PostgreSQL is extremely slow on Windows