Re: Quad processor options - summary - Mailing list pgsql-performance

From Bjoern Metzdorf
Subject Re: Quad processor options - summary
Date
Msg-id 40A3EE5B.1050104@turtle-entertainment.de
Whole thread Raw
In response to Re: Quad processor options - summary  (James Thornton <james@jamesthornton.com>)
Responses Re: Quad processor options - summary  (James Thornton <james@jamesthornton.com>)
Re: Quad processor options - summary  (Paul Tuckfield <paul@tuckfield.com>)
Re: Quad processor options - summary  (Mark Kirkwood <markir@paradise.net.nz>)
List pgsql-performance
James Thornton wrote:

>> This is what I am considering the ultimate platform for postgresql:
>>
>> Hardware:
>> Tyan Thunder K8QS board
>> 2-4 x Opteron 848 in NUMA mode
>> 4-8 GB RAM (DDR400 ECC Registered 1 GB modules, 2 for each processor)
>> LSI Megaraid 320-2 with 256 MB cache ram and battery backup
>> 6 x 36GB SCSI 10K drives + 1 spare running in RAID 10, split over both
>> channels (3 + 4) for pgdata including indexes and wal.
>
> You might also consider configuring the Postgres data drives for a RAID
> 10 SAME configuration as described in the Oracle paper "Optimal Storage
> Configuration Made Easy"
> (http://otn.oracle.com/deploy/availability/pdf/oow2000_same.pdf). Has
> anyone delved into this before?

Ok, if I understand it correctly the papers recommends the following:

1. Get many drives and stripe them into a RAID0 with a stripe width of
1MB. I am not quite sure if this stripe width is to be controlled at the
application level (does postgres support this?) or if e.g. the "chunk
size" of the linux software driver is meant. Normally a chunk size of
4KB is recommended, so 1MB sounds fairly large.

2. Mirror your RAID0 and get a RAID10.

3. Use primarily the fast, outer regions of your disks. In practice this
might be achieved by putting only half of the disk (the outer half) into
your stripe set. E.g. put only the outer 18GB of your 36GB disks into
the stripe set. Btw, is it common for all drives that the outer region
is on the higher block numbers? Or is it sometimes on the lower block
numbers?

4. Subset data by partition, not disk. If you have 8 disks, then don't
take a 4 disk RAID10 for data and the other one for log or indexes, but
make a global 8 drive RAID10 and have it partitioned the way that data
and log + indexes are located on all drives.

They say, which is very interesting, as it is really contrary to what is
normally recommended, that it is good or better to have one big stripe
set over all disks available, than to put log + indexes on a separated
stripe set. Having one big stripe set means that the speed of this big
stripe set is available to all data. In practice this setup is as fast
as or even faster than the "old" approach.

----------------------------------------------------------------

Bottom line for a normal, less than 10 disk setup:

Get many disks (8 + spare), create a RAID0 with 4 disks and mirror it to
the other 4 disks for a RAID10. Make sure to create the RAID on the
outer half of the disks (setup may depend on the disk model and raid
controller used), leaving the inner half empty.
Use a logical volume manager (LVM), which always helps when adding disk
space, and create 2 partitions on your RAID10. One for data and one for
log + indexes. This should look like this:

----- ----- ----- -----
| 1 | | 1 | | 1 | | 1 |
----- ----- ----- -----  <- outer, faster half of the disk
| 2 | | 2 | | 2 | | 2 |     part of the RAID10
----- ----- ----- -----
|   | |   | |   | |   |
|   | |   | |   | |   |  <- inner, slower half of the disk
|   | |   | |   | |   |     not used at all
----- ----- ----- -----

Partition 1 for data, partition 2 for log + indexes. All mirrored to the
other 4 disks not shown.

If you take 36GB disks, this should end up like this:

RAID10 has size of 36 / 2 * 4 = 72GB
Partition 1 is 36 GB
Partition 2 is 36 GB

If 36GB is not enough for your pgdata set, you might consider moving to
72GB disks, or (even better) make a 16 drive RAID10 out of 36GB disks,
which both will end up in a size of 72GB for your data (but the 16 drive
version will be faster).

Any comments?

Regards,
Bjoern

pgsql-performance by date:

Previous
From: James Thornton
Date:
Subject: Re: Quad processor options - summary
Next
From: James Thornton
Date:
Subject: Re: Quad processor options - summary