Thread: configuring RAID10 for data in Amazon EC2 cloud?

configuring RAID10 for data in Amazon EC2 cloud?

From
"Welty, Richard"
Date:

does anyone have any tips on this? Linux Software Raid doesn't seem to be doing a very good job here, but i may well have missed something.

i did a fairly naive setup using linux software raid on an amazon linux instance,
10 volumes (8G each), (WAL on a separate EBS volume) with the following setup:

mdadm -v --create /dev/md1 --level=raid10 --raid-devices=10 /dev/xvdg /dev/xvdh /dev/xvdi /dev/xvdj /dev/xvdk /dev/xvdl /dev/xvdm /dev/xvdn /dev/xvdo /dev/xvdp

pvcreate /dev/md1

vgcreate vg-pgdata /dev/md1

vgdisplay vg-pgdata

lvcreate -L39.98g -nlv-pgdata vg-pgdata


this particular instance is running about a factor of two slower than a simple single disk instance. both the single disk instance and the one with RAID10 for ~postgres/data/base
started from amazon m1.xlarge instances.

postgresql version is 8.4.9, using a simple pgbench test for 600 seconds; the single disk instance shows this:

dbDev, single disk, shared_buffers=4GB, effective_cache_size=8GB
       disk mounted noatime, readahead 4096, other stuff default

-bash-4.1$ pgbench -T 600 bench
starting vacuum...end.
transaction type: TPC-B (sort of)
scaling factor: 1
query mode: simple
number of clients: 1
duration: 600 s
number of transactions actually processed: 535018
tps = 891.696072 (including connections establishing)
tps = 891.704512 (excluding connections establishing)


and the RAID10 instance shows this:

dbQA, wal+raid10 setup, ext3 for WAL, ext4 for raid10,
      shared_buffers=2GB, effective_cache_size=3GB
      readahead 10240, wal&raid mount noatime, journal=ordered
      vm.swappiness=0,vm.overcommit_memory=2, dirty_ratio=2,
      dirty_background_ratio=1

starting vacuum...end.
transaction type: TPC-B (sort of)
scaling factor: 1
query mode: simple
number of clients: 1
duration: 600 s
number of transactions actually processed: 261513
tps = 435.854738 (including connections establishing)
tps = 435.858853 (excluding connections establishing)

Re: configuring RAID10 for data in Amazon EC2 cloud?

From
Ben Chobot
Date:
On Mar 27, 2012, at 8:25 AM, Welty, Richard wrote:

does anyone have any tips on this? Linux Software Raid doesn't seem to be doing a very good job here, but i may well have missed something.

iostat -x 5 is your friend. We've been struggling with a similar setup recently, and the TL;DR summary is that EBS has unreliable performance and isn't acceptable to use when your performance matters. When it's rocking, a single EBS volume can get you ~1200 IOPs, but far too often, a volume will drop to less than 100 IOPS. And then there are the occasional times when they almost lock up, but not entirely, so they lock up your raid but don't get automatically dropped (though you could drop them yourself).

When you have an 8-volume raid, you have 8x the exposure to these problems. We're coming to the realization that AWS has no real way to run a normal, non-memory resident database, and are looking to host our databases outside AWS using DirectConnect, or something similar. (And not to hijack this thread but if anybody has experiences with that, I'd love to hear them.)

Re: configuring RAID10 for data in Amazon EC2 cloud?

From
Frank Lanitz
Date:
On Tue, 27 Mar 2012 11:25:53 -0400
"Welty, Richard" <rwelty@ltionline.com> wrote:

> does anyone have any tips on this? Linux Software Raid doesn't seem
> to be doing a very good job here, but i may well have missed
> something.
>
> i did a fairly naive setup using linux software raid on an amazon
> linux instance, 10 volumes (8G each), (WAL on a separate EBS volume)
> with the following setup:
>

You might want to check with Amazon here.

Cheers,
Frank

--
Frank Lanitz <frank@frank.uvena.de>

Attachment