Re: Amazon High I/O instances - Mailing list pgsql-general

From Sébastien Lorion
Subject Re: Amazon High I/O instances
Date
Msg-id CAGa5y0NS=2SpoHDG0COP+O2RpKGkFs5gGRNjX8CqKp86ARVDrQ@mail.gmail.com
Whole thread Raw
In response to Re: Amazon High I/O instances  (John R Pierce <pierce@hogranch.com>)
Responses Re: Amazon High I/O instances
List pgsql-general
Finally I got time to setup an instance and do some tests.

Instance:

High-Mem 4x large (8 cores, 68 GB)
EBS-Optimized flag set (allow up to 1000 Mbits/s transfer)
10GB standard EBS for OS
8x100GB in RAID10 for data (max 1000 iops)
2x100GB in RAID0 for WAL (max 1000 iops)

FreeBSD 9.0
PostgreSQL 9.1.5
OS is on UFS
data is on ZFS with noatime, recordsize=8K, logbias=throughput
WAL is on ZFS with noatime, recordsize=8K, logbias=latency, primarycache=metadata

I have included config files used.

Results:

I ran the same pgbench command 5 times in a row, with 5 min. sleep between each.
autovacuum was off
scale is 10000 and -t = 10000

read-write
clients/threads : result1, result2, result3, result4, result5
8/1: 3210, 2394, 2668, 1943, 2894
8/8: 3285, 2487, 2423, 2839, 3380
32/8: 4862, 3116, 2933, 3053, 3013
64/8: 2988, 1197, 867, 1159, 828

read-only (-S)
clients/threads : result1, result2, result3, result4, result5
8/1: 5978, 5983, 6110, 6091, 6158
8/8: 6081, 6022, 6109, 6027, 5479
32/8: 5169, 5144, 4762, 5293, 4936
64/8: 3916, 4203, 4199, 4261, 4070

I also let `pgbench -c 32 -j 8 -T 10800` run last night, with autovacuum turned on this time, and the results is 694 tps.

As you can see, I am nowhere near the results John mentioned for a 10,000 scale (about 8000 tps) and I am not sure why. My instance setup and configuration should be ok, but I am far from an expert (a startup founder has to wear many hats...), I simply followed advice found in Greg Smith book and what I read on the net. So, if anyone can offer insight as to why the performance is not as good as expected, please let me know ..

I did not terminate the AMI yet, so I can do more testing and/or try suggestions to improve the results. I will also try to run the benchmarks again on a pure RAID1 configuration with fsync off, which I will use for read-only databases.

Many thanks!

Sébastien

On Thu, Aug 23, 2012 at 2:41 PM, John R Pierce <pierce@hogranch.com> wrote:
On 08/23/12 11:24 AM, Sébastien Lorion wrote:
I think both kind of tests (general and app specific) are complementary and useful in their own way. At a minimum, if the general ones fail, why go to the expenses of doing the specific ones ? Setting up a meaningful application test can take a lot of time and it can be hard to pinpoint exactly where in the stack the performance drops occur. The way I see it, synthetic benchmarks allow to isolate somewhat the layers and serve as a base to validate application tests done later on. It surprises me that asking for the general perf behavior of a platform is controversial.

I don't use AWS at all.   But, it shouldnt take more than a couple hours to spin up an instance, populate a pgbench database and run a series of pgbench runs against it, and do the same against any other sort of system you wish to use as your reference.

I like to test with a database about twice the size of the available memory if I'm testing IO, and I've found that pgbench -i -s ####, for ####=10000 it generates a 1 billion row table and uses about 150GB (and a hour or so to initialize on fast IO hardware).  I then run pgbench with -c of about 2-4X the cpu/thread count, and -j of about -c/16, and a -t of at least 10000 (so each client connection runs 10000 transactions).

on a modest but decent 2U class 2-socket dedicated server with a decent raid card and raid10 across enough spindles, I can see numbers as high as 5000 transactions/second with 15krpm rust, and 7000-8000 with a couple MLC SSD's striped.   trying to raid10 a bunch of SATA 7200 disks gives numbers more like 1000.   using host based raid, without a write-back cache in the raid card, gives numbers about 1/2 the above.   the IOPS during these tests hit around 12000 or 15000 small writes/second.

doing this level of IO on a midsized SAN can often cause the SAN CPU to run at 80%+ so if there's other activity on the SAN from other hosts, good luck.

in a heavily virtualized shared-everything environment, I'm guessing your numbers will be all over the place and difficult to achieve consistency.


--
john r pierce                            N 37, W 122
santa cruz ca                         mid-left coast




--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Attachment

pgsql-general by date:

Previous
From: Dann Corbit
Date:
Subject: how long to wait on 9.2 bitrock installer?
Next
From: François Beausoleil
Date:
Subject: Re: Amazon High I/O instances