Re: Hosting PG on AWS in 2013 - Mailing list pgsql-general

From David Boreham
Subject Re: Hosting PG on AWS in 2013
Date
Msg-id 5161B05B.1090405@boreham.org
Whole thread Raw
In response to Re: Hosting PG on AWS in 2013  (Tomas Vondra <tv@fuzzy.cz>)
Responses Re: Hosting PG on AWS in 2013
List pgsql-general
I thanks very much for your detailed response. A few answers below inline:

On 4/7/2013 9:38 AM, Tomas Vondra wrote:
> As for the performance, AFAIK the EBS volumes always had, and probably
> will have, a 32 MB/s limit. Thanks to caching, built into the EBS, the
> performance may seem much better initially (say twice as good), but
> after a sustained write workload (say 15-30 minutes), you're back at
> the 32 MB/s per volume. The main problem with regular EBS is the
> variability - the numbers above are for cases where everything
> operates fine. When something goes wrong, you can get 1 MB/s for a
> period of time. And when you create 10 volumes, each will have a bit
> different performance. There are ways to handle this, though - the
> "old way" is to build a RAID10 array on top of regular EBS volumes,
> the "new way" is to use EBS with Provisioned IOPS (possibly with RAID0).
>> Just to set the scene -- the application is a very high traffic web
>> service where any down time is very costly, processing a few hundred
>> transactions/s.
> What "high traffic" means for the database? Does that mean a lot of
> reads or writes, or something else?

I should have been more clear : the transactions/s above is all writes.
The read load is effectively cached. My assessment is that the load is
high enough that careful attention must be paid to I/O performance, but
no so high that sharding/partitioning is required (yet).
Part of the site is already using RDS with PIOPS, and runs at a constant
500 w/s, as viewed in CloudWatch. I don't know for sure how the PG-based
elements relate to this on load -- they back different functional areas
of the site.
>
>> Scanning through the latest list of AWS instance types, I can see two
>> plausible approaches:
>>
>> 1. High I/O Instances:  (regular AWS instance but with SSD local
>> storage) + some form of replication. Replication would be needed because
>> (as I understand it) any AWS instance can be "vanished" at any time due
>> to Amazon screwing something up, maintenance on the host, etc (I believe
>> the term of art is "ephemeral").
> Yes. You'll get great I/O performance with these SSD-based instances
> (easily ~1GB/s in), so you'll probably hit CPU bottlenecks instead.
>
> You're right that to handle the instance / ephemeral failures, you'll
> have to use some sort of replication - might be your custom
> application-specific application, or some sort of built-in (async/sync
> streamin, log shipping, Slony, Londiste, whatever suits your needs ...).
>
> If you really value the availability, you should deploy the replica in
> different availability zone or data center.
>
>> 2. EBS-Optimized Instances: these allow the use of EBS storage (SAN-type
>> service) from regular AWS instances. Assuming that EBS is maintained to
>> a high level of availability and performance (it doesn't, afaik, feature
>> the vanishing property of AWS machines), this should in theory work out
>> much the same as a traditional cluster of physical machines using a
>> shared SAN, with the appropriate voodoo to fail over between nodes.
> No, that's not what "EBS Optimized" instances are for. All AWS instance
> types can use EBS, using a SHARED network link. That means that e.g.
> HTTP or SSH traffic influences EBS performance, because they use the
> same ethernet link. The "EBS Optimized" says that the instance has a
> network link dedicated for EBS traffic, with guaranteed throughput.

Ah, thanks for clarifying that. I knew about PIOPS, but hadn't realized
that EBS Optimized meant a dedicated SAN cable. Makes sense...

>
> That is not going to fix the variability or EBS performance, though ...
>
> What you're looking for is called "Provisioned IOPS" (PIOPS) which
> guarantees the EBS volume performance, in terms of IOPS with 16kB block.
> For example you may create an EBS volume with 2000 IOPS, which is
> ~32MB/s (with 16kB blocks). It's not much, but it's much easier to build
> RAID0 array on top of those volumes. We're using this for some of our
> databases and are very happy with it.
>
> Obviously, you want to use PIOPS with EBS Optimized instances. I don't
> see much point in using only one of them.
>
> But still, depends on the required I/O performance - you can't really
> get above 125MB/s (m2.4xlarge) or 250MB/s (cc2.8xlarge).

I don't forsee this application being limited by bulk data throughput
(MB/s). It will be limited more by writes/s due to the small
transaction, OLTP-type workload.

>
> And you can't really rely on this if you need quick failover to a
> different availability zone or data center, because it's quite likely
> the EBS is going to be hit by the issue (read the analysis of AWS outage
> from April 2011: http://aws.amazon.com/message/65648/).

Right, assume that there can be cascading and correlated failures. I'm
not sure I could ever convince myself that a cloud-hosted solution is
really safe, because honestly I don't trust Amazon to design out their
single failure points and thermal-runaway problems. However in the
industry now there seems to be wide acceptance of the view that if
you're shafted by Amazon, that's ok (you don't get fired). I'm looking
at this project from that perspective. "Netflix-reliable", something
like that ;)

>
>> Any thoughts, wisdom, and especially from-the-trenches experience, would
>> be appreciated.
> My recommendation is to plan for zone/datacenter failures first. That
> means build a failover replica in a different zone/datacenter.
>
> You might be able to handle isolated EBS failures e.g. using snapshots
> and/or backups and similar recovery procedures, but it may require
> unpredictable downtimes (e.g. while we don't see failed EBS volumes very
> frequently, we do see EBS volumes stuck in "attaching" much more
> frequently).
>
> To handle I/O performance, you may either use EBS with PIOPS (which will
> also give you more reliability) or SSD instances (but you'll have to
> either setup a local replica to handle the instance failures or do the
> failover to the other cluster).
>

Thanks again for your thoughts -- most helpful.




pgsql-general by date:

Previous
From: Adrian Klaver
Date:
Subject: Re: procedure to contribute this community
Next
From: Gavin Flower
Date:
Subject: Re: procedure to contribute this community