Re: Hosting PG on AWS in 2013 - Mailing list pgsql-general

From Tomas Vondra
Subject Re: Hosting PG on AWS in 2013
Date
Msg-id 516206D1.5080900@fuzzy.cz
Whole thread Raw
In response to Re: Hosting PG on AWS in 2013  (David Boreham <david_list@boreham.org>)
List pgsql-general
On 7.4.2013 19:43, David Boreham wrote:
>
> I thanks very much for your detailed response. A few answers below inline:
>
> On 4/7/2013 9:38 AM, Tomas Vondra wrote:
>> As for the performance, AFAIK the EBS volumes always had, and probably
>> will have, a 32 MB/s limit. Thanks to caching, built into the EBS, the
>> performance may seem much better initially (say twice as good), but
>> after a sustained write workload (say 15-30 minutes), you're back at
>> the 32 MB/s per volume. The main problem with regular EBS is the
>> variability - the numbers above are for cases where everything
>> operates fine. When something goes wrong, you can get 1 MB/s for a
>> period of time. And when you create 10 volumes, each will have a bit
>> different performance. There are ways to handle this, though - the
>> "old way" is to build a RAID10 array on top of regular EBS volumes,
>> the "new way" is to use EBS with Provisioned IOPS (possibly with RAID0).
>>> Just to set the scene -- the application is a very high traffic web
>>> service where any down time is very costly, processing a few hundred
>>> transactions/s.
>> What "high traffic" means for the database? Does that mean a lot of
>> reads or writes, or something else?
>
> I should have been more clear : the transactions/s above is all writes.
> The read load is effectively cached. My assessment is that the load is
> high enough that careful attention must be paid to I/O performance, but
> no so high that sharding/partitioning is required (yet).
> Part of the site is already using RDS with PIOPS, and runs at a constant
> 500 w/s, as viewed in CloudWatch. I don't know for sure how the PG-based
> elements relate to this on load -- they back different functional areas
> of the site.

Thats 500 * 16kB of writes, i.e. ~8MB/s. Not a big deal, IMHO,
especially if only part of this are writes from PostgreSQL.

>> But still, depends on the required I/O performance - you can't really
>> get above 125MB/s (m2.4xlarge) or 250MB/s (cc2.8xlarge).
>
> I don't forsee this application being limited by bulk data throughput
> (MB/s). It will be limited more by writes/s due to the small
> transaction, OLTP-type workload.

There's not much difference between random and sequential I/O on EBS.
You may probably get a bit better sequential performance thanks to
coalescing smaller requests (the PIOPS work with 16kB blocks, while
PostgreSQL uses 8kB), but we don't see that in practice.

And the writes to the WAL are sequential anyway.

>> And you can't really rely on this if you need quick failover to a
>> different availability zone or data center, because it's quite likely
>> the EBS is going to be hit by the issue (read the analysis of AWS outage
>> from April 2011: http://aws.amazon.com/message/65648/).
>
> Right, assume that there can be cascading and correlated failures. I'm
> not sure I could ever convince myself that a cloud-hosted solution is
> really safe, because honestly I don't trust Amazon to design out their
> single failure points and thermal-runaway problems. However in the
> industry now there seems to be wide acceptance of the view that if
> you're shafted by Amazon, that's ok (you don't get fired). I'm looking
> at this project from that perspective. "Netflix-reliable", something
> like that ;)

Well, even if you could prevent all those failures, there's still a
possibility of a human error (as in 2011) or Godzilla eating the data
center (and it's not going to eat a single availability zone).

I believe Amazon is working hard on this and I trust their engineers,
but this simply is not a matter of trust. Mistakes and unexpected
failures do happen all the time. Anyone who believes that moving to
Amazon somehow magicaly makes them disappear is naive.

The only good thing is that when such crash happens, half of the
internet goes down so noone really notices the smaller sites. If you
can't watch funny cat pictures on reddit, it's all futile anyway.

Tomas



pgsql-general by date:

Previous
From: CR Lender
Date:
Subject: Re: pg_stat_get_last_vacuum_time(): why non-FULL?
Next
From: Rod
Date:
Subject: