Thread: amazon ec2
What are the best practices for setting up PG 9.x on Amazon EC2 to get the best performance? Thanks in advance, Joel -------------------------------------------------------------------------- - for hire: mac osx device driver ninja, kernel extensions and usb drivers ---------------------+------------+--------------------------------------- http://wagerlabs.com | @wagerlabs | http://www.linkedin.com/in/joelreymont ---------------------+------------+---------------------------------------
On May 3, 2011 11:48:35 am Joel Reymont wrote: > What are the best practices for setting up PG 9.x on Amazon EC2 to get the > best performance? > I am also interested in tips for this. EBS seems to suck pretty bad.
On May 3, 2011, at 8:41 PM, Alan Hodgson wrote: > I am also interested in tips for this. EBS seems to suck pretty bad. Alan, can you elaborate? Are you using PG on top of EBS? -------------------------------------------------------------------------- - for hire: mac osx device driver ninja, kernel extensions and usb drivers ---------------------+------------+--------------------------------------- http://wagerlabs.com | @wagerlabs | http://www.linkedin.com/in/joelreymont ---------------------+------------+---------------------------------------
On May 3, 2011 12:43:13 pm you wrote: > On May 3, 2011, at 8:41 PM, Alan Hodgson wrote: > > I am also interested in tips for this. EBS seems to suck pretty bad. > > Alan, can you elaborate? Are you using PG on top of EBS? > Trying to, yes. Let's see ... EBS volumes seem to vary in speed. Some are relatively fast. Some are really slow. Some fast ones become slow randomly. Some are fast attached to one instance, but really slow attached to another. Fast being a relative term, though. The fast ones seem to be able to do maybe 400 random IOPS. And of course you can only get about 80MB/sec sequential access to them on a good day. Which is why I'm interested in how other people are doing it. So far EC2 doesn't seem well suited to running databases at all.
On Tue, May 3, 2011 at 2:09 PM, Alan Hodgson <ahodgson@simkin.ca> wrote:
On May 3, 2011 12:43:13 pm you wrote:Trying to, yes.
> On May 3, 2011, at 8:41 PM, Alan Hodgson wrote:
> > I am also interested in tips for this. EBS seems to suck pretty bad.
>
> Alan, can you elaborate? Are you using PG on top of EBS?
>
Let's see ...
EBS volumes seem to vary in speed. Some are relatively fast. Some are really
slow. Some fast ones become slow randomly. Some are fast attached to one
instance, but really slow attached to another.
I ran pgbench tests late last year comparing EC2, GoGrid, a 5 year-old lab server and a new server. Whether I used a stock postgresql.conf or tweaked, the current 8.4 or 9.0, or varied the EC2 instance size EC2 was always at the bottom ranging from 409.834 to 693.100 tps. GoGrid's pgbench TPS numbers in similar tests were, on average, 3X that of EC2 (1,399.550 to 1,631.887 tps). The tests I conducted were small with 10 connections and total 5,000 transactions. The single variable that helped pgbench tests in EC2 was to select an instance size where the number of cores was equal to or greater than the number of connections I used in the tests however this only improved things slightly (715.931 tps).
For comparisons purposes, I ran the same tests on a 24-way X5650 with 12 GB and SAS RAID 10. This server typically ranged from 2,188.348 to 2,216.377 tps.
I attributed GoGrids superior performance over EC2 as EC2 simply being over-allocated but that's just speculation on my part. To test my theory, I had wanted to put the database on a ramdisk, or like device, in EC2 and GoGrid but never got around to it.
Fast being a relative term, though. The fast ones seem to be able to do maybe
400 random IOPS. And of course you can only get about 80MB/sec sequential
access to them on a good day.
Which is why I'm interested in how other people are doing it. So far EC2
doesn't seem well suited to running databases at all.
I was doing this perhaps to convince management to give me some time to validate our software (PG backed) on some of the cloud providers but with those abysmal numbers I didn't even bother at the time. I may revisit at some point b/c I know Amazon at least has been making architecture adjustments and updates.
Greg
iowait is a problem on any platform that relies on spinning media, compared to RAM.
no matter how fast a disk is, and no matter how intelligent the controller is, you are still dealing with an access speed differential of 10^6 (speed of disk access compared to memory access).
i have had good results by avoiding it.
if you can do this, ec2 is not too shabby, but beware - it doesn't come free.
this is achievable under the following circumstances (and maybe there are other ways to do this).
i use a technique of pro-actively querying enough of my anticipated result set with a daemon procedure.
as long as the frequency of your query daemon execution is greater than that of the competitor processes (eg ETL and other update activity), AND a substantial part of the result set will fit in available RAM, then the result set will be served from file system cache at the time you want it.
i have found that it doesn't take much to get this to happen, once you have identified your critical result set.
like - you can get away with running it once/hour, and i'm still reducing the frequency and getting good results.
this approach basically assumes a 90/10 rule - at any point in time, you only want to access 10% of your data. if you can work out what the 10% is, and it will fit into RAM, then you can set it up to cache it.
it also imposes no additional cost in ec2, because Amazon doesn't bill you for CPU activity, although the large-RAM machines do cost more. Depends on how big your critical result set is, and how much speed you need.
dont know if this helps - the success/failure of it depends on your typical query activity, the size of your critical result set, and whether you are able to get enough RAM to make this work.
as i said it doesn't come for free, but you can make it work.
as a further point, try also checking out greenplum - it is an excellent postgres derivative with a very powerful free version. the reason why i bring it up is because it offers block-level compression (with caveats - it also doesn't come for free, so do due diligence and rtfm carefully). The compression enabled me to improve the cache hit rate, and so you further reduce the iowait problem.
greenplum is also a better parallel machine than postgres, so combining the cache technique above with greenplum compression and parallel query, i have been able to get 20:1 reduction in response times for some of our queries.
obviously introducing new database technology is a big deal, but we needed the speed, and it kinda worked.
mr
no matter how fast a disk is, and no matter how intelligent the controller is, you are still dealing with an access speed differential of 10^6 (speed of disk access compared to memory access).
i have had good results by avoiding it.
if you can do this, ec2 is not too shabby, but beware - it doesn't come free.
this is achievable under the following circumstances (and maybe there are other ways to do this).
i use a technique of pro-actively querying enough of my anticipated result set with a daemon procedure.
as long as the frequency of your query daemon execution is greater than that of the competitor processes (eg ETL and other update activity), AND a substantial part of the result set will fit in available RAM, then the result set will be served from file system cache at the time you want it.
i have found that it doesn't take much to get this to happen, once you have identified your critical result set.
like - you can get away with running it once/hour, and i'm still reducing the frequency and getting good results.
this approach basically assumes a 90/10 rule - at any point in time, you only want to access 10% of your data. if you can work out what the 10% is, and it will fit into RAM, then you can set it up to cache it.
it also imposes no additional cost in ec2, because Amazon doesn't bill you for CPU activity, although the large-RAM machines do cost more. Depends on how big your critical result set is, and how much speed you need.
dont know if this helps - the success/failure of it depends on your typical query activity, the size of your critical result set, and whether you are able to get enough RAM to make this work.
as i said it doesn't come for free, but you can make it work.
as a further point, try also checking out greenplum - it is an excellent postgres derivative with a very powerful free version. the reason why i bring it up is because it offers block-level compression (with caveats - it also doesn't come for free, so do due diligence and rtfm carefully). The compression enabled me to improve the cache hit rate, and so you further reduce the iowait problem.
greenplum is also a better parallel machine than postgres, so combining the cache technique above with greenplum compression and parallel query, i have been able to get 20:1 reduction in response times for some of our queries.
obviously introducing new database technology is a big deal, but we needed the speed, and it kinda worked.
mr
On Tue, May 3, 2011 at 1:09 PM, Alan Hodgson <ahodgson@simkin.ca> wrote:
On May 3, 2011 12:43:13 pm you wrote:Trying to, yes.
> On May 3, 2011, at 8:41 PM, Alan Hodgson wrote:
> > I am also interested in tips for this. EBS seems to suck pretty bad.
>
> Alan, can you elaborate? Are you using PG on top of EBS?
>
Let's see ...
EBS volumes seem to vary in speed. Some are relatively fast. Some are really
slow. Some fast ones become slow randomly. Some are fast attached to one
instance, but really slow attached to another.
Fast being a relative term, though. The fast ones seem to be able to do maybe
400 random IOPS. And of course you can only get about 80MB/sec sequential
access to them on a good day.
Which is why I'm interested in how other people are doing it. So far EC2
doesn't seem well suited to running databases at all.
--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance
phoronix did some benchmarks of the ec2 machines and they show pretty poor numbers, especially in the I/O side of things http://www.phoronix.com/scan.php?page=article&item=amazon_ec2_round1&num=1 http://www.phoronix.com/scan.php?page=article&item=amazon_ec2_micro&num=1 David Lang On Tue, 3 May 2011, Alan Hodgson wrote: > Date: Tue, 3 May 2011 13:09:51 -0700 > From: Alan Hodgson <ahodgson@simkin.ca> > To: pgsql-performance@postgresql.org > Subject: Re: [PERFORM] amazon ec2 > > On May 3, 2011 12:43:13 pm you wrote: >> On May 3, 2011, at 8:41 PM, Alan Hodgson wrote: >>> I am also interested in tips for this. EBS seems to suck pretty bad. >> >> Alan, can you elaborate? Are you using PG on top of EBS? >> > > Trying to, yes. > > Let's see ... > > EBS volumes seem to vary in speed. Some are relatively fast. Some are really > slow. Some fast ones become slow randomly. Some are fast attached to one > instance, but really slow attached to another. > > Fast being a relative term, though. The fast ones seem to be able to do maybe > 400 random IOPS. And of course you can only get about 80MB/sec sequential > access to them on a good day. > > Which is why I'm interested in how other people are doing it. So far EC2 > doesn't seem well suited to running databases at all. > >
On 5/3/11 11:48 AM, Joel Reymont wrote: > What are the best practices for setting up PG 9.x on Amazon EC2 to get the best performance? Yes. Don't use EC2. There is no "best" performance on EC2. There's not even "good performance". Basically, EC2 is the platform for when performance doesn't matter. Use a dedicated server, or use a better cloud host. http://it.toolbox.com/blogs/database-soup/how-to-make-your-database-perform-well-on-amazon-ec2-45725 -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com
Mark Rostron wrote: > the success/failure of it depends on your typical query activity, the > size of your critical result set, and whether you are able to get > enough RAM to make this work. Basically, it all comes down to "does the working set of data I access frequently fit in RAM?" If it does, it's possible to get reasonable performance out of an EC2 instance. The EBS disks are so slow, both on average and particularly in cases where you have contention with other users slowing you down, that any situation where you have to use them is never going to work well. If most of the data fits in RAM, and the CPU resources available to your instance are sufficient to service your queries, you might see acceptable performance. > greenplum is also a better parallel machine than postgres, so > combining the cache technique above with greenplum compression and > parallel query, i have been able to get 20:1 reduction in response > times for some of our queries. I've also seen over a 20:1 speedup over PostgreSQL by using Greenplum's free Community Edition server, in situations where its column store + compression features work well on the data set. That's easiest with an append-only workload, and the data set needs to fit within the constraints where indexes on compressed data are useful. But if you fit the use profile it's good at, you end up with considerable ability to trade-off using more CPU resources to speed up queries. It effectively increases the amount of data that can be cached in RAM by a large multiple, and in the EC2 context (where any access to disk is very slow) it can be quite valuable. My colleague Gabrielle wrote something about setting this up on an earlier version of Greenplum's software at http://blog.2ndquadrant.com/en/2010/03/installing-greenplum-sne-ec2.html that gives an idea how that was setup. -- Greg Smith 2ndQuadrant US greg@2ndQuadrant.com Baltimore, MD PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.us "PostgreSQL 9.0 High Performance": http://www.2ndQuadrant.com/books
Greg Spiegelberg wrote: > I ran pgbench tests late last year comparing EC2, GoGrid, a 5 year-old > lab server and a new server. Whether I used a stock postgresql.conf > or tweaked, the current 8.4 or 9.0, or varied the EC2 instance size > EC2 was always at the bottom ranging from 409.834 to 693.100 tps. > GoGrid's pgbench TPS numbers in similar tests were, on average, 3X > that of EC2 (1,399.550 to 1,631.887 tps). The tests I conducted were > small with 10 connections and total 5,000 transactions. The single > variable that helped pgbench tests in EC2 was to select an instance > size where the number of cores was equal to or greater than the number > of connections I used in the tests however this only improved things > slightly (715.931 tps). The standard pgbench test is extremely sensitive to how fast transactions can be committed to disk. That doesn't reflect what performance looks like on most real-world workloads, which tend toward more reads. The fact that GoGrid is much faster at doing commits than EC2 is interesting, but that's only one of many parameters that impact performance on more normal workloads. The one parameter that can change how the test runs is turning off synchronous_commit, which pulls the commit time out of the results to some extent. And if you'd switched pgbench to a more read-oriented test, you'd discover it becomes extremely sensitive to the size of the database, as set by pgbench's scale parameter during setup. -- Greg Smith 2ndQuadrant US greg@2ndQuadrant.com Baltimore, MD PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.us "PostgreSQL 9.0 High Performance": http://www.2ndQuadrant.com/books
On May 3, 2011, at 5:39 PM, Greg Smith wrote: > I've also seen over a 20:1 speedup over PostgreSQL by using Greenplum's free Community Edition server, in situations whereits column store + compression features work well on the data set. That's easiest with an append-only workload, andthe data set needs to fit within the constraints where indexes on compressed data are useful. But if you fit the useprofile it's good at, you end up with considerable ability to trade-off using more CPU resources to speed up queries. It effectively increases the amount of data that can be cached in RAM by a large multiple, and in the EC2 context(where any access to disk is very slow) it can be quite valuable. FWIW, EnterpriseDB's "InfiniCache" provides the same caching benefit. The way that works is when PG goes to evict a pagefrom shared buffers that page gets compressed and stuffed into a memcache cluster. When PG determines that a given pageisn't in shared buffers it will then check that memcache cluster before reading the page from disk. This allows you tocache amounts of data that far exceed the amount of memory you could put in a physical server. -- Jim C. Nasby, Database Architect jim@nasby.net 512.569.9461 (cell) http://jim.nasby.net
On 05/03/2011 01:48 PM, Joel Reymont wrote: > What are the best practices for setting up PG 9.x on Amazon EC2 to > get the best performance? Use EC2 and other Amazon hosting for cloud-based client access only. Their shared disk services are universally despised by basically everyone who has tried to use it for database hosting. The recommended pattern is to have the scalable cloud clients access (and memcache) a remote DB at a colo or managed services host. EBS is a nice idea, and probably fine for things like image or video hosting, but database access, especially for OLTP databases, will just result in wailing and gnashing of teeth. Just ask anyone who got bit by the recent EBS failure that spanned *several* availability zones. For all those clients who thought they were safe by deploying across multiple ACs, it was a rather rude awakening. http://aws.amazon.com/message/65648/ The consensus seems to be that Amazon's cloud is fine... so long as you stay far, far away from EBS. Apparently that needs a little more work. -- Shaun Thomas OptionsHouse | 141 W. Jackson Blvd. | Suite 800 | Chicago IL, 60604 312-676-8870 sthomas@peak6.com ______________________________________________ See http://www.peak6.com/email_disclaimer.php for terms and conditions related to this email
> FWIW, EnterpriseDB's "InfiniCache" provides the same caching benefit. The way that works is when PG goes to evict a pagefrom shared buffers that page gets compressed and stuffed into a memcache cluster. When PG determines that a given pageisn't in shared buffers it will then check that memcache cluster before reading the page from disk. This allows you tocache amounts of data that far exceed the amount of memory you could put in a physical server. So memcached basically replaces the filesystem? That sounds cool, but I'm wondering if it's actually a performance speedup. Seems like it would only be a benefit for single-row lookups; any large reads would be a mess. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com
On Wed, 4 May 2011, Josh Berkus wrote: > Date: Wed, 04 May 2011 17:02:53 -0700 > From: Josh Berkus <josh@agliodbs.com> > To: postgres performance list <pgsql-performance@postgresql.org> > Subject: Re: [PERFORM] amazon ec2 > > >> FWIW, EnterpriseDB's "InfiniCache" provides the same caching benefit. The way that works is when PG goes to evict a pagefrom shared buffers that page gets compressed and stuffed into a memcache cluster. When PG determines that a given pageisn't in shared buffers it will then check that memcache cluster before reading the page from disk. This allows you tocache amounts of data that far exceed the amount of memory you could put in a physical server. > > So memcached basically replaces the filesystem? > > That sounds cool, but I'm wondering if it's actually a performance > speedup. Seems like it would only be a benefit for single-row lookups; > any large reads would be a mess. I think it would depend a lot on how well the page compresses. if the network I/O plus uncompression time is faster than the seek and read from disk it should be a win. I don't see why the benifit would be limited to single row lookups, anything within that page should be the same. for multipage actions, the disk would have less of a disadvantage as readahead may be able to hide some of the work while the memcache approach would need to do separate transactions for each page. this does bring up an interesting variation for hierarchical storage, using compressed pages in memcache rather than dedicated resources on a local system. thanks for the thought and I'll keep it in mind. I can think of lots of cases where the database stores a relativly small set of values that would compress well. David Lang
----- Original Message ----- > From: Josh Berkus <josh@agliodbs.com> > To: postgres performance list <pgsql-performance@postgresql.org> > Cc: > Sent: Thursday, May 5, 2011 2:02 AM > Subject: Re: [PERFORM] amazon ec2 > So memcached basically replaces the filesystem? > > That sounds cool, but I'm wondering if it's actually a performance > speedup. Seems like it would only be a benefit for single-row lookups; > any large reads would be a mess. I've never tested with pgsql, but with mysql it makes a *huge* difference when you're pulling data repeatedly. Multi-rowlookups can be cached too: $rows = $cache->get(md5($query . '--' . serialize($args))); if ( !$rows) { // query and cache for a few hours... } This is true even with mysql's caching features turned on. You spare the DB from doing identical queries that get repeatedover and over. Memcache lets you pull those straight from the memory, allowing for the DB server to handle new queriesexclusively.
On Thu, May 5, 2011 at 1:02 AM, Josh Berkus <josh@agliodbs.com> wrote: > >> FWIW, EnterpriseDB's "InfiniCache" provides the same caching benefit. The way that works is when PG goes to evict a pagefrom shared buffers that page gets compressed and stuffed into a memcache cluster. When PG determines that a given pageisn't in shared buffers it will then check that memcache cluster before reading the page from disk. This allows you tocache amounts of data that far exceed the amount of memory you could put in a physical server. > > So memcached basically replaces the filesystem? No, it sits in between shared buffers and the filesystem, effectively providing an additional layer of extremely large, compressed cache. Even on a single server there can be benefits over larger shared buffers due to the compression. > That sounds cool, but I'm wondering if it's actually a performance > speedup. Seems like it would only be a benefit for single-row lookups; > any large reads would be a mess. Depends on the database and the workload - if you can fit your entire 100GB database in cache, and your workload is read intensive then the speedups are potentially huge (I've seen benchmarks showing 20x+). Write intensive workloads, less so, similarly if the working set is far larger than your cache size. -- Dave Page Blog: http://pgsnake.blogspot.com Twitter: @pgsnake EnterpriseDB UK: http://www.enterprisedb.com The Enterprise PostgreSQL Company