Thread: Migrating to Postgresql and new hardware
Hi, We are in the process of moving a web based application from a MySql to Postgresql database. Our main reason for moving to Postgresql is problems with MySql (MyISAM) table locking. We will buy a new set of servers to run the Postgresql databases. The current setup is five Dell PowerEdge 2950 with 2 * XEON E5410, 4GB RAM. PERC 5/I 256MB NV Cache, 4 * 10K Disks (3 inRAID 5 + 1 spare). One server is used for shared data. Four servers are used for sharded data. A user in the system only has data in one of the shards. There is another server to which all data is replicated but I'll leave that one out of this discussion. These are dedicated database servers. There are more or less no stored procedures. The shared database size is about 20GBand each shard database is about 40GB (total of 20 + 40 * 4 = 180GB). I would expect the size will grow 10%-15% thisyear. Server load might increase with 15%-30% this year. This setup is disk I/O bound. The overwhelming majority of sqlstatements are fast (typically single row selects, updates, inserts and deletes on primary key) but there are some slowlong running (10min) queries. As new server we are thinking of PowerEdge R510, 1 * Xeon X5650, 24Gb RAM, H700 512MB NV Cache. Dell has offered two alternative SSDs: Samsung model SS805 (100GB Solid State Disk SATA 2.5"). (http://www.plianttechnology.com/lightning_lb.php) Pliant model LB 150S (149GB Solid State Drive SAS 3Gbps 2.5"). (http://www.samsung.com/global/business/semiconductor/products/SSD/Products_Enterprise_SSD.html) Both are SLC drives. The price of the Pliant is about 2,3 times the price of the Samsung (does it have twice the performance?). One alternative is 5 servers (1 shared and 4 shards) with 5 Samsung drives (4 in RAID 10 + 1 spare). Another alternative would be 3 servers (1 shared and 2 shards) with 5 Pliant drives (4 in RAID 10 + 1 spare). This wouldbe slightly more expensive than the first alternative but would be easier to upgrade with two new shard servers whenit's needed. Anyone have experience using the Samsung or the Pliant SSD? Any information about degraded performance over time? Any comments on the setups? How would an alternative with 15K disks (6 RAID 10 + 1 spare, or even 10 RAID10 + 1 spare) compare? How would these alternatives compare in I/O performance compared to the old setup? Anyone care to guess how the two alternatives would compare in performance running Postgresql? How would the hardware usage of Postgresql compare to MySqls? Regards /Lars
On 1/18/2011 4:56 AM, Lars wrote: > Hi, > > We are in the process of moving a web based application from a MySql > to Postgresql database. Our main reason for moving to Postgresql is > problems with MySql (MyISAM) table locking. We will buy a new set of > servers to run the Postgresql databases. > > The current setup is five Dell PowerEdge 2950 with 2 * XEON E5410, > 4GB RAM. PERC 5/I 256MB NV Cache, 4 * 10K Disks (3 in RAID 5 + 1 > spare). > > One server is used for shared data. Four servers are used for sharded > data. A user in the system only has data in one of the shards. There > is another server to which all data is replicated but I'll leave that > one out of this discussion. These are dedicated database servers. > There are more or less no stored procedures. The shared database size > is about 20GB and each shard database is about 40GB (total of 20 + 40 > * 4 = 180GB). I would expect the size will grow 10%-15% this year. > Server load might increase with 15%-30% this year. This setup is disk > I/O bound. The overwhelming majority of sql statements are fast > (typically single row selects, updates, inserts and deletes on > primary key) but there are some slow long running (10min) queries. > No idea what mysql thinks a shard is, but in PG we have read-only hot standby's. The standby database is exactly the same as the master (save a bit of data that has not been synced yet.) I assume you know this... but I'd really recommend trying out PG's hot-standby and make sure it works the way you need (because I bet its different than mysql's). Assuming the "shared" and the "sharded" databases are totally different (lets call them database a and c), with the PG setup you'd have database a on one computer, then one master with database b on it (where all writes go), then several hot-standby's mirroring database b (that support read-only queries). As for the hardware, you'd better test it. Got any old servers you could put a real-world workload on? Or just buy one new server for testing? Its pretty hard to guess what your usage pattern is (70% read, small columns, no big blobs (like photos), etc)... and even then we'd still have to guess. I can tell you, however, having your readers and writers not block each other is really nice. Not only will I not compare apples to oranges, but I really wont compare apples in Canada to oranges in Japan. :-) -Andy
oops, call them database 'a' and database 'b'.
Are you going to RAID the SSD drives at all? You would likely be better off with a couple of things. - increasing ram to 256GB on each server to cache most of the databases. (easy, and cheaper than SSD) - move to fusionIO - move to SLC based SSD, warning not many raid controllers will get the performance out of the SSD's at this time. Of the three I would suggest #1, and #2, the cost of a SLC SSD raid will cost more than the fusionIO drive and still notmatch the fusionIO drive performance. Of course this is based on my experience, and I have my fireproof suit since I mentioned the word fusionIO :) - John -----Original Message----- From: pgsql-performance-owner@postgresql.org [mailto:pgsql-performance-owner@postgresql.org] On Behalf Of Lars Sent: Tuesday, January 18, 2011 4:57 AM To: pgsql-performance@postgresql.org Subject: [PERFORM] Migrating to Postgresql and new hardware Hi, We are in the process of moving a web based application from a MySql to Postgresql database. Our main reason for moving to Postgresql is problems with MySql (MyISAM) table locking. We will buy a new set of servers to run the Postgresql databases. The current setup is five Dell PowerEdge 2950 with 2 * XEON E5410, 4GB RAM. PERC 5/I 256MB NV Cache, 4 * 10K Disks (3 inRAID 5 + 1 spare). One server is used for shared data. Four servers are used for sharded data. A user in the system only has data in one of the shards. There is another server to which all data is replicated but I'll leave that one out of this discussion. These are dedicated database servers. There are more or less no stored procedures. The shared database size is about 20GBand each shard database is about 40GB (total of 20 + 40 * 4 = 180GB). I would expect the size will grow 10%-15% thisyear. Server load might increase with 15%-30% this year. This setup is disk I/O bound. The overwhelming majority of sqlstatements are fast (typically single row selects, updates, inserts and deletes on primary key) but there are some slowlong running (10min) queries. As new server we are thinking of PowerEdge R510, 1 * Xeon X5650, 24Gb RAM, H700 512MB NV Cache. Dell has offered two alternative SSDs: Samsung model SS805 (100GB Solid State Disk SATA 2.5"). (http://www.plianttechnology.com/lightning_lb.php) Pliant model LB 150S (149GB Solid State Drive SAS 3Gbps 2.5"). (http://www.samsung.com/global/business/semiconductor/products/SSD/Products_Enterprise_SSD.html) Both are SLC drives. The price of the Pliant is about 2,3 times the price of the Samsung (does it have twice the performance?). One alternative is 5 servers (1 shared and 4 shards) with 5 Samsung drives (4 in RAID 10 + 1 spare). Another alternative would be 3 servers (1 shared and 2 shards) with 5 Pliant drives (4 in RAID 10 + 1 spare). This wouldbe slightly more expensive than the first alternative but would be easier to upgrade with two new shard servers whenit's needed. Anyone have experience using the Samsung or the Pliant SSD? Any information about degraded performance over time? Any comments on the setups? How would an alternative with 15K disks (6 RAID 10 + 1 spare, or even 10 RAID10 + 1 spare) compare? How would these alternatives compare in I/O performance compared to the old setup? Anyone care to guess how the two alternatives would compare in performance running Postgresql? How would the hardware usage of Postgresql compare to MySqls? Regards /Lars -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance This communication is for informational purposes only. It is not intended as an offer or solicitation for the purchase or sale of any financial instrument or as an official confirmation of any transaction. All market prices, data and other information are not warranted as to completeness or accuracy and are subject to change without notice. Any comments or statements made herein do not necessarily reflect those of JPMorgan Chase & Co., its subsidiaries and affiliates. This transmission may contain information that is privileged, confidential, legally privileged, and/or exempt from disclosure under applicable law. If you are not the intended recipient, you are hereby notified that any disclosure, copying, distribution, or use of the information contained herein (including any reliance thereon) is STRICTLY PROHIBITED. Although this transmission and any attachments are believed to be free of any virus or other defect that might affect any computer system into which it is received and opened, it is the responsibility of the recipient to ensure that it is virus free and no responsibility is accepted by JPMorgan Chase & Co., its subsidiaries and affiliates, as applicable, for any loss or damage arising in any way from its use. If you received this transmission in error, please immediately contact the sender and destroy the material in its entirety, whether in electronic or hard copy format. Thank you. Please refer to http://www.jpmorgan.com/pages/disclosures for disclosures relating to European legal entities.
On Tue, 18 Jan 2011 16:06:17 -0600, Strange, John W <john.w.strange@jpmchase.com> wrote: > Of course this is based on my experience, and I have my fireproof suit > since I mentioned the word fusionIO :) I'll throw a fire blanket up as well. We have a customer who has been running Fusion IO with Postgres for about 2 years. They get amazing performance, but also aren't running fsync. They haven't had corruption with OS crashes (they're very abusive to their CentOS install), but did with a power outage (a UPS of ours went up in smoke; they weren't paying for N+1 power). Now their data is mostly static; they run analytics once a day and if they have a problem they can reload yesterday's data and run the analytics again to get back up to speed. If this is the type of stuff you're doing and you can easily get your data back to a sane state by all means give FusionIO a whirl! This customer did discuss this with me in length last time they stopped in and also pointed out that FusionIO was announced as being a major part of some trading company or bank firm's database performance junk. I don't know the details, but I think he said they were out of Chicago. If anyone knows what I'm talking about please share the link. Either way, it seems that people are actually doing money transactions on FusionIO, so you can either take that as comforting reassurance or you can start getting really nervous about the stock market :-) Regards, Mark PS, don't turn off fsync unless you know what you're doing.
Comments in line, take em for what you paid for em. > -----Original Message----- > From: pgsql-performance-owner@postgresql.org [mailto:pgsql-performance- > owner@postgresql.org] On Behalf Of Lars > Sent: Tuesday, January 18, 2011 3:57 AM > To: pgsql-performance@postgresql.org > Subject: [PERFORM] Migrating to Postgresql and new hardware > > Hi, > > We are in the process of moving a web based application from a MySql to > Postgresql database. > Our main reason for moving to Postgresql is problems with MySql > (MyISAM) table locking. I would never try and talk someone out of switching but.... MyISAM? What version of MySQL and did you pick MyISAM for a good reason or just happened to end up there? > We will buy a new set of servers to run the Postgresql databases. > > The current setup is five Dell PowerEdge 2950 with 2 * XEON E5410, 4GB > RAM. PERC 5/I 256MB NV Cache, 4 * 10K Disks (3 in RAID 5 + 1 spare). > > One server is used for shared data. > Four servers are used for sharded data. A user in the system only has > data in one of the shards. > There is another server to which all data is replicated but I'll leave > that one out of this discussion. > These are dedicated database servers. There are more or less no stored > procedures. The shared database size is about 20GB and each shard > database is about 40GB (total of 20 + 40 * 4 = 180GB). I would expect > the size will grow 10%-15% this year. Server load might increase with > 15%-30% this year. This setup is disk I/O bound. The overwhelming > majority of sql statements are fast (typically single row selects, > updates, inserts and deletes on primary key) but there are some slow > long running (10min) queries. > > As new server we are thinking of PowerEdge R510, 1 * Xeon X5650, 24Gb > RAM, H700 512MB NV Cache. One would think you should notice a nice speed improvement, ceteris paribus, since the X5650 will have ->significantly<- more memory bandwidth than the 5410s you are used to, and you are going to have a heck of a lot more ram for things to cache in. I think the H700 is a step up in raid cards as well but with only 4 disks your probably not maxing out there. > Dell has offered two alternative SSDs: > Samsung model SS805 (100GB Solid State Disk SATA 2.5"). > (http://www.plianttechnology.com/lightning_lb.php) > Pliant model LB 150S (149GB Solid State Drive SAS 3Gbps 2.5"). > (http://www.samsung.com/global/business/semiconductor/products/SSD/Prod > ucts_Enterprise_SSD.html) The Samsung ones seems to indicate that they have protection in the event of a power failure, and the pliant does not mention it. Granted I haven't done or seen any pull the plug under max load tests on either family, so I got nothing beyond that it is the first thing I have looked at with every SSD that crosses my path. > > Both are SLC drives. The price of the Pliant is about 2,3 times the > price of the Samsung (does it have twice the performance?). > > One alternative is 5 servers (1 shared and 4 shards) with 5 Samsung > drives (4 in RAID 10 + 1 spare). > Another alternative would be 3 servers (1 shared and 2 shards) with 5 > Pliant drives (4 in RAID 10 + 1 spare). This would be slightly more > expensive than the first alternative but would be easier to upgrade > with two new shard servers when it's needed. As others have mentioned, how are you going to be doing your "shards"? > > Anyone have experience using the Samsung or the Pliant SSD? Any > information about degraded performance over time? > Any comments on the setups? How would an alternative with 15K disks (6 > RAID 10 + 1 spare, or even 10 RAID10 + 1 spare) compare? You still may find that breaking xlog out to its own logical drive (2 drives in raid 1) gives a speed improvement to the overall. YMMV - so tinker and find out before you go deploying. > How would these alternatives compare in I/O performance compared to the > old setup? > Anyone care to guess how the two alternatives would compare in > performance running Postgresql? > How would the hardware usage of Postgresql compare to MySqls? I won't hazard a guess on the performance difference between PG w/ Fsync ON and MySQL running with MyISAM. If you can get your OS and PG tuned you should be able to have a database that can have pretty decent throughput for an OLTP workload. Since that seems to be the majority of your intended workload. -Mark > > > > Regards > /Lars > > -- > Sent via pgsql-performance mailing list (pgsql- > performance@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-performance
On 18/01/11 18:56, Lars wrote: > Hi, > > We are in the process of moving a web based application from a MySql to Postgresql database. > Our main reason for moving to Postgresql is problems with MySql (MyISAM) table locking. > We will buy a new set of servers to run the Postgresql databases. Most people seem to simply move over to InnoDB when facing these issues, saving themselves LOTS of pain over MyISAM while minimizing transition costs. I assume you've rejected that, but I'm interested in why. > The current setup is five Dell PowerEdge 2950 with 2 * XEON E5410, 4GB RAM. PERC 5/I 256MB NV Cache, 4 * 10K Disks (3in RAID 5 + 1 spare). > > One server is used for shared data. > Four servers are used for sharded data. A user in the system only has data in one of the shards. > There is another server to which all data is replicated but I'll leave that one out of this discussion. Don't, if you want to have a similar thing going in your Pg deployment later. Replication in Pg remains ... interesting. An n-to-m (or n-to-1) replication setup can't be achieved with the built-in replication in 9.0; you need to use things like Slony-I, Bucardo, etc each of which have their own limitations and quirks. > These are dedicated database servers. There are more or less no stored procedures. The shared database size is about 20GBand each shard database is about 40GB (total of 20 + 40 * 4 = 180GB). I would expect the size will grow 10%-15% thisyear. Server load might increase with 15%-30% this year. This setup is disk I/O bound. The overwhelming majority of sqlstatements are fast (typically single row selects, updates, inserts and deletes on primary key) but there are some slowlong running (10min) queries. Since you're sharding (and thus clearly don't need strong cluster-wide ACID) have you considered looking into relaxed semi-ACID / eventually consistent database systems? If you're doing lots of simple queries and few of the kind of heavy lifting reporting queries RDBMSs are great for, it may be worth considering. If your app uses a data acesss layer, it should be pretty easy to prototype implementations on other databases and try them out. Even if you do go for PostgreSQL, if you're not using memcached yet you're wasting money and effort. You might get lots more life out of your hardware with a bit of memcached love. -- System & Network Administrator POST Newspapers
On Tue, Jan 18, 2011 at 3:56 AM, Lars <la@unifaun.com> wrote: > Any comments on the setups? How would an alternative with 15K disks (6 RAID 10 + 1 spare, or even 10 RAID10 + 1 spare)compare? RAID-10 is going to trounce RAID-5 for writes, which is where you usually have the most issues. > How would these alternatives compare in I/O performance compared to the old setup? Only testing can tell, but I've seen 4 SATA drives in RAID-10 with no caching or fancy controller beat a 4 disk RAID-5 with BBU controller more than once.
> No idea what mysql thinks a shard is, but in PG we have read-only hot standby's. I used sharding as an expression for partitioning data into several databases. Each user in the system is unaware of any other user. The user never accesses the private data of another user. Each usercould in theory be assigned their own database server. This makes it easy to split the 40000 users over a number of databaseservers. There are some shared data that is stored in a special "shared" database. > The standby database is exactly the same as the master (save a bit of > data that has not been synced yet.) I assume you know this... but I'd > really recommend trying out PG's hot-standby and make sure it works the > way you need (because I bet its different than mysql's). > Assuming the "shared" and the "sharded" databases are totally different > (lets call them database a and c), with the PG setup you'd have database > a on one computer, then one master with database b on it (where all > writes go), then several hot-standby's mirroring database b (that > support read-only queries). As our data is easily partitioned into any number of servers we do not plan to use replication for load balancing. We dohowever plan to use it to set up a backup site. > Its pretty hard to guess what your usage pattern is (70% read, > small columns, no big blobs (like photos), etc)... and even then we'd > still have to guess. It's more like 40% read 60% write. > Not only will I not compare apples to oranges, but I really wont compare > apples in Canada to oranges in Japan. :-) Hehe /Lars
> Are you going to RAID the SSD drives at all? Yes, I was thinking four drives in RAID 10 and a (hot) spare drive... > Of course this is based on my experience, and I have my fireproof suit since > I mentioned the word fusionIO :) Hehe FusionIO has some impressive stats! SSD in RAID10 provides redundancy in case of disc failure. How do you handle this with fusionIO? Two mirrored cards? /Lars
Thanks for the reply! MyISAM was chosen back in 2000. I'm not aware of the reasoning behind this choice... Dell claims both the Samsung and the Pliant are safe to use. Below is a quote from the Pliant datasheet: "No Write Cache: Pliant EFDs deliver outstanding write performance without any dependence on write cache and thus does not use battery/supercap." > As others have mentioned, how are you going to be doing your "shards"? Hmm... shards might not have been a good word to describe it. I'll paste what I wrote in another reply: I used sharding as an expression for partitioning data into several databases. Each user in the system is unaware of any other user. The user never accesses the private data of another user. Each usercould in theory be assigned their own database server. This makes it easy to split the 40000 users over a number of databaseservers. There are some shared data that is stored in a special "shared" database. /Lars -----Ursprungligt meddelande----- Från: mark [mailto:dvlhntr@gmail.com] Skickat: den 19 januari 2011 05:10 Till: Lars Kopia: pgsql-performance@postgresql.org Ämne: RE: [PERFORM] Migrating to Postgresql and new hardware Comments in line, take em for what you paid for em. > -----Original Message----- > From: pgsql-performance-owner@postgresql.org [mailto:pgsql-performance- > owner@postgresql.org] On Behalf Of Lars > Sent: Tuesday, January 18, 2011 3:57 AM > To: pgsql-performance@postgresql.org > Subject: [PERFORM] Migrating to Postgresql and new hardware > > Hi, > > We are in the process of moving a web based application from a MySql to > Postgresql database. > Our main reason for moving to Postgresql is problems with MySql > (MyISAM) table locking. I would never try and talk someone out of switching but.... MyISAM? What version of MySQL and did you pick MyISAM for a good reason or just happened to end up there? > We will buy a new set of servers to run the Postgresql databases. > > The current setup is five Dell PowerEdge 2950 with 2 * XEON E5410, 4GB > RAM. PERC 5/I 256MB NV Cache, 4 * 10K Disks (3 in RAID 5 + 1 spare). > > One server is used for shared data. > Four servers are used for sharded data. A user in the system only has > data in one of the shards. > There is another server to which all data is replicated but I'll leave > that one out of this discussion. > These are dedicated database servers. There are more or less no stored > procedures. The shared database size is about 20GB and each shard > database is about 40GB (total of 20 + 40 * 4 = 180GB). I would expect > the size will grow 10%-15% this year. Server load might increase with > 15%-30% this year. This setup is disk I/O bound. The overwhelming > majority of sql statements are fast (typically single row selects, > updates, inserts and deletes on primary key) but there are some slow > long running (10min) queries. > > As new server we are thinking of PowerEdge R510, 1 * Xeon X5650, 24Gb > RAM, H700 512MB NV Cache. One would think you should notice a nice speed improvement, ceteris paribus, since the X5650 will have ->significantly<- more memory bandwidth than the 5410s you are used to, and you are going to have a heck of a lot more ram for things to cache in. I think the H700 is a step up in raid cards as well but with only 4 disks your probably not maxing out there. > Dell has offered two alternative SSDs: > Samsung model SS805 (100GB Solid State Disk SATA 2.5"). > (http://www.plianttechnology.com/lightning_lb.php) > Pliant model LB 150S (149GB Solid State Drive SAS 3Gbps 2.5"). > (http://www.samsung.com/global/business/semiconductor/products/SSD/Prod > ucts_Enterprise_SSD.html) The Samsung ones seems to indicate that they have protection in the event of a power failure, and the pliant does not mention it. Granted I haven't done or seen any pull the plug under max load tests on either family, so I got nothing beyond that it is the first thing I have looked at with every SSD that crosses my path. > > Both are SLC drives. The price of the Pliant is about 2,3 times the > price of the Samsung (does it have twice the performance?). > > One alternative is 5 servers (1 shared and 4 shards) with 5 Samsung > drives (4 in RAID 10 + 1 spare). > Another alternative would be 3 servers (1 shared and 2 shards) with 5 > Pliant drives (4 in RAID 10 + 1 spare). This would be slightly more > expensive than the first alternative but would be easier to upgrade > with two new shard servers when it's needed. As others have mentioned, how are you going to be doing your "shards"? > > Anyone have experience using the Samsung or the Pliant SSD? Any > information about degraded performance over time? > Any comments on the setups? How would an alternative with 15K disks (6 > RAID 10 + 1 spare, or even 10 RAID10 + 1 spare) compare? You still may find that breaking xlog out to its own logical drive (2 drives in raid 1) gives a speed improvement to the overall. YMMV - so tinker and find out before you go deploying. > How would these alternatives compare in I/O performance compared to the > old setup? > Anyone care to guess how the two alternatives would compare in > performance running Postgresql? > How would the hardware usage of Postgresql compare to MySqls? I won't hazard a guess on the performance difference between PG w/ Fsync ON and MySQL running with MyISAM. If you can get your OS and PG tuned you should be able to have a database that can have pretty decent throughput for an OLTP workload. Since that seems to be the majority of your intended workload. -Mark > > > > Regards > /Lars > > -- > Sent via pgsql-performance mailing list (pgsql- > performance@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-performance
On 01/19/2011 05:09 PM, Lars wrote: > Thanks for the reply! > > MyISAM was chosen back in 2000. I'm not aware of the reasoning behind this choice... > > Dell claims both the Samsung and the Pliant are safe to use. > Below is a quote from the Pliant datasheet: > "No Write Cache: > Pliant EFDs deliver outstanding > write performance > without any dependence on > write cache and thus does > not use battery/supercap." Er ... magic? I wouldn't trust them without details on *how* it achieves good performance, and what "good" is. Is there *any* device on the market that efficiently handles lots of small writes? > >> As others have mentioned, how are you going to be doing your "shards"? > Hmm... shards might not have been a good word to describe it. I'll paste what I wrote in another reply: > I used sharding as an expression for partitioning data into several databases. "sharding" or "shards" is pretty much the standard way that setup is described. It doesn't come up on the Pg list a lot as most people doing web-oriented horizontally scaled apps use MySQL or fashionable non-SQL databases, but it's pretty well known in wider circles. -- Craig Ringer
On 1/19/2011 6:42 PM, Craig Ringer wrote: > On 01/19/2011 05:09 PM, Lars wrote: >> Thanks for the reply! >> >> >>> As others have mentioned, how are you going to be doing your "shards"? >> Hmm... shards might not have been a good word to describe it. I'll >> paste what I wrote in another reply: >> I used sharding as an expression for partitioning data into several >> databases. > > "sharding" or "shards" is pretty much the standard way that setup is > described. It doesn't come up on the Pg list a lot as most people doing > web-oriented horizontally scaled apps use MySQL or fashionable non-SQL > databases, but it's pretty well known in wider circles. > > -- > Craig Ringer > Or... PG is just so good we've never had to use more than one database server! :-) -Andy
On Thu, Jan 20, 2011 at 7:48 AM, Andy Colson <andy@squeakycode.net> wrote: > Or... PG is just so good we've never had to use more than one database > server! :-) Hehe, while you can do a lot with one server, there are some scenarios where sharding is the answer. I have a horror story about not sharding when we should have I can tell you over a beer sometime.
Lars wrote: > Below is a quote from the Pliant datasheet: > "No Write Cache: > Pliant EFDs deliver outstanding > write performance > without any dependence on > write cache and thus does > not use battery/supercap." > I liked the article The Register wrote about them, with the headline "Pliant's SSDs are awesome, says Pliant". Of course they do. Check out the write benchmark figures in the information review at http://oliveraaltonen.com/2010/09/29/preliminary-benchmark-results-of-the-pliant-ssd-drives/ to see how badly performance suffers on their design from those decisions. The Fusion I/O devices get nearly an order of magnitude more write IOPS in those tests. As far as I've been able to tell, what Pliant does is just push writes out all the time without waiting for them to be aligned with block sizes, followed by cleaning up the wreckage later via their internal automatic maintenance ASICs (it's sort of an always on TRIM implementation if I'm guessing right). That has significant limitations both in regards to total write speed as well as device longevity. For a database, I'd much rather have a supercap and get ultimate write performance without those downsides. Depends on the read/write ratio though; I could see a heavily read-biased system work well with their approach. Of course, a heavily read-based system would be better served by having a ton of RAM instead in most cases. Could be worst though--they could be misleading about the whole topic of write durability like Intel is. I consider claiming high performance when you don't always really have it, what Pliant is doing here, to be a much lesser sin than losing data at random and not being clear about when that can happen. I'd like FusionIO to put a big "expect your server to be down for many minutes after a power interruption" warning on their drives, too, while I'm wishing for complete vendor transparency here. -- Greg Smith 2ndQuadrant US greg@2ndQuadrant.com Baltimore, MD PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.us "PostgreSQL 9.0 High Performance": http://www.2ndQuadrant.com/books
Interesting. Would have been nice if the test was with a raid-10 setup as raid-5 is not very good for writes... Would you get much of a performance increase with a write-cached ssd even if you got a raid controller with (battery-backed)cache? /Lars -----Ursprungligt meddelande----- Från: Greg Smith [mailto:greg@2ndquadrant.com] Skickat: den 29 januari 2011 01:27 Till: Lars Kopia: mark; pgsql-performance@postgresql.org Ämne: Re: [PERFORM] Migrating to Postgresql and new hardware Lars wrote: > Below is a quote from the Pliant datasheet: > "No Write Cache: > Pliant EFDs deliver outstanding > write performance > without any dependence on > write cache and thus does > not use battery/supercap." > I liked the article The Register wrote about them, with the headline "Pliant's SSDs are awesome, says Pliant". Of course they do. Check out the write benchmark figures in the information review at http://oliveraaltonen.com/2010/09/29/preliminary-benchmark-results-of-the-pliant-ssd-drives/ to see how badly performance suffers on their design from those decisions. The Fusion I/O devices get nearly an order of magnitude more write IOPS in those tests. As far as I've been able to tell, what Pliant does is just push writes out all the time without waiting for them to be aligned with block sizes, followed by cleaning up the wreckage later via their internal automatic maintenance ASICs (it's sort of an always on TRIM implementation if I'm guessing right). That has significant limitations both in regards to total write speed as well as device longevity. For a database, I'd much rather have a supercap and get ultimate write performance without those downsides. Depends on the read/write ratio though; I could see a heavily read-biased system work well with their approach. Of course, a heavily read-based system would be better served by having a ton of RAM instead in most cases. Could be worst though--they could be misleading about the whole topic of write durability like Intel is. I consider claiming high performance when you don't always really have it, what Pliant is doing here, to be a much lesser sin than losing data at random and not being clear about when that can happen. I'd like FusionIO to put a big "expect your server to be down for many minutes after a power interruption" warning on their drives, too, while I'm wishing for complete vendor transparency here. -- Greg Smith 2ndQuadrant US greg@2ndQuadrant.com Baltimore, MD PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.us "PostgreSQL 9.0 High Performance": http://www.2ndQuadrant.com/books