Thread: Disk performance
Hi all, as we encountered some limitations of our cheap disk setup, I really would like to see how cheap they are compared to expensive disk setups. We have a 12 GB RAM machine with intel i7-975 and using 3 disks "Seagate Barracuda 7200.11, ST31500341AS (1.5 GB)" One disk for the system and WAL etc. and one SW RAID-0 with two disks for postgresql data. Now I ran a few test as described in http://www.westnet.com/~gsmith/content/postgresql/pg-disktesting.htm # time sh -c "dd if=/dev/zero of=bigfile bs=8k count=3000000 && sync" 3000000+0 records in 3000000+0 records out 24576000000 bytes (25 GB) copied, 276.03 s, 89.0 MB/s real 4m48.658s user 0m0.580s sys 0m51.579s # time dd if=bigfile of=/dev/null bs=8k 3000000+0 records in 3000000+0 records out 24576000000 bytes (25 GB) copied, 222.841 s, 110 MB/s real 3m42.879s user 0m0.468s sys 0m18.721s IMHO it is looking quite fast compared to the values mentioned in the article. What values do you expect with a very expensive setup like many spindles, scsi, raid controller, battery cache etc. How much faster will it be? Of yourse, you can't give me exact results, but I would just like to get a an idea about how much faster an expensive disk setup could be. Would it be like 10% faster, 100% or 1000% faster? If you can give me any hints, I would greatly appreciate it. kind regards Janning
On 06/15/10 14:59, Janning wrote: > Hi all, > > as we encountered some limitations of our cheap disk setup, I really would > like to see how cheap they are compared to expensive disk setups. > > We have a 12 GB RAM machine with intel i7-975 and using > 3 disks "Seagate Barracuda 7200.11, ST31500341AS (1.5 GB)" > One disk for the system and WAL etc. and one SW RAID-0 with two disks for > postgresql data. > > Now I ran a few test as described in > http://www.westnet.com/~gsmith/content/postgresql/pg-disktesting.htm > > # time sh -c "dd if=/dev/zero of=bigfile bs=8k count=3000000 && sync" > 3000000+0 records in > 3000000+0 records out > 24576000000 bytes (25 GB) copied, 276.03 s, 89.0 MB/s > > real 4m48.658s > user 0m0.580s > sys 0m51.579s > > # time dd if=bigfile of=/dev/null bs=8k > 3000000+0 records in > 3000000+0 records out > 24576000000 bytes (25 GB) copied, 222.841 s, 110 MB/s > > real 3m42.879s > user 0m0.468s > sys 0m18.721s The figures are ok if the tests were done on a single drive (i.e. not your RAID-0 array). > IMHO it is looking quite fast compared to the values mentioned in the article. > What values do you expect with a very expensive setup like many spindles, > scsi, raid controller, battery cache etc. How much faster will it be? For start, you are attempting to use RAID-0 with two disks here. This means you have twice as much risk that a drive failure will cause total data loss. In any kind of serious setup this would be the first thing to replace. > Of yourse, you can't give me exact results, but I would just like to get a an > idea about how much faster an expensive disk setup could be. > Would it be like 10% faster, 100% or 1000% faster? If you can give me any > hints, I would greatly appreciate it. There is no magic here - scalability of drives can be approximated linearly: a) faster drives: 15,000 RPM drives will be almost exactly 15000/7200 times faster at random access b) more drives: depending on your RAID schema, each parallel drive or drive combination will grow your speed linearly. For example, a 3-drive RAID-0 will be 3/2 times faster than a 2-drive RAID-0. Of course, you would not use RAID-0 anywhere serious. But an 8-drive RAID-10 array will be 8/4=2 times faster than a 4-drive RAID-10 array. Finally, it all depends on your expected load vs budget. If you are unsure of what you want and what you need, but don't expect serious write loads, make a 4-drive RAID-10 array of your cheap 7200 RPM drives, invest in more RAM and don't worry about it. Drive controllers are another issue and there is somewhat more magic here. If the above paragraph describes you well, you probably don't need a RAID controller. There are many different kinds of these with extremely different prices, and many different configuration option so nowadays it isn't practical to think about those until you really need to.
On Tuesday 15 June 2010 15:16:19 Ivan Voras wrote: > On 06/15/10 14:59, Janning wrote: > > Hi all, > > > > as we encountered some limitations of our cheap disk setup, I really > > would like to see how cheap they are compared to expensive disk setups. > > > > We have a 12 GB RAM machine with intel i7-975 and using > > 3 disks "Seagate Barracuda 7200.11, ST31500341AS (1.5 GB)" > > One disk for the system and WAL etc. and one SW RAID-0 with two disks for > > postgresql data. > > > > Now I ran a few test as described in > > http://www.westnet.com/~gsmith/content/postgresql/pg-disktesting.htm > > > > # time sh -c "dd if=/dev/zero of=bigfile bs=8k count=3000000 && sync" > > 3000000+0 records in > > 3000000+0 records out > > 24576000000 bytes (25 GB) copied, 276.03 s, 89.0 MB/s > > > > real 4m48.658s > > user 0m0.580s > > sys 0m51.579s > > > > # time dd if=bigfile of=/dev/null bs=8k > > 3000000+0 records in > > 3000000+0 records out > > 24576000000 bytes (25 GB) copied, 222.841 s, 110 MB/s > > > > real 3m42.879s > > user 0m0.468s > > sys 0m18.721s > > The figures are ok if the tests were done on a single drive (i.e. not > your RAID-0 array). Ahh, I meant raid-1, of course. Sorry for this. I tested my raid 1 too and it looks quite the same. Not much difference. > > IMHO it is looking quite fast compared to the values mentioned in the > > article. What values do you expect with a very expensive setup like many > > spindles, scsi, raid controller, battery cache etc. How much faster will > > it be? > > For start, you are attempting to use RAID-0 with two disks here. This > means you have twice as much risk that a drive failure will cause total > data loss. In any kind of serious setup this would be the first thing to > replace. I did it already :-) > > Of yourse, you can't give me exact results, but I would just like to get > > a an idea about how much faster an expensive disk setup could be. > > Would it be like 10% faster, 100% or 1000% faster? If you can give me any > > hints, I would greatly appreciate it. > > There is no magic here - scalability of drives can be approximated > linearly: > > a) faster drives: 15,000 RPM drives will be almost exactly 15000/7200 > times faster at random access ok. > b) more drives: depending on your RAID schema, each parallel drive or > drive combination will grow your speed linearly. For example, a 3-drive > RAID-0 will be 3/2 times faster than a 2-drive RAID-0. Of course, you > would not use RAID-0 anywhere serious. But an 8-drive RAID-10 array will > be 8/4=2 times faster than a 4-drive RAID-10 array. So RAID-10 with 4 disks is 2 times faster than a RAID-1, I got it. So as I need much more power I should look for a RAID-10 with 8 or more 15k RPM disks. > Finally, it all depends on your expected load vs budget. If you are > unsure of what you want and what you need, but don't expect serious > write loads, make a 4-drive RAID-10 array of your cheap 7200 RPM drives, > invest in more RAM and don't worry about it. ok, I will look for a hoster who can provide this. Most hosters normaly offer lots of ram and cpu but no advanced disk configuration. > Drive controllers are another issue and there is somewhat more magic > here. If the above paragraph describes you well, you probably don't need > a RAID controller. There are many different kinds of these with > extremely different prices, and many different configuration option so > nowadays it isn't practical to think about those until you really need to. thanks very much for your help. It gave me a good idea of what to do. If you have further recommendations, I would be glad to here them. kind regards Janning
> thanks very much for your > help. > It gave me a good idea of what to do. If you have further > recommendations, I > would be glad to here them. I guess you should give more info about the expected workload of your server(s)... otherwise you'll risk spend too much money/spend your money in a wrong way...
On Tuesday, June 15, 2010, Janning <ml@planwerk6.de> wrote: > ok, I will look for a hoster who can provide this. Most hosters normaly > offer lots of ram and cpu but no advanced disk configuration. > I've noticed that too, even Rackspace doesn't offer a standard config that anyone would actually want to use for a database server. I know they can custom build something but is there really no demand for servers with real storage subsystems? -- "No animals were harmed in the recording of this episode. We tried but that damn monkey was just too fast."
On 15 June 2010 18:22, Janning <ml@planwerk6.de> wrote: >> The figures are ok if the tests were done on a single drive (i.e. not >> your RAID-0 array). > > Ahh, I meant raid-1, of course. Sorry for this. > I tested my raid 1 too and it looks quite the same. Not much difference. This is expected: a RAID-1 array (mirroring) will have the performance of the slowest drive (or a single drive if they are equal). >> There is no magic here - scalability of drives can be approximated >> linearly: >> >> a) faster drives: 15,000 RPM drives will be almost exactly 15000/7200 >> times faster at random access > > ok. (or if you are looking at raw numbers: a 15,000 RPM drive will sustain 15000/60=250 random IOs per second (IOPS); but now you are entering magic territory - depending on the exact type of your load you can get much better results, but not significantly worse). >> b) more drives: depending on your RAID schema, each parallel drive or >> drive combination will grow your speed linearly. For example, a 3-drive >> RAID-0 will be 3/2 times faster than a 2-drive RAID-0. Of course, you >> would not use RAID-0 anywhere serious. But an 8-drive RAID-10 array will >> be 8/4=2 times faster than a 4-drive RAID-10 array. > > So RAID-10 with 4 disks is 2 times faster than a RAID-1, I got it. So as I > need much more power I should look for a RAID-10 with 8 or more 15k RPM disks. Yes, if you expect serious write or random IO load. To illustrate: if you are trying to power a generic web site, for example a blog, you can expect that most of your load will be read-only (mostly pageviews) and except if you plan on having a really large site (many authors for example), that your database will largely fit into RAM, so you don't have to invest in disk drives as it will be served from cache. On the other hand, a financial application will do a lot of transactions and you will almost certainly need good storage infrastructure - this is where the 250 IOPS for a 15000 RPM drive estimates come into play. > thanks very much for your help. > It gave me a good idea of what to do. If you have further recommendations, I > would be glad to here them. I can point you to a dedicated mailing list: pgsql-performance @ postgresql.org for questions about performance such as yours.
Janning wrote: > IMHO it is looking quite fast compared to the values mentioned in the article. > The tests in the article were using the 2006 versions of the same drive you have, so I'd certainly hope yours are faster now. > What values do you expect with a very expensive setup like many spindles, > scsi, raid controller, battery cache etc. How much faster will it be? > If you visit look at my "Database Hardware Benchmarking" talk at http://projects.2ndquadrant.com/talks I give examples of some of this. Page 9 shows how much of a speedup I saw going from one cheap drive to three for example, and P32 shows that in the mixed I/O bonnie++ seeks tests tripling the number of drives increases the seeks rating it computes from 177 to 371. If you add in a RAID controller, the sequential read/write numbers increase no differently than if you add disks with software RAID. They do significantly increase what I call the "Commit Rate", which is how many small writes you can get per second for database commits. The commit rate for regular drives is proportional to their rotation rate, between 100-250 commit/second without a battery-backed RAID controller. As you can also see on P32, it jumps to thousands of commits/second with one. Presuming you have reasonable sequential performance and a battery-backed controller to make the commit rate reasonable, database applications will then normally bottleneck at how fast they can seek around. It is extremely hard to estimate how fast that scales upwards as you add more disks to an array and insert a read/write cache into the system. -- Greg Smith 2ndQuadrant US Baltimore, MD PostgreSQL Training, Services and Support greg@2ndQuadrant.com www.2ndQuadrant.us
Ivan Voras wrote: > (or if you are looking at raw numbers: a 15,000 RPM drive will sustain > 15000/60=250 random IOs per second (IOPS) That's only taking into account the rotation speed--a 15K drive can do 250 physical commits per second if you never seek anywhere. A true IOPS number also considers average seek latency. A decent 15K drive will be around 4ms there, which makes for 167 IOPS total. Random note: this discussion is on the wrong list. There are more people interested in this topic who post regularly on pgsql-performance than pgsql-general. -- Greg Smith 2ndQuadrant US Baltimore, MD PostgreSQL Training, Services and Support greg@2ndQuadrant.com www.2ndQuadrant.us