Thread: Write performance
Hi, at the moment we encounter some performance problems with our database server. We have a 12 GB RAM machine with intel i7-975 and using 3 disks "Seagate Barracuda 7200.11, ST31500341AS (1.5 GB)" One disk for the system and WAL etc. and one SW RAID-0 with two disks for postgresql data. Our database is about 24GB. Our munin graph reports at 9:00 a clock writes of 3000 blocks per second and reads of about 1000 blocks per second on our disk which holds the data directories of postgresql (WAL are on a different disk) 3000 blocks ~ about 3 MB/s write 1000 blocks ~ about 1 MB/s read At the same time we have nearly 50% CPU I/O wait and only 12% user CPU load (so 4 of 8 cpu cores are in use for io wait) We know, its a poor man disk setup (but we can not find a hoster with rather advanced disk configuration at an affordable price). Anyway, we ran some tests on it: # time sh -c "dd if=/dev/zero of=bigfile bs=8k count=3000000 && sync" 3000000+0 records in 3000000+0 records out 24576000000 bytes (25 GB) copied, 276.03 s, 89.0 MB/s real 4m48.658s user 0m0.580s sys 0m51.579s # time dd if=bigfile of=/dev/null bs=8k 3000000+0 records in 3000000+0 records out 24576000000 bytes (25 GB) copied, 222.841 s, 110 MB/s real 3m42.879s user 0m0.468s sys 0m18.721s Of course, writing large chunks is quite a different usage pattern. But I am wondering that writing 3MB/s and reading 1 MB/s seams to be a limit if i can run a test with 89 MB/s writing and 110MB/s reading. Can you give some hints, if this numbers seems to be reasonable? kind regards Janning
On Thu, Jun 24, 2010 at 02:43:33PM +0200, Janning wrote: > Hi, > > at the moment we encounter some performance problems with our database server. > > We have a 12 GB RAM machine with intel i7-975 and using > 3 disks "Seagate Barracuda 7200.11, ST31500341AS (1.5 GB)" > One disk for the system and WAL etc. and one SW RAID-0 with two disks for > postgresql data. Our database is about 24GB. > > Our munin graph reports at 9:00 a clock writes of 3000 blocks per second and > reads of about 1000 blocks per second on our disk which holds the data > directories of postgresql (WAL are on a different disk) > > 3000 blocks ~ about 3 MB/s write > 1000 blocks ~ about 1 MB/s read > > At the same time we have nearly 50% CPU I/O wait and only 12% user CPU load > (so 4 of 8 cpu cores are in use for io wait) > > We know, its a poor man disk setup (but we can not find a hoster with rather > advanced disk configuration at an affordable price). Anyway, we ran some tests > on it: > > > # time sh -c "dd if=/dev/zero of=bigfile bs=8k count=3000000 && sync" > 3000000+0 records in > 3000000+0 records out > 24576000000 bytes (25 GB) copied, 276.03 s, 89.0 MB/s > > real 4m48.658s > user 0m0.580s > sys 0m51.579s > > # time dd if=bigfile of=/dev/null bs=8k > 3000000+0 records in > 3000000+0 records out > 24576000000 bytes (25 GB) copied, 222.841 s, 110 MB/s > > real 3m42.879s > user 0m0.468s > sys 0m18.721s > > > > Of course, writing large chunks is quite a different usage pattern. But I am > wondering that writing 3MB/s and reading 1 MB/s seams to be a limit if i can > run a test with 89 MB/s writing and 110MB/s reading. > > Can you give some hints, if this numbers seems to be reasonable? > > kind regards > Janning > Yes, these are typical random I/O versus sequential I/O rates for hard drives. Your I/O is extremely under-powered relative to your CPU/memory. For DB servers, many times you need much more I/O instead. Cheers, Ken
On Thu, 24 Jun 2010, Janning wrote: > We have a 12 GB RAM machine with intel i7-975 and using > 3 disks "Seagate Barracuda 7200.11, ST31500341AS (1.5 GB)" Those discs are 1.5TB, not 1.5GB. > One disk for the system and WAL etc. and one SW RAID-0 with two disks for > postgresql data. Our database is about 24GB. Beware of RAID-0 - make sure you can recover the data when (not if) a disc fails. > Our munin graph reports at 9:00 a clock writes of 3000 blocks per second and > reads of about 1000 blocks per second on our disk which holds the data > directories of postgresql (WAL are on a different disk) > > 3000 blocks ~ about 3 MB/s write > 1000 blocks ~ about 1 MB/s read > > At the same time we have nearly 50% CPU I/O wait and only 12% user CPU load > (so 4 of 8 cpu cores are in use for io wait) Not quite sure what situation you are measuring these figures under. However, as a typical figure, let's say you are doing random access with 8kB blocks (as in Postgres), and the access time on your drive is 8.5ms (as with these drives). For each drive, you will be able to read/write approximately 8kB / 0.0085s, giving 941kB per second. If you have multiple processes all doing random access, then you may be able to utilise both discs and get double that. > Of course, writing large chunks is quite a different usage pattern. But I am > wondering that writing 3MB/s and reading 1 MB/s seams to be a limit if i can > run a test with 89 MB/s writing and 110MB/s reading. That's quite right, and typical performance figures for a drive like that. Matthew -- Don't criticise a man until you have walked a mile in his shoes; and if you do at least he will be a mile behind you and bare footed.
On Thursday 24 June 2010 14:53:57 Matthew Wakeling wrote: > On Thu, 24 Jun 2010, Janning wrote: > > We have a 12 GB RAM machine with intel i7-975 and using > > 3 disks "Seagate Barracuda 7200.11, ST31500341AS (1.5 GB)" > > Those discs are 1.5TB, not 1.5GB. sorry, my fault. > > One disk for the system and WAL etc. and one SW RAID-0 with two disks for > > postgresql data. Our database is about 24GB. > > Beware of RAID-0 - make sure you can recover the data when (not if) a disc > fails. oh sorry again, its a raid-1 of course. shame on me. > > Our munin graph reports at 9:00 a clock writes of 3000 blocks per second > > and reads of about 1000 blocks per second on our disk which holds the > > data directories of postgresql (WAL are on a different disk) > > > > 3000 blocks ~ about 3 MB/s write > > 1000 blocks ~ about 1 MB/s read > > > > At the same time we have nearly 50% CPU I/O wait and only 12% user CPU > > load (so 4 of 8 cpu cores are in use for io wait) > > Not quite sure what situation you are measuring these figures under. > However, as a typical figure, let's say you are doing random access with > 8kB blocks (as in Postgres), and the access time on your drive is 8.5ms > (as with these drives). > > For each drive, you will be able to read/write approximately 8kB / > 0.0085s, giving 941kB per second. If you have multiple processes all doing > random access, then you may be able to utilise both discs and get double > that. So with your calculation I have a maximum of 2MB/s random access. So i really need to upgrade my disk configuration! > > Of course, writing large chunks is quite a different usage pattern. But I > > am wondering that writing 3MB/s and reading 1 MB/s seams to be a limit if > > i can run a test with 89 MB/s writing and 110MB/s reading. > > That's quite right, and typical performance figures for a drive like that. thanks for your help. kind regards Janning > Matthew > > -- > Don't criticise a man until you have walked a mile in his shoes; and if > you do at least he will be a mile behind you and bare footed.
thanks for your quick response, kenneth On Thursday 24 June 2010 14:47:34 you wrote: > On Thu, Jun 24, 2010 at 02:43:33PM +0200, Janning wrote: > > Hi, > > > > at the moment we encounter some performance problems with our database > > server. > > > > We have a 12 GB RAM machine with intel i7-975 and using > > 3 disks "Seagate Barracuda 7200.11, ST31500341AS (1.5 GB)" > > One disk for the system and WAL etc. and one SW RAID-0 with two disks for > > postgresql data. Our database is about 24GB. [...] > Your I/O is extremely under-powered relative to your > CPU/memory. For DB servers, many times you need much more I/O > instead. So at the moment we are using this machine as our primary database server: http://www.hetzner.de/en/hosting/produkte_rootserver/eq9/ Sadly, our hoster is not offering advanced disk setup. Now we have two options 1. buying a server on our own and renting a co-location. I fear we do not know enough about hardware to vote for this option. I think for co-locating your own server one should have more knowledge about hardware. 2. renting a server from a hoster with an advanced disk setup. Can anybody recommend a good hosting solution in germany with a good disk setup for postgresql? kind regards Janning
As others have already pointed out, your disk performance here is completely typical of a single pair of drives doing random read/write activity. So the question you should be asking is how to reduce the amount of reading and writing needed to run your application. The suggestions at http://wiki.postgresql.org/wiki/Tuning_Your_PostgreSQL_Server address that. Increases to shared_buffers and checkpoint_segments in particular can dramatically reduce the amount of I/O needed to run an application. On the last server I turned, random reads went from a constant stream of 1MB/s (with default value of shared_buffers at 32MB) to an average of 0.1MB/s just by adjusting those two parameters upwards via those guidelines. If you haven't already made large increases to those values, I'd suggest starting there before presuming you must get a different disk setup. -- Greg Smith 2ndQuadrant US Baltimore, MD PostgreSQL Training, Services and Support greg@2ndQuadrant.com www.2ndQuadrant.us
On Thursday 24 June 2010 15:16:05 Janning wrote: > On Thursday 24 June 2010 14:53:57 Matthew Wakeling wrote: > > On Thu, 24 Jun 2010, Janning wrote: > > > We have a 12 GB RAM machine with intel i7-975 and using > > > 3 disks "Seagate Barracuda 7200.11, ST31500341AS (1.5 TB)" > > > > > For each drive, you will be able to read/write approximately 8kB / > > 0.0085s, giving 941kB per second. If you have multiple processes all > > doing random access, then you may be able to utilise both discs and get > > double that. > > So with your calculation I have a maximum of 2MB/s random access. So i > really need to upgrade my disk configuration! i was looking at tomshardware.com and the fastest disk is Maxtor Atlas 15K II * 8K147S0,SAS,147 GB, 16 MB Cache,15000 rpm with 5.5 ms random access time. So even if i switch to those disks i can only reach a perfomace gain of 1.5, right? To achieve a better disk performance by factor of ten, i need a raid-10 setup with 12 disks (so i have 6 raid-1 bundles). Or are there other factors with high end disks? kind regards Janning
On 2010-06-24 15:45, Janning Vygen wrote: > On Thursday 24 June 2010 15:16:05 Janning wrote: > >> On Thursday 24 June 2010 14:53:57 Matthew Wakeling wrote: >> >>> On Thu, 24 Jun 2010, Janning wrote: >>> >>>> We have a 12 GB RAM machine with intel i7-975 and using >>>> 3 disks "Seagate Barracuda 7200.11, ST31500341AS (1.5 TB)" >>>> >>>> >>> For each drive, you will be able to read/write approximately 8kB / >>> 0.0085s, giving 941kB per second. If you have multiple processes all >>> doing random access, then you may be able to utilise both discs and get >>> double that. >>> >> So with your calculation I have a maximum of 2MB/s random access. So i >> really need to upgrade my disk configuration! >> > i was looking at tomshardware.com and the fastest disk is > > Maxtor Atlas 15K II * 8K147S0,SAS,147 GB, 16 MB Cache,15000 rpm > > with 5.5 ms random access time. > > So even if i switch to those disks i can only reach a perfomace gain of 1.5, > right? > > To achieve a better disk performance by factor of ten, i need a raid-10 setup > with 12 disks (so i have 6 raid-1 bundles). Or are there other factors with > high end disks? > Well. On the write-side, you can add in a Raid controller with Battery backed write cache to not make the writes directly hit disk. This improves the amount of writing you can do. On the read-side you can add more memory to your server so a significant part of your most active dataset is cached in memory. It depends on the actual sizes and workload what gives the most benefit for you. -- Jesper
On Jun 24, 2010, at 6:16 AM, Janning wrote: > On Thursday 24 June 2010 14:53:57 Matthew Wakeling wrote: >> On Thu, 24 Jun 2010, Janning wrote: >>> We have a 12 GB RAM machine with intel i7-975 and using >>> 3 disks "Seagate Barracuda 7200.11, ST31500341AS (1.5 GB)" >> >> Those discs are 1.5TB, not 1.5GB. > > sorry, my fault. > >>> One disk for the system and WAL etc. and one SW RAID-0 with two disks for >>> postgresql data. Our database is about 24GB. >> >> Beware of RAID-0 - make sure you can recover the data when (not if) a disc >> fails. > > oh sorry again, its a raid-1 of course. shame on me. If your WAL is not on RAID but your data is, you will lose data if the WAL log drive dies. You will then have a difficulttime recovering data from the data drives even though they are RAID protected. Most likely indexes and some datawill be corrupted since the last checkpoint. I have lost a WAL before, and the result was a lot of corrupted systemindexes that had to be rebuilt in single user mode, and one system table (stats related) that had to be purged andregenerated from scratch. This was not fun. Most of the data was fine, but the cleanup is messy if you lose WAL, andthere is no guarantee that your data is safe if you don't have the WAL available.