Thread: Really bad diskio
Hello all I'm running a postgres 7.4.5, on a dual 2.4Ghz Athlon, 1Gig RAM and an 3Ware SATA raid. Currently the database is only 16G with about 2 tables with 500000+ row, one table 200000+ row and a few small tables. The larger tables get updated about every two hours. The problem I having with this server (which is in production) is the disk IO. On the larger tables I'm getting disk IO wait averages of ~70-90%. I've been tweaking the linux kernel as specified in the PostgreSQL documentations and switched to the deadline scheduler. Nothing seems to be fixing this. The queries are as optimized as I can get them. fsync is off in an attempt to help preformance still nothing. Are there any setting I should be look at the could improve on this??? Thanks for and help in advance. Ron
Ron Wills wrote: > Hello all > > I'm running a postgres 7.4.5, on a dual 2.4Ghz Athlon, 1Gig RAM and > an 3Ware SATA raid. 2 drives? 4 drives? 8 drives? RAID 1? 0? 10? 5? Currently the database is only 16G with about 2 > tables with 500000+ row, one table 200000+ row and a few small > tables. The larger tables get updated about every two hours. The > problem I having with this server (which is in production) is the disk > IO. On the larger tables I'm getting disk IO wait averages of > ~70-90%. I've been tweaking the linux kernel as specified in the > PostgreSQL documentations and switched to the deadline > scheduler. Nothing seems to be fixing this. The queries are as > optimized as I can get them. fsync is off in an attempt to help > preformance still nothing. Are there any setting I should be look at > the could improve on this??? > > Thanks for and help in advance. > > Ron > > ---------------------------(end of broadcast)--------------------------- > TIP 1: if posting/reading through Usenet, please send an appropriate > subscribe-nomail command to majordomo@postgresql.org so that your > message can get through to the mailing list cleanly -- Your PostgreSQL solutions company - Command Prompt, Inc. 1.800.492.2240 PostgreSQL Replication, Consulting, Custom Programming, 24x7 support Managed Services, Shared and Dedicated Hosting Co-Authors: plPHP, plPerlNG - http://www.commandprompt.com/
On Fri, 2005-07-15 at 14:39 -0600, Ron Wills wrote: > Hello all > > I'm running a postgres 7.4.5, on a dual 2.4Ghz Athlon, 1Gig RAM and > an 3Ware SATA raid. Currently the database is only 16G with about 2 > tables with 500000+ row, one table 200000+ row and a few small > tables. The larger tables get updated about every two hours. The > problem I having with this server (which is in production) is the disk > IO. On the larger tables I'm getting disk IO wait averages of > ~70-90%. I've been tweaking the linux kernel as specified in the > PostgreSQL documentations and switched to the deadline > scheduler. Nothing seems to be fixing this. The queries are as > optimized as I can get them. fsync is off in an attempt to help > preformance still nothing. Are there any setting I should be look at > the could improve on this??? Can you please characterize this a bit better? Send the output of vmstat or iostat over several minutes, or similar diagnostic information. Also please describe your hardware more. Regards, Jeff Baker
At Fri, 15 Jul 2005 13:45:07 -0700, Joshua D. Drake wrote: > > Ron Wills wrote: > > Hello all > > > > I'm running a postgres 7.4.5, on a dual 2.4Ghz Athlon, 1Gig RAM and > > an 3Ware SATA raid. > > 2 drives? > 4 drives? > 8 drives? 3 drives raid 5. I don't believe it's the raid. I've tested this by moving the database to the mirrors software raid where the root is found and onto the the SATA raid. Neither relieved the IO problems. I was also was thinking this could be from the transactional subsystem getting overloaded? There are several automated processes that use the DB. Most are just selects, but the data updates and one that updates the smaller tables that are the heavier queries. On their own they seem to work ok, (still high IO, but fairly quick). But if even the simplest select is called during the heavier operation, then everything goes out through the roof. Maybe there's something I'm missing here as well? > RAID 1? 0? 10? 5? > > > Currently the database is only 16G with about 2 > > tables with 500000+ row, one table 200000+ row and a few small > > tables. The larger tables get updated about every two hours. The > > problem I having with this server (which is in production) is the disk > > IO. On the larger tables I'm getting disk IO wait averages of > > ~70-90%. I've been tweaking the linux kernel as specified in the > > PostgreSQL documentations and switched to the deadline > > scheduler. Nothing seems to be fixing this. The queries are as > > optimized as I can get them. fsync is off in an attempt to help > > preformance still nothing. Are there any setting I should be look at > > the could improve on this??? > > > > Thanks for and help in advance. > > > > Ron > > > > ---------------------------(end of broadcast)--------------------------- > > TIP 1: if posting/reading through Usenet, please send an appropriate > > subscribe-nomail command to majordomo@postgresql.org so that your > > message can get through to the mailing list cleanly > > > -- > Your PostgreSQL solutions company - Command Prompt, Inc. 1.800.492.2240 > PostgreSQL Replication, Consulting, Custom Programming, 24x7 support > Managed Services, Shared and Dedicated Hosting > Co-Authors: plPHP, plPerlNG - http://www.commandprompt.com/
On Jul 15, 2005, at 2:39 PM, Ron Wills wrote: > Hello all > > I'm running a postgres 7.4.5, on a dual 2.4Ghz Athlon, 1Gig RAM and > an 3Ware SATA raid. Operating System? Which file system are you using? I was having a similar problem just a few days ago and learned that ext3 was the culprit. -Dan
On Fri, Jul 15, 2005 at 03:04:35PM -0600, Ron Wills wrote: > > > Ron Wills wrote: > > > Hello all > > > > > > I'm running a postgres 7.4.5, on a dual 2.4Ghz Athlon, 1Gig RAM and > > > an 3Ware SATA raid. > > > 3 drives raid 5. I don't believe it's the raid. I've tested this by > moving the database to the mirrors software raid where the root is > found and onto the the SATA raid. Neither relieved the IO problems. What filesystem is this? -- Alvaro Herrera (<alvherre[a]alvh.no-ip.org>) Si no sabes adonde vas, es muy probable que acabes en otra parte.
On Fri, 2005-07-15 at 15:04 -0600, Ron Wills wrote: > At Fri, 15 Jul 2005 13:45:07 -0700, > Joshua D. Drake wrote: > > > > Ron Wills wrote: > > > Hello all > > > > > > I'm running a postgres 7.4.5, on a dual 2.4Ghz Athlon, 1Gig RAM and > > > an 3Ware SATA raid. > > > > 2 drives? > > 4 drives? > > 8 drives? > > 3 drives raid 5. I don't believe it's the raid. I've tested this by > moving the database to the mirrors software raid where the root is > found and onto the the SATA raid. Neither relieved the IO problems. Hard or soft RAID? Which controller? Many of the 3Ware controllers (85xx and 95xx) have extremely bad RAID 5 performance. Did you take any pgbench or other benchmark figures before you started using the DB? -jwb
At Fri, 15 Jul 2005 14:00:07 -0700, Jeffrey W. Baker wrote: > > On Fri, 2005-07-15 at 14:39 -0600, Ron Wills wrote: > > Hello all > > > > I'm running a postgres 7.4.5, on a dual 2.4Ghz Athlon, 1Gig RAM and > > an 3Ware SATA raid. Currently the database is only 16G with about 2 > > tables with 500000+ row, one table 200000+ row and a few small > > tables. The larger tables get updated about every two hours. The > > problem I having with this server (which is in production) is the disk > > IO. On the larger tables I'm getting disk IO wait averages of > > ~70-90%. I've been tweaking the linux kernel as specified in the > > PostgreSQL documentations and switched to the deadline > > scheduler. Nothing seems to be fixing this. The queries are as > > optimized as I can get them. fsync is off in an attempt to help > > preformance still nothing. Are there any setting I should be look at > > the could improve on this??? > > Can you please characterize this a bit better? Send the output of > vmstat or iostat over several minutes, or similar diagnostic > information. > > Also please describe your hardware more. Here's a bit of a dump of the system that should be useful. Processors x2: vendor_id : AuthenticAMD cpu family : 6 model : 8 model name : AMD Athlon(tm) MP 2400+ stepping : 1 cpu MHz : 2000.474 cache size : 256 KB MemTotal: 903804 kB Mandrake 10.0 Linux kernel 2.6.3-19mdk The raid controller, which is using the hardware raid configuration: 3ware 9000 Storage Controller device driver for Linux v2.26.02.001. scsi0 : 3ware 9000 Storage Controller 3w-9xxx: scsi0: Found a 3ware 9000 Storage Controller at 0xe8020000, IRQ: 17. 3w-9xxx: scsi0: Firmware FE9X 2.02.00.011, BIOS BE9X 2.02.01.037, Ports: 4. Vendor: 3ware Model: Logical Disk 00 Rev: 1.00 Type: Direct-Access ANSI SCSI revision: 00 SCSI device sda: 624955392 512-byte hdwr sectors (319977 MB) SCSI device sda: drive cache: write back, no read (daft) This is also on a 3.6 reiser filesystem. Here's the iostat for 10mins every 10secs. I've removed the stats from the idle drives to reduce the size of this email. Linux 2.6.3-19mdksmp (photo_server) 07/15/2005 avg-cpu: %user %nice %sys %iowait %idle 2.85 1.53 2.15 39.52 53.95 Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn sda 82.49 4501.73 188.38 1818836580 76110154 avg-cpu: %user %nice %sys %iowait %idle 0.30 0.00 1.00 96.30 2.40 Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn sda 87.80 6159.20 340.00 61592 3400 avg-cpu: %user %nice %sys %iowait %idle 2.50 0.00 1.45 94.35 1.70 Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn sda 89.60 5402.40 320.80 54024 3208 avg-cpu: %user %nice %sys %iowait %idle 1.00 0.10 1.35 97.55 0.00 Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn sda 105.20 5626.40 332.80 56264 3328 avg-cpu: %user %nice %sys %iowait %idle 0.40 0.00 1.00 87.40 11.20 Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn sda 92.61 4484.32 515.48 44888 5160 avg-cpu: %user %nice %sys %iowait %idle 0.45 0.00 1.00 92.66 5.89 Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn sda 89.10 4596.00 225.60 45960 2256 avg-cpu: %user %nice %sys %iowait %idle 0.30 0.00 0.80 96.30 2.60 Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn sda 86.49 3877.48 414.01 38736 4136 avg-cpu: %user %nice %sys %iowait %idle 0.50 0.00 1.00 98.15 0.35 Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn sda 97.10 4710.49 405.19 47152 4056 avg-cpu: %user %nice %sys %iowait %idle 0.35 0.00 1.00 98.65 0.00 Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn sda 93.30 5324.80 186.40 53248 1864 avg-cpu: %user %nice %sys %iowait %idle 0.40 0.00 1.10 96.70 1.80 Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn sda 117.88 5481.72 402.80 54872 4032 avg-cpu: %user %nice %sys %iowait %idle 0.50 0.00 1.05 98.30 0.15 Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn sda 124.00 6081.60 403.20 60816 4032 avg-cpu: %user %nice %sys %iowait %idle 8.75 0.00 2.55 84.46 4.25 Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn sda 125.20 5609.60 228.80 56096 2288 avg-cpu: %user %nice %sys %iowait %idle 2.25 0.00 1.30 96.00 0.45 Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn sda 176.98 6166.17 686.29 61600 6856 avg-cpu: %user %nice %sys %iowait %idle 5.95 0.00 2.25 88.09 3.70 Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn sda 154.55 7879.32 295.70 78872 2960 avg-cpu: %user %nice %sys %iowait %idle 10.29 0.00 3.40 81.97 4.35 Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn sda 213.19 11422.18 557.84 114336 5584 avg-cpu: %user %nice %sys %iowait %idle 1.90 0.10 3.25 94.75 0.00 Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn sda 227.80 12330.40 212.80 123304 2128 avg-cpu: %user %nice %sys %iowait %idle 0.55 0.00 0.85 96.80 1.80 Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn sda 96.30 3464.80 568.80 34648 5688 avg-cpu: %user %nice %sys %iowait %idle 0.70 0.00 1.10 97.25 0.95 Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn sda 92.60 4989.60 237.60 49896 2376 avg-cpu: %user %nice %sys %iowait %idle 2.75 0.00 2.10 93.55 1.60 Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn sda 198.40 10031.63 458.86 100216 4584 avg-cpu: %user %nice %sys %iowait %idle 0.65 0.00 2.40 95.90 1.05 Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn sda 250.25 14174.63 231.77 141888 2320 avg-cpu: %user %nice %sys %iowait %idle 0.60 0.00 2.15 97.20 0.05 Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn sda 285.50 12127.20 423.20 121272 4232 avg-cpu: %user %nice %sys %iowait %idle 0.60 0.00 2.90 95.65 0.85 Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn sda 393.70 14383.20 534.40 143832 5344 avg-cpu: %user %nice %sys %iowait %idle 0.55 0.00 2.15 96.15 1.15 Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn sda 252.15 11801.80 246.15 118136 2464 avg-cpu: %user %nice %sys %iowait %idle 0.75 0.00 3.45 95.15 0.65 Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn sda 396.00 19980.80 261.60 199808 2616 avg-cpu: %user %nice %sys %iowait %idle 0.70 0.00 2.70 95.70 0.90 Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn sda 286.20 14182.40 467.20 141824 4672 avg-cpu: %user %nice %sys %iowait %idle 0.70 0.00 2.70 95.65 0.95 Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn sda 344.20 15838.40 473.60 158384 4736 avg-cpu: %user %nice %sys %iowait %idle 0.75 0.00 1.70 97.50 0.05 Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn sda 178.72 7495.70 412.39 75032 4128 avg-cpu: %user %nice %sys %iowait %idle 1.05 0.05 1.30 97.05 0.55 Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn sda 107.89 4334.87 249.35 43392 2496 avg-cpu: %user %nice %sys %iowait %idle 0.55 0.00 1.30 98.10 0.05 Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn sda 107.01 6345.55 321.12 63392 3208 avg-cpu: %user %nice %sys %iowait %idle 0.65 0.00 1.05 97.55 0.75 Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn sda 107.79 3908.89 464.34 39128 4648 avg-cpu: %user %nice %sys %iowait %idle 0.50 0.00 1.15 97.75 0.60 Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn sda 109.21 4162.56 434.83 41584 4344 avg-cpu: %user %nice %sys %iowait %idle 0.75 0.00 1.15 98.00 0.10 Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn sda 104.19 4796.81 211.58 48064 2120 avg-cpu: %user %nice %sys %iowait %idle 0.70 0.00 1.05 97.85 0.40 Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn sda 105.50 4690.40 429.60 46904 4296 avg-cpu: %user %nice %sys %iowait %idle 0.75 0.00 1.10 98.15 0.00 Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn sda 107.51 4525.33 357.96 45208 3576 avg-cpu: %user %nice %sys %iowait %idle 2.80 0.00 1.65 92.81 2.75 Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn sda 123.18 3810.59 512.29 38144 5128 avg-cpu: %user %nice %sys %iowait %idle 0.60 0.00 1.05 97.10 1.25 Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn sda 104.60 3780.00 236.00 37800 2360 avg-cpu: %user %nice %sys %iowait %idle 0.70 0.00 1.10 95.96 2.25 Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn sda 117.08 3817.78 466.73 38216 4672 avg-cpu: %user %nice %sys %iowait %idle 0.65 0.00 0.90 96.65 1.80 Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn sda 117.20 3629.60 477.60 36296 4776 avg-cpu: %user %nice %sys %iowait %idle 0.80 0.00 1.10 97.50 0.60 Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn sda 112.79 4258.94 326.07 42632 3264 avg-cpu: %user %nice %sys %iowait %idle 1.05 0.15 1.20 97.50 0.10 Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn sda 125.83 2592.99 522.12 25904 5216 avg-cpu: %user %nice %sys %iowait %idle 0.60 0.00 0.55 98.20 0.65 Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn sda 104.90 823.98 305.29 8248 3056 avg-cpu: %user %nice %sys %iowait %idle 0.50 0.00 0.65 98.75 0.10 Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn sda 109.80 734.40 468.80 7344 4688 avg-cpu: %user %nice %sys %iowait %idle 1.15 0.00 1.05 97.75 0.05 Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn sda 107.70 751.20 463.20 7512 4632 avg-cpu: %user %nice %sys %iowait %idle 6.50 0.00 1.85 90.25 1.40 Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn sda 98.00 739.14 277.08 7384 2768 avg-cpu: %user %nice %sys %iowait %idle 0.20 0.00 0.40 82.75 16.65 Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn sda 83.13 550.90 360.08 5520 3608 avg-cpu: %user %nice %sys %iowait %idle 2.65 0.30 2.15 82.91 11.99 Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn sda 100.00 1136.46 503.50 11376 5040 avg-cpu: %user %nice %sys %iowait %idle 1.00 6.25 2.15 89.70 0.90 Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn sda 170.17 4106.51 388.39 41024 3880 avg-cpu: %user %nice %sys %iowait %idle 0.75 0.15 1.75 73.70 23.65 Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn sda 234.60 5107.20 232.80 51072 2328 avg-cpu: %user %nice %sys %iowait %idle 0.15 0.00 0.65 49.48 49.73 Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn sda 175.52 1431.37 122.28 14328 1224 avg-cpu: %user %nice %sys %iowait %idle 0.15 0.00 0.55 50.22 49.08 Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn sda 173.50 1464.00 119.20 14640 1192 avg-cpu: %user %nice %sys %iowait %idle 2.00 0.00 0.60 76.18 21.22 Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn sda 130.60 1044.80 203.20 10448 2032 avg-cpu: %user %nice %sys %iowait %idle 0.90 0.10 0.75 97.55 0.70 Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn sda 92.09 1024.22 197.80 10232 1976 avg-cpu: %user %nice %sys %iowait %idle 0.25 0.00 0.40 73.78 25.57 Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn sda 92.81 582.83 506.99 5840 5080 avg-cpu: %user %nice %sys %iowait %idle 0.20 0.00 0.55 98.85 0.40 Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn sda 90.80 657.60 383.20 6576 3832 avg-cpu: %user %nice %sys %iowait %idle 16.46 0.00 4.25 77.09 2.20 Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn sda 99.60 1174.83 549.85 11760 5504 avg-cpu: %user %nice %sys %iowait %idle 8.05 0.00 2.60 56.92 32.43 Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn sda 172.30 2063.20 128.00 20632 1280 avg-cpu: %user %nice %sys %iowait %idle 20.84 0.00 4.75 52.82 21.59 Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn sda 174.30 1416.80 484.00 14168 4840 avg-cpu: %user %nice %sys %iowait %idle 1.30 0.00 1.60 56.93 40.17 Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn sda 181.02 2858.74 418.78 28616 4192 avg-cpu: %user %nice %sys %iowait %idle 19.17 0.00 4.44 49.78 26.61 Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn sda 162.20 1286.40 373.60 12864 3736 avg-cpu: %user %nice %sys %iowait %idle 0.15 0.00 0.60 50.85 48.40 Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn sda 178.08 1436.64 97.70 14352 976 > Regards, > Jeff Baker
At Fri, 15 Jul 2005 14:17:34 -0700, Jeffrey W. Baker wrote: > > On Fri, 2005-07-15 at 15:04 -0600, Ron Wills wrote: > > At Fri, 15 Jul 2005 13:45:07 -0700, > > Joshua D. Drake wrote: > > > > > > Ron Wills wrote: > > > > Hello all > > > > > > > > I'm running a postgres 7.4.5, on a dual 2.4Ghz Athlon, 1Gig RAM and > > > > an 3Ware SATA raid. > > > > > > 2 drives? > > > 4 drives? > > > 8 drives? > > > > 3 drives raid 5. I don't believe it's the raid. I've tested this by > > moving the database to the mirrors software raid where the root is > > found and onto the the SATA raid. Neither relieved the IO problems. > > Hard or soft RAID? Which controller? Many of the 3Ware controllers > (85xx and 95xx) have extremely bad RAID 5 performance. > > Did you take any pgbench or other benchmark figures before you started > using the DB? No, unfortunatly, I'm more or less just the developer for the automation systems and admin the system to keep everything going. I have very little say in the hardware used and I don't have any physical access to the machine, it's found a province over :P. But, for what the system, this IO seems unreasonable. I run development on a 1.4Ghz Athlon, Gentoo system, with no raid and I can't reproduce this kind of IO :(. > -jwb
On Fri, 2005-07-15 at 15:29 -0600, Ron Wills wrote: > Here's a bit of a dump of the system that should be useful. > > Processors x2: > > vendor_id : AuthenticAMD > cpu family : 6 > model : 8 > model name : AMD Athlon(tm) MP 2400+ > stepping : 1 > cpu MHz : 2000.474 > cache size : 256 KB > > MemTotal: 903804 kB > > Mandrake 10.0 Linux kernel 2.6.3-19mdk > > The raid controller, which is using the hardware raid configuration: > > 3ware 9000 Storage Controller device driver for Linux v2.26.02.001. > scsi0 : 3ware 9000 Storage Controller > 3w-9xxx: scsi0: Found a 3ware 9000 Storage Controller at 0xe8020000, IRQ: 17. > 3w-9xxx: scsi0: Firmware FE9X 2.02.00.011, BIOS BE9X 2.02.01.037, Ports: 4. > Vendor: 3ware Model: Logical Disk 00 Rev: 1.00 > Type: Direct-Access ANSI SCSI revision: 00 > SCSI device sda: 624955392 512-byte hdwr sectors (319977 MB) > SCSI device sda: drive cache: write back, no read (daft) > > This is also on a 3.6 reiser filesystem. > > Here's the iostat for 10mins every 10secs. I've removed the stats from > the idle drives to reduce the size of this email. > > Linux 2.6.3-19mdksmp (photo_server) 07/15/2005 > > avg-cpu: %user %nice %sys %iowait %idle > 2.85 1.53 2.15 39.52 53.95 > > Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn > sda 82.49 4501.73 188.38 1818836580 76110154 > > avg-cpu: %user %nice %sys %iowait %idle > 0.30 0.00 1.00 96.30 2.40 > > Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn > sda 87.80 6159.20 340.00 61592 3400 These I/O numbers are not so horrible, really. 100% iowait is not necessarily a symptom of misconfiguration. It just means you are disk limited. With a database 20 times larger than main memory, this is no surprise. If I had to speculate about the best way to improve your performance, I would say: 1a) Get a better RAID controller. The 3ware hardware RAID5 is very bad. 1b) Get more disks. 2) Get a (much) newer kernel. 3) Try XFS or JFS. Reiser3 has never looked good in my pgbench runs By the way, are you experiencing bad application performance, or are you just unhappy with the iostat figures? Regards, jwb
At Fri, 15 Jul 2005 14:53:26 -0700, Jeffrey W. Baker wrote: > > On Fri, 2005-07-15 at 15:29 -0600, Ron Wills wrote: > > Here's a bit of a dump of the system that should be useful. > > > > Processors x2: > > > > vendor_id : AuthenticAMD > > cpu family : 6 > > model : 8 > > model name : AMD Athlon(tm) MP 2400+ > > stepping : 1 > > cpu MHz : 2000.474 > > cache size : 256 KB > > > > MemTotal: 903804 kB > > > > Mandrake 10.0 Linux kernel 2.6.3-19mdk > > > > The raid controller, which is using the hardware raid configuration: > > > > 3ware 9000 Storage Controller device driver for Linux v2.26.02.001. > > scsi0 : 3ware 9000 Storage Controller > > 3w-9xxx: scsi0: Found a 3ware 9000 Storage Controller at 0xe8020000, IRQ: 17. > > 3w-9xxx: scsi0: Firmware FE9X 2.02.00.011, BIOS BE9X 2.02.01.037, Ports: 4. > > Vendor: 3ware Model: Logical Disk 00 Rev: 1.00 > > Type: Direct-Access ANSI SCSI revision: 00 > > SCSI device sda: 624955392 512-byte hdwr sectors (319977 MB) > > SCSI device sda: drive cache: write back, no read (daft) > > > > This is also on a 3.6 reiser filesystem. > > > > Here's the iostat for 10mins every 10secs. I've removed the stats from > > the idle drives to reduce the size of this email. > > > > Linux 2.6.3-19mdksmp (photo_server) 07/15/2005 > > > > avg-cpu: %user %nice %sys %iowait %idle > > 2.85 1.53 2.15 39.52 53.95 > > > > Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn > > sda 82.49 4501.73 188.38 1818836580 76110154 > > > > avg-cpu: %user %nice %sys %iowait %idle > > 0.30 0.00 1.00 96.30 2.40 > > > > Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn > > sda 87.80 6159.20 340.00 61592 3400 > > These I/O numbers are not so horrible, really. 100% iowait is not > necessarily a symptom of misconfiguration. It just means you are disk > limited. With a database 20 times larger than main memory, this is no > surprise. > > If I had to speculate about the best way to improve your performance, I > would say: > > 1a) Get a better RAID controller. The 3ware hardware RAID5 is very bad. > 1b) Get more disks. > 2) Get a (much) newer kernel. > 3) Try XFS or JFS. Reiser3 has never looked good in my pgbench runs Not good news :(. I can't change the hardware, hopefully a kernel update and XFS of JFS will make an improvement. I was hoping for software raid (always has worked well), but the client didn't feel conforable with it :P. > By the way, are you experiencing bad application performance, or are you > just unhappy with the iostat figures? It's affecting the whole system. It is sending the load averages through the roof (from 4 to 12) and processes that would take only a few minutes starts going over an hour, until it clears up. Well, I guess I'll have to drum up some more programming magic... and I'm starting to run out of tricks... I love my job some day :$ > Regards, > jwb >
At Fri, 15 Jul 2005 14:39:36 -0600, Ron Wills wrote: I just wanted to thank everyone for their help. I believe we found a solution that will help with this problem, with the hardware configuration and caching the larger tables into smaller data sets. A valuable lesson learned from this ;) > Hello all > > I'm running a postgres 7.4.5, on a dual 2.4Ghz Athlon, 1Gig RAM and > an 3Ware SATA raid. Currently the database is only 16G with about 2 > tables with 500000+ row, one table 200000+ row and a few small > tables. The larger tables get updated about every two hours. The > problem I having with this server (which is in production) is the disk > IO. On the larger tables I'm getting disk IO wait averages of > ~70-90%. I've been tweaking the linux kernel as specified in the > PostgreSQL documentations and switched to the deadline > scheduler. Nothing seems to be fixing this. The queries are as > optimized as I can get them. fsync is off in an attempt to help > preformance still nothing. Are there any setting I should be look at > the could improve on this??? > > Thanks for and help in advance. > > Ron > > ---------------------------(end of broadcast)--------------------------- > TIP 1: if posting/reading through Usenet, please send an appropriate > subscribe-nomail command to majordomo@postgresql.org so that your > message can get through to the mailing list cleanly