Thread: High load average with PostgreSQL 7.4.2 on debian/ibm eserver.
Hello, I'm using PostgreSQL 7.4.2 (package from backports.org) on a Debian (woody) box. The machine is IBM eServer 345 with two 2.8 Xeon CPUs, it has 1024MB of RAM and two 15k RPM SCSI disks running in hardware RAID1, which is provided by the onboard LSI Logic controller (LSI53C1030). The database consists of two rather large tables (currently about 15 million rows in one table and about 5 million in the other one). Both tables have 5 indexes (4 btree/1 hash). Application running on the server INSERTs a lot of stuff to the tables (which is not the target use of the DB, it'll add data periodically, about 300 rows per 10 minutes). Queries (SELECTs) run perfectly fine on the database, thanks to the indexes we have here probably. Performance issue, I'm experiencing here, is somewhat weird - server gets high average load (from 5 up to 15, 8 on average). Standard performance monitoring utilities (like top) show that CPUs are not loaded (below 20%, often near zero). With kernel 2.6.x which I was using earlier, top showed very high "wa" values (which indicate I/O waiting, AFAIK). I've googled some issues with 2.6 kernels and LSI Logic controllers running RAID, so I've downgraded the kernel to 2.4.26. The machine started to behave a bit better, but still high load states look weird. Unfortunately, top with 2.4 kernels does not show "wa" column, so I can't be sure if the load is caused by waiting for disks, but high idle values and high average load would suggest it. With kernel 2.6 swap was almost always 100% free, with 2.4.26 Linux eats below 5 megabytes of swapspace. PostgreSQL is running with shared_mem set to 48000, sort_mem = 4096, fsync off. Whole config is available here: http://ludojad.itpp.pl/~eleven/pg-high-load.conf I've also made some iostat report (using iostat 3 1000 as suggested in one of the posts): http://ludojad.itpp.pl/~eleven/iostat.log Any solutions I should consider? I'd be grateful getting some hints on this. -- 11.
On Tue, 29 Jun 2004 17:55:37 +0200, eleven@ludojad.itpp.pl <eleven@ludojad.itpp.pl> wrote: > Performance issue, I'm experiencing here, is somewhat > weird - server gets high average load (from 5 up to 15, > 8 on average). Standard performance monitoring > utilities (like top) show that CPUs are not loaded > (below 20%, often near zero). So ... you never actually say what the performance issue you experience is. Having a high load average is not necessarily a performance issue. What is it that you want to fix?
eleven@ludojad.itpp.pl wrote: > <>I'm using PostgreSQL 7.4.2 (package from backports.org) > on a Debian (woody) box. The machine is IBM eServer 345 > with two 2.8 Xeon CPUs, it has 1024MB of RAM and > two 15k RPM SCSI disks running in hardware RAID1, which > is provided by the onboard LSI Logic controller (LSI53C1030). <snip> > <>With kernel 2.6.x which I was using earlier, > top showed very high "wa" values (which indicate I/O waiting, AFAIK) It sounds like you are very much bound by disk I/O. Your iostat output indicates a good amount of I/O going on--I bet an iostat -x /dev/sdX would show very high await times (time in ms before an IO request to the device is serviced). If your RAID controller has a battery-backed cache, check that you have write-back (as opposed to write-through) enabled. This will cause the controller to report data written only to RAID cache and not yet flushed to disk as sync'd. You can experience large gains in write performance this way. If write-back is already enabled, or enabling it does not give a large enough performance boost, you may need to buy more disks. In general, if you have the budget for lots of disks, RAID 10 is the best you can do performance-wise; if your budget for disks is limited, RAID 5 is the next best thing. Also, you will get more bang for your buck with a larger number of 10k disks than a smaller number of 15k disks. Good luck, Bill Montgomery
Re: High load average with PostgreSQL 7.4.2 on debian/ibm eserver.
From
eleven@ludojad.itpp.pl
Date:
On Tue, Jun 29, 2004 at 09:17:36AM -0700, Marc wrote: > > Performance issue, I'm experiencing here, is somewhat > > weird - server gets high average load (from 5 up to 15, > > 8 on average). Standard performance monitoring > > utilities (like top) show that CPUs are not loaded > > (below 20%, often near zero). > So ... you never actually say what the performance issue you > experience is. Having a high load average is not necessarily a > performance issue. Well, if the server's CPUs are idle and the machine is starting to hog itself, one can suspect something bad going on. > What is it that you want to fix? Basically, I'm wondering if I'm already on the edge of performance capabilities of this machine/configuration, or maybe there's some abnormal behaviour happening (which could be noticed by somebody from this mailing list, hopefully). In particular - could someone tell me if those iostat values can tell if I'm close to upper performance boundary of fast SCSI (Ultra 320, 15k RPM) disks? -- 11.
On Tue, 2004-06-29 at 09:55, eleven@ludojad.itpp.pl wrote: > Hello, > > I'm using PostgreSQL 7.4.2 (package from backports.org) > on a Debian (woody) box. The machine is IBM eServer 345 > with two 2.8 Xeon CPUs, it has 1024MB of RAM and > two 15k RPM SCSI disks running in hardware RAID1, which > is provided by the onboard LSI Logic controller (LSI53C1030). > > The database consists of two rather large tables > (currently about 15 million rows in one table > and about 5 million in the other one). Both tables > have 5 indexes (4 btree/1 hash). > Application running on the server INSERTs a lot of > stuff to the tables (which is not the target use of > the DB, it'll add data periodically, about > 300 rows per 10 minutes). Queries (SELECTs) run perfectly > fine on the database, thanks to the indexes we have > here probably. > > Performance issue, I'm experiencing here, is somewhat > weird - server gets high average load (from 5 up to 15, > 8 on average). Standard performance monitoring > utilities (like top) show that CPUs are not loaded > (below 20%, often near zero). Anytime you have high load but low CPU utilization, you usually have an I/O bound system. Ad disks to your RAID (big RAID 5 or RAID 1+0) and make sure you have battery backed cache set to write back. also, put as memory as you can in the machine. Jumping up to 2 Gigs may help quite a bit as well.
Eleven, > In particular - could someone tell me if those iostat > values can tell if I'm close to upper performance boundary > of fast SCSI (Ultra 320, 15k RPM) disks? It's quite possible that you need to improve your disk array; certainly I would have spec'd a lot more disk than you're using (like raid 0+1 with 6 disks or RAID 5 with seven disks). However, there's the other end as well; it's quite possible that your queries are doing seq scans and other disk-intensive operations that could be avoided. Have you analyed this at all? -- -Josh Berkus Aglio Database Solutions San Francisco
eleven@ludojad.itpp.pl wrote: > Whole config is available here: > http://ludojad.itpp.pl/~eleven/pg-high-load.conf effective_cache_size = 4000 # typically 8KB each #random_page_cost = 4 # units are one sequential page fetch cost #cpu_tuple_cost = 0.01 # (same) #cpu_index_tuple_cost = 0.001 # (same) #cpu_operator_cost = 0.0025 # (same) These values are too higher for your hardware, try to execute the explain analyze for the queries that are running on your box and repeat it lowering these values, I bet postgres is running seq scan instead of an index scan. These are the value that I use for a configuration closer to your: effective_cache_size = 20000 random_page_cost = 2.0 cpu_tuple_cost = 0.005 cpu_index_tuple_cost = 0.0005 cpu_operator_cost = 0.0025 last question, do you use the autovacuum daemon ? If no => you have to use it If yes => did you apply the patch that will not fail with big tables like yours ? if you can post the autovacuum daemon log ( last lines ). Regards Gaetano Mendola