Thread: Most effective tuning choices for busy website?
Hello, I'm the fellow who was interviewed in the fall about using PostgreSQL on 1-800-Save-A-Pet.com: http://techdocs.postgresql.org/techdocs/interview-stosberg.php The site traffic continues to grow, and we are now seeing parts of the day where the CPU load (according to MRTG graphs) on the database server is stuck at 100%. I would like to improve this, and I'm not sure where to look first. The machine is a dedicated PostgreSQL server which two web server boxes talk to. I've used PQA to analyze my queries and happy overall with how they are running. About 55% of the query time is going to variations of the pet searching query, which seems like where it should be going. The query is frequent and complex. It has already been combed over for appropriate indexing. I'm more interested at this point in tuning the software and hardware infrastructure, but would like to get a sense about which choices will bring the greatest reward. Let me explain some avenues I'm considering. - We are currently running 7.4. If I upgrade to 8.0 and DBD::Pg 1.42, then the "server side prepare" feature will be available for use. We do run the same queries a number of times. - PhpPgAds seems to sucking up about 7.5% of our query time and is unrelated to the core application. We could move this work to another machine. The queries it generates seem like they have some room to optimized, or simply don't need to be run in some cases. However, I would like to stay out of modifying third-party code and PHP if possible. - I saw the hardware tip to "Separate the Transaction Log from the Database". We have about 60% SELECT statements and 14% UPDATE statements. Focusing more on SELECT performance seems more important for us. - We have tried to tune 'shared_buffers' some, but haven't seen a noticeable performance improvement. Our hardware: Dual 3 Ghz processors 3 GB RAM, running on FreeBSD. I'm not quite sure how to check our average connection usage, but maybe this is helpful: When I do: select count(*) from pg_stat_activity ; I get values around 170. We have these values: max_connections = 400 shared_buffers = 4096 Most other values in postgresql.conf are still at the their defaults. Any suggestions are which avenues might offer the most bang for the buck are appreciated! ( I have already found: http://www.powerpostgresql.com/PerfList/ and it has been a very helpful source of suggestions. ) Mark
Mark Stosberg wrote: > I've used PQA to analyze my queries and happy overall with how they are > running. About 55% of the query time is going to variations of the pet > searching query, which seems like where it should be going. The query is > frequent and complex. It has already been combed over for appropriate > indexing. It might be worth posting the EXPLAIN ANALYZE and relevant schema definitions for this query, in case there is additional room for optimization. > Our hardware: Dual 3 Ghz processors 3 GB RAM, running on FreeBSD. Disk? You are presumably using Xeon processors, right? If so, check the list archives for information on the infamous "context switching storm" that causes performance problems for some people using SMP Xeons. -Neil
Neil Conway wrote: > Mark Stosberg wrote: >> I've used PQA to analyze my queries and happy overall with how they are >> running. About 55% of the query time is going to variations of the pet >> searching query, which seems like where it should be going. The query is >> frequent and complex. It has already been combed over for appropriate >> indexing. > > It might be worth posting the EXPLAIN ANALYZE and relevant schema > definitions for this query, in case there is additional room for > optimization. > >> Our hardware: Dual 3 Ghz processors 3 GB RAM, running on FreeBSD. > > Disk? > > You are presumably using Xeon processors, right? If so, check the list > archives for information on the infamous "context switching storm" that > causes performance problems for some people using SMP Xeons. I wanted to follow-up to report a positive outcome to tuning this Xeon SMP machine on FreeBSD. We applied the following techniques, and saw the average CPU usage drop by about 25%. - in /etc/sysctl.conf, we set it to use raw RAM for shared memory: kern.ipc.shm_use_phys=1 - We updated our kernel config and postmaster.conf to set shared_buffers to about 8000. - We disabled hyperthreading in the BIOS, which had a label like "Logical Processors? : Disabled". I recall there was tweak my co-worker made that's not on my list. I realize it's not particularly scientific because we changed several things at once...but at least it is working well enough for now. Mark
On 06/01/2005-07:19PM, Mark Stosberg wrote: > > - I saw the hardware tip to "Separate the Transaction Log from the > Database". We have about 60% SELECT statements and 14% UPDATE > statements. Focusing more on SELECT performance seems more important > for us. > I would think that would help SELECT If the spindle isn't busy writing Transaction log it can be reading for your SELECTs. You did say you were CPU bound though. -- ------------------------------------------------------------ Christopher Weimann http://www.k12usa.com K12USA.com Cool Tools for Schools! ------------------------------------------------------------