Re: Finding bottleneck - Mailing list pgsql-performance
From | Kari Lavikka |
---|---|
Subject | Re: Finding bottleneck |
Date | |
Msg-id | Pine.HPX.4.62.0508191413480.3361@purple.bdb.fi Whole thread Raw |
In response to | Re: Finding bottleneck (Tom Lane <tgl@sss.pgh.pa.us>) |
Responses |
Re: Finding bottleneck
|
List | pgsql-performance |
On Mon, 8 Aug 2005, Tom Lane wrote: > What that sounds like to me is a machine with inadequate disk I/O bandwidth. > Your earlier comment that checkpoint drives the machine into the ground > fits right into that theory, too. You said there is "almost no IO-wait" > but are you sure you are measuring that correctly? Reducing checkpoint_timeout to 600 seconds had a positive effect. Previous value was 1800 seconds. We have a spare disk array from the old server and I'm planning to use it as a tablespace for the comment table (the 100M+ rows one) as Ron suggested. >> Queries accumulate and when checkpointing is over, there can be >> something like 400 queries running but over 50% of cpu is just idling. > > 400 queries? Are you launching 400 separate backends to do that? > Some sort of connection pooling seems like a good idea, if you don't > have it in place already. If the system's effective behavior in the > face of heavy load is to start even more concurrent backends, that > could easily drive things into the ground. Ok, I implemented connection pooling using pgpool and it increased performance a lot! We are now delivering about 1500 dynamic pages a second without problems. Each of the eight single-cpu webservers are running a pgpool instance with 20 connections. However, those configuration changes didn't have significant effect to oprofile results. AtEOXact_CatCache consumes even more cycles. This isn't a problem right now but it may be in the future... CPU: AMD64 processors, speed 2190.23 MHz (estimated) Counted CPU_CLK_UNHALTED events (Cycles outside of halt state) with a unit mask of 0x00 (No unit mask) count 100000 samples % symbol name 1147870 21.1602 AtEOXact_CatCache 187466 3.4558 hash_seq_search 174357 3.2142 AllocSetAlloc 170896 3.1504 nocachegetattr 131724 2.4282 ExecMakeFunctionResultNoSets 125292 2.3097 SearchCatCache 117264 2.1617 StrategyDirtyBufferList 105741 1.9493 hash_search 98245 1.8111 FunctionCall2 97878 1.8043 yyparse 90932 1.6763 LWLockAcquire 83555 1.5403 LWLockRelease 81045 1.4940 _bt_compare ... and so on ... ----->8 Signigicant rows from current postgresql.conf 8<----- max_connections = 768 # unnecessarily large with connection pooling shared_buffers = 15000 work_mem = 2048 maintenance_work_mem = 32768 max_fsm_pages = 1000000 max_fsm_relations = 5000 bgwriter_percent = 2 fsync = true wal_buffers = 512 checkpoint_segments = 200 # less would probably be enuff with 600sec timeout checkpoint_timeout = 600 effective_cache_size = 500000 random_page_cost = 1.5 default_statistics_target = 150 stats_start_collector = true stats_command_string = true |\__/| ( oo ) Kari Lavikka - tuner@bdb.fi - (050) 380 3808 __ooO( )Ooo_______ _____ ___ _ _ _ _ _ _ _ ""
pgsql-performance by date: