Re: Performance Bottleneck - Mailing list pgsql-performance
From | Martin Foster |
---|---|
Subject | Re: Performance Bottleneck |
Date | |
Msg-id | X0ZPc.23543$yT2.23053@clgrps13 Whole thread Raw |
In response to | Re: Performance Bottleneck (Gaetano Mendola <mendola@bigfoot.com>) |
Responses |
Re: Performance Bottleneck
Re: Performance Bottleneck Re: Performance Bottleneck |
List | pgsql-performance |
Gaetano Mendola wrote: > Martin Foster wrote: > >> I run a Perl/CGI driven website that makes extensive use of PostgreSQL >> (7.4.3) for everything from user information to formatting and display >> of specific sections of the site. The server itself, is a dual >> processor AMD Opteron 1.4Ghz w/ 2GB Ram and 2 x 120GB hard drives >> mirrored for redundancy running under FreeBSD 5.2.1 (AMD64). >> >> Recently loads on the site have increased during peak hours to the >> point of showing considerable loss in performance. This can be >> observed when connections move from the 120 concurrent connections to >> PostgreSQL to roughly 175 or more. Essentially, the machine seems >> to struggle to keep up with continual requests and slows down >> respectively as resources are tied down. >> >> Code changes have been made to the scripts to essentially back off in >> high load working environments which have worked to an extent. >> However, as loads continue to increase the database itself is not >> taking well to the increased traffic taking place. >> >> Having taken a look at 'Tuning PostgreSQL for Performance' >> (http://www.varlena.com/GeneralBits/Tidbits/perf.html) using it as >> best I could in order to set my settings. However, even with >> statistics disabled and ever setting tweaked things still consider to >> deteriorate. >> >> Is there anything anyone can recommend in order to give the system a >> necessary speed boost? It would seem to me that a modest dataset of >> roughly a Gig combined with that type of hardware should be able to >> handle substantially more load then what it is. Can anyone provide me >> with clues as where to pursue? Would disabling 'fsync' provide more >> performance if I choose that information may be lost in case of a crash? >> >> If anyone needs access to logs, settings et cetera. Please ask, I >> simply wish to test the waters first on what is needed. Thanks! > > > Tell us about your tipical queries, show us your configuration file. > The access are only in read only mode or do you have concurrent writers > and readers ? During peak hours your processors are tied to 100% ? > What say the vmstat and the iostat ? > > May be you are not using indexes some where, or may be yes but the > planner is not using it... In two words we needs other informations > in order to help you. > > > > Regards > Gaetano Mendola > > I included all the files in attachments, which will hopefully cut down on any replied to Emails. As for things like connection pooling, the web server makes use of Apache::DBI to pool the connections for the Perl scripts being driven on that server. For the sake of being thorough, a quick 'apachectl status' was thrown in when the database was under a good load. Since it would rather slow things down to wait for the servers to really get bogged down with load averages of 20.00 and more, I opted to choose a period of time where we are a bit busier then normal. You will be able to see how the system behaves under a light load and subsequently reaching 125 or so concurrent connections. The queries themselves are simple, normally drawing information from one table with few conditions or in the most complex cases using joins on two table or sub queries. These behave very well and always have, the problem is that these queries take place in rather large amounts due to the dumb nature of the scripts themselves. Over a year ago when I was still using MySQL for the project, the statistics generated would report well over 65 queries per second under loads ranging from 130 to 160 at peak but averaged over the weeks of operation. Looking at the Apache status, one can see that it averages only roughly 2.5 requests per second giving you a slight indication as to what is taking place. A quick run of 'systat -ifstat' shows the following graph: /0 /1 /2 /3 /4 /5 /6 /7 /8 /9 /10 Load Average >>>>>>>>>>> Interface Traffic Peak Total lo0 in 0.000 KB/s 0.000 KB/s 37.690 GB out 0.000 KB/s 0.000 KB/s 37.690 GB em0 in 34.638 KB/s 41.986 KB/s 28.998 GB out 70.777 KB/s 70.777 KB/s 39.553 GB Em0 is a full duplexed 100Mbs connection to an internal switch that supports the servers directly. Load on the loopback was cut down considerably once I stopped using pg_autovaccum since its performance benefits under low load were buried under the hindrance it caused when traffic was high. I am sure that there are some places that could benefit from some optimization. Especially in the case of indexes, however as a whole the problem seems to be related more to the massive onslaught of queries then it does anything else. Also note that some of these scripts run for longer durations even if they are web based. Some run as long as 30 minutes, making queries to the database from periods of wait from five seconds to twenty-five seconds. Under high duress the timeouts should back out, based on the time needed for the query to respond, normally averaging 0.008 seconds. Does this help at all, or is more detail needed on the matter? Martin Foster martin@ethereal-realms.org # ----------------------------- # PostgreSQL configuration file # ----------------------------- # # This file consists of lines of the form: # # name = value # # (The '=' is optional.) White space may be used. Comments are introduced # with '#' anywhere on a line. The complete list of option names and # allowed values can be found in the PostgreSQL documentation. The # commented-out settings shown in this file represent the default values. # # Any option can also be given as a command line switch to the # postmaster, e.g. 'postmaster -c log_connections=on'. Some options # can be changed at run-time with the 'SET' SQL command. # # This file is read on postmaster startup and when the postmaster # receives a SIGHUP. If you edit the file on a running system, you have # to SIGHUP the postmaster for the changes to take effect, or use # "pg_ctl reload". #--------------------------------------------------------------------------- # CONNECTIONS AND AUTHENTICATION #--------------------------------------------------------------------------- # - Connection Settings - tcpip_socket = true ssl = false max_connections = 512 superuser_reserved_connections = 2 rendezvous_name='io' port = 5432 #unix_socket_directory = '/var/postgres' #unix_socket_group = 'postgres' #unix_socket_permissions = 0777 # octal # - Security & Authentication - authentication_timeout = 20 # 1-600, in seconds password_encryption = true db_user_namespace = false #--------------------------------------------------------------------------- # RESOURCE USAGE (except WAL) #--------------------------------------------------------------------------- # - Memory - shared_buffers = 8192 # min 16, at least max_connections*2, 8KB each sort_mem = 2048 # min 64, size in KB vacuum_mem = 65536 # min 1024, size in KB # - Free Space Map - #max_fsm_pages = 20000 # min max_fsm_relations*16, 6 bytes each #max_fsm_relations = 1000 # min 100, ~50 bytes each # - Kernel Resource Usage - #max_files_per_process = 1000 # min 25 #preload_libraries = '' #--------------------------------------------------------------------------- # WRITE AHEAD LOG #--------------------------------------------------------------------------- # - Settings - fsync = true # turns forced synchronization on or off wal_sync_method = fsync # the default varies across platforms: # fsync, fdatasync, open_sync, or open_datasync wal_buffers = 128 # min 4, 8KB each # - Checkpoints - checkpoint_segments = 24 # in logfile segments, min 1, 16MB each checkpoint_timeout = 300 # range 30-3600, in seconds checkpoint_warning = 30 # 0 is off, in seconds commit_delay = 15000 # range 0-100000, in microseconds commit_siblings = 64 # range 1-1000 #--------------------------------------------------------------------------- # QUERY TUNING #--------------------------------------------------------------------------- # - Planner Method Enabling - enable_hashagg = true enable_hashjoin = true enable_indexscan = true enable_mergejoin = true enable_nestloop = true enable_seqscan = true enable_sort = true enable_tidscan = true # - Planner Cost Constants - effective_cache_size = 16000 # typically 8KB each random_page_cost = 2 # units are one sequential page fetch cost cpu_tuple_cost = 0.01 # (same) cpu_index_tuple_cost = 0.001 # (same) cpu_operator_cost = 0.0025 # (same) # - Genetic Query Optimizer - geqo = false geqo_threshold = 11 geqo_effort = 1 geqo_generations = 0 geqo_pool_size = 0 # default based on tables in statement, # range 128-1024 geqo_selection_bias = 2.0 # range 1.5-2.0 # - Other Planner Options - default_statistics_target = 100 # range 1-1000 from_collapse_limit = 8 join_collapse_limit = 8 # 1 disables collapsing of explicit JOINs #--------------------------------------------------------------------------- # ERROR REPORTING AND LOGGING #--------------------------------------------------------------------------- # - Syslog - #syslog = 2 # range 0-2; 0=stdout; 1=both; 2=syslog #syslog_facility = 'LOCAL0' #syslog_ident = 'postgres' # - When to Log - client_min_messages = notice # Values, in order of decreasing detail: # debug5, debug4, debug3, debug2, debug1, # log, info, notice, warning, error log_min_messages = info # Values, in order of decreasing detail: # debug5, debug4, debug3, debug2, debug1, # info, notice, warning, error, log, fatal, # panic log_error_verbosity = default # terse, default, or verbose messages log_min_error_statement = panic # Values in order of increasing severity: # debug5, debug4, debug3, debug2, debug1, # info, notice, warning, error, panic(off) log_min_duration_statement = -1 # Log all statements whose # execution time exceeds the value, in # milliseconds. Zero prints all queries. # Minus-one disables. silent_mode = false # DO NOT USE without Syslog! # - What to Log - #debug_print_parse = false #debug_print_rewritten = false #debug_print_plan = false #debug_pretty_print = false #log_connections = false #log_duration = false #log_pid = false #log_statement = false #log_timestamp = false #log_hostname = false #log_source_port = false #--------------------------------------------------------------------------- # RUNTIME STATISTICS #--------------------------------------------------------------------------- # - Statistics Monitoring - log_parser_stats = false log_planner_stats = false log_executor_stats = false log_statement_stats = false # - Query/Index Statistics Collector - stats_start_collector = false stats_command_string = false stats_block_level = false stats_row_level = false stats_reset_on_server_start = true #--------------------------------------------------------------------------- # CLIENT CONNECTION DEFAULTS #--------------------------------------------------------------------------- # - Statement Behavior - search_path = '$user,public' # schema names check_function_bodies = true default_transaction_isolation = 'read committed' default_transaction_read_only = false statement_timeout = 25000 # 0 is disabled, in milliseconds # - Locale and Formatting - datestyle = 'iso, us' #timezone = unknown # actually, defaults to TZ environment setting #australian_timezones = false #extra_float_digits = 0 # min -15, max 2 #client_encoding = sql_ascii # actually, defaults to database encoding # These settings are initialized by initdb -- they may be changed lc_messages = 'C' # locale for system error message strings lc_monetary = 'C' # locale for monetary formatting lc_numeric = 'C' # locale for number formatting lc_time = 'C' # locale for time formatting # - Other Defaults - explain_pretty_print = true #dynamic_library_path = '$libdir' #max_expr_depth = 10000 # min 10 #--------------------------------------------------------------------------- # LOCK MANAGEMENT #--------------------------------------------------------------------------- deadlock_timeout = 5000 # in milliseconds #max_locks_per_transaction = 64 # min 10, ~260*max_connections bytes each #--------------------------------------------------------------------------- # VERSION/PLATFORM COMPATIBILITY #--------------------------------------------------------------------------- # - Previous Postgres Versions - #add_missing_from = true #regex_flavor = advanced # advanced, extended, or basic #sql_inheritance = true # - Other Platforms & Clients - #transform_null_equals = false martin@elara ~$ apachectl status Apache Server Status for www.ethereal-realms.org Server Version: Apache/1.3.29 (Unix) mod_perl/1.27 mod_gzip/1.3.26.1a mod_ssl/2.8.16 OpenSSL/0.9.7c Server Built: May 24 2004 00:32:11 _________________________________________________________________ Current Time: Tuesday, 03-Aug-2004 21:34:37 MDT Restart Time: Tuesday, 03-Aug-2004 03:54:21 MDT Parent Server Generation: 0 Server uptime: 17 hours 40 minutes 16 seconds Total accesses: 168852 - Total Traffic: 406.0 MB CPU Usage: u283.96 s41.27 cu92.51 cs13.68 - .678% CPU load 2.65 requests/sec - 6.5 kB/second - 2521 B/request 127 requests currently being processed, 59 idle servers W_WWWW_WW.WWKKWK__W._WW_.WK._KK._..K.WW__WK_WW_WWWK_.__WWW_W.WWW __..__W._W__WWWWWW._WKWW_W_KKW_WWW___K_.WKWW__KWWWWW_WWWWKKKWW.W __.WK_WW.W_W_WWWW_KWWWWW_KW.WW.W__W_W___W___WWWKW__W_W._.WWW.WWK W___WKKW.K_W.W...KW....WW_...................................... ................................................................ ................................................................ ................................................................ ................................................................ ................................................................ ................................................................ ................................................................ ................................................................ ................................................................ ................................................................ ................................................................ ................................................................ Scoreboard Key: "_" Waiting for Connection, "S" Starting up, "R" Reading Request, "W" Sending Reply, "K" Keepalive (read), "D" DNS Lookup, "L" Logging, "G" Gracefully finishing, "." Open slot with no current process martin@io ~$ vmstat procs memory page disks faults cpu r b w avm fre flt re pi po fr sr ad4 ad6 in sy cs us sy id 0 0 0 498532 122848 3306 0 0 0 740 0 0 0 788 0 1675 16 21 63 martin@io ~$ pstat -s Device 1K-blocks Used Avail Capacity /dev/ar0s1b 4194304 24 4194280 0% martin@io ~$ iostat tty ad4 ad6 ar0 cpu tin tout KB/t tps MB/s KB/t tps MB/s KB/t tps MB/s us ni sy in id 0 17 0.50 0 0.00 0.50 0 0.00 20.28 23 0.45 16 0 20 1 63 martin@io ~# ps -U postgres | wc -l 127 martin@io ~$ ps -uax USER PID %CPU %MEM VSZ RSS TT STAT STARTED TIME COMMAND postgres 32084 0.0 0.2 91616 3764 p0- R Mon12PM 4:08.99 /usr/local/bin/postmaster -D /var/postgres (postgres) postgres 80333 0.0 2.1 94620 44372 ?? S 8:57PM 0:01.00 postmaster: ethereal ethereal 192.168.1.6 idle in trans postgres 80599 0.0 2.1 94652 44780 ?? S 8:59PM 0:00.97 postmaster: ethereal ethereal 192.168.1.6 idle in trans postgres 80616 0.0 2.4 94424 50396 ?? S 8:59PM 0:00.89 postmaster: ethereal ethereal 192.168.1.6 idle in trans postgres 80715 0.0 2.2 94444 46804 ?? S 9:00PM 0:00.68 postmaster: ethereal ethereal 192.168.1.6 idle in trans postgres 80788 0.0 2.1 94424 43944 ?? S 9:00PM 0:00.93 postmaster: ethereal ethereal 192.168.1.6 idle in trans postgres 80811 0.0 2.1 94424 43884 ?? S 9:00PM 0:00.94 postmaster: ethereal ethereal 192.168.1.6 idle in trans postgres 80902 0.0 2.1 94424 43380 ?? S 9:01PM 0:00.76 postmaster: ethereal ethereal 192.168.1.6 idle in trans postgres 80949 0.0 2.2 94424 45248 ?? S 9:01PM 0:00.67 postmaster: ethereal ethereal 192.168.1.6 idle in trans postgres 81020 0.0 2.1 94424 42924 ?? S 9:02PM 0:00.74 postmaster: ethereal ethereal 192.168.1.6 idle in trans martin@io ~# date Tue Aug 3 21:28:53 MDT 2004 martin@io ~$ systat -vmstat 1 users Load 4.15 3.76 3.54 Aug 3 21:30 Mem:KB REAL VIRTUAL VN PAGER SWAP PAGER Tot Share Tot Share Free in out in out Act 293708 4120 482092 6736 132204 count All 1910640 5224 2462084 17060 pages Interrupts Proc:r p d s w Csw Trp Sys Int Sof Flt 212 cow 728 total 5 139 215023802 3112 1078 58023548 397288 wire 100 0: clk 295304 act 6: fdc0 59.6%Sys 0.9%Intr 38.9%User 0.0%Nice 0.6%Idl 1151936 inact 128 8: rtc | | | | | | | | | | 92748 cache 53 10: bge ==============================>>>>>>>>>>>>>>>>>> 39456 free 447 11: em0 daefr 14: ata Namei Name-cache Dir-cache 1345 prcfr Calls hits % hits % react 2761 2761 100 pdwake 1111 zfod pdpgs Disks ad4 ad6 ar0 fd0 1059 ofod intrn KB/t 0.00 0.00 16.00 0.00 95 %slo-z 204096 buf tps 0 0 35 0 1587 tfree 591 dirtybuf MB/s 0.00 0.00 0.54 0.00 120525 desiredvnodes % busy 0 0 5 0 30131 numvnodes 22695 freevnodes martin@io ~$ vmstat procs memory page disks faults cpu r b w avm fre flt re pi po fr sr ad4 ad6 in sy cs us sy id 0 0 0 317216 128932 3121 0 0 0 736 0 0 0 789 0 1682 16 21 63 martin@io ~$ pstat -s Device 1K-blocks Used Avail Capacity /dev/ar0s1b 4194304 24 4194280 0% martin@io ~$ iostat tty ad4 ad6 ar0 cpu tin tout KB/t tps MB/s KB/t tps MB/s KB/t tps MB/s us ni sy in id 0 17 0.50 0 0.00 0.50 0 0.00 20.32 23 0.45 16 0 20 1 63 martin@io ~$ ps -U postgres | wc -l 72 martin@io ~$ ps -uax USER PID %CPU %MEM VSZ RSS TT STAT STARTED TIME COMMAND postgres 32084 0.0 0.2 91616 3764 p0- S Mon12PM 3:17.42 /usr/local/bin/postmaster -D /var/postgres (postgres) postgres 51100 0.0 1.7 94412 35828 ?? S 3:36PM 0:01.02 postmaster: ethereal ethereal 192.168.1.6 idle in trans postgres 51134 0.0 1.8 94620 37060 ?? S 3:36PM 0:01.04 postmaster: ethereal ethereal 192.168.1.6 idle in trans postgres 51147 0.0 1.8 94620 36980 ?? S 3:37PM 0:00.66 postmaster: ethereal ethereal 192.168.1.6 idle in trans postgres 51240 0.0 1.8 94412 37156 ?? S 3:38PM 0:00.62 postmaster: ethereal ethereal 192.168.1.6 idle in trans postgres 51276 0.0 0.7 94412 15180 ?? S 3:38PM 0:00.41 postmaster: ethereal ethereal 192.168.1.6 idle in trans postgres 51277 0.0 1.9 94492 40776 ?? S 3:38PM 0:00.57 postmaster: ethereal ethereal 192.168.1.6 idle in trans postgres 51287 0.0 1.8 94620 37228 ?? S 3:38PM 0:00.63 postmaster: ethereal ethereal 192.168.1.6 idle in trans postgres 51368 0.0 1.7 94412 35348 ?? S 3:40PM 0:00.55 postmaster: ethereal ethereal 192.168.1.6 idle in trans postgres 51493 0.0 1.7 94412 35920 ?? S 3:42PM 0:00.49 postmaster: ethereal ethereal 192.168.1.6 idle in trans postgres 51495 0.0 1.7 94412 36192 ?? S 3:42PM 0:00.89 postmaster: ethereal ethereal 192.168.1.6 idle in trans martin@io ~$ date Tue Aug 3 16:09:06 MDT 2004 martin@io ~$ systat -vmstat 1 users Load 0.53 0.54 0.47 Aug 3 16:14 Mem:KB REAL VIRTUAL VN PAGER SWAP PAGER Tot Share Tot Share Free in out in out Act 205172 4312 326472 6736 120128 count All 1909516 4944 2215992 11896 pages Interrupts Proc:r p d s w Csw Trp Sys Int Sof Flt 197 cow 642 total 96 177814901 2186 947 20014746 386844 wire 100 0: clk 206544 act 6: fdc0 28.8%Sys 0.5%Intr 26.8%User 0.0%Nice 43.9%Idl 1263216 inact 128 8: rtc | | | | | | | | | | 69644 cache 55 10: bge ==============+>>>>>>>>>>>>> 50484 free 359 11: em0 daefr 14: ata Namei Name-cache Dir-cache 1235 prcfr Calls hits % hits % react 1709 1709 100 pdwake 1000 zfod pdpgs Disks ad4 ad6 ar0 fd0 938 ofod intrn KB/t 0.00 0.00 16.00 0.00 93 %slo-z 204096 buf tps 0 0 31 0 1482 tfree 56 dirtybuf MB/s 0.00 0.00 0.49 0.00 120525 desiredvnodes % busy 0 0 2 0 30131 numvnodes 7645 freevnodes
pgsql-performance by date: