Thread: Performance Bottleneck
I run a Perl/CGI driven website that makes extensive use of PostgreSQL (7.4.3) for everything from user information to formatting and display of specific sections of the site. The server itself, is a dual processor AMD Opteron 1.4Ghz w/ 2GB Ram and 2 x 120GB hard drives mirrored for redundancy running under FreeBSD 5.2.1 (AMD64). Recently loads on the site have increased during peak hours to the point of showing considerable loss in performance. This can be observed when connections move from the 120 concurrent connections to PostgreSQL to roughly 175 or more. Essentially, the machine seems to struggle to keep up with continual requests and slows down respectively as resources are tied down. Code changes have been made to the scripts to essentially back off in high load working environments which have worked to an extent. However, as loads continue to increase the database itself is not taking well to the increased traffic taking place. Having taken a look at 'Tuning PostgreSQL for Performance' (http://www.varlena.com/GeneralBits/Tidbits/perf.html) using it as best I could in order to set my settings. However, even with statistics disabled and ever setting tweaked things still consider to deteriorate. Is there anything anyone can recommend in order to give the system a necessary speed boost? It would seem to me that a modest dataset of roughly a Gig combined with that type of hardware should be able to handle substantially more load then what it is. Can anyone provide me with clues as where to pursue? Would disabling 'fsync' provide more performance if I choose that information may be lost in case of a crash? If anyone needs access to logs, settings et cetera. Please ask, I simply wish to test the waters first on what is needed. Thanks! Martin Foster martin@ethereal-realms.org
On Tue, 3 Aug 2004, Martin Foster wrote: > to roughly 175 or more. Essentially, the machine seems to struggle > to keep up with continual requests and slows down respectively as > resources are tied down. I suggest you try to find queries that are slow and check to see if the plans are optimal for those queries. There are some logging options for logging quries that run longer then a user set limit. That can help finding the slow queries. Just doing some logging for some typical page fetches often show things that can be done better. For example, it's not uncommon to see the same information beeing pulled several times by misstake. Maybe you can also try something like connection pooling. I'm not sure how much that can give, but for small queries the connection time is usually the big part. > Would disabling 'fsync' provide more performance if I choose that > information may be lost in case of a crash? I would not do that. In most cases the performance increase is modest and the data corruption risk after a crash is much bigger so it's not worth it. If you have a lot of small inserts then it might be faster with this, but if possible it's much better to try to do more work in a transaction then before. -- /Dennis Björklund
Hello, It sounds to me like you are IO bound. 2x120GB hard drives just isn't going to cut it with that many connections (as a general rule). Are you swapping ? Sincerely, Joshua D. Drake Martin Foster wrote: > I run a Perl/CGI driven website that makes extensive use of PostgreSQL > (7.4.3) for everything from user information to formatting and display > of specific sections of the site. The server itself, is a dual > processor AMD Opteron 1.4Ghz w/ 2GB Ram and 2 x 120GB hard drives > mirrored for redundancy running under FreeBSD 5.2.1 (AMD64). -- Command Prompt, Inc., home of Mammoth PostgreSQL - S/ODBC and S/JDBC Postgresql support, programming shared hosting and dedicated hosting. +1-503-667-4564 - jd@commandprompt.com - http://www.commandprompt.com Mammoth PostgreSQL Replicator. Integrated Replication for PostgreSQL
Attachment
Martin Foster wrote: > I run a Perl/CGI driven website that makes extensive use of PostgreSQL > (7.4.3) for everything from user information to formatting and display > of specific sections of the site. The server itself, is a dual > processor AMD Opteron 1.4Ghz w/ 2GB Ram and 2 x 120GB hard drives > mirrored for redundancy running under FreeBSD 5.2.1 (AMD64). > > Recently loads on the site have increased during peak hours to the point > of showing considerable loss in performance. This can be observed > when connections move from the 120 concurrent connections to PostgreSQL > to roughly 175 or more. Essentially, the machine seems to struggle > to keep up with continual requests and slows down respectively as > resources are tied down. > > Code changes have been made to the scripts to essentially back off in > high load working environments which have worked to an extent. However, > as loads continue to increase the database itself is not taking well to > the increased traffic taking place. > > Having taken a look at 'Tuning PostgreSQL for Performance' > (http://www.varlena.com/GeneralBits/Tidbits/perf.html) using it as best > I could in order to set my settings. However, even with statistics > disabled and ever setting tweaked things still consider to deteriorate. > > Is there anything anyone can recommend in order to give the system a > necessary speed boost? It would seem to me that a modest dataset of > roughly a Gig combined with that type of hardware should be able to > handle substantially more load then what it is. Can anyone provide me > with clues as where to pursue? Would disabling 'fsync' provide more > performance if I choose that information may be lost in case of a crash? > > If anyone needs access to logs, settings et cetera. Please ask, I > simply wish to test the waters first on what is needed. Thanks! Tell us about your tipical queries, show us your configuration file. The access are only in read only mode or do you have concurrent writers and readers ? During peak hours your processors are tied to 100% ? What say the vmstat and the iostat ? May be you are not using indexes some where, or may be yes but the planner is not using it... In two words we needs other informations in order to help you. Regards Gaetano Mendola
Gaetano Mendola wrote: > Martin Foster wrote: > >> I run a Perl/CGI driven website that makes extensive use of PostgreSQL >> (7.4.3) for everything from user information to formatting and display >> of specific sections of the site. The server itself, is a dual >> processor AMD Opteron 1.4Ghz w/ 2GB Ram and 2 x 120GB hard drives >> mirrored for redundancy running under FreeBSD 5.2.1 (AMD64). >> >> Recently loads on the site have increased during peak hours to the >> point of showing considerable loss in performance. This can be >> observed when connections move from the 120 concurrent connections to >> PostgreSQL to roughly 175 or more. Essentially, the machine seems >> to struggle to keep up with continual requests and slows down >> respectively as resources are tied down. >> >> Code changes have been made to the scripts to essentially back off in >> high load working environments which have worked to an extent. >> However, as loads continue to increase the database itself is not >> taking well to the increased traffic taking place. >> >> Having taken a look at 'Tuning PostgreSQL for Performance' >> (http://www.varlena.com/GeneralBits/Tidbits/perf.html) using it as >> best I could in order to set my settings. However, even with >> statistics disabled and ever setting tweaked things still consider to >> deteriorate. >> >> Is there anything anyone can recommend in order to give the system a >> necessary speed boost? It would seem to me that a modest dataset of >> roughly a Gig combined with that type of hardware should be able to >> handle substantially more load then what it is. Can anyone provide me >> with clues as where to pursue? Would disabling 'fsync' provide more >> performance if I choose that information may be lost in case of a crash? >> >> If anyone needs access to logs, settings et cetera. Please ask, I >> simply wish to test the waters first on what is needed. Thanks! > > > Tell us about your tipical queries, show us your configuration file. > The access are only in read only mode or do you have concurrent writers > and readers ? During peak hours your processors are tied to 100% ? > What say the vmstat and the iostat ? > > May be you are not using indexes some where, or may be yes but the > planner is not using it... In two words we needs other informations > in order to help you. > > > > Regards > Gaetano Mendola > > I included all the files in attachments, which will hopefully cut down on any replied to Emails. As for things like connection pooling, the web server makes use of Apache::DBI to pool the connections for the Perl scripts being driven on that server. For the sake of being thorough, a quick 'apachectl status' was thrown in when the database was under a good load. Since it would rather slow things down to wait for the servers to really get bogged down with load averages of 20.00 and more, I opted to choose a period of time where we are a bit busier then normal. You will be able to see how the system behaves under a light load and subsequently reaching 125 or so concurrent connections. The queries themselves are simple, normally drawing information from one table with few conditions or in the most complex cases using joins on two table or sub queries. These behave very well and always have, the problem is that these queries take place in rather large amounts due to the dumb nature of the scripts themselves. Over a year ago when I was still using MySQL for the project, the statistics generated would report well over 65 queries per second under loads ranging from 130 to 160 at peak but averaged over the weeks of operation. Looking at the Apache status, one can see that it averages only roughly 2.5 requests per second giving you a slight indication as to what is taking place. A quick run of 'systat -ifstat' shows the following graph: /0 /1 /2 /3 /4 /5 /6 /7 /8 /9 /10 Load Average >>>>>>>>>>> Interface Traffic Peak Total lo0 in 0.000 KB/s 0.000 KB/s 37.690 GB out 0.000 KB/s 0.000 KB/s 37.690 GB em0 in 34.638 KB/s 41.986 KB/s 28.998 GB out 70.777 KB/s 70.777 KB/s 39.553 GB Em0 is a full duplexed 100Mbs connection to an internal switch that supports the servers directly. Load on the loopback was cut down considerably once I stopped using pg_autovaccum since its performance benefits under low load were buried under the hindrance it caused when traffic was high. I am sure that there are some places that could benefit from some optimization. Especially in the case of indexes, however as a whole the problem seems to be related more to the massive onslaught of queries then it does anything else. Also note that some of these scripts run for longer durations even if they are web based. Some run as long as 30 minutes, making queries to the database from periods of wait from five seconds to twenty-five seconds. Under high duress the timeouts should back out, based on the time needed for the query to respond, normally averaging 0.008 seconds. Does this help at all, or is more detail needed on the matter? Martin Foster martin@ethereal-realms.org # ----------------------------- # PostgreSQL configuration file # ----------------------------- # # This file consists of lines of the form: # # name = value # # (The '=' is optional.) White space may be used. Comments are introduced # with '#' anywhere on a line. The complete list of option names and # allowed values can be found in the PostgreSQL documentation. The # commented-out settings shown in this file represent the default values. # # Any option can also be given as a command line switch to the # postmaster, e.g. 'postmaster -c log_connections=on'. Some options # can be changed at run-time with the 'SET' SQL command. # # This file is read on postmaster startup and when the postmaster # receives a SIGHUP. If you edit the file on a running system, you have # to SIGHUP the postmaster for the changes to take effect, or use # "pg_ctl reload". #--------------------------------------------------------------------------- # CONNECTIONS AND AUTHENTICATION #--------------------------------------------------------------------------- # - Connection Settings - tcpip_socket = true ssl = false max_connections = 512 superuser_reserved_connections = 2 rendezvous_name='io' port = 5432 #unix_socket_directory = '/var/postgres' #unix_socket_group = 'postgres' #unix_socket_permissions = 0777 # octal # - Security & Authentication - authentication_timeout = 20 # 1-600, in seconds password_encryption = true db_user_namespace = false #--------------------------------------------------------------------------- # RESOURCE USAGE (except WAL) #--------------------------------------------------------------------------- # - Memory - shared_buffers = 8192 # min 16, at least max_connections*2, 8KB each sort_mem = 2048 # min 64, size in KB vacuum_mem = 65536 # min 1024, size in KB # - Free Space Map - #max_fsm_pages = 20000 # min max_fsm_relations*16, 6 bytes each #max_fsm_relations = 1000 # min 100, ~50 bytes each # - Kernel Resource Usage - #max_files_per_process = 1000 # min 25 #preload_libraries = '' #--------------------------------------------------------------------------- # WRITE AHEAD LOG #--------------------------------------------------------------------------- # - Settings - fsync = true # turns forced synchronization on or off wal_sync_method = fsync # the default varies across platforms: # fsync, fdatasync, open_sync, or open_datasync wal_buffers = 128 # min 4, 8KB each # - Checkpoints - checkpoint_segments = 24 # in logfile segments, min 1, 16MB each checkpoint_timeout = 300 # range 30-3600, in seconds checkpoint_warning = 30 # 0 is off, in seconds commit_delay = 15000 # range 0-100000, in microseconds commit_siblings = 64 # range 1-1000 #--------------------------------------------------------------------------- # QUERY TUNING #--------------------------------------------------------------------------- # - Planner Method Enabling - enable_hashagg = true enable_hashjoin = true enable_indexscan = true enable_mergejoin = true enable_nestloop = true enable_seqscan = true enable_sort = true enable_tidscan = true # - Planner Cost Constants - effective_cache_size = 16000 # typically 8KB each random_page_cost = 2 # units are one sequential page fetch cost cpu_tuple_cost = 0.01 # (same) cpu_index_tuple_cost = 0.001 # (same) cpu_operator_cost = 0.0025 # (same) # - Genetic Query Optimizer - geqo = false geqo_threshold = 11 geqo_effort = 1 geqo_generations = 0 geqo_pool_size = 0 # default based on tables in statement, # range 128-1024 geqo_selection_bias = 2.0 # range 1.5-2.0 # - Other Planner Options - default_statistics_target = 100 # range 1-1000 from_collapse_limit = 8 join_collapse_limit = 8 # 1 disables collapsing of explicit JOINs #--------------------------------------------------------------------------- # ERROR REPORTING AND LOGGING #--------------------------------------------------------------------------- # - Syslog - #syslog = 2 # range 0-2; 0=stdout; 1=both; 2=syslog #syslog_facility = 'LOCAL0' #syslog_ident = 'postgres' # - When to Log - client_min_messages = notice # Values, in order of decreasing detail: # debug5, debug4, debug3, debug2, debug1, # log, info, notice, warning, error log_min_messages = info # Values, in order of decreasing detail: # debug5, debug4, debug3, debug2, debug1, # info, notice, warning, error, log, fatal, # panic log_error_verbosity = default # terse, default, or verbose messages log_min_error_statement = panic # Values in order of increasing severity: # debug5, debug4, debug3, debug2, debug1, # info, notice, warning, error, panic(off) log_min_duration_statement = -1 # Log all statements whose # execution time exceeds the value, in # milliseconds. Zero prints all queries. # Minus-one disables. silent_mode = false # DO NOT USE without Syslog! # - What to Log - #debug_print_parse = false #debug_print_rewritten = false #debug_print_plan = false #debug_pretty_print = false #log_connections = false #log_duration = false #log_pid = false #log_statement = false #log_timestamp = false #log_hostname = false #log_source_port = false #--------------------------------------------------------------------------- # RUNTIME STATISTICS #--------------------------------------------------------------------------- # - Statistics Monitoring - log_parser_stats = false log_planner_stats = false log_executor_stats = false log_statement_stats = false # - Query/Index Statistics Collector - stats_start_collector = false stats_command_string = false stats_block_level = false stats_row_level = false stats_reset_on_server_start = true #--------------------------------------------------------------------------- # CLIENT CONNECTION DEFAULTS #--------------------------------------------------------------------------- # - Statement Behavior - search_path = '$user,public' # schema names check_function_bodies = true default_transaction_isolation = 'read committed' default_transaction_read_only = false statement_timeout = 25000 # 0 is disabled, in milliseconds # - Locale and Formatting - datestyle = 'iso, us' #timezone = unknown # actually, defaults to TZ environment setting #australian_timezones = false #extra_float_digits = 0 # min -15, max 2 #client_encoding = sql_ascii # actually, defaults to database encoding # These settings are initialized by initdb -- they may be changed lc_messages = 'C' # locale for system error message strings lc_monetary = 'C' # locale for monetary formatting lc_numeric = 'C' # locale for number formatting lc_time = 'C' # locale for time formatting # - Other Defaults - explain_pretty_print = true #dynamic_library_path = '$libdir' #max_expr_depth = 10000 # min 10 #--------------------------------------------------------------------------- # LOCK MANAGEMENT #--------------------------------------------------------------------------- deadlock_timeout = 5000 # in milliseconds #max_locks_per_transaction = 64 # min 10, ~260*max_connections bytes each #--------------------------------------------------------------------------- # VERSION/PLATFORM COMPATIBILITY #--------------------------------------------------------------------------- # - Previous Postgres Versions - #add_missing_from = true #regex_flavor = advanced # advanced, extended, or basic #sql_inheritance = true # - Other Platforms & Clients - #transform_null_equals = false martin@elara ~$ apachectl status Apache Server Status for www.ethereal-realms.org Server Version: Apache/1.3.29 (Unix) mod_perl/1.27 mod_gzip/1.3.26.1a mod_ssl/2.8.16 OpenSSL/0.9.7c Server Built: May 24 2004 00:32:11 _________________________________________________________________ Current Time: Tuesday, 03-Aug-2004 21:34:37 MDT Restart Time: Tuesday, 03-Aug-2004 03:54:21 MDT Parent Server Generation: 0 Server uptime: 17 hours 40 minutes 16 seconds Total accesses: 168852 - Total Traffic: 406.0 MB CPU Usage: u283.96 s41.27 cu92.51 cs13.68 - .678% CPU load 2.65 requests/sec - 6.5 kB/second - 2521 B/request 127 requests currently being processed, 59 idle servers W_WWWW_WW.WWKKWK__W._WW_.WK._KK._..K.WW__WK_WW_WWWK_.__WWW_W.WWW __..__W._W__WWWWWW._WKWW_W_KKW_WWW___K_.WKWW__KWWWWW_WWWWKKKWW.W __.WK_WW.W_W_WWWW_KWWWWW_KW.WW.W__W_W___W___WWWKW__W_W._.WWW.WWK W___WKKW.K_W.W...KW....WW_...................................... ................................................................ ................................................................ ................................................................ ................................................................ ................................................................ ................................................................ ................................................................ ................................................................ ................................................................ ................................................................ ................................................................ ................................................................ Scoreboard Key: "_" Waiting for Connection, "S" Starting up, "R" Reading Request, "W" Sending Reply, "K" Keepalive (read), "D" DNS Lookup, "L" Logging, "G" Gracefully finishing, "." Open slot with no current process martin@io ~$ vmstat procs memory page disks faults cpu r b w avm fre flt re pi po fr sr ad4 ad6 in sy cs us sy id 0 0 0 498532 122848 3306 0 0 0 740 0 0 0 788 0 1675 16 21 63 martin@io ~$ pstat -s Device 1K-blocks Used Avail Capacity /dev/ar0s1b 4194304 24 4194280 0% martin@io ~$ iostat tty ad4 ad6 ar0 cpu tin tout KB/t tps MB/s KB/t tps MB/s KB/t tps MB/s us ni sy in id 0 17 0.50 0 0.00 0.50 0 0.00 20.28 23 0.45 16 0 20 1 63 martin@io ~# ps -U postgres | wc -l 127 martin@io ~$ ps -uax USER PID %CPU %MEM VSZ RSS TT STAT STARTED TIME COMMAND postgres 32084 0.0 0.2 91616 3764 p0- R Mon12PM 4:08.99 /usr/local/bin/postmaster -D /var/postgres (postgres) postgres 80333 0.0 2.1 94620 44372 ?? S 8:57PM 0:01.00 postmaster: ethereal ethereal 192.168.1.6 idle in trans postgres 80599 0.0 2.1 94652 44780 ?? S 8:59PM 0:00.97 postmaster: ethereal ethereal 192.168.1.6 idle in trans postgres 80616 0.0 2.4 94424 50396 ?? S 8:59PM 0:00.89 postmaster: ethereal ethereal 192.168.1.6 idle in trans postgres 80715 0.0 2.2 94444 46804 ?? S 9:00PM 0:00.68 postmaster: ethereal ethereal 192.168.1.6 idle in trans postgres 80788 0.0 2.1 94424 43944 ?? S 9:00PM 0:00.93 postmaster: ethereal ethereal 192.168.1.6 idle in trans postgres 80811 0.0 2.1 94424 43884 ?? S 9:00PM 0:00.94 postmaster: ethereal ethereal 192.168.1.6 idle in trans postgres 80902 0.0 2.1 94424 43380 ?? S 9:01PM 0:00.76 postmaster: ethereal ethereal 192.168.1.6 idle in trans postgres 80949 0.0 2.2 94424 45248 ?? S 9:01PM 0:00.67 postmaster: ethereal ethereal 192.168.1.6 idle in trans postgres 81020 0.0 2.1 94424 42924 ?? S 9:02PM 0:00.74 postmaster: ethereal ethereal 192.168.1.6 idle in trans martin@io ~# date Tue Aug 3 21:28:53 MDT 2004 martin@io ~$ systat -vmstat 1 users Load 4.15 3.76 3.54 Aug 3 21:30 Mem:KB REAL VIRTUAL VN PAGER SWAP PAGER Tot Share Tot Share Free in out in out Act 293708 4120 482092 6736 132204 count All 1910640 5224 2462084 17060 pages Interrupts Proc:r p d s w Csw Trp Sys Int Sof Flt 212 cow 728 total 5 139 215023802 3112 1078 58023548 397288 wire 100 0: clk 295304 act 6: fdc0 59.6%Sys 0.9%Intr 38.9%User 0.0%Nice 0.6%Idl 1151936 inact 128 8: rtc | | | | | | | | | | 92748 cache 53 10: bge ==============================>>>>>>>>>>>>>>>>>> 39456 free 447 11: em0 daefr 14: ata Namei Name-cache Dir-cache 1345 prcfr Calls hits % hits % react 2761 2761 100 pdwake 1111 zfod pdpgs Disks ad4 ad6 ar0 fd0 1059 ofod intrn KB/t 0.00 0.00 16.00 0.00 95 %slo-z 204096 buf tps 0 0 35 0 1587 tfree 591 dirtybuf MB/s 0.00 0.00 0.54 0.00 120525 desiredvnodes % busy 0 0 5 0 30131 numvnodes 22695 freevnodes martin@io ~$ vmstat procs memory page disks faults cpu r b w avm fre flt re pi po fr sr ad4 ad6 in sy cs us sy id 0 0 0 317216 128932 3121 0 0 0 736 0 0 0 789 0 1682 16 21 63 martin@io ~$ pstat -s Device 1K-blocks Used Avail Capacity /dev/ar0s1b 4194304 24 4194280 0% martin@io ~$ iostat tty ad4 ad6 ar0 cpu tin tout KB/t tps MB/s KB/t tps MB/s KB/t tps MB/s us ni sy in id 0 17 0.50 0 0.00 0.50 0 0.00 20.32 23 0.45 16 0 20 1 63 martin@io ~$ ps -U postgres | wc -l 72 martin@io ~$ ps -uax USER PID %CPU %MEM VSZ RSS TT STAT STARTED TIME COMMAND postgres 32084 0.0 0.2 91616 3764 p0- S Mon12PM 3:17.42 /usr/local/bin/postmaster -D /var/postgres (postgres) postgres 51100 0.0 1.7 94412 35828 ?? S 3:36PM 0:01.02 postmaster: ethereal ethereal 192.168.1.6 idle in trans postgres 51134 0.0 1.8 94620 37060 ?? S 3:36PM 0:01.04 postmaster: ethereal ethereal 192.168.1.6 idle in trans postgres 51147 0.0 1.8 94620 36980 ?? S 3:37PM 0:00.66 postmaster: ethereal ethereal 192.168.1.6 idle in trans postgres 51240 0.0 1.8 94412 37156 ?? S 3:38PM 0:00.62 postmaster: ethereal ethereal 192.168.1.6 idle in trans postgres 51276 0.0 0.7 94412 15180 ?? S 3:38PM 0:00.41 postmaster: ethereal ethereal 192.168.1.6 idle in trans postgres 51277 0.0 1.9 94492 40776 ?? S 3:38PM 0:00.57 postmaster: ethereal ethereal 192.168.1.6 idle in trans postgres 51287 0.0 1.8 94620 37228 ?? S 3:38PM 0:00.63 postmaster: ethereal ethereal 192.168.1.6 idle in trans postgres 51368 0.0 1.7 94412 35348 ?? S 3:40PM 0:00.55 postmaster: ethereal ethereal 192.168.1.6 idle in trans postgres 51493 0.0 1.7 94412 35920 ?? S 3:42PM 0:00.49 postmaster: ethereal ethereal 192.168.1.6 idle in trans postgres 51495 0.0 1.7 94412 36192 ?? S 3:42PM 0:00.89 postmaster: ethereal ethereal 192.168.1.6 idle in trans martin@io ~$ date Tue Aug 3 16:09:06 MDT 2004 martin@io ~$ systat -vmstat 1 users Load 0.53 0.54 0.47 Aug 3 16:14 Mem:KB REAL VIRTUAL VN PAGER SWAP PAGER Tot Share Tot Share Free in out in out Act 205172 4312 326472 6736 120128 count All 1909516 4944 2215992 11896 pages Interrupts Proc:r p d s w Csw Trp Sys Int Sof Flt 197 cow 642 total 96 177814901 2186 947 20014746 386844 wire 100 0: clk 206544 act 6: fdc0 28.8%Sys 0.5%Intr 26.8%User 0.0%Nice 43.9%Idl 1263216 inact 128 8: rtc | | | | | | | | | | 69644 cache 55 10: bge ==============+>>>>>>>>>>>>> 50484 free 359 11: em0 daefr 14: ata Namei Name-cache Dir-cache 1235 prcfr Calls hits % hits % react 1709 1709 100 pdwake 1000 zfod pdpgs Disks ad4 ad6 ar0 fd0 938 ofod intrn KB/t 0.00 0.00 16.00 0.00 93 %slo-z 204096 buf tps 0 0 31 0 1482 tfree 56 dirtybuf MB/s 0.00 0.00 0.49 0.00 120525 desiredvnodes % busy 0 0 2 0 30131 numvnodes 7645 freevnodes
> The queries themselves are simple, normally drawing information from one > table with few conditions or in the most complex cases using joins on > two table or sub queries. These behave very well and always have, the > problem is that these queries take place in rather large amounts due to > the dumb nature of the scripts themselves. Hum, maybe this "dumb" thing is where to look at ? I'm no expert, but I have had the same situation with a very dump PHP application, namely osCommerce, which averaged about 140 (!!!!!) queries on a page ! I added some traces to queries, and some logging, only to see that the stupid programmers did something like (pseudo code): for id in id_list: select stuff from database where id=id Geee... I replaced it by : select stuff from database where id in (id_list) And this saved about 20 requests... The code was peppered by queries like that. In the end it went from 140 queries to about 20, which is still way too much IMHO, but I couldn't go lower without an extensive rewrite. If you have a script making many selects, it's worth grouping them, even using stored procedures. For instance using the classical "tree in a table" to store a tree of product categories : create table categories ( id serial primary key, parent_id references categories(id), etc ); You basically have these choices in order to display the tree : - select for parent_id=0 (root) - for each element, select its children - and so on OR - make a stored procedure which does that. At least 3x faster and a lot less CPU overhead. OR (if you have say 50 rows in the table which was my case) - select the entire table and build your tree in the script It was a little bit faster than the stored procedure. Could you give an example of your dumb scripts ? It's good to optimize a database, but it's even better to remove useless queries...
On Wed, Aug 04, 2004 at 03:49:11AM +0000, Martin Foster wrote: > Also note that some of these scripts run for longer durations even if > they are web based. Some run as long as 30 minutes, making queries to > the database from periods of wait from five seconds to twenty-five > seconds. Under high duress the timeouts should back out, based on > the time needed for the query to respond, normally averaging 0.008 seconds. I would start by EXPLAIN ANALYZE'ing those 30 minute queries. > martin@io ~$ vmstat > procs memory page disks faults cpu > r b w avm fre flt re pi po fr sr ad4 ad6 in sy cs us sy id > 0 0 0 498532 122848 3306 0 0 0 740 0 0 0 788 0 1675 16 21 63 > vmstat without a "delay" argument (e.g. 'vmstat 1') gives you a cumulative or average since boot. You'd probably get better information by doing a real-time sampling of stats during normal and heavy load. > martin@io ~$ ps -uax > USER PID %CPU %MEM VSZ RSS TT STAT STARTED TIME COMMAND > postgres 32084 0.0 0.2 91616 3764 p0- R Mon12PM 4:08.99 /usr/local/bin/postmaster -D /var/postgres (postgres) > postgres 80333 0.0 2.1 94620 44372 ?? S 8:57PM 0:01.00 postmaster: ethereal ethereal 192.168.1.6 idle in trans > postgres 80599 0.0 2.1 94652 44780 ?? S 8:59PM 0:00.97 postmaster: ethereal ethereal 192.168.1.6 idle in trans > postgres 80616 0.0 2.4 94424 50396 ?? S 8:59PM 0:00.89 postmaster: ethereal ethereal 192.168.1.6 idle in trans > postgres 80715 0.0 2.2 94444 46804 ?? S 9:00PM 0:00.68 postmaster: ethereal ethereal 192.168.1.6 idle in trans > postgres 80788 0.0 2.1 94424 43944 ?? S 9:00PM 0:00.93 postmaster: ethereal ethereal 192.168.1.6 idle in trans > postgres 80811 0.0 2.1 94424 43884 ?? S 9:00PM 0:00.94 postmaster: ethereal ethereal 192.168.1.6 idle in trans > postgres 80902 0.0 2.1 94424 43380 ?? S 9:01PM 0:00.76 postmaster: ethereal ethereal 192.168.1.6 idle in trans > postgres 80949 0.0 2.2 94424 45248 ?? S 9:01PM 0:00.67 postmaster: ethereal ethereal 192.168.1.6 idle in trans > postgres 81020 0.0 2.1 94424 42924 ?? S 9:02PM 0:00.74 postmaster: ethereal ethereal 192.168.1.6 idle in trans All the connections in your email are idle. You may benefit from using pgpool instead of Apache::DBI (I've never tried). http://www.mail-archive.com/pgsql-announce@postgresql.org/msg00760.html
Michael Adler wrote: > On Wed, Aug 04, 2004 at 03:49:11AM +0000, Martin Foster wrote: > >>Also note that some of these scripts run for longer durations even if >>they are web based. Some run as long as 30 minutes, making queries to >>the database from periods of wait from five seconds to twenty-five >>seconds. Under high duress the timeouts should back out, based on >>the time needed for the query to respond, normally averaging 0.008 seconds. > > > I would start by EXPLAIN ANALYZE'ing those 30 minute queries. > The Apache process will run for 30 minutes at a time, not the query itself. Essentially, while that process is running it will check for new records in the table at varying intervals, since it will increase timeouts based on load or lack of activity in order to reduce load to the database. > >>martin@io ~$ vmstat >> procs memory page disks faults cpu >> r b w avm fre flt re pi po fr sr ad4 ad6 in sy cs us sy id >> 0 0 0 498532 122848 3306 0 0 0 740 0 0 0 788 0 1675 16 21 63 >> > > > vmstat without a "delay" argument (e.g. 'vmstat 1') gives you a > cumulative or average since boot. You'd probably get better > information by doing a real-time sampling of stats during normal and > heavy load. > > >>martin@io ~$ ps -uax >>USER PID %CPU %MEM VSZ RSS TT STAT STARTED TIME COMMAND >>postgres 32084 0.0 0.2 91616 3764 p0- R Mon12PM 4:08.99 /usr/local/bin/postmaster -D /var/postgres (postgres) >>postgres 80333 0.0 2.1 94620 44372 ?? S 8:57PM 0:01.00 postmaster: ethereal ethereal 192.168.1.6 idle in trans >>postgres 80599 0.0 2.1 94652 44780 ?? S 8:59PM 0:00.97 postmaster: ethereal ethereal 192.168.1.6 idle in trans >>postgres 80616 0.0 2.4 94424 50396 ?? S 8:59PM 0:00.89 postmaster: ethereal ethereal 192.168.1.6 idle in trans >>postgres 80715 0.0 2.2 94444 46804 ?? S 9:00PM 0:00.68 postmaster: ethereal ethereal 192.168.1.6 idle in trans >>postgres 80788 0.0 2.1 94424 43944 ?? S 9:00PM 0:00.93 postmaster: ethereal ethereal 192.168.1.6 idle in trans >>postgres 80811 0.0 2.1 94424 43884 ?? S 9:00PM 0:00.94 postmaster: ethereal ethereal 192.168.1.6 idle in trans >>postgres 80902 0.0 2.1 94424 43380 ?? S 9:01PM 0:00.76 postmaster: ethereal ethereal 192.168.1.6 idle in trans >>postgres 80949 0.0 2.2 94424 45248 ?? S 9:01PM 0:00.67 postmaster: ethereal ethereal 192.168.1.6 idle in trans >>postgres 81020 0.0 2.1 94424 42924 ?? S 9:02PM 0:00.74 postmaster: ethereal ethereal 192.168.1.6 idle in trans > > > All the connections in your email are idle. You may benefit from using > pgpool instead of Apache::DBI (I've never tried). > > http://www.mail-archive.com/pgsql-announce@postgresql.org/msg00760.html > I will take a look into pgpool and see if it will serve as the solution I need. The pre-pooling of children sounds like a good choice, however since overhead is already a point of worry I almost wonder if I can host it on another server in order to drop that overhead on the servers directly. Anyone have experience with this on running it on the same machine or a different machine then the database proper? Of course, if this works as it should, I could easily put an older database server back into operation provided pgpool does weighted load balancing. Martin Foster Creator/Designer Ethereal Realms martin@ethereal-realms.org
Gaetano Mendola wrote: > Martin Foster wrote: > >> Gaetano Mendola wrote: >> >>> Martin Foster wrote: >>> >>>> I run a Perl/CGI driven website that makes extensive use of >>>> PostgreSQL (7.4.3) for everything from user information to >>>> formatting and display of specific sections of the site. The >>>> server itself, is a dual processor AMD Opteron 1.4Ghz w/ 2GB Ram and >>>> 2 x 120GB hard drives mirrored for redundancy running under FreeBSD >>>> 5.2.1 (AMD64). >>>> >>>> Recently loads on the site have increased during peak hours to the >>>> point of showing considerable loss in performance. This can be >>>> observed when connections move from the 120 concurrent connections >>>> to PostgreSQL to roughly 175 or more. Essentially, the machine >>>> seems to struggle to keep up with continual requests and slows down >>>> respectively as resources are tied down. >>>> >>>> Code changes have been made to the scripts to essentially back off >>>> in high load working environments which have worked to an extent. >>>> However, as loads continue to increase the database itself is not >>>> taking well to the increased traffic taking place. >>>> >>>> Having taken a look at 'Tuning PostgreSQL for Performance' >>>> (http://www.varlena.com/GeneralBits/Tidbits/perf.html) using it as >>>> best I could in order to set my settings. However, even with >>>> statistics disabled and ever setting tweaked things still consider >>>> to deteriorate. >>>> >>>> Is there anything anyone can recommend in order to give the system a >>>> necessary speed boost? It would seem to me that a modest dataset >>>> of roughly a Gig combined with that type of hardware should be able >>>> to handle substantially more load then what it is. Can anyone >>>> provide me with clues as where to pursue? Would disabling 'fsync' >>>> provide more performance if I choose that information may be lost in >>>> case of a crash? >>>> >>>> If anyone needs access to logs, settings et cetera. Please ask, I >>>> simply wish to test the waters first on what is needed. Thanks! >>> >>> >>> >>> >>> Tell us about your tipical queries, show us your configuration file. >>> The access are only in read only mode or do you have concurrent writers >>> and readers ? During peak hours your processors are tied to 100% ? >>> What say the vmstat and the iostat ? >>> >>> May be you are not using indexes some where, or may be yes but the >>> planner is not using it... In two words we needs other informations >>> in order to help you. >>> >>> >>> >>> Regards >>> Gaetano Mendola >>> >>> >> >> I included all the files in attachments, which will hopefully cut down >> on any replied to Emails. As for things like connection pooling, >> the web server makes use of Apache::DBI to pool the connections for >> the Perl scripts being driven on that server. For the sake of being >> thorough, a quick 'apachectl status' was thrown in when the database >> was under a good load. > > > Let start from your postgres configuration: > > shared_buffers = 8192 <==== This is really too small for your > configuration > sort_mem = 2048 > > wal_buffers = 128 <==== This is really too small for your configuration > > effective_cache_size = 16000 > > change this values in: > > shared_buffers = 50000 > sort_mem = 16084 > > wal_buffers = 1500 > > effective_cache_size = 32000 > > > to bump up the shm usage you have to configure your OS in order to be > allowed to use that ammount of SHM. > > This are the numbers that I feel good for your HW, the second step now is > analyze your queries > >> The queries themselves are simple, normally drawing information from >> one table with few conditions or in the most complex cases using joins >> on two table or sub queries. These behave very well and always have, >> the problem is that these queries take place in rather large amounts >> due to the dumb nature of the scripts themselves. > > > Show us the explain analyze on that queries, how many rows the tables are > containing, the table schema could be also usefull. > > > > regards > Gaetano Mendola > I will look into moving up those values and seeing how they interact with the system once I get back from work. Since it was requested, I have a visual representation of an older schema, one that was used under MySQL. Note that all of the timestamps are now properly set to LOCALTIME on PostgreSQL. http://prdownloads.sourceforge.net/ethereal-realms/ethereal-3_0_0.png?download The amount of rows for tables of note are as follows: Puppeteer 1606 Puppet 33176 Realm 83 Post 36156 Audit 61961 The post table is continually cleared of old information since the nature of the information is time very critical and archiving would only hinder performance. As a result, this will vary wildly based on time of day since users (Puppeteers) tend to post more during peak hours. NOTE: The scripts make use of different schema's with the same information in order to virtualize the script in order to support more then one site on the same hardware. On a side note, this would be a normal post-authentication session once in realm for getting new posts: * Script is executed and schema is determined through stored procedure; * Formatting information is fetched from Tag and RealmDesign as needed; * Script will retrieve stored parameters in the Param table; * Script will decode, analyze and authenticate against Puppeteer; * Script will scan the Puppet and Post tables to generate posts; * Sub-query to determine ignored puppeteers/users; * Sub-query to determine ignored puppets/handles; and * Loop above if necessary until expiry of script delaying the execution of the script from 5 to 25 seconds. This should provide an idea on that portion. of course the flow changes when one posts, but is handled by a different script instance as is authentication et cetera. Martin Foster Creator/Designer Ethereal Realms martin@ethereal-realms.org
Apache processes running for 30 minutes ?..... My advice : use frames and Javascript ! In your webpage, you have two frames : "content" and "refresh". "content" starts empty (say, just a title on top of the page). "refresh" is refreshed every five seconds from a script on your server. This script generates a javascript which "document.write()'s" new entries in the "content" frame, thus adding new records in the upper frame. Thus, the refreshing uses a new request every 5 seconds, which terminates very fast, and does not hog an Apache process. Turn keepalive timeout down.
On Thu, Aug 05, 2004 at 08:40:35AM +0200, Pierre-Frédéric Caillaud wrote: > Apache processes running for 30 minutes ?..... > > My advice : use frames and Javascript ! My advice: Stay out of frames and Javascript if you can avoid it. The first is severely outdated technology, and the other one might well be disabled at the client side. /* Steinar */ -- Homepage: http://www.sesse.net/
Martin Foster wrote: > Gaetano Mendola wrote: > >> Martin Foster wrote: >> >>> I run a Perl/CGI driven website that makes extensive use of >>> PostgreSQL (7.4.3) for everything from user information to formatting >>> and display of specific sections of the site. The server itself, is >>> a dual processor AMD Opteron 1.4Ghz w/ 2GB Ram and 2 x 120GB hard >>> drives mirrored for redundancy running under FreeBSD 5.2.1 (AMD64). >>> >>> Recently loads on the site have increased during peak hours to the >>> point of showing considerable loss in performance. This can be >>> observed when connections move from the 120 concurrent connections to >>> PostgreSQL to roughly 175 or more. Essentially, the machine seems >>> to struggle to keep up with continual requests and slows down >>> respectively as resources are tied down. >>> >>> Code changes have been made to the scripts to essentially back off in >>> high load working environments which have worked to an extent. >>> However, as loads continue to increase the database itself is not >>> taking well to the increased traffic taking place. >>> >>> Having taken a look at 'Tuning PostgreSQL for Performance' >>> (http://www.varlena.com/GeneralBits/Tidbits/perf.html) using it as >>> best I could in order to set my settings. However, even with >>> statistics disabled and ever setting tweaked things still consider to >>> deteriorate. >>> >>> Is there anything anyone can recommend in order to give the system a >>> necessary speed boost? It would seem to me that a modest dataset of >>> roughly a Gig combined with that type of hardware should be able to >>> handle substantially more load then what it is. Can anyone provide >>> me with clues as where to pursue? Would disabling 'fsync' provide >>> more performance if I choose that information may be lost in case of >>> a crash? >>> >>> If anyone needs access to logs, settings et cetera. Please ask, I >>> simply wish to test the waters first on what is needed. Thanks! >> >> >> >> Tell us about your tipical queries, show us your configuration file. >> The access are only in read only mode or do you have concurrent writers >> and readers ? During peak hours your processors are tied to 100% ? >> What say the vmstat and the iostat ? >> >> May be you are not using indexes some where, or may be yes but the >> planner is not using it... In two words we needs other informations >> in order to help you. >> >> >> >> Regards >> Gaetano Mendola >> >> > > I included all the files in attachments, which will hopefully cut down > on any replied to Emails. As for things like connection pooling, the > web server makes use of Apache::DBI to pool the connections for the Perl > scripts being driven on that server. For the sake of being thorough, > a quick 'apachectl status' was thrown in when the database was under a > good load. Let start from your postgres configuration: shared_buffers = 8192 <==== This is really too small for your configuration sort_mem = 2048 wal_buffers = 128 <==== This is really too small for your configuration effective_cache_size = 16000 change this values in: shared_buffers = 50000 sort_mem = 16084 wal_buffers = 1500 effective_cache_size = 32000 to bump up the shm usage you have to configure your OS in order to be allowed to use that ammount of SHM. This are the numbers that I feel good for your HW, the second step now is analyze your queries > The queries themselves are simple, normally drawing information from one > table with few conditions or in the most complex cases using joins on > two table or sub queries. These behave very well and always have, the > problem is that these queries take place in rather large amounts due to > the dumb nature of the scripts themselves. Show us the explain analyze on that queries, how many rows the tables are containing, the table schema could be also usefull. regards Gaetano Mendola
On Wed, 2004-08-04 at 17:25 +0200, Gaetano Mendola wrote: > > The queries themselves are simple, normally drawing information from one > > table with few conditions or in the most complex cases using joins on > > two table or sub queries. These behave very well and always have, the > > problem is that these queries take place in rather large amounts due to > > the dumb nature of the scripts themselves. > > Show us the explain analyze on that queries, how many rows the tables are > containing, the table schema could be also usefull. > If the queries themselves are optimized as much as they can be, and as you say, its just the sheer amount of similar queries hitting the database, you could try using prepared queries for ones that are most often executed to eliminate some of the overhead. I've had relatively good success with this in the past, and it doesn't take very much code modification. -- Mike Benoit <ipso@snappymail.ca>
Gaetano Mendola wrote: > > > Let start from your postgres configuration: > > shared_buffers = 8192 <==== This is really too small for your > configuration > sort_mem = 2048 > > wal_buffers = 128 <==== This is really too small for your configuration > > effective_cache_size = 16000 > > change this values in: > > shared_buffers = 50000 > sort_mem = 16084 > > wal_buffers = 1500 > > effective_cache_size = 32000 > > > to bump up the shm usage you have to configure your OS in order to be > allowed to use that ammount of SHM. > > This are the numbers that I feel good for your HW, the second step now is > analyze your queries > These changes have yielded some visible improvements, with load averages rarely going over the anything noticeable. However, I do have a question on the matter, why do these values seem to be far higher then what a frequently pointed to document would indicate as necessary? http://www.varlena.com/GeneralBits/Tidbits/perf.html I am simply curious, as this clearly shows that my understanding of PostgreSQL is clearly lacking when it comes to tweaking for the hardware. Martin Foster Creator/Designer Ethereal Realms martin@ethereal-realms.org
Mike Benoit wrote: > On Wed, 2004-08-04 at 17:25 +0200, Gaetano Mendola wrote: > > >>>The queries themselves are simple, normally drawing information from one >>>table with few conditions or in the most complex cases using joins on >>>two table or sub queries. These behave very well and always have, the >>>problem is that these queries take place in rather large amounts due to >>>the dumb nature of the scripts themselves. >> >>Show us the explain analyze on that queries, how many rows the tables are >>containing, the table schema could be also usefull. >> > > > If the queries themselves are optimized as much as they can be, and as > you say, its just the sheer amount of similar queries hitting the > database, you could try using prepared queries for ones that are most > often executed to eliminate some of the overhead. > > I've had relatively good success with this in the past, and it doesn't > take very much code modification. > One of the biggest problems is most probably related to the indexes. Since the performance penalty of logging the information needed to see which queries are used and which are not is a slight problem, then I cannot really make use of it for now. However, I am curious how one would go about preparing query? Is this similar to the DBI::Prepare statement with placeholders and simply changing the values passed on execute? Or is this something database level such as a view et cetera? SELECT Post.PostIDNumber, Post.$format, Post.PuppeteerLogin, Post.PuppetName, Post.PostCmd, Post.PostClass FROM Post WHERE Post.PostIDNumber > ?::INT AND (Post.PostTo='all' OR Post.PostTo=?) AND (NOT EXISTS (SELECT PuppetIgnore.PuppetLogin FROM PuppetIgnore WHERE PuppetIgnore.PuppetIgnore='global' AND PuppetIgnore.PuppeteerLogin=? AND PuppetIgnore.PuppetLogin=Post.PuppeteerLogin) OR Post.PuppeteerLogin IS NULL) AND (NOT EXISTS (SELECT PuppetIgnore.PuppetName FROM PuppetIgnore WHERE PuppetIgnore.PuppetIgnore='single' AND PuppetIgnore.PuppeteerLogin=? AND PuppetIgnore.PuppetName=Post.PuppetName) OR Post.PuppetName IS NULL) ORDER BY Post.PostIDNumber LIMIT 100 The range is determined from the previous run or through a query listed below. It was determined that using INT was far faster then limiting by timestamp. SELECT MIN(PostIDNumber) FROM Post WHERE RealmName=? AND PostClass IN ('general','play') AND PostTo='all' The above simply provides a starting point, nothing more. Once posts are pulled the script will throw in the last pulled number as to start from a fresh point. Under MySQL time was an stored as an INT which may have helped it handle timestamps more efficiently. It also made use of three or more queries, where two were done to generate an IN statement for the query actually running at the time. Martin Foster Creator/Designer Ethereal Realms martin@ethereal-realms.org
Martin Foster wrote: > Gaetano Mendola wrote: > >> >> >> Let start from your postgres configuration: >> >> shared_buffers = 8192 <==== This is really too small for your >> configuration >> sort_mem = 2048 >> >> wal_buffers = 128 <==== This is really too small for your >> configuration >> >> effective_cache_size = 16000 >> >> change this values in: >> >> shared_buffers = 50000 >> sort_mem = 16084 >> >> wal_buffers = 1500 >> >> effective_cache_size = 32000 >> >> >> to bump up the shm usage you have to configure your OS in order to be >> allowed to use that ammount of SHM. >> >> This are the numbers that I feel good for your HW, the second step now is >> analyze your queries >> > > These changes have yielded some visible improvements, with load averages > rarely going over the anything noticeable. However, I do have a > question on the matter, why do these values seem to be far higher then > what a frequently pointed to document would indicate as necessary? > > http://www.varlena.com/GeneralBits/Tidbits/perf.html > > I am simply curious, as this clearly shows that my understanding of > PostgreSQL is clearly lacking when it comes to tweaking for the hardware. Unfortunately there is no a "wizard tuning" for postgres so each one of us have a own "school". The data I gave you are oversized to be sure to achieve improvements. Now you can start to decrease these values ( starting from the wal_buffers ) in order to find the good compromise with your HW. Regards Gaetano Mendola
Martin Foster <martin@ethereal-realms.org> writes: > Gaetano Mendola wrote: >> change this values in: >> shared_buffers = 50000 >> sort_mem = 16084 >> >> wal_buffers = 1500 This value of wal_buffers is simply ridiculous. There isn't any reason to set wal_buffers higher than the amount of WAL log data that will be generated by a single transaction, because whatever is in the buffers will be flushed at transaction commit. If you are mainly dealing with heavy concurrency then it's the mean time between transaction commits that matters, and that's even less than the average transaction length. Even if you are mainly interested in the performance of large updating transactions that are not concurrent with anything else (bulk data load, perhaps), I'm not sure that I see any value in setting wal_buffers so high. The data will have to go to disk before commit in any case, and buffering so much of it just means that you are going to have a serious spike in disk traffic right before commit. It's almost certainly better to keep wal_buffers conservatively small and let the data trickle out as the transaction proceeds. I don't actually think there is anything very wrong with the default value (8) ... perhaps it is too small, but it's not two orders of magnitude too small. In 8.0, the presence of the background writer may make it useful to run with wal_buffers somewhat higher than before, but I still doubt that order-of-a-thousand buffers would be useful. The RAM would almost certainly be better spent on general-purpose disk buffers or kernel cache. Note though that this is just informed opinion, as I've never done or seen any benchmarks that examine the results of changing wal_buffers while holding other things constant. Has anyone tried it? regards, tom lane
On Fri, 2004-08-06 at 17:24, Gaetano Mendola wrote: > Martin Foster wrote: > > > Gaetano Mendola wrote: > > > >> > >> > >> Let start from your postgres configuration: > >> > >> shared_buffers = 8192 <==== This is really too small for your > >> configuration > >> sort_mem = 2048 > >> > >> wal_buffers = 128 <==== This is really too small for your > >> configuration > >> > >> effective_cache_size = 16000 > >> > >> change this values in: > >> > >> shared_buffers = 50000 > >> sort_mem = 16084 > >> > >> wal_buffers = 1500 > >> > >> effective_cache_size = 32000 > >> > >> > >> to bump up the shm usage you have to configure your OS in order to be > >> allowed to use that ammount of SHM. > >> > >> This are the numbers that I feel good for your HW, the second step now is > >> analyze your queries > >> > > > > These changes have yielded some visible improvements, with load averages > > rarely going over the anything noticeable. However, I do have a > > question on the matter, why do these values seem to be far higher then > > what a frequently pointed to document would indicate as necessary? > > > > http://www.varlena.com/GeneralBits/Tidbits/perf.html > > > > I am simply curious, as this clearly shows that my understanding of > > PostgreSQL is clearly lacking when it comes to tweaking for the hardware. > > Unfortunately there is no a "wizard tuning" for postgres so each one of > us have a own "school". The data I gave you are oversized to be sure > to achieve improvements. Now you can start to decrease these values > ( starting from the wal_buffers ) in order to find the good compromise > with your HW. FYI, my school of tuning is to change one thing at a time some reasonable percentage (shared_buffers from 1000 to 2000) and measure the change under simulated load. Make another change, test it, chart the shape of the change line. It should look something like this for most folks: shared_buffers | q/s (more is better) 100 | 20 200 | 45 400 | 80 1000 | 100 ... levels out here... 8000 | 110 10000 | 108 20000 | 40 30000 | 20 Note it going back down as we exceed our memory and start swapping shared_buffers. Where that happens on your machine is determined by many things like your machine's memory, memory bandwidth, type of load, etc... but it will happen on most machines and when it does, it often happens at the worst times, under heavy parallel load. Unless testing shows it's faster, 10000 or 25% of mem (whichever is less) is usually a pretty good setting for shared_buffers. Large data sets may require more than 10000, but going over 25% on machines with large memory is usually a mistake, especially servers that do anything other than just PostgreSQL. You're absolutely right about one thing, there's no automatic wizard for tuning this stuff.
Scott Marlowe wrote: > On Fri, 2004-08-06 at 17:24, Gaetano Mendola wrote: > >>Martin Foster wrote: >> >> >>>Gaetano Mendola wrote: >>> >>> >>>> >>>>Let start from your postgres configuration: >>>> >>>>shared_buffers = 8192 <==== This is really too small for your >>>>configuration >>>>sort_mem = 2048 >>>> >>>>wal_buffers = 128 <==== This is really too small for your >>>>configuration >>>> >>>>effective_cache_size = 16000 >>>> >>>>change this values in: >>>> >>>>shared_buffers = 50000 >>>>sort_mem = 16084 >>>> >>>>wal_buffers = 1500 >>>> >>>>effective_cache_size = 32000 >>>> >>>> >>>>to bump up the shm usage you have to configure your OS in order to be >>>>allowed to use that ammount of SHM. >>>> >>>>This are the numbers that I feel good for your HW, the second step now is >>>>analyze your queries >>>> >>> >>>These changes have yielded some visible improvements, with load averages >>>rarely going over the anything noticeable. However, I do have a >>>question on the matter, why do these values seem to be far higher then >>>what a frequently pointed to document would indicate as necessary? >>> >>>http://www.varlena.com/GeneralBits/Tidbits/perf.html >>> >>>I am simply curious, as this clearly shows that my understanding of >>>PostgreSQL is clearly lacking when it comes to tweaking for the hardware. >> >>Unfortunately there is no a "wizard tuning" for postgres so each one of >>us have a own "school". The data I gave you are oversized to be sure >>to achieve improvements. Now you can start to decrease these values >>( starting from the wal_buffers ) in order to find the good compromise >>with your HW. > > > FYI, my school of tuning is to change one thing at a time some > reasonable percentage (shared_buffers from 1000 to 2000) and measure the > change under simulated load. Make another change, test it, chart the > shape of the change line. It should look something like this for most > folks: > > shared_buffers | q/s (more is better) > 100 | 20 > 200 | 45 > 400 | 80 > 1000 | 100 > ... levels out here... > 8000 | 110 > 10000 | 108 > 20000 | 40 > 30000 | 20 > > Note it going back down as we exceed our memory and start swapping > shared_buffers. Where that happens on your machine is determined by > many things like your machine's memory, memory bandwidth, type of load, > etc... but it will happen on most machines and when it does, it often > happens at the worst times, under heavy parallel load. > > Unless testing shows it's faster, 10000 or 25% of mem (whichever is > less) is usually a pretty good setting for shared_buffers. Large data > sets may require more than 10000, but going over 25% on machines with > large memory is usually a mistake, especially servers that do anything > other than just PostgreSQL. > > You're absolutely right about one thing, there's no automatic wizard for > tuning this stuff. > Which rather points out the crux of the problem. This is a live system, meaning changes made need to be as informed as possible, and that changing values for the sake of testing can lead to potential problems in service. Martin Foster Creator/Designer Ethereal Realms martin@ethereal-realms.org
On Fri, 2004-08-06 at 22:02, Martin Foster wrote: > Scott Marlowe wrote: > > > On Fri, 2004-08-06 at 17:24, Gaetano Mendola wrote: > > > >>Martin Foster wrote: > >> > >> > >>>Gaetano Mendola wrote: > >>> > >>> > >>>> > >>>>Let start from your postgres configuration: > >>>> > >>>>shared_buffers = 8192 <==== This is really too small for your > >>>>configuration > >>>>sort_mem = 2048 > >>>> > >>>>wal_buffers = 128 <==== This is really too small for your > >>>>configuration > >>>> > >>>>effective_cache_size = 16000 > >>>> > >>>>change this values in: > >>>> > >>>>shared_buffers = 50000 > >>>>sort_mem = 16084 > >>>> > >>>>wal_buffers = 1500 > >>>> > >>>>effective_cache_size = 32000 > >>>> > >>>> > >>>>to bump up the shm usage you have to configure your OS in order to be > >>>>allowed to use that ammount of SHM. > >>>> > >>>>This are the numbers that I feel good for your HW, the second step now is > >>>>analyze your queries > >>>> > >>> > >>>These changes have yielded some visible improvements, with load averages > >>>rarely going over the anything noticeable. However, I do have a > >>>question on the matter, why do these values seem to be far higher then > >>>what a frequently pointed to document would indicate as necessary? > >>> > >>>http://www.varlena.com/GeneralBits/Tidbits/perf.html > >>> > >>>I am simply curious, as this clearly shows that my understanding of > >>>PostgreSQL is clearly lacking when it comes to tweaking for the hardware. > >> > >>Unfortunately there is no a "wizard tuning" for postgres so each one of > >>us have a own "school". The data I gave you are oversized to be sure > >>to achieve improvements. Now you can start to decrease these values > >>( starting from the wal_buffers ) in order to find the good compromise > >>with your HW. > > > > > > FYI, my school of tuning is to change one thing at a time some > > reasonable percentage (shared_buffers from 1000 to 2000) and measure the > > change under simulated load. Make another change, test it, chart the > > shape of the change line. It should look something like this for most > > folks: > > > > shared_buffers | q/s (more is better) > > 100 | 20 > > 200 | 45 > > 400 | 80 > > 1000 | 100 > > ... levels out here... > > 8000 | 110 > > 10000 | 108 > > 20000 | 40 > > 30000 | 20 > > > > Note it going back down as we exceed our memory and start swapping > > shared_buffers. Where that happens on your machine is determined by > > many things like your machine's memory, memory bandwidth, type of load, > > etc... but it will happen on most machines and when it does, it often > > happens at the worst times, under heavy parallel load. > > > > Unless testing shows it's faster, 10000 or 25% of mem (whichever is > > less) is usually a pretty good setting for shared_buffers. Large data > > sets may require more than 10000, but going over 25% on machines with > > large memory is usually a mistake, especially servers that do anything > > other than just PostgreSQL. > > > > You're absolutely right about one thing, there's no automatic wizard for > > tuning this stuff. > > > > Which rather points out the crux of the problem. This is a live system, > meaning changes made need to be as informed as possible, and that > changing values for the sake of testing can lead to potential problems > in service. But if you make those changes slowly, as I was showing, you should see the small deleterious effects like I was showing long before they become catastrophic. To just jump shared_buffers to 50000 is not a good idea, especially if the sweet spot is likely lower than that.
Scott Marlowe wrote: > On Fri, 2004-08-06 at 22:02, Martin Foster wrote: > >>Scott Marlowe wrote: >> >> >>>On Fri, 2004-08-06 at 17:24, Gaetano Mendola wrote: >>> >>> >>>>Martin Foster wrote: >>>> >>>> >>>> >>>>>Gaetano Mendola wrote: >>>>> >>>>> >>>>> >>>>>>Let start from your postgres configuration: >>>>>> >>>>>>shared_buffers = 8192 <==== This is really too small for your >>>>>>configuration >>>>>>sort_mem = 2048 >>>>>> >>>>>>wal_buffers = 128 <==== This is really too small for your >>>>>>configuration >>>>>> >>>>>>effective_cache_size = 16000 >>>>>> >>>>>>change this values in: >>>>>> >>>>>>shared_buffers = 50000 >>>>>>sort_mem = 16084 >>>>>> >>>>>>wal_buffers = 1500 >>>>>> >>>>>>effective_cache_size = 32000 >>>>>> >>>>>> >>>>>>to bump up the shm usage you have to configure your OS in order to be >>>>>>allowed to use that ammount of SHM. >>>>>> >>>>>>This are the numbers that I feel good for your HW, the second step now is >>>>>>analyze your queries >>>>>> >>>>> >>>>>These changes have yielded some visible improvements, with load averages >>>>>rarely going over the anything noticeable. However, I do have a >>>>>question on the matter, why do these values seem to be far higher then >>>>>what a frequently pointed to document would indicate as necessary? >>>>> >>>>>http://www.varlena.com/GeneralBits/Tidbits/perf.html >>>>> >>>>>I am simply curious, as this clearly shows that my understanding of >>>>>PostgreSQL is clearly lacking when it comes to tweaking for the hardware. >>>> >>>>Unfortunately there is no a "wizard tuning" for postgres so each one of >>>>us have a own "school". The data I gave you are oversized to be sure >>>>to achieve improvements. Now you can start to decrease these values >>>>( starting from the wal_buffers ) in order to find the good compromise >>>>with your HW. >>> >>> >>>FYI, my school of tuning is to change one thing at a time some >>>reasonable percentage (shared_buffers from 1000 to 2000) and measure the >>>change under simulated load. Make another change, test it, chart the >>>shape of the change line. It should look something like this for most >>>folks: >>> >>>shared_buffers | q/s (more is better) >>>100 | 20 >>>200 | 45 >>>400 | 80 >>>1000 | 100 >>>... levels out here... >>>8000 | 110 >>>10000 | 108 >>>20000 | 40 >>>30000 | 20 >>> >>>Note it going back down as we exceed our memory and start swapping >>>shared_buffers. Where that happens on your machine is determined by >>>many things like your machine's memory, memory bandwidth, type of load, >>>etc... but it will happen on most machines and when it does, it often >>>happens at the worst times, under heavy parallel load. >>> >>>Unless testing shows it's faster, 10000 or 25% of mem (whichever is >>>less) is usually a pretty good setting for shared_buffers. Large data >>>sets may require more than 10000, but going over 25% on machines with >>>large memory is usually a mistake, especially servers that do anything >>>other than just PostgreSQL. >>> >>>You're absolutely right about one thing, there's no automatic wizard for >>>tuning this stuff. >>> >> >>Which rather points out the crux of the problem. This is a live system, >>meaning changes made need to be as informed as possible, and that >>changing values for the sake of testing can lead to potential problems >>in service. > > > But if you make those changes slowly, as I was showing, you should see > the small deleterious effects like I was showing long before they become > catastrophic. To just jump shared_buffers to 50000 is not a good idea, > especially if the sweet spot is likely lower than that. > While I agree, there are also issues with the fact that getting consistent results from this site are very much difficult to do, since it is based on the whims of users visiting one of three sites hosted on the same hardware. Now that being said, having wal_buffers at 8 certainly would not be a good idea, since the database logs themselves were warning of excessive writes in that region. I am not hoping for a perfect intermix ratio, that will solve all my problems. But a good idea on a base that will allow me to gain a fair load would certainly be a good option. Right now, the load being handled is not much more then a single processor system did with half the memory. Certainly this architecture should be able to take more of a beating then this? Martin Foster Creator/Designer Ethereal Realms martin@ethereal-realms.org
Scott Marlowe wrote: > On Fri, 2004-08-06 at 22:02, Martin Foster wrote: > >>Scott Marlowe wrote: >> >> >>>On Fri, 2004-08-06 at 17:24, Gaetano Mendola wrote: >>> >>> >>>>Martin Foster wrote: >>>> >>>> >>>> >>>>>Gaetano Mendola wrote: >>>>> >>>>> >>>>> >>>>>>Let start from your postgres configuration: >>>>>> >>>>>>shared_buffers = 8192 <==== This is really too small for your >>>>>>configuration >>>>>>sort_mem = 2048 >>>>>> >>>>>>wal_buffers = 128 <==== This is really too small for your >>>>>>configuration >>>>>> >>>>>>effective_cache_size = 16000 >>>>>> >>>>>>change this values in: >>>>>> >>>>>>shared_buffers = 50000 >>>>>>sort_mem = 16084 >>>>>> >>>>>>wal_buffers = 1500 >>>>>> >>>>>>effective_cache_size = 32000 >>>>>> >>>>>> >>>>>>to bump up the shm usage you have to configure your OS in order to be >>>>>>allowed to use that ammount of SHM. >>>>>> >>>>>>This are the numbers that I feel good for your HW, the second step now is >>>>>>analyze your queries >>>>>> >>>>> >>>>>These changes have yielded some visible improvements, with load averages >>>>>rarely going over the anything noticeable. However, I do have a >>>>>question on the matter, why do these values seem to be far higher then >>>>>what a frequently pointed to document would indicate as necessary? >>>>> >>>>>http://www.varlena.com/GeneralBits/Tidbits/perf.html >>>>> >>>>>I am simply curious, as this clearly shows that my understanding of >>>>>PostgreSQL is clearly lacking when it comes to tweaking for the hardware. >>>> >>>>Unfortunately there is no a "wizard tuning" for postgres so each one of >>>>us have a own "school". The data I gave you are oversized to be sure >>>>to achieve improvements. Now you can start to decrease these values >>>>( starting from the wal_buffers ) in order to find the good compromise >>>>with your HW. >>> >>> >>>FYI, my school of tuning is to change one thing at a time some >>>reasonable percentage (shared_buffers from 1000 to 2000) and measure the >>>change under simulated load. Make another change, test it, chart the >>>shape of the change line. It should look something like this for most >>>folks: >>> >>>shared_buffers | q/s (more is better) >>>100 | 20 >>>200 | 45 >>>400 | 80 >>>1000 | 100 >>>... levels out here... >>>8000 | 110 >>>10000 | 108 >>>20000 | 40 >>>30000 | 20 >>> >>>Note it going back down as we exceed our memory and start swapping >>>shared_buffers. Where that happens on your machine is determined by >>>many things like your machine's memory, memory bandwidth, type of load, >>>etc... but it will happen on most machines and when it does, it often >>>happens at the worst times, under heavy parallel load. >>> >>>Unless testing shows it's faster, 10000 or 25% of mem (whichever is >>>less) is usually a pretty good setting for shared_buffers. Large data >>>sets may require more than 10000, but going over 25% on machines with >>>large memory is usually a mistake, especially servers that do anything >>>other than just PostgreSQL. >>> >>>You're absolutely right about one thing, there's no automatic wizard for >>>tuning this stuff. >>> >> >>Which rather points out the crux of the problem. This is a live system, >>meaning changes made need to be as informed as possible, and that >>changing values for the sake of testing can lead to potential problems >>in service. > > > But if you make those changes slowly, as I was showing, you should see > the small deleterious effects like I was showing long before they become > catastrophic. To just jump shared_buffers to 50000 is not a good idea, > especially if the sweet spot is likely lower than that. As you can see 50000 are less then 20% of his total memory and I strongly fell that 50000 is not oversized for his hardware ( as wal_buffers isn't), may be could be for his database activity but for sure that value ( values ) can not be source of problems. I'd like to have a wizard that could be run also for hours in order to find the good compromise for all GUC parameters , may be a genetic algoritm can help. Regards Gaetano Mendola
Tom Lane wrote: > Martin Foster <martin@ethereal-realms.org> writes: > >>Gaetano Mendola wrote: >> >>>change this values in: >>>shared_buffers = 50000 >>>sort_mem = 16084 >>> >>>wal_buffers = 1500 > > > This value of wal_buffers is simply ridiculous. Instead I think is ridiculous a wal_buffers = 8 ( 64KB ) by default. > There isn't any reason to set wal_buffers higher than the amount of > WAL log data that will be generated by a single transaction, because > whatever is in the buffers will be flushed at transaction commit. > If you are mainly dealing with heavy concurrency then it's the mean time > between transaction commits that matters, and that's even less than the > average transaction length. I partially agree with you, tell me how decide that value without even now the typical queries, the tipical load ... nothing. I suggested to OP to keep the wal_buffers so high in order to eliminate one freedom of degree in his performance problems. You can see from following reply, ======================================================================== Gaetano Mendola wrote: Unfortunately there is no a "wizard tuning" for postgres so each one of us have a own "school". The data I gave you are oversized to be sure to achieve improvements. Now you can start to decrease these values ( starting from the wal_buffers ) in order to find the good compromise with your HW. ======================================================================== However wal_buffers = 1500 means ~12 MB that are not so expensive considering a server with 2GB of ram and I think that is a good compromise if you are not starving for RAM. I had a discussion about how fine tuning a postgres server with a client, my question was: are you planning to have someone that periodically take a look at your server activities in order to use your hardware at the best? Easy answer: No, because when the server is overloaded I will buy a bigger one that is less expensive that pay someone, considering also that shareolders prefer increase the capex that pay salaries ( if the company close the hardware can be selled :-( ). This is the real world out there. Regards Gaetano Mendola
>> This value of wal_buffers is simply ridiculous. > > > Instead I think is ridiculous a wal_buffers = 8 ( 64KB ) by default. There is no point making WAL buffers higher than 8. I have done much testing of this and it makes not the slightest difference to performance that I could measure. Chris
On 8/3/2004 2:05 PM, Martin Foster wrote: > I run a Perl/CGI driven website that makes extensive use of PostgreSQL > (7.4.3) for everything from user information to formatting and display > of specific sections of the site. The server itself, is a dual > processor AMD Opteron 1.4Ghz w/ 2GB Ram and 2 x 120GB hard drives > mirrored for redundancy running under FreeBSD 5.2.1 (AMD64). > > Recently loads on the site have increased during peak hours to the point > of showing considerable loss in performance. This can be observed > when connections move from the 120 concurrent connections to PostgreSQL > to roughly 175 or more. Essentially, the machine seems to struggle > to keep up with continual requests and slows down respectively as > resources are tied down. Have you taken a look at pgpool? I know, it sounds silly to *reduce* the number of DB connections through a connection pool, but it can help. Jan -- #======================================================================# # It's easier to get forgiveness for being wrong than for being right. # # Let's break this rule - forgive me. # #================================================== JanWieck@Yahoo.com #
Jan Wieck wrote: > On 8/3/2004 2:05 PM, Martin Foster wrote: > >> I run a Perl/CGI driven website that makes extensive use of PostgreSQL >> (7.4.3) for everything from user information to formatting and display >> of specific sections of the site. The server itself, is a dual >> processor AMD Opteron 1.4Ghz w/ 2GB Ram and 2 x 120GB hard drives >> mirrored for redundancy running under FreeBSD 5.2.1 (AMD64). >> >> Recently loads on the site have increased during peak hours to the >> point of showing considerable loss in performance. This can be >> observed when connections move from the 120 concurrent connections to >> PostgreSQL to roughly 175 or more. Essentially, the machine seems >> to struggle to keep up with continual requests and slows down >> respectively as resources are tied down. > > > Have you taken a look at pgpool? I know, it sounds silly to *reduce* the > number of DB connections through a connection pool, but it can help. > > > Jan > I am currently making use of Apache::DBI which overrides the DBI::disconnect call and keeps a pool of active connections for use when need be. Since it offloads the pooling to the webserver, it seems more advantageous then pgpool which while being able to run on a external system is not adding another layer of complexity. Anyone had any experience with both Apache::DBI and pgpool? For my needs they seem to do essentially the same thing, simply that one is invisible to the code while the other requires adding the complexity of a proxy. Martin Foster Creator/Designer Ethereal Realms martin@ethereal-realms.org
Christopher Kings-Lynne wrote: >>> This value of wal_buffers is simply ridiculous. >> >> >> >> Instead I think is ridiculous a wal_buffers = 8 ( 64KB ) by default. > > > There is no point making WAL buffers higher than 8. I have done much > testing of this and it makes not the slightest difference to performance > that I could measure. > > Chris > No point? I had it at 64 if memory serves and logs were warning me that raising this value would be desired because of excessive IO brought upon from the logs being filled far too often. It would seem to me that 8 is a bit low in at least a few circumstances. Martin Foster Creator/Designer Ethereal Realms martin@ethereal-realms.org
On Fri, 2004-08-06 at 23:18 +0000, Martin Foster wrote: > Mike Benoit wrote: > > > On Wed, 2004-08-04 at 17:25 +0200, Gaetano Mendola wrote: > > > > > >>>The queries themselves are simple, normally drawing information from one > >>>table with few conditions or in the most complex cases using joins on > >>>two table or sub queries. These behave very well and always have, the > >>>problem is that these queries take place in rather large amounts due to > >>>the dumb nature of the scripts themselves. > >> > >>Show us the explain analyze on that queries, how many rows the tables are > >>containing, the table schema could be also usefull. > >> > > > > > > If the queries themselves are optimized as much as they can be, and as > > you say, its just the sheer amount of similar queries hitting the > > database, you could try using prepared queries for ones that are most > > often executed to eliminate some of the overhead. > > > > I've had relatively good success with this in the past, and it doesn't > > take very much code modification. > > > > One of the biggest problems is most probably related to the indexes. > Since the performance penalty of logging the information needed to see > which queries are used and which are not is a slight problem, then I > cannot really make use of it for now. > > However, I am curious how one would go about preparing query? Is this > similar to the DBI::Prepare statement with placeholders and simply > changing the values passed on execute? Or is this something database > level such as a view et cetera? > Yes, always optimize your queries and GUC settings first and foremost. Thats where you are likely to gain the most performance. After that if you still want to push things even further I would try prepared queries. I'm not familiar with DBI::Prepare at all, but I don't think its what your looking for. This is what you want: http://www.postgresql.org/docs/current/static/sql-prepare.html -- Mike Benoit <ipso@snappymail.ca>
On Aug 8, 2004, at 1:29 AM, Martin Foster wrote: > I am currently making use of Apache::DBI which overrides the > DBI::disconnect call and keeps a pool of active connections for use > when need be. Since it offloads the pooling to the webserver, it > seems more advantageous then pgpool which while being able to run on a > external system is not adding another layer of complexity. > Apache::DBI is not the same sort of a pool as pgpool. DB connections are not shared among all your apache children (A common misconception). So if you have 300 apache kids you can have have 300 db connections. With pgpool connections are shared among all of them so even though you have 300 kids you only have say 32 db connections. > Anyone had any experience with both Apache::DBI and pgpool? For my > needs they seem to do essentially the same thing, simply that one is > invisible to the code while the other requires adding the complexity > of a proxy. > Both are invisible to the app. (With pgpool it thinks it is connecting to a regular old PG server) And I've been running pgpool in production for months. It just sits there. Doesn't take much to set it up or configure it. Works like a champ -- Jeff Trout <jeff@jefftrout.com> http://www.jefftrout.com/ http://www.stuarthamm.net/
On 8/8/2004 8:10 AM, Jeff wrote: > On Aug 8, 2004, at 1:29 AM, Martin Foster wrote: > >> I am currently making use of Apache::DBI which overrides the >> DBI::disconnect call and keeps a pool of active connections for use >> when need be. Since it offloads the pooling to the webserver, it >> seems more advantageous then pgpool which while being able to run on a >> external system is not adding another layer of complexity. >> > > Apache::DBI is not the same sort of a pool as pgpool. DB connections > are not shared among all your apache children (A common misconception). > So if you have 300 apache kids you can have have 300 db connections. > With pgpool connections are shared among all of them so even though > you have 300 kids you only have say 32 db connections. And this is exactly where the pgpool advantage lies. Especially with the TPC-W, the Apache is serving a mix of PHP (or whatever CGI technique is used) and static content like images. Since the 200+ Apache kids serve any of that content by random and the emulated browsers very much encourage it to ramp up MaxClients children by using up to 4 concurrent image connections, one does end up with MaxClients DB connections that are all relatively low frequently used. In contrast to that the real pgpool causes lesser, more active DB connections, which is better for performance. > >> Anyone had any experience with both Apache::DBI and pgpool? For my >> needs they seem to do essentially the same thing, simply that one is >> invisible to the code while the other requires adding the complexity >> of a proxy. >> > > Both are invisible to the app. (With pgpool it thinks it is connecting > to a regular old PG server) > > And I've been running pgpool in production for months. It just sits > there. Doesn't take much to set it up or configure it. Works like a > champ And it buys you some extra admin feature people like to forget about it. One can shut down one pool for one web application only. That gives you instant single user access to one database without shutting down the whole webserver or tempering with the pg_hba.conf file. Jan -- #======================================================================# # It's easier to get forgiveness for being wrong than for being right. # # Let's break this rule - forgive me. # #================================================== JanWieck@Yahoo.com #
> And this is exactly where the pgpool advantage lies. > Especially with the > TPC-W, the Apache is serving a mix of PHP (or whatever CGI > technique is > used) and static content like images. Since the 200+ Apache > kids serve > any of that content by random and the emulated browsers very much > encourage it to ramp up MaxClients children by using up to 4 > concurrent > image connections, one does end up with MaxClients DB > connections that > are all relatively low frequently used. In contrast to that the real > pgpool causes lesser, more active DB connections, which is better for > performance. There are two well-worn and very mature techniques for dealing with the issue of web apps using one DB connection per apache process, both of which work extremely well and attack the issue at its source. 1) Use a front-end caching proxy like Squid as an accelerator. Static content will be served by the accelerator 99% of the time. Additionally, large pages can be served immediately to the accelerator by Apache, which can then go on to serve another request without waiting for the end user's dial-up connection to pull the data down. Massive speedup, fewer apache processes needed. 2) Serve static content off an entirely separate apache server than the dynamic content, but by using separate domains (e.g. 'static.foo.com'). Personally I favour number 1. Our last biggish peak saw 6000 open HTTP and HTTPS connections and only 200 apache children, all of them nice and busy, not hanging around on street corners looking bored. During quiet times Apache drops back to its configured minimum of 40 kids. Option 2 has the advantage that you can use a leaner build for the 'dynamic' apache server, but with RAM so plentiful these days that's a less useful property. Basically this puts the 'pooling' back in the stateless HTTP area where it truly belongs and can be proven not to have any peculiar side effects (especially when it comes to transaction safety). Even better, so long as you use URL parameters for searches and the like, you can have the accelerator cache those pages for a certain time too so long as slightly stale results are OK. I'm sure pgpool and the like have their place, but being band-aids for poorly configured websites probably isn't the best use for them. M
Jeff wrote: > > On Aug 8, 2004, at 1:29 AM, Martin Foster wrote: > >> I am currently making use of Apache::DBI which overrides the >> DBI::disconnect call and keeps a pool of active connections for use >> when need be. Since it offloads the pooling to the webserver, it >> seems more advantageous then pgpool which while being able to run on a >> external system is not adding another layer of complexity. >> > > Apache::DBI is not the same sort of a pool as pgpool. DB connections > are not shared among all your apache children (A common misconception). > So if you have 300 apache kids you can have have 300 db connections. > With pgpool connections are shared among all of them so even though you > have 300 kids you only have say 32 db connections. > Seems that you are right, never noticed that from the documentation before. I always assumed it had something to do with the long lasting/persistent scripts that would remain in transactions for extended periods of time. Here is an odd question. While the server run 7.4.x, the client connects with 7.3.x. Would this in itself make a difference in performance as the protocols are different? At least based from pgpool's documentation. Martin Foster Creator/Designer Ethereal Realms martin@ethereal-realms.org
On 8-8-2004 16:29, Matt Clark wrote: > There are two well-worn and very mature techniques for dealing with the > issue of web apps using one DB connection per apache process, both of which > work extremely well and attack the issue at its source. > > 1) Use a front-end caching proxy like Squid as an accelerator. Static > content will be served by the accelerator 99% of the time. Additionally, > large pages can be served immediately to the accelerator by Apache, which > can then go on to serve another request without waiting for the end user's > dial-up connection to pull the data down. Massive speedup, fewer apache > processes needed. Another version of this 1) is to run with a "content accelerator"; our "favourite" is to run Tux in front of Apache. It takes over the connection-handling stuff, has a very low memoryprofile (compared to Apache) and very little overhead. What it does, is to serve up all "simple" content (although you can have cgi/php/perl and other languages being processed by it, entirely disabling the need for apache in some cases) and forwards/proxies everything it doesn't understand to an Apache/other webserver running at the same machine (which runs on another port). I think there are a few advantages over Squid; since it is partially done in kernel-space it can be slightly faster in serving up content, apart from its simplicity which will probably matter even more. You'll have no caching issues for pages that should not be cached or static files that change periodically (like every few seconds). Afaik Tux can handle more than 10 times as much ab-generated requests per second than a default-compiled Apache on the same machine. And besides the speed-up, you can do any request you where able to do before, since Tux will simply forward it to Apache if it didn't understand it. Anyway, apart from all that. Reducing the amount of apache-connections is nice, but not really the same as reducing the amount of pooled-connections using a db-pool... You may even be able to run with 1000 http-connections, 40 apache-processes and 10 db-connections. In case of the non-pooled setup, you'd still have 40 db-connections. In a simple test I did, I did feel pgpool had quite some overhead though. So it should be well tested, to find out where the turnover-point is where it will be a gain instead of a loss... Best regards, Arjen van der Meijden
Arjen van der Meijden wrote: > On 8-8-2004 16:29, Matt Clark wrote: > >> There are two well-worn and very mature techniques for dealing with the >> issue of web apps using one DB connection per apache process, both of >> which >> work extremely well and attack the issue at its source. >> >> 1) Use a front-end caching proxy like Squid as an accelerator. Static >> content will be served by the accelerator 99% of the time. Additionally, >> large pages can be served immediately to the accelerator by Apache, which >> can then go on to serve another request without waiting for the end >> user's >> dial-up connection to pull the data down. Massive speedup, fewer apache >> processes needed. > > > Another version of this 1) is to run with a "content accelerator"; our > "favourite" is to run Tux in front of Apache. It takes over the > connection-handling stuff, has a very low memoryprofile (compared to > Apache) and very little overhead. What it does, is to serve up all > "simple" content (although you can have cgi/php/perl and other languages > being processed by it, entirely disabling the need for apache in some > cases) and forwards/proxies everything it doesn't understand to an > Apache/other webserver running at the same machine (which runs on > another port). > > I think there are a few advantages over Squid; since it is partially > done in kernel-space it can be slightly faster in serving up content, > apart from its simplicity which will probably matter even more. You'll > have no caching issues for pages that should not be cached or static > files that change periodically (like every few seconds). Afaik Tux can > handle more than 10 times as much ab-generated requests per second than > a default-compiled Apache on the same machine. > And besides the speed-up, you can do any request you where able to do > before, since Tux will simply forward it to Apache if it didn't > understand it. > > Anyway, apart from all that. Reducing the amount of apache-connections > is nice, but not really the same as reducing the amount of > pooled-connections using a db-pool... You may even be able to run with > 1000 http-connections, 40 apache-processes and 10 db-connections. In > case of the non-pooled setup, you'd still have 40 db-connections. > > In a simple test I did, I did feel pgpool had quite some overhead > though. So it should be well tested, to find out where the > turnover-point is where it will be a gain instead of a loss... > > Best regards, > > Arjen van der Meijden > Other then images, there are very few static pages being loaded up by the user. Since they make up a very small portion of the traffic, it tends to be an optimization we can forgo for now. I attempted to make use of pgpool. At the default 32 connections pre-forked the webserver almost immediately tapped out the pgpool base and content stopped being served because no new processes were being forked to make up for it. So I raised it to a higher value (256) and it immediately segfaulted and dropped the core. So not sure exactly how to proceed, since I rather need the thing to fork additional servers as load hits and not the other way around. Unless I had it configured oddly, but it seems work differently then an Apache server would to handle content. Martin Foster Creator/Designer Ethereal Realms martin@ethereal-realms.org
> Jeff wrote: > > > > On Aug 8, 2004, at 1:29 AM, Martin Foster wrote: > > > >> I am currently making use of Apache::DBI which overrides the > >> DBI::disconnect call and keeps a pool of active connections for use > >> when need be. Since it offloads the pooling to the webserver, it > >> seems more advantageous then pgpool which while being able to run on a > >> external system is not adding another layer of complexity. > >> > > > > Apache::DBI is not the same sort of a pool as pgpool. DB connections > > are not shared among all your apache children (A common misconception). > > So if you have 300 apache kids you can have have 300 db connections. > > With pgpool connections are shared among all of them so even though you > > have 300 kids you only have say 32 db connections. > > > > Seems that you are right, never noticed that from the documentation > before. I always assumed it had something to do with the long > lasting/persistent scripts that would remain in transactions for > extended periods of time. > > Here is an odd question. While the server run 7.4.x, the client > connects with 7.3.x. Would this in itself make a difference in > performance as the protocols are different? At least based from > pgpool's documentation. In this case the server fall back from V3 protocol (employed in 7.4 or later) to V2 protocol (employed in from 6.4 to 7.3.x). As far as pgpool concerning, performance difference is significant. Of course that depends on the implementation though. FYI here is the outline of the testing using pgbench. H/W: Pentium4 2.4GHz x2/memory 1GB/HDD IDE 80GB (all PCs are same spec) S/W: RedHat Linux 9/PostgreSQL 7.3.6/7.4.3 postgresql.conf: tcpip_socket = true max_connections = 512 shared_buffers = 2048 host A: pgbench, host B: pgpool, host C: PostgreSQL 7.3.6 or 7.4.3 pgbench parameters: -S -c 10 -t 1000 result: TPS ratio(7.4.3) ratio(7.3.6) ---------------------------------------------------------------------------------------------------- without pgpool 4357.625059 100% 100% with pgpool(connection pool mode) 4330.290294 99.4% 94.1% with pgpool(replication mode) 4297.614996 98.6% 87.6% with pgpoo(replication with strictmode) 4270.223136 98.0% 81.5% -- Tatsuo Ishii
> Arjen van der Meijden wrote: > > > On 8-8-2004 16:29, Matt Clark wrote: > > > >> There are two well-worn and very mature techniques for dealing with the > >> issue of web apps using one DB connection per apache process, both of > >> which > >> work extremely well and attack the issue at its source. > >> > >> 1) Use a front-end caching proxy like Squid as an accelerator. Static > >> content will be served by the accelerator 99% of the time. Additionally, > >> large pages can be served immediately to the accelerator by Apache, which > >> can then go on to serve another request without waiting for the end > >> user's > >> dial-up connection to pull the data down. Massive speedup, fewer apache > >> processes needed. > > > > > > Another version of this 1) is to run with a "content accelerator"; our > > "favourite" is to run Tux in front of Apache. It takes over the > > connection-handling stuff, has a very low memoryprofile (compared to > > Apache) and very little overhead. What it does, is to serve up all > > "simple" content (although you can have cgi/php/perl and other languages > > being processed by it, entirely disabling the need for apache in some > > cases) and forwards/proxies everything it doesn't understand to an > > Apache/other webserver running at the same machine (which runs on > > another port). > > > > I think there are a few advantages over Squid; since it is partially > > done in kernel-space it can be slightly faster in serving up content, > > apart from its simplicity which will probably matter even more. You'll > > have no caching issues for pages that should not be cached or static > > files that change periodically (like every few seconds). Afaik Tux can > > handle more than 10 times as much ab-generated requests per second than > > a default-compiled Apache on the same machine. > > And besides the speed-up, you can do any request you where able to do > > before, since Tux will simply forward it to Apache if it didn't > > understand it. > > > > Anyway, apart from all that. Reducing the amount of apache-connections > > is nice, but not really the same as reducing the amount of > > pooled-connections using a db-pool... You may even be able to run with > > 1000 http-connections, 40 apache-processes and 10 db-connections. In > > case of the non-pooled setup, you'd still have 40 db-connections. > > > > In a simple test I did, I did feel pgpool had quite some overhead > > though. So it should be well tested, to find out where the > > turnover-point is where it will be a gain instead of a loss... I don't know what were the configurations you are using, but I noticed that UNIX domain sockets are preferred for the connection bwteen clients and pgpool. When I tested using pgbench -C (involving connection estblishing for each transaction), with-pgpool-configuration 10 times faster than without-pgpool-conf if using UNIX domain sockets, while there is only 3.6 times speed up with TCP/IP sockets. > > Best regards, > > > > Arjen van der Meijden > > > > Other then images, there are very few static pages being loaded up by > the user. Since they make up a very small portion of the traffic, it > tends to be an optimization we can forgo for now. > > I attempted to make use of pgpool. At the default 32 connections > pre-forked the webserver almost immediately tapped out the pgpool base > and content stopped being served because no new processes were being > forked to make up for it. > > So I raised it to a higher value (256) and it immediately segfaulted and > dropped the core. So not sure exactly how to proceed, since I rather > need the thing to fork additional servers as load hits and not the other > way around. What version of pgpool did you test? I know that certain version (actually 2.0.2) had such that problem. Can you try again with the latest verison of pgpool? (it's 2.0.6). -- Tatsuo Ishii
On Sun, 8 Aug 2004, Matt Clark wrote: > > And this is exactly where the pgpool advantage lies. > > Especially with the > > TPC-W, the Apache is serving a mix of PHP (or whatever CGI > > technique is > > used) and static content like images. Since the 200+ Apache > > kids serve > > any of that content by random and the emulated browsers very much > > encourage it to ramp up MaxClients children by using up to 4 > > concurrent > > image connections, one does end up with MaxClients DB > > connections that > > are all relatively low frequently used. In contrast to that the real > > pgpool causes lesser, more active DB connections, which is better for > > performance. > > There are two well-worn and very mature techniques for dealing with the > issue of web apps using one DB connection per apache process, both of which > work extremely well and attack the issue at its source. > > 1) Use a front-end caching proxy like Squid as an accelerator. Static > content will be served by the accelerator 99% of the time. Additionally, > large pages can be served immediately to the accelerator by Apache, which > can then go on to serve another request without waiting for the end user's > dial-up connection to pull the data down. Massive speedup, fewer apache > processes needed. Squid also takes away the work of doing SSL (presuming you're running it on a different machine). Unfortunately it doesn't support HTTP/1.1 which means that most generated pages (those that don't set Content-length) end up forcing squid to close and then reopen the connection to the web server. Because you no longer need to worry about keeping Apache processes around to dribble data to people on the wrong end of modems you can reduce MaxClients quite a bit (to, say, 10 or 20 per web server). This keeps the number of PostgreSQL connections down. I'd guess that above some point you're going to reduce performance by increasing MaxClients and running queries in parallel rather than queueing the request and doing them serially. I've also had some problems when Squid had a large number of connections open (several thousand); though that may have been because of my half_closed_clients setting. Squid 3 coped a lot better when I tried it (quite a few months ago now - and using FreeBSD and the special kqueue system call) but crashed under some (admittedly synthetic) conditions. > I'm sure pgpool and the like have their place, but being band-aids for > poorly configured websites probably isn't the best use for them. You still have periods of time when the web servers are busy using their CPUs to generate HTML rather than waiting for database queries. This is especially true if you cache a lot of data somewhere on the web servers themselves (which, in my experience, reduces the database load a great deal). If you REALLY need to reduce the number of connections (because you have a large number of web servers doing a lot of computation, say) then it might still be useful.
> Squid also takes away the work of doing SSL (presuming you're running it > on a different machine). Unfortunately it doesn't support HTTP/1.1 which > means that most generated pages (those that don't set Content-length) end > up forcing squid to close and then reopen the connection to the web > server. It is true that it doesn't support http/1.1, but 'most generated pages'? Unless they are actually emitted progressively they should have a perfectly good content-length header. > I've also had some problems when Squid had a large number of connections > open (several thousand); though that may have been because of my > half_closed_clients setting. Squid 3 coped a lot better when I tried it > (quite a few months ago now - and using FreeBSD and the special kqueue > system call) but crashed under some (admittedly synthetic) conditions. It runs out of the box with a very conservative setting for max open file descriptors - this may or may not be the cause of the problems you have seen. Certainly I ran squid with >16,000 connections back in 1999... > You still have periods of time when the web servers are busy using their > CPUs to generate HTML rather than waiting for database queries. This is > especially true if you cache a lot of data somewhere on the web servers > themselves (which, in my experience, reduces the database load a great > deal). If you REALLY need to reduce the number of connections (because you > have a large number of web servers doing a lot of computation, say) then > it might still be useful. Aha, a postgres related topic in this thread! What you say is very true, but then given that the connection overhead is so vanishingly small, why not simply run without a persistent DB connection in this case? I would maintain that if your webservers are holding open idle DB connections for so long that it's a problem, then simply close the connections! M