Re: Performance Bottleneck - Mailing list pgsql-performance

From Martin Foster
Subject Re: Performance Bottleneck
Date
Msg-id X0ZPc.23543$yT2.23053@clgrps13
Whole thread Raw
In response to Re: Performance Bottleneck  (Gaetano Mendola <mendola@bigfoot.com>)
Responses Re: Performance Bottleneck  (Pierre-Frédéric Caillaud<lists@boutiquenumerique.com>)
Re: Performance Bottleneck  (Michael Adler <adler@pobox.com>)
Re: Performance Bottleneck  (Gaetano Mendola <mendola@bigfoot.com>)
List pgsql-performance
Gaetano Mendola wrote:
> Martin Foster wrote:
>
>> I run a Perl/CGI driven website that makes extensive use of PostgreSQL
>> (7.4.3) for everything from user information to formatting and display
>> of specific sections of the site.   The server itself, is a dual
>> processor AMD Opteron 1.4Ghz w/ 2GB Ram and 2 x 120GB hard drives
>> mirrored for redundancy running under FreeBSD 5.2.1 (AMD64).
>>
>> Recently loads on the site have increased during peak hours to the
>> point of showing considerable loss in performance.    This can be
>> observed when connections move from the 120 concurrent connections to
>> PostgreSQL to roughly 175 or more.     Essentially, the machine seems
>> to struggle to keep up with continual requests and slows down
>> respectively as resources are tied down.
>>
>> Code changes have been made to the scripts to essentially back off in
>> high load working environments which have worked to an extent.
>> However, as loads continue to increase the database itself is not
>> taking well to the increased traffic taking place.
>>
>> Having taken a look at 'Tuning PostgreSQL for Performance'
>> (http://www.varlena.com/GeneralBits/Tidbits/perf.html) using it as
>> best I could in order to set my settings.    However, even with
>> statistics disabled and ever setting tweaked things still consider to
>> deteriorate.
>>
>> Is there anything anyone can recommend in order to give the system a
>> necessary speed boost?   It would seem to me that a modest dataset of
>> roughly a Gig combined with that type of hardware should be able to
>> handle substantially more load then what it is.  Can anyone provide me
>> with clues as where to pursue?    Would disabling 'fsync' provide more
>> performance if I choose that information may be lost in case of a crash?
>>
>> If anyone needs access to logs, settings et cetera.   Please ask, I
>> simply wish to test the waters first on what is needed. Thanks!
>
>
> Tell us about your tipical queries, show us your configuration file.
> The access are only in read only mode or do you have concurrent writers
> and readers ? During peak hours your processors are tied to 100% ?
> What say the vmstat and the iostat ?
>
> May be you are not using indexes some where, or may be yes but the
> planner is not using it... In two words we needs other informations
> in order to help you.
>
>
>
> Regards
> Gaetano Mendola
>
>

I included all the files in attachments, which will hopefully cut down
on any replied to Emails.    As for things like connection pooling, the
web server makes use of Apache::DBI to pool the connections for the Perl
scripts being driven on that server.    For the sake of being thorough,
a quick 'apachectl status' was thrown in when the database was under a
good load.

Since it would rather slow things down to wait for the servers to really
get bogged down with load averages of 20.00 and more, I opted to choose
a period of time where we are a bit busier then normal.   You will be
able to see how the system behaves under a light load and subsequently
reaching 125 or so concurrent connections.

The queries themselves are simple, normally drawing information from one
table with few conditions or in the most complex cases using joins on
two table or sub queries.   These behave very well and always have, the
problem is that these queries take place in rather large amounts due to
the dumb nature of the scripts themselves.

Over a year ago when I was still using MySQL for the project, the
statistics generated would report well over 65 queries per second under
loads ranging from 130 to 160 at peak but averaged over the weeks of
operation.   Looking at the Apache status, one can see that it averages
only roughly 2.5 requests per second giving you a slight indication as
to what is taking place.

A quick run of 'systat -ifstat' shows the following graph:


                /0   /1   /2   /3   /4   /5   /6   /7   /8   /9   /10
Load Average   >>>>>>>>>>>

Interface           Traffic               Peak                Total
       lo0  in      0.000 KB/s          0.000 KB/s           37.690 GB
            out     0.000 KB/s          0.000 KB/s           37.690 GB

       em0  in     34.638 KB/s         41.986 KB/s           28.998 GB
            out    70.777 KB/s         70.777 KB/s           39.553 GB

Em0 is a full duplexed 100Mbs connection to an internal switch that
supports the servers directly.   Load on the loopback was cut down
considerably once I stopped using pg_autovaccum since its performance
benefits under low load were buried under the hindrance it caused when
traffic was high.

I am sure that there are some places that could benefit from some
optimization.  Especially in the case of indexes, however as a whole the
problem seems to be related more to the massive onslaught of queries
then it does anything else.

Also note that some of these scripts run for longer durations even if
they are web based.    Some run as long as 30 minutes, making queries to
the database from periods of wait from five seconds to twenty-five
seconds.     Under high duress the timeouts should back out, based on
the time needed for the query to respond, normally averaging 0.008 seconds.

Does this help at all, or is more detail needed on the matter?

    Martin Foster
    martin@ethereal-realms.org

# -----------------------------
# PostgreSQL configuration file
# -----------------------------
#
# This file consists of lines of the form:
#
#   name = value
#
# (The '=' is optional.) White space may be used. Comments are introduced
# with '#' anywhere on a line. The complete list of option names and
# allowed values can be found in the PostgreSQL documentation. The
# commented-out settings shown in this file represent the default values.
#
# Any option can also be given as a command line switch to the
# postmaster, e.g. 'postmaster -c log_connections=on'. Some options
# can be changed at run-time with the 'SET' SQL command.
#
# This file is read on postmaster startup and when the postmaster
# receives a SIGHUP. If you edit the file on a running system, you have
# to SIGHUP the postmaster for the changes to take effect, or use
# "pg_ctl reload".


#---------------------------------------------------------------------------
# CONNECTIONS AND AUTHENTICATION
#---------------------------------------------------------------------------

# - Connection Settings -

tcpip_socket = true
ssl = false

max_connections = 512
superuser_reserved_connections = 2

rendezvous_name='io'
port = 5432

#unix_socket_directory = '/var/postgres'
#unix_socket_group = 'postgres'
#unix_socket_permissions = 0777    # octal

# - Security & Authentication -

authentication_timeout = 20    # 1-600, in seconds
password_encryption = true
db_user_namespace = false


#---------------------------------------------------------------------------
# RESOURCE USAGE (except WAL)
#---------------------------------------------------------------------------

# - Memory -

shared_buffers = 8192        # min 16, at least max_connections*2, 8KB each
sort_mem = 2048            # min 64, size in KB
vacuum_mem = 65536        # min 1024, size in KB

# - Free Space Map -

#max_fsm_pages = 20000        # min max_fsm_relations*16, 6 bytes each
#max_fsm_relations = 1000    # min 100, ~50 bytes each

# - Kernel Resource Usage -

#max_files_per_process = 1000    # min 25
#preload_libraries = ''


#---------------------------------------------------------------------------
# WRITE AHEAD LOG
#---------------------------------------------------------------------------

# - Settings -

fsync = true            # turns forced synchronization on or off
wal_sync_method = fsync        # the default varies across platforms:
                # fsync, fdatasync, open_sync, or open_datasync
wal_buffers = 128        # min 4, 8KB each

# - Checkpoints -

checkpoint_segments = 24    # in logfile segments, min 1, 16MB each
checkpoint_timeout = 300    # range 30-3600, in seconds
checkpoint_warning = 30        # 0 is off, in seconds
commit_delay = 15000        # range 0-100000, in microseconds
commit_siblings = 64        # range 1-1000


#---------------------------------------------------------------------------
# QUERY TUNING
#---------------------------------------------------------------------------

# - Planner Method Enabling -

enable_hashagg = true
enable_hashjoin = true
enable_indexscan = true
enable_mergejoin = true
enable_nestloop = true
enable_seqscan = true
enable_sort = true
enable_tidscan = true

# - Planner Cost Constants -

effective_cache_size = 16000    # typically 8KB each
random_page_cost = 2        # units are one sequential page fetch cost
cpu_tuple_cost = 0.01        # (same)
cpu_index_tuple_cost = 0.001    # (same)
cpu_operator_cost = 0.0025    # (same)

# - Genetic Query Optimizer -

geqo = false
geqo_threshold = 11
geqo_effort = 1
geqo_generations = 0
geqo_pool_size = 0        # default based on tables in statement,
                # range 128-1024
geqo_selection_bias = 2.0    # range 1.5-2.0

# - Other Planner Options -

default_statistics_target = 100    # range 1-1000
from_collapse_limit = 8
join_collapse_limit = 8        # 1 disables collapsing of explicit JOINs


#---------------------------------------------------------------------------
# ERROR REPORTING AND LOGGING
#---------------------------------------------------------------------------

# - Syslog -

#syslog = 2            # range 0-2; 0=stdout; 1=both; 2=syslog
#syslog_facility = 'LOCAL0'
#syslog_ident = 'postgres'

# - When to Log -

client_min_messages = notice    # Values, in order of decreasing detail:
                #   debug5, debug4, debug3, debug2, debug1,
                #   log, info, notice, warning, error

log_min_messages = info        # Values, in order of decreasing detail:
                #   debug5, debug4, debug3, debug2, debug1,
                #   info, notice, warning, error, log, fatal,
                #   panic

log_error_verbosity = default   # terse, default, or verbose messages

log_min_error_statement = panic # Values in order of increasing severity:
                 #   debug5, debug4, debug3, debug2, debug1,
                 #   info, notice, warning, error, panic(off)

log_min_duration_statement = -1 # Log all statements whose
                 # execution time exceeds the value, in
                 # milliseconds.  Zero prints all queries.
                 # Minus-one disables.

silent_mode = false         # DO NOT USE without Syslog!

# - What to Log -

#debug_print_parse = false
#debug_print_rewritten = false
#debug_print_plan = false
#debug_pretty_print = false
#log_connections = false
#log_duration = false
#log_pid = false
#log_statement = false
#log_timestamp = false
#log_hostname = false
#log_source_port = false


#---------------------------------------------------------------------------
# RUNTIME STATISTICS
#---------------------------------------------------------------------------

# - Statistics Monitoring -

log_parser_stats = false
log_planner_stats = false
log_executor_stats = false
log_statement_stats = false

# - Query/Index Statistics Collector -

stats_start_collector = false
stats_command_string = false
stats_block_level = false
stats_row_level = false
stats_reset_on_server_start = true


#---------------------------------------------------------------------------
# CLIENT CONNECTION DEFAULTS
#---------------------------------------------------------------------------

# - Statement Behavior -

search_path = '$user,public'    # schema names
check_function_bodies = true
default_transaction_isolation = 'read committed'
default_transaction_read_only = false
statement_timeout = 25000    # 0 is disabled, in milliseconds

# - Locale and Formatting -

datestyle = 'iso, us'
#timezone = unknown        # actually, defaults to TZ environment setting
#australian_timezones = false
#extra_float_digits = 0        # min -15, max 2
#client_encoding = sql_ascii    # actually, defaults to database encoding

# These settings are initialized by initdb -- they may be changed
lc_messages = 'C'        # locale for system error message strings
lc_monetary = 'C'        # locale for monetary formatting
lc_numeric = 'C'        # locale for number formatting
lc_time = 'C'            # locale for time formatting

# - Other Defaults -

explain_pretty_print = true
#dynamic_library_path = '$libdir'
#max_expr_depth = 10000        # min 10


#---------------------------------------------------------------------------
# LOCK MANAGEMENT
#---------------------------------------------------------------------------

deadlock_timeout = 5000        # in milliseconds
#max_locks_per_transaction = 64    # min 10, ~260*max_connections bytes each


#---------------------------------------------------------------------------
# VERSION/PLATFORM COMPATIBILITY
#---------------------------------------------------------------------------

# - Previous Postgres Versions -

#add_missing_from = true
#regex_flavor = advanced    # advanced, extended, or basic
#sql_inheritance = true

# - Other Platforms & Clients -

#transform_null_equals = false
martin@elara ~$ apachectl status

               Apache Server Status for www.ethereal-realms.org

   Server Version: Apache/1.3.29 (Unix) mod_perl/1.27 mod_gzip/1.3.26.1a
   mod_ssl/2.8.16 OpenSSL/0.9.7c
   Server Built: May 24 2004 00:32:11
     _________________________________________________________________

   Current Time: Tuesday, 03-Aug-2004 21:34:37 MDT
   Restart Time: Tuesday, 03-Aug-2004 03:54:21 MDT
   Parent Server Generation: 0
   Server uptime: 17 hours 40 minutes 16 seconds
   Total accesses: 168852 - Total Traffic: 406.0 MB
   CPU Usage: u283.96 s41.27 cu92.51 cs13.68 - .678% CPU load
   2.65 requests/sec - 6.5 kB/second - 2521 B/request
   127 requests currently being processed, 59 idle servers
W_WWWW_WW.WWKKWK__W._WW_.WK._KK._..K.WW__WK_WW_WWWK_.__WWW_W.WWW
__..__W._W__WWWWWW._WKWW_W_KKW_WWW___K_.WKWW__KWWWWW_WWWWKKKWW.W
__.WK_WW.W_W_WWWW_KWWWWW_KW.WW.W__W_W___W___WWWKW__W_W._.WWW.WWK
W___WKKW.K_W.W...KW....WW_......................................
................................................................
................................................................
................................................................
................................................................
................................................................
................................................................
................................................................
................................................................
................................................................
................................................................
................................................................
................................................................

   Scoreboard Key:
   "_" Waiting for Connection, "S" Starting up, "R" Reading Request,
   "W" Sending Reply, "K" Keepalive (read), "D" DNS Lookup,
   "L" Logging, "G" Gracefully finishing, "." Open slot with no current
   process
martin@io ~$ vmstat
 procs      memory      page                    disks     faults      cpu
 r b w     avm    fre  flt  re  pi  po  fr  sr ad4 ad6   in   sy  cs us sy id
 0 0 0  498532 122848 3306   0   0   0 740   0   0   0  788    0 1675 16 21 63

martin@io ~$ pstat -s
Device          1K-blocks     Used    Avail Capacity
/dev/ar0s1b       4194304       24  4194280     0%

martin@io ~$ iostat
      tty             ad4              ad6              ar0             cpu
 tin tout  KB/t tps  MB/s   KB/t tps  MB/s   KB/t tps  MB/s  us ni sy in id
   0   17  0.50   0  0.00   0.50   0  0.00  20.28  23  0.45  16  0 20  1 63

martin@io ~# ps -U postgres | wc -l
     127


martin@io ~$ ps -uax
USER       PID %CPU %MEM   VSZ  RSS  TT  STAT STARTED      TIME COMMAND
postgres 32084  0.0  0.2 91616 3764  p0- R    Mon12PM   4:08.99 /usr/local/bin/postmaster -D /var/postgres (postgres)
postgres 80333  0.0  2.1 94620 44372  ??  S     8:57PM   0:01.00 postmaster: ethereal ethereal 192.168.1.6 idle in
trans
postgres 80599  0.0  2.1 94652 44780  ??  S     8:59PM   0:00.97 postmaster: ethereal ethereal 192.168.1.6 idle in
trans
postgres 80616  0.0  2.4 94424 50396  ??  S     8:59PM   0:00.89 postmaster: ethereal ethereal 192.168.1.6 idle in
trans
postgres 80715  0.0  2.2 94444 46804  ??  S     9:00PM   0:00.68 postmaster: ethereal ethereal 192.168.1.6 idle in
trans
postgres 80788  0.0  2.1 94424 43944  ??  S     9:00PM   0:00.93 postmaster: ethereal ethereal 192.168.1.6 idle in
trans
postgres 80811  0.0  2.1 94424 43884  ??  S     9:00PM   0:00.94 postmaster: ethereal ethereal 192.168.1.6 idle in
trans
postgres 80902  0.0  2.1 94424 43380  ??  S     9:01PM   0:00.76 postmaster: ethereal ethereal 192.168.1.6 idle in
trans
postgres 80949  0.0  2.2 94424 45248  ??  S     9:01PM   0:00.67 postmaster: ethereal ethereal 192.168.1.6 idle in
trans
postgres 81020  0.0  2.1 94424 42924  ??  S     9:02PM   0:00.74 postmaster: ethereal ethereal 192.168.1.6 idle in
trans


martin@io ~# date
Tue Aug  3 21:28:53 MDT 2004
martin@io ~$ systat -vmstat
    1 users    Load  4.15  3.76  3.54                  Aug  3 21:30

Mem:KB    REAL            VIRTUAL                     VN PAGER  SWAP PAGER
        Tot   Share      Tot    Share    Free         in  out     in  out
Act  293708    4120   482092     6736  132204 count
All 1910640    5224  2462084    17060         pages
                                                                 Interrupts
Proc:r  p  d  s  w    Csw  Trp  Sys  Int  Sof  Flt    212 cow     728 total
     5      139      215023802 3112 1078  58023548 397288 wire    100 0: clk
                                                   295304 act         6: fdc0
59.6%Sys   0.9%Intr 38.9%User  0.0%Nice  0.6%Idl  1151936 inact   128 8: rtc
|    |    |    |    |    |    |    |    |    |      92748 cache    53 10: bge
==============================>>>>>>>>>>>>>>>>>>    39456 free    447 11: em0
                                                          daefr       14: ata
Namei         Name-cache    Dir-cache                1345 prcfr
    Calls     hits    %     hits    %                     react
     2761     2761  100                                   pdwake
                                     1111 zfod            pdpgs
Disks   ad4   ad6   ar0   fd0        1059 ofod            intrn
KB/t   0.00  0.00 16.00  0.00          95 %slo-z   204096 buf
tps       0     0    35     0        1587 tfree       591 dirtybuf
MB/s   0.00  0.00  0.54  0.00                      120525 desiredvnodes
% busy    0     0     5     0                       30131 numvnodes
                                                    22695 freevnodes

martin@io ~$ vmstat
 procs      memory      page                    disks     faults      cpu
 r b w     avm    fre  flt  re  pi  po  fr  sr ad4 ad6   in   sy  cs us sy id
 0 0 0  317216 128932 3121   0   0   0 736   0   0   0  789    0 1682 16 21 63

martin@io ~$ pstat -s
Device          1K-blocks     Used    Avail Capacity
/dev/ar0s1b       4194304       24  4194280     0%

martin@io ~$ iostat
      tty             ad4              ad6              ar0             cpu
 tin tout  KB/t tps  MB/s   KB/t tps  MB/s   KB/t tps  MB/s  us ni sy in id
   0   17  0.50   0  0.00   0.50   0  0.00  20.32  23  0.45  16  0 20  1 63


martin@io ~$ ps -U postgres | wc -l
      72


martin@io ~$ ps -uax
USER       PID %CPU %MEM   VSZ  RSS  TT  STAT STARTED      TIME COMMAND
postgres 32084  0.0  0.2 91616 3764  p0- S    Mon12PM   3:17.42 /usr/local/bin/postmaster -D /var/postgres (postgres)
postgres 51100  0.0  1.7 94412 35828  ??  S     3:36PM   0:01.02 postmaster: ethereal ethereal 192.168.1.6 idle in
trans
postgres 51134  0.0  1.8 94620 37060  ??  S     3:36PM   0:01.04 postmaster: ethereal ethereal 192.168.1.6 idle in
trans
postgres 51147  0.0  1.8 94620 36980  ??  S     3:37PM   0:00.66 postmaster: ethereal ethereal 192.168.1.6 idle in
trans
postgres 51240  0.0  1.8 94412 37156  ??  S     3:38PM   0:00.62 postmaster: ethereal ethereal 192.168.1.6 idle in
trans
postgres 51276  0.0  0.7 94412 15180  ??  S     3:38PM   0:00.41 postmaster: ethereal ethereal 192.168.1.6 idle in
trans
postgres 51277  0.0  1.9 94492 40776  ??  S     3:38PM   0:00.57 postmaster: ethereal ethereal 192.168.1.6 idle in
trans
postgres 51287  0.0  1.8 94620 37228  ??  S     3:38PM   0:00.63 postmaster: ethereal ethereal 192.168.1.6 idle in
trans
postgres 51368  0.0  1.7 94412 35348  ??  S     3:40PM   0:00.55 postmaster: ethereal ethereal 192.168.1.6 idle in
trans
postgres 51493  0.0  1.7 94412 35920  ??  S     3:42PM   0:00.49 postmaster: ethereal ethereal 192.168.1.6 idle in
trans
postgres 51495  0.0  1.7 94412 36192  ??  S     3:42PM   0:00.89 postmaster: ethereal ethereal 192.168.1.6 idle in
trans


martin@io ~$ date
Tue Aug  3 16:09:06 MDT 2004

martin@io ~$ systat -vmstat
    1 users    Load  0.53  0.54  0.47                  Aug  3 16:14

Mem:KB    REAL            VIRTUAL                     VN PAGER  SWAP PAGER
        Tot   Share      Tot    Share    Free         in  out     in  out
Act  205172    4312   326472     6736  120128 count
All 1909516    4944  2215992    11896         pages
                                                                 Interrupts
Proc:r  p  d  s  w    Csw  Trp  Sys  Int  Sof  Flt    197 cow     642 total
             96      177814901 2186  947  20014746 386844 wire    100 0: clk
                                                   206544 act         6: fdc0
28.8%Sys   0.5%Intr 26.8%User  0.0%Nice 43.9%Idl  1263216 inact   128 8: rtc
|    |    |    |    |    |    |    |    |    |      69644 cache    55 10: bge
==============+>>>>>>>>>>>>>                        50484 free    359 11: em0
                                                          daefr       14: ata
Namei         Name-cache    Dir-cache                1235 prcfr
    Calls     hits    %     hits    %                     react
     1709     1709  100                                   pdwake
                                     1000 zfod            pdpgs
Disks   ad4   ad6   ar0   fd0         938 ofod            intrn
KB/t   0.00  0.00 16.00  0.00          93 %slo-z   204096 buf
tps       0     0    31     0        1482 tfree        56 dirtybuf
MB/s   0.00  0.00  0.49  0.00                      120525 desiredvnodes
% busy    0     0     2     0                       30131 numvnodes
                                                     7645 freevnodes




pgsql-performance by date:

Previous
From: Christopher Kings-Lynne
Date:
Subject: Re: NOT IN query takes forever
Next
From: Pierre-Frédéric Caillaud
Date:
Subject: Re: Performance Bottleneck