CPU bound at 99% - Mailing list pgsql-performance

From Bryan Buecking
Subject CPU bound at 99%
Date
Msg-id 20080422153045.GB5961@starling-software.com
Whole thread Raw
Responses Re: CPU bound at 99%
Re: CPU bound at 99%
Re: CPU bound at 99%
List pgsql-performance
Hi,

I'm running into an performance problem where a Postgres db is running
at 99% CPU (4 cores) with about 500 concurrent connection doing various
queries from a web application. This problem started about a week ago,
and has been steadily going downhill. I have been tweaking the config a
bit, mainly shared_memory but have seen no noticeable improvements.

at any given time there is about 5-6 postgres in startup
(ps auxwww | grep postgres | grep startup | wc -l)

about 2300 connections in idle
(ps auxwww | grep postgres | idle)

and loads of "FATAL: sorry, too many clients already" being logged.

The server that connects to the db is an apache server using persistent
connections. MaxClients is 2048 thus the high number of connections
needed. Application was written in PHP using the Pear DB class.

Here are some typical queries taking place

(table media has about 40,000 records and category about 40):

LOG: duration: 66141.530 ms  statement:
    SELECT COUNT(*) AS CNT
    FROM media m JOIN category ca USING(category_id)
    WHERE CATEGORY_ROOT(m.category_id) = '-1'
    AND m.deleted_on IS NULL

LOG:  duration: 57828.983 ms  statement:
    SELECT COUNT(*) AS CNT
    FROM media m JOIN category ca USING(category_id)
    WHERE CATEGORY_ROOT(m.category_id) = '-1'
    AND m.deleted_on IS NULL AND m.POSTED_ON + interval '7 day'

System
======
cpu             Xeon(R) CPU 5160 @ 3.00GHz stepping 06 x 4
                L1, L2 = 32K, 4096K
mem             8GB
dbms            postgresql-server 8.2.4
disks         scsi0 : LSI Logic SAS based MegaRAID driver
        SCSI device sda: 142082048 512-byte hdwr sectors (72746 MB)
        SCSI device sda: 142082048 512-byte hdwr sectors (72746 MB)

Stats
======

top - 00:28:40 up 12:43,  1 user,  load average: 46.88, 36.55, 37.65
Tasks: 2184 total,  63 running, 2119 sleeping,   1 stopped,   1 zombie
Cpu0: 99.3% us,  0.5% sy,  0.0% ni,  0.0% id,  0.0% wa,  0.0% hi, 0.2% si
Cpu1: 98.3% us,  1.4% sy,  0.0% ni,  0.2% id,  0.0% wa,  0.0% hi, 0.0% si
Cpu2: 99.5% us,  0.5% sy,  0.0% ni,  0.0% id,  0.0% wa,  0.0% hi, 0.0% si
Cpu3: 99.5% us,  0.5% sy,  0.0% ni,  0.0% id,  0.0% wa,  0.0% hi, 0.0% si
Mem:   8166004k total,  6400368k used,  1765636k free,   112080k buffers
Swap:  1020088k total,        0k used,  1020088k free,  3558764k cached


$ vmstat 3
procs ---------memory---------- ---swap-- -----io---- --system-- ----cpu----
 r  b swpd   free   buff  cache   si   so    bi    bo   in    cs us sy id wa
 4  0    0 559428 109440 3558684    0    0    11    27   31   117 96  2  2  0
 5  0    0 558996 109452 3558672    0    0     0    41 1171   835 93  1  7  0
 4  0    0 558996 109452 3558740    0    0     0    38 1172   497 98  1  1  0
11  0    0 554516 109452 3558740    0    0     0    19 1236   610 97  1  2  0
25  0    0 549860 109452 3558740    0    0     0    32 1228   332 99  1  0  0
12  0    0 555412 109452 3558740    0    0     0     4 1148   284 99  1  0  0
15  0    0 555476 109452 3558740    0    0     0    23 1202   290 99  1  0  0
15  0    0 555476 109452 3558740    0    0     0     1 1125   260 99  1  0  0
16  0    0 555460 109452 3558740    0    0     0    12 1214   278 99  1  0  0


# -----------------------------
# PostgreSQL configuration file
# -----------------------------

#data_directory = 'ConfigDir'           # use data in another directory
                                        # (change requires restart)
#hba_file = 'ConfigDir/pg_hba.conf'     # host-based authentication file
                                        # (change requires restart)
#ident_file = 'ConfigDir/pg_ident.conf' # ident configuration file
                                        # (change requires restart)

# If external_pid_file is not explicitly set, no extra PID file is written.
#external_pid_file = '(none)'           # write an extra PID file
                                        # (change requires restart)


#---------------------------------------------------------------------------
# CONNECTIONS AND AUTHENTICATION
#---------------------------------------------------------------------------

# - Connection Settings -

listen_addresses = 'localhost'            # what IP address(es) to listen on;
                                        # comma-separated list of addresses;
                                        # defaults to 'localhost', '*' = all
                                        # (change requires restart)
port = 5432                             # (change requires restart)
max_connections = 2400                  # (change requires restart)
# Note: increasing max_connections costs ~400 bytes of shared memory per
# connection slot, plus lock space (see max_locks_per_transaction).  You
# might also need to raise shared_buffers to support more connections.
superuser_reserved_connections = 3      # (change requires restart)
#unix_socket_directory = ''             # (change requires restart)
#unix_socket_group = ''                 # (change requires restart)
#unix_socket_permissions = 0777         # octal
                                        # (change requires restart)
#bonjour_name = ''                      # defaults to the computer name
                                        # (change requires restart)

# - Security & Authentication -

#authentication_timeout = 1min          # 1s-600s
#ssl = off                              # (change requires restart)
#password_encryption = on
#db_user_namespace = off

# Kerberos
#krb_server_keyfile = ''                # (change requires restart)
#krb_srvname = 'postgres'               # (change requires restart)
#krb_server_hostname = ''               # empty string matches any keytab entry
                                        # (change requires restart)
#krb_caseins_users = off                # (change requires restart)

# - TCP Keepalives -
# see 'man 7 tcp' for details

#tcp_keepalives_idle = 0                # TCP_KEEPIDLE, in seconds;
                                        # 0 selects the system default
#tcp_keepalives_interval = 0            # TCP_KEEPINTVL, in seconds;
                                        # 0 selects the system default
#tcp_keepalives_count = 0               # TCP_KEEPCNT;
                                        # 0 selects the system default


#---------------------------------------------------------------------------
# RESOURCE USAGE (except WAL)
#---------------------------------------------------------------------------

# - Memory -

shared_buffers = 600MB                  # min 128kB or max_connections*16kB
                                        # (change requires restart)
temp_buffers = 10MB                     # min 800kB
#max_prepared_transactions = 5          # can be 0 or more
                                        # (change requires restart)
# Note: increasing max_prepared_transactions costs ~600 bytes of shared memory
# per transaction slot, plus lock space (see max_locks_per_transaction).
work_mem = 8MB                          # min 64kB
maintenance_work_mem = 512MB            # min 1MB
max_stack_depth = 8MB                   # min 100kB

# - Free Space Map -

max_fsm_pages = 1536000         # min max_fsm_relations*16, 6 bytes each
                                        # (change requires restart)
max_fsm_relations = 10000               # min 100, ~70 bytes each
                                        # (change requires restart)

# - Kernel Resource Usage -

max_files_per_process = 1024            # min 25
                                        # (change requires restart)
#shared_preload_libraries = ''          # (change requires restart)

# - Cost-Based Vacuum Delay -

#vacuum_cost_delay = 0                  # 0-1000 milliseconds
#vacuum_cost_page_hit = 1               # 0-10000 credits
#vacuum_cost_page_miss = 10             # 0-10000 credits
#vacuum_cost_page_dirty = 20            # 0-10000 credits
#vacuum_cost_limit = 200                # 0-10000 credits

# - Background writer -

#bgwriter_delay = 200ms                 # 10-10000ms between rounds
#bgwriter_lru_percent = 1.0             # 0-100% of LRU buffers scanned/round
#bgwriter_lru_maxpages = 5              # 0-1000 buffers max written/round
#bgwriter_all_percent = 0.333           # 0-100% of all buffers scanned/round
#bgwriter_all_maxpages = 5              # 0-1000 buffers max written/round


#---------------------------------------------------------------------------
# WRITE AHEAD LOG
#---------------------------------------------------------------------------

# - Settings -

#fsync = on                             # turns forced synchronization on or off
#wal_sync_method = fsync                # the default is the first option
                                        # supported by the operating system:
                                        #   open_datasync
                                        #   fdatasync
                                        #   fsync
                                        #   fsync_writethrough
                                        #   open_sync
#full_page_writes = on                  # recover from partial page writes
#wal_buffers = 64kB                     # min 32kB
                                        # (change requires restart)
#commit_delay = 0                       # range 0-100000, in microseconds
#commit_siblings = 5                    # range 1-1000

# - Checkpoints -

#checkpoint_segments = 3                # in logfile segments, min 1, 16MB each
#checkpoint_timeout = 5min              # range 30s-1h
#checkpoint_warning = 30s               # 0 is off

# - Archiving -

#archive_command = ''           # command to use to archive a logfile segment
#archive_timeout = 0            # force a logfile segment switch after this
                                # many seconds; 0 is off


#---------------------------------------------------------------------------
# QUERY TUNING
#---------------------------------------------------------------------------

# - Planner Method Configuration -

#enable_bitmapscan = on
#enable_hashagg = on
#enable_hashjoin = on
#enable_indexscan = on
#enable_mergejoin = on
#enable_nestloop = on
#enable_seqscan = on
#enable_sort = on
#enable_tidscan = on

# - Planner Cost Constants -

#seq_page_cost = 1.0                    # measured on an arbitrary scale
#random_page_cost = 4.0                 # same scale as above
#cpu_tuple_cost = 0.01                  # same scale as above
#cpu_index_tuple_cost = 0.005           # same scale as above
#cpu_operator_cost = 0.0025             # same scale as above
effective_cache_size = 1000MB

# - Genetic Query Optimizer -

#geqo = on
#geqo_threshold = 12
#geqo_effort = 5                        # range 1-10
#geqo_pool_size = 0                     # selects default based on effort
#geqo_generations = 0                   # selects default based on effort
#geqo_selection_bias = 2.0              # range 1.5-2.0

# - Other Planner Options -

#default_statistics_target = 10         # range 1-1000
#constraint_exclusion = off
#from_collapse_limit = 8
#join_collapse_limit = 8                # 1 disables collapsing of explicit
                                        # JOINs


#---------------------------------------------------------------------------
# ERROR REPORTING AND LOGGING
#---------------------------------------------------------------------------

# - Where to Log -

#log_destination = 'stderr'             # Valid values are combinations of
                                        # stderr, syslog and eventlog,
                                        # depending on platform.

# This is used when logging to stderr:
#redirect_stderr = off                  # Enable capturing of stderr into log
                                        # files
                                        # (change requires restart)

# These are only used if redirect_stderr is on:
#log_directory = 'pg_log'               # Directory where log files are written
                                        # Can be absolute or relative to PGDATA
#log_filename = 'postgresql-%Y-%m-%d_%H%M%S.log' # Log file name pattern.
                                        # Can include strftime() escapes
#log_truncate_on_rotation = off # If on, any existing log file of the same
                                        # name as the new log file will be
                                        # truncated rather than appended to. But
                                        # such truncation only occurs on
                                        # time-driven rotation, not on restarts
                                        # or size-driven rotation. Default is
                                        # off, meaning append to existing files
                                        # in all cases.
#log_rotation_age = 1d                  # Automatic rotation of logfiles will
                                        # happen after that time.  0 to
                                        # disable.
#log_rotation_size = 10MB               # Automatic rotation of logfiles will
                                        # happen after that much log
                                        # output.  0 to disable.

# These are relevant when logging to syslog:
#syslog_facility = 'LOCAL0'
#syslog_ident = 'postgres'


# - When to Log -

#client_min_messages = notice           # Values, in order of decreasing detail:
                                        #   debug5
                                        #   debug4
                                        #   debug3
                                        #   debug2
                                        #   debug1
                                        #   log
                                        #   notice
                                        #   warning
                                        #   error

#log_min_messages = notice              # Values, in order of decreasing detail:
                                        #   debug5
                                        #   debug4
                                        #   debug3
                                        #   debug2
                                        #   debug1
                                        #   info
                                        #   notice
                                        #   warning
                                        #   error
                                        #   log
                                        #   fatal
                                        #   panic

#log_error_verbosity = default          # terse, default, or verbose messages

#log_min_error_statement = error        # Values in order of increasing severity:
                                        #   debug5
                                        #   debug4
                                        #   debug3
                                        #   debug2
                                        #   debug1
                                        #   info
                                        #   notice
                                        #   warning
                                        #   error
                                        #   fatal
                                        #   panic (effectively off)

log_min_duration_statement = -1         # -1 is disabled, 0 logs all statements
                                        # and their durations.

#silent_mode = off                      # DO NOT USE without syslog or
                                        # redirect_stderr
                                        # (change requires restart)

# - What to Log -

#debug_print_parse = off
#debug_print_rewritten = off
#debug_print_plan = off
#debug_pretty_print = off
#log_connections = off
log_disconnections = on
log_duration = on
log_line_prefix = '%u@%d %h %m'         # Special values:
                                        #   %u = user name
                                        #   %d = database name
                                        #   %r = remote host and port
                                        #   %h = remote host
                                        #   %p = PID
                                        #   %t = timestamp (no milliseconds)
                                        #   %m = timestamp with milliseconds
                                        #   %i = command tag
                                        #   %c = session id
                                        #   %l = session line number
                                        #   %s = session start timestamp
                                        #   %x = transaction id
                                        #   %q = stop here in non-session
                                        #        processes
                                        #   %% = '%'
                                        # e.g. '<%u%%%d> '
#log_statement = 'none'                 # none, ddl, mod, all
#log_hostname = off


#---------------------------------------------------------------------------
# RUNTIME STATISTICS
#---------------------------------------------------------------------------

# - Query/Index Statistics Collector -

#stats_command_string = on
#update_process_title = on

#stats_start_collector = on             # needed for block or row stats
                                        # (change requires restart)
#stats_block_level = off
#stats_row_level = off
#stats_reset_on_server_start = off      # (change requires restart)


# - Statistics Monitoring -

#log_parser_stats = off
#log_planner_stats = off
#log_executor_stats = off
#log_statement_stats = off


#---------------------------------------------------------------------------
# AUTOVACUUM PARAMETERS
#---------------------------------------------------------------------------

#autovacuum = off                       # enable autovacuum subprocess?
                                        # 'on' requires stats_start_collector
                                        # and stats_row_level to also be on
#autovacuum_naptime = 1min              # time between autovacuum runs
#autovacuum_vacuum_threshold = 500      # min # of tuple updates before
                                        # vacuum
#autovacuum_analyze_threshold = 250     # min # of tuple updates before
                                        # analyze
#autovacuum_vacuum_scale_factor = 0.2   # fraction of rel size before
                                        # vacuum
#autovacuum_analyze_scale_factor = 0.1  # fraction of rel size before
                                        # analyze
#autovacuum_freeze_max_age = 200000000  # maximum XID age before forced vacuum
                                        # (change requires restart)
#autovacuum_vacuum_cost_delay = -1      # default vacuum cost delay for
                                        # autovacuum, -1 means use
                                        # vacuum_cost_delay
#autovacuum_vacuum_cost_limit = -1      # default vacuum cost limit for
                                        # autovacuum, -1 means use
                                        # vacuum_cost_limit


#---------------------------------------------------------------------------
# CLIENT CONNECTION DEFAULTS
#---------------------------------------------------------------------------

# - Statement Behavior -

#search_path = '"$user",public'         # schema names
#default_tablespace = ''                # a tablespace name, '' uses
                                        # the default
#check_function_bodies = on
#default_transaction_isolation = 'read committed'
#default_transaction_read_only = off
#statement_timeout = 0                  # 0 is disabled
#vacuum_freeze_min_age = 100000000

# - Locale and Formatting -

datestyle = 'iso, mdy'
#timezone = unknown                     # actually, defaults to TZ
                                        # environment setting
#timezone_abbreviations = 'Default'     # select the set of available timezone
                                        # abbreviations. Currently, there are
                                        #   Default
                                        #   Australia
                                        #   India
                                        # However you can also create your own
                                        # file in share/timezonesets/.
#extra_float_digits = 0                 # min -15, max 2
#client_encoding = sql_ascii            # actually, defaults to database
                                        # encoding

# These settings are initialized by initdb -- they might be changed
lc_messages = 'C'                       # locale for system error message
                                        # strings
lc_monetary = 'C'                       # locale for monetary formatting
lc_numeric = 'C'                        # locale for number formatting
lc_time = 'C'                           # locale for time formatting

# - Other Defaults -

#explain_pretty_print = on
#dynamic_library_path = '$libdir'
#local_preload_libraries = ''


#---------------------------------------------------------------------------
# LOCK MANAGEMENT
#---------------------------------------------------------------------------

#deadlock_timeout = 1s
#max_locks_per_transaction = 64         # min 10
                                        # (change requires restart)
# Note: each lock table slot uses ~270 bytes of shared memory, and there are
# max_locks_per_transaction * (max_connections + max_prepared_transactions)
# lock table slots.


#---------------------------------------------------------------------------
# VERSION/PLATFORM COMPATIBILITY
#---------------------------------------------------------------------------

# - Previous Postgres Versions -

#add_missing_from = off
#array_nulls = on
#backslash_quote = safe_encoding        # on, off, or safe_encoding
#default_with_oids = off
#escape_string_warning = on
#standard_conforming_strings = off
#regex_flavor = advanced                # advanced, extended, or basic
#sql_inheritance = on

# - Other Platforms & Clients -

#transform_null_equals = off


#---------------------------------------------------------------------------
# CUSTOMIZED OPTIONS
#---------------------------------------------------------------------------

#custom_variable_classes = ''           # list of custom variable class names

--
Bryan Buecking

pgsql-performance by date:

Previous
From: "Scott Marlowe"
Date:
Subject: Re: Oddly slow queries
Next
From: "Joshua D. Drake"
Date:
Subject: Re: CPU bound at 99%