Thread: Config review

Config review

From

Bryan Vest

Date:

07 December 2004, 00:52:09

Postgresql is the backbone of our spam filtering system. Currently the
performance is OK. Wanted to know if someone could give this config a
quick run down and see if there is anything we can adjust here to smooth
out the performance. The IO Wait Times are outrageous, at times the load
will spike up to the 70 - 90 range.

Hardware:
Quad Opteron 2Ghz
Tyan Quad Opteron Board
16GB DDR Ram
Emulex LightPulse LP1050
EMC Clarion Fiber Array running Raid5
-----------------------------------------
Software:
RedHat Linux AS
Postgresql 7.4.6
-----------------------------------------
Detail:
pg_xlog is stored on a local 10k RPM SCSI drive.
The rest of the database is stored on the Fiber Array.

Currently the database is at a size of 87.6Gig. A Vacuum Analyze runs
every night and has been taking 4 or 5 hours to complete. Everything
seems to run fine for a while, then at some point the load goes through
the roof and the iowait % also goes way up. It will recover after a
little bit and then do the same thing all over again. When this happens
access to the web based user interface slows way down for our customers.
Any input for improvements to this config would be appreciated, Thanks.

------------------------------------------

------------------------------------------
Vacuum Output:

INFO:  analyzing "pg_catalog.pg_listener"
INFO:  "pg_listener": 0 pages, 0 rows sampled, 0 estimated total rows
INFO:  free space map: 79 relations, 1948399 pages stored; 5306160 total
pages needed
DETAIL:  Allocated FSM size: 500 relations + 2000000 pages = 11769 kB
shared memory.
VACUUM
--------------------------------------------

<--config-->

tcpip_socket = true
max_connections = 800
#superuser_reserved_connections = 2
port = 5432
#port = 9999
#unix_socket_directory = ''
#unix_socket_group = ''
#unix_socket_permissions = 0777 # octal
#virtual_host = ''              # what interface to listen on; defaults
to any
#rendezvous_name = ''           # defaults to the computer name

# - Security & Authentication -

#authentication_timeout = 60    # 1-600, in seconds
#ssl = false
#password_encryption = true
#krb_server_keyfile = ''
#db_user_namespace = false


#---------------------------------------------------------------------------
# RESOURCE USAGE (except WAL)
#---------------------------------------------------------------------------

# - Memory -

shared_buffers = 16000
sort_mem = 16384
vacuum_mem = 3200000

# - Free Space Map -

max_fsm_pages = 2000000
max_fsm_relations = 500

# - Kernel Resource Usage -

max_files_per_process = 100     # min 25
#preload_libraries = ''


#---------------------------------------------------------------------------
# WRITE AHEAD LOG
#---------------------------------------------------------------------------

# - Settings -

fsync = true                    # turns forced synchronization on or off
#wal_sync_method = fsync        # the default varies across platforms:
                                 # fsync, fdatasync, open_sync, or
open_datasync
wal_buffers = 64        # min 4, 8KB each

# - Checkpoints -

checkpoint_segments = 50        # in logfile segments, min 1, 16MB each
#checkpoint_timeout = 60        # range 30-3600, in seconds
#checkpoint_warning = 30        # 0 is off, in seconds
#commit_delay = 0               # range 0-100000, in microseconds
#commit_siblings = 10           # range 1-1000


#---------------------------------------------------------------------------
# QUERY TUNING
#---------------------------------------------------------------------------

# - Planner Method Enabling -

#enable_hashagg = true
#enable_hashjoin = true
enable_indexscan = true
#enable_mergejoin = true
#enable_nestloop = true
#enable_seqscan = true
#enable_sort = true
#enable_tidscan = true

# - Planner Cost Constants -

effective_cache_size = 50000    # typically 8KB each
random_page_cost = 20           # units are one sequential page fetch cost
#cpu_tuple_cost = 0.01          # (same)
#cpu_index_tuple_cost = 0.001   # (same)
#cpu_operator_cost = 0.0025     # (same)

# - Genetic Query Optimizer -

geqo = true
geqo_threshold = 11
geqo_effort = 1
geqo_generations = 0
geqo_pool_size = 0              # default based on tables in statement,
                                 # range 128-1024
geqo_selection_bias = 2.0       # range 1.5-2.0

# - Other Planner Options -

#default_statistics_target = 10 # range 1-1000
#from_collapse_limit = 8
#join_collapse_limit = 8        # 1 disables collapsing of explicit JOINs


#---------------------------------------------------------------------------
# ERROR REPORTING AND LOGGING
#---------------------------------------------------------------------------

# - Syslog -

syslog = 2                      # range 0-2; 0=stdout; 1=both; 2=syslog
#syslog_facility = 'LOCAL0'
#syslog_ident = 'postgres'

# - When to Log -

client_min_messages = error     # Values, in order of decreasing detail:
                                 #   debug5, debug4, debug3, debug2, debug1,
                                 #   log, info, notice, warning, error

log_min_messages = error        # Values, in order of decreasing detail:
                                 #   debug5, debug4, debug3, debug2, debug1,
                                 #   info, notice, warning, error, log,
fatal,
                                 #   panic

log_error_verbosity = terse   # terse, default, or verbose messages

#log_min_error_statement = panic # Values in order of increasing severity:
                                  #   debug5, debug4, debug3, debug2,
debug1,
                                  #   info, notice, warning, error,
panic(off)

#log_min_duration_statement = -1 # Log all statements whose
                                  # execution time exceeds the value, in
                                  # milliseconds.  Zero prints all queries.
                                  # Minus-one disables.

#silent_mode = false             # DO NOT USE without Syslog!

# - What to Log -

debug_print_parse = false
debug_print_rewritten = false
debug_print_plan = false
debug_pretty_print = false
log_connections = false
log_duration = false
log_pid = false
log_statement = false
log_timestamp = true
log_hostname = true
log_source_port = false


#---------------------------------------------------------------------------
# RUNTIME STATISTICS
#---------------------------------------------------------------------------

# - Statistics Monitoring -

log_parser_stats = false
log_planner_stats = false
log_executor_stats = false
log_statement_stats = false

# - Query/Index Statistics Collector -

#stats_start_collector = true
#stats_command_string = false
#stats_block_level = false
#stats_row_level = false
#stats_reset_on_server_start = true


#---------------------------------------------------------------------------
# CLIENT CONNECTION DEFAULTS
#---------------------------------------------------------------------------

# - Statement Behavior -

#search_path = '$user,public'   # schema names
#check_function_bodies = true
#default_transaction_isolation = 'read committed'
#default_transaction_read_only = false
#statement_timeout = 0          # 0 is disabled, in milliseconds

# - Locale and Formatting -

#datestyle = 'iso, mdy'
#timezone = unknown             # actually, defaults to TZ environment
setting
#australian_timezones = false
#extra_float_digits = 0         # min -15, max 2
#client_encoding = sql_ascii    # actually, defaults to database encoding

# These settings are initialized by initdb -- they may be changed
lc_messages = 'en_US.UTF-8'             # locale for system error
message strings
lc_monetary = 'en_US.UTF-8'             # locale for monetary formatting
lc_numeric = 'en_US.UTF-8'              # locale for number formatting
lc_time = 'en_US.UTF-8'                 # locale for time formatting

# - Other Defaults -

#explain_pretty_print = true
#dynamic_library_path = '$libdir'
#max_expr_depth = 10000         # min 10


#---------------------------------------------------------------------------
# LOCK MANAGEMENT
#---------------------------------------------------------------------------

#deadlock_timeout = 1000        # in milliseconds
max_locks_per_transaction = 200 # min 10, ~260*max_connections bytes each


#---------------------------------------------------------------------------
# VERSION/PLATFORM COMPATIBILITY
#---------------------------------------------------------------------------

# - Previous Postgres Versions -

#add_missing_from = true
#regex_flavor = advanced        # advanced, extended, or basic
#sql_inheritance = true

# - Other Platforms & Clients -

#transform_null_equals = false

<--config-->


Thanks
--
---------------------------------------------------------------
Bryan Vest
ComNet Inc.
bright.net Network Administration/Network Operations
(888)-618-4638
net-admin@bright.net    Pager: pagenoc@bright.net
---------------------------------------------------------------

Re: Config review

From

Greg Stark

Date:

07 December 2004, 08:03:30

Bryan Vest <bvest@bright.net> writes:

> Currently the database is at a size of 87.6Gig. A Vacuum Analyze runs every
> night and has been taking 4 or 5 hours to complete. Everything seems to run
> fine for a while, then at some point the load goes through the roof and the
> iowait % also goes way up. It will recover after a little bit and then do the
> same thing all over again. When this happens access to the web based user
> interface slows way down for our customers. Any input for improvements to this
> config would be appreciated, Thanks.

While others have pointed out problems with the config I don't think any of
them explains this irregular behaviour. From what you're describing the
response time is ok most of the time except for these particular bursts?

Do they occur at regular intervals? Is it possible it's just the
checkpointing? Can you see which volumes the i/o traffic is on? Is it on the
local transaction log files or is it on the data files? Does the write i/o
spike upwards or is it just a storm of read i/o? Also, incidentally, Is it
possible you have a cron job running vacuum and don't realize it?

If it happens at irregular intervals then it could be a single bad query
that's causing the problem. One bad query would cause a sequential scan of
your 87G and potentially push out a lot of data from the cache. I imagine this
might also be especially bad with the shared_buffers being out of whack.

You might start by checking the easiest thing first, set
log_min_duration_statement to something high and slowly lower it until it's
printing a handful of queries during the heaviest period.

You could also look for a pgsql_tmp directory that indicate a disk sort is
happening, which would mean some query is trying to sort a lot of data. You
might have to lower sort_mem to a conservative value before you could see that
though.

The pgsql_tmp directory appears (and disappears?) as needed, it's something
like this:

bash-2.05b# ls /var/lib/postgres/data/base/17150/pgsql_tmp
pgsql_tmp22184.0

--
greg

Re: Config review

From

Tom Lane

Date:

07 December 2004, 15:08:18

Greg Stark <gsstark@mit.edu> writes:
> Bryan Vest <bvest@bright.net> writes:
>> Currently the database is at a size of 87.6Gig. A Vacuum Analyze runs every
>> night and has been taking 4 or 5 hours to complete. Everything seems to run
>> fine for a while, then at some point the load goes through the roof and the
>> iowait % also goes way up. It will recover after a little bit and then do the
>> same thing all over again.

> While others have pointed out problems with the config I don't think any of
> them explains this irregular behaviour.

As Greg says, it might be checkpoints or a background query.  If it
actually is the vacuum itself causing the variation in load, the theory
that comes to mind is that the performance tanks when the vacuum run
switches from find-dead-tuples to clean-indexes mode; clean-indexes is
usually a lot more I/O intensive.

ISTM it actually doesn't matter much which of these explanations is
correct, because all three imply the same thing: not enough disk I/O
bandwidth.  The disk is near saturation already and any increase in
demand drives response time over the knee of the curve.

If you are using a RAID configuration it might just be that you need
to adjust the configuration (IIRC, there are some RAID setups that
are not very write-friendly).  Otherwise you may have little alternative
but to buy faster disks.

            regards, tom lane

Re: Config review

From

Andrew Sullivan

Date:

07 December 2004, 15:19:43

On Tue, Dec 07, 2004 at 10:07:54AM -0500, Tom Lane wrote:
> If you are using a RAID configuration it might just be that you need
> to adjust the configuration (IIRC, there are some RAID setups that
> are not very write-friendly).  Otherwise you may have little alternative
> but to buy faster disks.

It might be that altering the Clariion array from RAID 5 to RAID 1+0
would make a difference; but I'd be very surprised to learn that you
could get that array to go a whole lot faster.

One thing that might also be worth investigating is whether
performance actually goes up by moveing the WAL into the array.
We've had some remarkably good experiences with our recently-acquired
EMC.

A

--
Andrew Sullivan  | ajs@crankycanuck.ca
When my information changes, I alter my conclusions.  What do you do sir?
        --attr. John Maynard Keynes