Thread: Variable (degrading) perfomance

Variable (degrading) perfomance

From
Vladimir Stankovic
Date:
Hi all,

It seems that I have an issue with the performance of a PostgreSQL server.

I'm running write-intensive, TPC-C like tests. The workload consist of
150 to 200 thousand transactions. The performance varies dramatically,
between 5 and more than 9 hours (I don't have the exact figure for the
longest experiment). Initially the server is relatively fast. It
finishes the first batch of 50k transactions in an hour. This is
probably due to the fact that the database is RAM-resident during this
interval. As soon as the database grows bigger than the RAM the
performance, not surprisingly, degrades, because of the slow disks.
My problem is that the performance is rather variable, and to me
non-deterministic. A 150k test can finish in approx. 3h30mins but
conversely  it can take more than 5h to complete.
Preferably I would like to see *steady-state* performance (where my
interpretation of the steady-state is that the average
throughput/response time does not change over time). Is the steady-state
achievable despite the MVCC and the inherent non-determinism between
experiments? What could be the reasons for the variable performance?
- misconfiguration of the PG parameters (e.g. autovacuum does not cope
with the dead tuples on the MVCC architecture)
- file fragmentation
- index bloat
- ???
The initial size of the database (actually the output of the 'du -h'
command) is ~ 400 MB. The size increases dramatically, somewhere between
600MB and 1.1GB

I have doubted the client application at some point too. However, other
server combinations using different DBMS exhibit steady state
performance.As a matter of fact when PG is paired with Firebird, through
statement-based replication middleware, the performance of the pair is
steady too.

The hardware configuration:
Client machine
- 1.5 GHz CPU Pentium 4
- 1GB Rambus RAM
- Seagate st340810a IDE disk (40GB), 5400 rpms

Server machine
- 1.5 GHz CPU Pentium 4
- 640 MB Rambus RAM
- Seagate Barracuda 7200.9 rpms
- Seagate st340810a IDE disk (40GB) - the WAL is stored on an ext2 partition

The Software configuration:
The client application is a multi-threaded Java client running on Win
2000 Pro sp4
The database server version is  8.1.5 running on Fedora Core 6.
Please find attached:
1 - the output of vmstat  taken after  the first  60k transactions were
executed
2 - the postgresql.conf file

Any help would be appreciated.

Best regards,
Vladimir
--

Vladimir Stankovic                        T: +44 20 7040 0273
Research Student/Research Assistant       F: +44 20 7040 8585
Centre for Software Reliability           E: V.Stankovic@city.ac.uk
City University
Northampton Square, London EC1V 0HB

# -----------------------------
# PostgreSQL configuration file
# -----------------------------
#
# This file consists of lines of the form:
#
#   name = value
#
# (The '=' is optional.) White space may be used. Comments are introduced
# with '#' anywhere on a line. The complete list of option names and
# allowed values can be found in the PostgreSQL documentation. The
# commented-out settings shown in this file represent the default values.
#
# Please note that re-commenting a setting is NOT sufficient to revert it
# to the default value, unless you restart the postmaster.
#
# Any option can also be given as a command line switch to the
# postmaster, e.g. 'postmaster -c log_connections=on'. Some options
# can be changed at run-time with the 'SET' SQL command.
#
# This file is read on postmaster startup and when the postmaster
# receives a SIGHUP. If you edit the file on a running system, you have
# to SIGHUP the postmaster for the changes to take effect, or use
# "pg_ctl reload". Some settings, such as listen_addresses, require
# a postmaster shutdown and restart to take effect.


#---------------------------------------------------------------------------
# FILE LOCATIONS
#---------------------------------------------------------------------------

# The default values of these variables are driven from the -D command line
# switch or PGDATA environment variable, represented here as ConfigDir.

#data_directory = 'ConfigDir'        # use data in another directory
#hba_file = 'ConfigDir/pg_hba.conf'    # host-based authentication file
#ident_file = 'ConfigDir/pg_ident.conf'    # IDENT configuration file

# If external_pid_file is not explicitly set, no extra pid file is written.
#external_pid_file = '(none)'        # write an extra pid file


#---------------------------------------------------------------------------
# CONNECTIONS AND AUTHENTICATION
#---------------------------------------------------------------------------

# - Connection Settings -

#listen_addresses = 'localhost'        # what IP address(es) to listen on;
                    # comma-separated list of addresses;
                    # defaults to 'localhost', '*' = all
#port = 5432
#max_connections = 100
max_connections = 150
# note: increasing max_connections costs ~400 bytes of shared memory per
# connection slot, plus lock space (see max_locks_per_transaction).  You
# might also need to raise shared_buffers to support more connections.
#superuser_reserved_connections = 2
superuser_reserved_connections = 3 # 1 for the AUTOVACUUM proc
#unix_socket_directory = ''
#unix_socket_group = ''
#unix_socket_permissions = 0777        # octal
#bonjour_name = ''            # defaults to the computer name

# - Security & Authentication -

#authentication_timeout = 60        # 1-600, in seconds
#ssl = off
#password_encryption = on
#db_user_namespace = off

# Kerberos
#krb_server_keyfile = ''
#krb_srvname = 'postgres'
#krb_server_hostname = ''        # empty string matches any keytab entry
#krb_caseins_users = off

# - TCP Keepalives -
# see 'man 7 tcp' for details

#tcp_keepalives_idle = 0        # TCP_KEEPIDLE, in seconds;
                    # 0 selects the system default
#tcp_keepalives_interval = 0        # TCP_KEEPINTVL, in seconds;
                    # 0 selects the system default
#tcp_keepalives_count = 0        # TCP_KEEPCNT;
                    # 0 selects the system default


#---------------------------------------------------------------------------
# RESOURCE USAGE (except WAL)
#---------------------------------------------------------------------------

# - Memory -

#shared_buffers = 1000            # min 16 or max_connections*2, 8KB each
shared_buffers = 16384
#temp_buffers = 1000            # min 100, 8KB each
#max_prepared_transactions = 5        # can be 0 or more
# note: increasing max_prepared_transactions costs ~600 bytes of shared memory
# per transaction slot, plus lock space (see max_locks_per_transaction).
#work_mem = 1024            # min 64, size in KB
#maintenance_work_mem = 16384        # min 1024, size in KB
maintenance_work_mem = 16384
#max_stack_depth = 2048            # min 100, size in KB

# - Free Space Map -

#max_fsm_pages = 20000            # min max_fsm_relations*16, 6 bytes each
max_fsm_pages = 60000
#max_fsm_relations = 1000        # min 100, ~70 bytes each

# - Kernel Resource Usage -

#max_files_per_process = 1000        # min 25
#preload_libraries = ''

# - Cost-Based Vacuum Delay -

#vacuum_cost_delay = 0            # 0-1000 milliseconds
#vacuum_cost_page_hit = 1        # 0-10000 credits
#vacuum_cost_page_miss = 10        # 0-10000 credits
#vacuum_cost_page_dirty = 20        # 0-10000 credits
#vacuum_cost_limit = 200        # 0-10000 credits

# - Background writer -

#bgwriter_delay = 200            # 10-10000 milliseconds between rounds
#bgwriter_lru_percent = 1.0        # 0-100% of LRU buffers scanned/round
#bgwriter_lru_maxpages = 5        # 0-1000 buffers max written/round
#bgwriter_all_percent = 0.333        # 0-100% of all buffers scanned/round
#bgwriter_all_maxpages = 5        # 0-1000 buffers max written/round


#---------------------------------------------------------------------------
# WRITE AHEAD LOG
#---------------------------------------------------------------------------

# - Settings -

fsync = off                # turns forced synchronization on or off

#wal_sync_method = fsync        # the default is the first option
                    # supported by the operating system:
                    #   open_datasync
                    #   fdatasync
                    #   fsync
                    #   fsync_writethrough
                    #   open_sync
#full_page_writes = on            # recover from partial page writes
#wal_buffers = 8            # min 4, 8KB each
wal_buffers = 32
#commit_delay = 0            # range 0-100000, in microseconds
#commit_siblings = 5            # range 1-1000

# - Checkpoints -

#checkpoint_segments = 3        # in logfile segments, min 1, 16MB each
checkpoint_segments = 9
#checkpoint_timeout = 300        # range 30-3600, in seconds
#checkpoint_warning = 30        # in seconds, 0 is off

# - Archiving -

#archive_command = ''            # command to use to archive a logfile
                    # segment


#---------------------------------------------------------------------------
# QUERY TUNING
#---------------------------------------------------------------------------

# - Planner Method Configuration -

#enable_bitmapscan = on
#enable_hashagg = on
#enable_hashjoin = on
#enable_indexscan = on
#enable_mergejoin = on
#enable_nestloop = on
#enable_seqscan = on
#enable_sort = on
#enable_tidscan = on

# - Planner Cost Constants -

#effective_cache_size = 1000        # typically 8KB each
effective_cache_size = 16384
#random_page_cost = 4            # units are one sequential page fetch
                    # cost
#cpu_tuple_cost = 0.01            # (same)
#cpu_index_tuple_cost = 0.001        # (same)
#cpu_operator_cost = 0.0025        # (same)

# - Genetic Query Optimizer -

#geqo = on
#geqo_threshold = 12
#geqo_effort = 5            # range 1-10
#geqo_pool_size = 0            # selects default based on effort
#geqo_generations = 0            # selects default based on effort
#geqo_selection_bias = 2.0        # range 1.5-2.0

# - Other Planner Options -

#default_statistics_target = 10        # range 1-1000
#constraint_exclusion = off
#from_collapse_limit = 8
#join_collapse_limit = 8        # 1 disables collapsing of explicit
                    # JOINs


#---------------------------------------------------------------------------
# ERROR REPORTING AND LOGGING
#---------------------------------------------------------------------------

# - Where to Log -

#log_destination = 'stderr'        # Valid values are combinations of
                    # stderr, syslog and eventlog,
                    # depending on platform.

# This is used when logging to stderr:
redirect_stderr = on            # Enable capturing of stderr into log
                    # files

# These are only used if redirect_stderr is on:
log_directory = 'pg_log'        # Directory where log files are written
                    # Can be absolute or relative to PGDATA
#log_filename = 'postgresql-%Y-%m-%d_%H%M%S.log' # Log file name pattern.
                    # Can include strftime() escapes
log_truncate_on_rotation = on           # If on, any existing log file of the same
                    # name as the new log file will be
                    # truncated rather than appended to. But
                    # such truncation only occurs on
                    # time-driven rotation, not on restarts
                    # or size-driven rotation. Default is
                    # off, meaning append to existing files
                    # in all cases.
log_rotation_age = 1440            # Automatic rotation of logfiles will
                    # happen after so many minutes.  0 to
                    # disable.
log_rotation_size = 0            # Automatic rotation of logfiles will
                    # happen after so many kilobytes of log
                    # output.  0 to disable.

# These are relevant when logging to syslog:
#syslog_facility = 'LOCAL0'
#syslog_ident = 'postgres'


# - When to Log -

#client_min_messages = notice        # Values, in order of decreasing detail:
                    #   debug5
                    #   debug4
                    #   debug3
                    #   debug2
                    #   debug1
                    #   log
                    #   notice
                    #   warning
                    #   error

#log_min_messages = notice        # Values, in order of decreasing detail:
log_min_messages = notice
                    #   debug5
                    #   debug4
                    #   debug3
                    #   debug2
                    #   debug1
                    #   info
                    #   notice
                    #   warning
                    #   error
                    #   log
                    #   fatal
                    #   panic

#log_error_verbosity = default        # terse, default, or verbose messages

#log_min_error_statement = panic    # Values in order of increasing severity:
                     #   debug5
                    #   debug4
                    #   debug3
                    #   debug2
                    #   debug1
                     #   info
                    #   notice
                    #   warning
                    #   error
                    #   panic(off)

#log_min_duration_statement = -1    # -1 is disabled, 0 logs all statements
                    # and their durations, in milliseconds.

#silent_mode = off            # DO NOT USE without syslog or
                    # redirect_stderr

# - What to Log -

#debug_print_parse = off
#debug_print_rewritten = off
#debug_print_plan = off
#debug_pretty_print = off
#log_connections = off
#log_disconnections = off
#log_duration = off
#log_line_prefix = ''            # Special values:
                    #   %u = user name
                    #   %d = database name
                    #   %r = remote host and port
                    #   %h = remote host
                    #   %p = PID
                    #   %t = timestamp (no milliseconds)
                    #   %m = timestamp with milliseconds
                    #   %i = command tag
                    #   %c = session id
                    #   %l = session line number
                    #   %s = session start timestamp
                    #   %x = transaction id
                    #   %q = stop here in non-session
                    #        processes
                    #   %% = '%'
                    # e.g. '<%u%%%d> '
#log_statement = 'none'            # none, mod, ddl, all
#log_hostname = off


#---------------------------------------------------------------------------
# RUNTIME STATISTICS
#---------------------------------------------------------------------------

# - Statistics Monitoring -

#log_parser_stats = off
#log_planner_stats = off
#log_executor_stats = off
#log_statement_stats = off

# - Query/Index Statistics Collector -

#stats_start_collector = on
stats_start_collector = on
#stats_command_string = off
stats_command_string = off #turn it on ON for debugging
#stats_block_level = off
stats_row_level = on
#stats_row_level = off
#stats_reset_on_server_start = off


#---------------------------------------------------------------------------
# AUTOVACUUM PARAMETERS
#---------------------------------------------------------------------------

autovacuum = on            # enable autovacuum subprocess?
#autovacuum_naptime = 60        # time between autovacuum runs, in secs
#autovacuum_vacuum_threshold = 1000    # min # of tuple updates before
                    # vacuum
#autovacuum_analyze_threshold = 500    # min # of tuple updates before
                    # analyze

#autovacuum_vacuum_scale_factor = 0.4    # fraction of rel size before
                    # vacuum
autovacuum_vacuum_scale_factor = 0    # intensify autovacuum vacuums?

#autovacuum_analyze_scale_factor = 0.2    # fraction of rel size before
                    # analyze
autovacuum_analyze_scale_factor = 0    # intensify autovacuum analyzes?

#autovacuum_vacuum_cost_delay = -1    # default vacuum cost delay for
                    # autovac, -1 means use
                    # vacuum_cost_delay
#autovacuum_vacuum_cost_limit = -1    # default vacuum cost limit for
                    # autovac, -1 means use
                    # vacuum_cost_limit


#---------------------------------------------------------------------------
# CLIENT CONNECTION DEFAULTS
#---------------------------------------------------------------------------

# - Statement Behavior -

#search_path = '$user,public'        # schema names
#default_tablespace = ''        # a tablespace name, '' uses
                    # the default
#check_function_bodies = on
#default_transaction_isolation = 'read committed'
#default_transaction_read_only = off
#statement_timeout = 0            # 0 is disabled, in milliseconds

# - Locale and Formatting -

#datestyle = 'iso, mdy'
#timezone = unknown            # actually, defaults to TZ
                    # environment setting
#australian_timezones = off
#extra_float_digits = 0            # min -15, max 2
#client_encoding = sql_ascii        # actually, defaults to database
                    # encoding

# These settings are initialized by initdb -- they might be changed
lc_messages = 'C'            # locale for system error message
                    # strings
lc_monetary = 'C'            # locale for monetary formatting
lc_numeric = 'C'            # locale for number formatting
lc_time = 'C'                # locale for time formatting

# - Other Defaults -

#explain_pretty_print = on
#dynamic_library_path = '$libdir'


#---------------------------------------------------------------------------
# LOCK MANAGEMENT
#---------------------------------------------------------------------------

#deadlock_timeout = 1000        # in milliseconds
#max_locks_per_transaction = 64        # min 10
# note: each lock table slot uses ~220 bytes of shared memory, and there are
# max_locks_per_transaction * (max_connections + max_prepared_transactions)
# lock table slots.


#---------------------------------------------------------------------------
# VERSION/PLATFORM COMPATIBILITY
#---------------------------------------------------------------------------

# - Previous Postgres Versions -

#add_missing_from = off
#backslash_quote = safe_encoding    # on, off, or safe_encoding
#default_with_oids = off
#escape_string_warning = off
#regex_flavor = advanced        # advanced, extended, or basic
#sql_inheritance = on

# - Other Platforms & Clients -

#transform_null_equals = off


#---------------------------------------------------------------------------
# CUSTOMIZED OPTIONS
#---------------------------------------------------------------------------

#custom_variable_classes = ''        # list of custom variable class names
-bash-3.1$ vmstat 1
procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu------
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa st
 0 11    156   6976   4640 525572    0    0   288  1712  981  620 26 11 30 33  0
 0  3    156   7356   4648 525172    0    0   274  2245 1287  992 36 17  0 48  0
 0  4    156   6680   4664 525780    0    0   174  2665 1193  743 34 14  0 52  0
 0  3    156   7336   4640 524944    0    0   232  2185  694  348  7  4  0 89  0
 2  4    156   6536   4652 525920    0    0   346  2073 1677  999 50 19  0 32  0
 2  4    156   6892   4652 525620    0    0   535  1546 1210  903 36 14  0 50  0
 0  6    156   7416   4644 524924    0    0   141  2834 1196  755 39 10  0 51  0
 1  6    156   6676   4652 525788    0    0   306  1832 1374  965 44 17  0 40  0
 0  7    156   7328   4640 525072    0    0   148  2418 1077  747 31 11  0 58  0
 1  8    156   6656   4644 525808    0    0   385  2256 1144  824 34 10  0 56  0
 2 11    156   7368   4644 525160    0    0   354  1957 1380 1060 43 13  0 45  0
 1  8    156   6880   4648 525608    0    0   217  1865  997  695 33 11  0 56  0
 0  6    156   7340   4628 525028    0    0   109  2397 1096  780 30 14  0 56  0
 2  9    156   7032   4640 525484    0    0   173  2277  994  622 25 11  0 65  0
 4  6    156   7560   4628 525028    0    0   401  1701 1085  747 35 12  0 53  0
 3  8    156   6948   4632 525544    0    0   257  2070 1066  676 24 11  0 65  0
 0  5    156   6640   4636 525904    0    0    57  2720  999  581 20  8  0 72  0
 1 11    156   7052   4640 525416    0    0   226  1682 1246  778 36 11  0 53  0
 0  6    156   6684   4644 525776    0    0   105  1689  946  621 23 11  0 66  0
 5  9    156   6684   4644 525920    0    0    40  1808  686  335 10  5  0 85  0
 0  6    156   6960   4644 525476    0    0   210  2354 1446  950 37 19  0 44  0
 2  8    156   6652   4648 525828    0    0   105  2345 1115  776 29  9  0 62  0
 2  9    156   7056   4648 525464    0    0   274  1853 1453 1046 44 15  0 41  0
 1  9    156   7216   4636 525124    0    0   218  2261 1414 1031 42 17  0 41  0
 1  3    156   6784   4644 525740    0    0   217  1944 1161  813 31 14  0 55  0
 2  5    156   7300   4628 525212    0    0   233  2650 1231  823 37 15  0 48  0
 1  5    156   6532   4636 525856    0    0   218  2337 1314  876 36 17  0 47  0
 1  5    156   7224   4632 525300    0    0   217  2465 1060  688 24 13  0 63  0
 2  7    156   7448   4632 524936    0    0   298  2273 1196  779 37 13  0 50  0
 0  6    156   6952   4636 525512    0    0   169  2521 1184  847 31 16  0 53  0
 2 11    156   7628   4636 524776    0    0   210  1872 1332  961 43 17  0 41  0
 0  7    156   6772   4644 525756    0    0   437  2462 1425 1001 45 14  0 40  0
 0  5    156   6968   4636 525496    0    0   266  2201 1364  914 42 16  0 42  0
 0  4    156   7676   4624 524616    0    0   161  2817 1202  761 31 12  0 57  0
 0  9    156   7436   4628 524944    0    0   137  2549  890  457 15  9  0 76  0
 1  7    156   6764   4632 525668    0    0   365  2449 1352  920 39 13  0 48  0
 0  6    156   7000   4628 525388    0    0   331  1977 1645 1186 52 16  0 32  0
 0  8    156   7384   4620 525092    0    0   346  2185 1566 1038 44 19  0 37  0
 0 11    156   6768   4624 525676    0    0   209  2509 1313  893 38 16  0 46  0
 0  7    156   7044   4624 525440    0    0   267  1905 1408  988 37 18  0 45  0
 1  8    156   6576   4628 525904    0    0   137  1177  952  763 31  8  0 61  0
 7  6    156   7384   4608 525108    0    0   161  2317 1352  881 41 18  0 42  0
 0  3    156   7668   4612 524660    0    0   274  2753 1502 1022 53 17  0 30  0
 2  8    156   7296   4616 525176    0    0   217  2593 1220  761 35 15  0 50  0
 3  7    156   7516   4608 524716    0    0   242  2353 1569 1012 40 23  0 38  0
 3  3    156   7152   4612 525388    0    0   185  2409 1222  801 34 10  0 56  0
procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu------
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa st
 2  7    156   7588   4604 524884    0    0   298  2161 1402  938 40 20  0 41  0
 4  9    156   6864   4612 525640    0    0   274  2045 1322  943 44 12  0 45  0
 2  8    156   7156   4604 525320    0    0   373  2189 1314  933 39 14  0 47  0
 3  8    156   6548   4608 525940    0    0   385  1925 1074  779 29 15  0 56  0
 3  7    156   6616   4588 525732    0    0   273  2138 1318 1005 40 18  0 43  0
 0 14    156   7192   4564 524816    0    0   303  7220 1414 1130 40 33  0 27  0
 1  9    156   6804   4564 525204    0    0   201  3297  989  853 27  8  0 65  0
 0  8    156   6680   4568 525408    0    0     9  2508  656  321  3  4  0 93  0
 0  8    156   7272   4560 524740    0    0   109  3266 1039  791 30 11  0 59  0
 0  8    156   6564   4572 525568    0    0   222  3681 1227  876 31  9  0 60  0

Re: Variable (degrading) perfomance

From
"Merlin Moncure"
Date:
On 6/11/07, Vladimir Stankovic <V.Stankovic@city.ac.uk> wrote:
> Hi all,
>
> It seems that I have an issue with the performance of a PostgreSQL server.
>
> I'm running write-intensive, TPC-C like tests. The workload consist of
> 150 to 200 thousand transactions. The performance varies dramatically,
> between 5 and more than 9 hours (I don't have the exact figure for the
> longest experiment). Initially the server is relatively fast. It
> finishes the first batch of 50k transactions in an hour. This is
> probably due to the fact that the database is RAM-resident during this
> interval. As soon as the database grows bigger than the RAM the
> performance, not surprisingly, degrades, because of the slow disks.
> My problem is that the performance is rather variable, and to me
> non-deterministic. A 150k test can finish in approx. 3h30mins but
> conversely  it can take more than 5h to complete.
> Preferably I would like to see *steady-state* performance (where my
> interpretation of the steady-state is that the average
> throughput/response time does not change over time). Is the steady-state
> achievable despite the MVCC and the inherent non-determinism between
> experiments? What could be the reasons for the variable performance?
> - misconfiguration of the PG parameters (e.g. autovacuum does not cope
> with the dead tuples on the MVCC architecture)
> - file fragmentation
> - index bloat
> - ???

vmstat is telling you that the server is i/o bound.  an iostat will
tell be helpful to tell you where things are binding up...either the
data volume, wal volume, or both.  I suspect your sorts are spilling
to disk which is likely the cause of the variable performance,
interacting with autovacuum.  Another possibility is vacuum is bogging
you down.  look for pg_tmp folders inside the database tree to see if
this is happening.   Also you want to see if your server is swapping.

first, I'd suggest bumping maintenance_work_mem to 256mb.   I'd also
suggest bumping work_mem higher, but you are going to have to
calculate how far to go based on how many active queries with sort are
going to fire simultaneously.  It can be a fine line because your a
bit underpowered memory but your database is small as well.  bumping
work_mem but throwing your server into swap solves nothing.

merlin