Thread: BUG #2104: pg_xlog/ trace files not reclaimed by server

BUG #2104: pg_xlog/ trace files not reclaimed by server

From
"Reuben Pasquini"
Date:
The following bug has been logged online:

Bug reference:      2104
Logged by:          Reuben Pasquini
Email address:      pasquini@imageworks.com
PostgreSQL version: 8.1
Operating system:   linux redhat-4.1
Description:        pg_xlog/ trace files not reclaimed by server
Details:

Hello!

It appears that when my 8.1 database is forced
to generate more than
            2*checkpoint_segments + 1
trace files under pg_xlog/, that the database
becomes confused and stops recycling the trace files.
Unfortunately - I haven't been able to catch this condition
until after it filled up the disk (hundreds of trace files)
and things crashed.  I did not have this problem running
postgres-7.4, but performance was not nearly so good as 8.1.

I'm running a high-transaction postgres-8.1 database
on a 2-proc linux box - looks like 4 procs due to hyperthreading:

Main memory size: 3931 Mbytes
4 GenuineIntel Intel(R) Xeon(TM) CPU 2.80GHz processors


Attached below is the postgresql.conf file
I was running with at the time of the last crash.
I've had a few crashes like this, but only diagnosed
the problem recently.
I increased the checkpoint-segments to 30 after the
last crash early thismorning to see if that works
around the problem.

Thanks for the help!
Reuben

---------------
# -----------------------------
# PostgreSQL configuration file
# -----------------------------
#
# This file consists of lines of the form:
#
#   name = value
#
# (The '=' is optional.) White space may be used. Comments are introduced
# with '#' anywhere on a line. The complete list of option names and
# allowed values can be found in the PostgreSQL documentation. The
# commented-out settings shown in this file represent the default values.
#
# Please note that re-commenting a setting is NOT sufficient to revert it
# to the default value, unless you restart the postmaster.
#
# Any option can also be given as a command line switch to the
# postmaster, e.g. 'postmaster -c log_connections=on'. Some options
# can be changed at run-time with the 'SET' SQL command.
#
# This file is read on postmaster startup and when the postmaster
# receives a SIGHUP. If you edit the file on a running system, you have
# to SIGHUP the postmaster for the changes to take effect, or use
# "pg_ctl reload". Some settings, such as listen_address, require
# a postmaster shutdown and restart to take effect.


#---------------------------------------------------------------------------

# FILE LOCATIONS
#---------------------------------------------------------------------------


# The default values of these variables are driven from the -D command line
# switch or PGDATA environment variable, represented here as ConfigDir.

#data_directory = 'ConfigDir'           # use data in another directory
#hba_file = 'ConfigDir/pg_hba.conf'     # host-based authentication file
#ident_file = 'ConfigDir/pg_ident.conf  # IDENT configuration file

# If external_pid_file is not explicitly set, no extra pid file is written.
#external_pid_file = '(none)'           # write an extra pid file


#---------------------------------------------------------------------------

# CONNECTIONS AND AUTHENTICATION
#---------------------------------------------------------------------------


# - Connection Settings -

listen_addresses = '*'          # what IP address(es) to listen on;
                                        # comma-separated list of
addresses;
                                        # defaults to 'localhost', '*' =
all
#port = 5432
max_connections = 100
# note: increasing max_connections costs ~400 bytes of shared memory per
# connection slot, plus lock space (see max_locks_per_transaction).  You
# might also need to raise shared_buffers to support more connections.
#superuser_reserved_connections = 2
#unix_socket_directory = ''
#unix_socket_group = ''
#unix_socket_permissions = 0777         # octal
#bonjour_name = ''                      # defaults to the computer name

# - Security & Authentication -

#authentication_timeout = 60            # 1-600, in seconds
#ssl = off
#password_encryption = on
#db_user_namespace = off

# Kerberos
#krb_server_keyfile = ''
#krb_srvname = 'postgres'
#krb_server_hostname = ''               # empty string matches any keytab
entry
#krb_caseins_users = off

# - TCP Keepalives -
# see 'man 7 tcp' for details

#tcp_keepalives_idle = 0                # TCP_KEEPIDLE, in seconds;
                                        # 0 selects the system default
#tcp_keepalives_interval = 0            # TCP_KEEPINTVL, in seconds;
                                        # 0 selects the system default
#tcp_keepalives_count = 0               # TCP_KEEPCNT;
                                        # 0 selects the system default


#---------------------------------------------------------------------------

# RESOURCE USAGE (except WAL)
#---------------------------------------------------------------------------


# - Memory -

shared_buffers = 220000                 # min 16 or max_connections*2, 8KB
each
temp_buffers = 244000                   # min 100, 8KB each
max_prepared_transactions = 100         # can be 0 or more
# note: increasing max_prepared_transactions costs ~600 bytes of shared
memory
# per transaction slot, plus lock space (see max_locks_per_transaction).
work_mem = 204800                       # min 64, size in KB
maintenance_work_mem = 323840           # min 1024, size in KB
#max_stack_depth = 2048                 # min 100, size in KB

# - Free Space Map -

max_fsm_pages = 8000000                 # min max_fsm_relations*16, 6 bytes
each
max_fsm_relations = 10000               # min 100, ~70 bytes each

# - Kernel Resource Usage -

#max_files_per_process = 1000           # min 25
#preload_libraries = ''

# - Cost-Based Vacuum Delay -

#vacuum_cost_delay = 0                  # 0-1000 milliseconds
#vacuum_cost_page_hit = 1               # 0-10000 credits
#vacuum_cost_page_miss = 10             # 0-10000 credits
#vacuum_cost_page_dirty = 20            # 0-10000 credits
#vacuum_cost_limit = 200                # 0-10000 credits

# - Background writer -

#bgwriter_delay = 200                   # 10-10000 milliseconds between
rounds
#bgwriter_lru_percent = 1.0             # 0-100% of LRU buffers
scanned/round
bgwriter_lru_maxpages = 100             # 0-1000 buffers max written/round
#bgwriter_all_percent = 0.333           # 0-100% of all buffers
scanned/round
bgwriter_all_maxpages = 100             # 0-1000 buffers max written/round


#---------------------------------------------------------------------------

# WRITE AHEAD LOG
#---------------------------------------------------------------------------


# - Settings -

fsync = off                             # turns forced synchronization on or
off
#wal_sync_method = fsync                # the default is the first option
                                        # supported by the operating
system:
                                        #   open_datasync
                                        #   fdatasync
                                        #   fsync
                                        #   fsync_writethrough
                                        #   open_sync
full_page_writes = off                  # recover from partial page writes
wal_buffers = 200                       # min 4, 8KB each
#commit_delay = 0                       # range 0-100000, in microseconds
#commit_siblings = 5                    # range 1-1000

# - Checkpoints -

checkpoint_segments = 30                # in logfile segments, min 1, 16MB
each
#checkpoint_timeout = 300               # range 30-3600, in seconds
#checkpoint_warning = 30                # in seconds, 0 is off

# - Archiving -

#archive_command = ''                   # command to use to archive a
logfile
                                        # segment


#---------------------------------------------------------------------------

# QUERY TUNING
#---------------------------------------------------------------------------


# - Planner Method Configuration -

#enable_bitmapscan = on
#enable_hashagg = on
#enable_hashjoin = on
#enable_indexscan = on
#enable_mergejoin = on
#enable_nestloop = on
#enable_seqscan = on
#enable_sort = on
#enable_tidscan = on

# - Planner Cost Constants -

#effective_cache_size = 1000            # typically 8KB each
#random_page_cost = 4                   # units are one sequential page
fetch
                                        # cost
#cpu_tuple_cost = 0.01                  # (same)
#cpu_index_tuple_cost = 0.001           # (same)
#cpu_operator_cost = 0.0025             # (same)

# - Genetic Query Optimizer -

#geqo = on
#geqo_threshold = 12
#geqo_effort = 5                        # range 1-10
#geqo_pool_size = 0                     # selects default based on effort
#geqo_generations = 0                   # selects default based on effort
#geqo_selection_bias = 2.0              # range 1.5-2.0

# - Other Planner Options -

#default_statistics_target = 10         # range 1-1000
#constraint_exclusion = off
#from_collapse_limit = 8
#join_collapse_limit = 8                # 1 disables collapsing of explicit

                                        # JOINs


#---------------------------------------------------------------------------

# ERROR REPORTING AND LOGGING
#---------------------------------------------------------------------------


# - Where to Log -

#log_destination = 'stderr'             # Valid values are combinations of
                                        # stderr, syslog and eventlog,
                                        # depending on platform.

# This is used when logging to stderr:
redirect_stderr = on                    # Enable capturing of stderr into
log
                                        # files

# These are only used if redirect_stderr is on:
log_directory = 'pg_log'                # Directory where log files are
written
                                        # Can be absolute or relative to
PGDATA
#log_filename = 'postgresql-%Y-%m-%d_%H%M%S.log' # Log file name pattern.
                                        # Can include strftime() escapes
log_truncate_on_rotation = on           # If on, any existing log file of
the same
                                        # name as the new log file will be
                                        # truncated rather than appended to.
But
                                        # such truncation only occurs on
                                        # time-driven rotation, not on
restarts
                                        # or size-driven rotation. Default
is
                                        # off, meaning append to existing
files
                                        # in all cases.
log_rotation_age = 1440                 # Automatic rotation of logfiles
will
                                        # happen after so many minutes.  0
to
                                        # disable.
log_rotation_size = 0                   # Automatic rotation of logfiles
will
                                        # happen after so many kilobytes of
log
                                        # output.  0 to disable.

# These are relevant when logging to syslog:
#syslog_facility = 'LOCAL0'
#syslog_ident = 'postgres'


# - When to Log -

#client_min_messages = notice           # Values, in order of decreasing
detail:
                                        #   debug5
                                        #   debug4
                                        #   debug3
                                        #   debug2
                                        #   debug1
                                        #   log
                                        #   notice
                                        #   warning
                                        #   error

#log_min_messages = notice              # Values, in order of decreasing
detail:
                                        #   debug5
                                        #   debug4
                                        #   debug3
                                        #   debug2
                                        #   debug1
                                        #   info
                                        #   notice
                                        #   warning
                                        #   error
                                        #   log
                                        #   fatal
                                        #   panic

#log_error_verbosity = default          # terse, default, or verbose
messages

#log_min_error_statement = panic        # Values in order of increasing
severity:
                                        #   debug5
                                        #   debug4
                                        #   debug3
                                        #   debug2
                                        #   debug1
                                        #   info
                                        #   notice
                                        #   warning
                                        #   error
                                        #   panic(off)

#log_min_duration_statement = -1        # -1 is disabled, 0 logs all
statements
                                        # and their durations, in
milliseconds.

#silent_mode = off                      # DO NOT USE without syslog or
                                        # redirect_stderr

# - What to Log -

#debug_print_parse = off
#debug_print_rewritten = off
#debug_print_plan = off
#debug_pretty_print = off
#log_connections = off
#log_disconnections = off
#log_duration = off
#log_line_prefix = ''                   # Special values:
                                        #   %u = user name
                                        #   %d = database name
                                        #   %r = remote host and port
                                        #   %h = remote host
                                        #   %p = PID
                                        #   %t = timestamp (no
milliseconds)
                                        #   %m = timestamp with
milliseconds
                                        #   %i = command tag
                                        #   %c = session id
                                        #   %l = session line number
                                        #   %s = session start timestamp
                                        #   %x = transaction id
                                        #   %q = stop here in non-session
                                        #        processes
                                        #   %% = '%'
                                        # e.g. '<%u%%%d> '
#log_statement = 'none'                 # none, mod, ddl, all
#log_hostname = off


#---------------------------------------------------------------------------

# RUNTIME STATISTICS
#---------------------------------------------------------------------------


# - Statistics Monitoring -

#log_parser_stats = off
#log_planner_stats = off
#log_executor_stats = off
#log_statement_stats = off

# - Query/Index Statistics Collector -

#stats_start_collector = on
#stats_command_string = off
#stats_block_level = off
#stats_row_level = off
#stats_reset_on_server_start = off


#---------------------------------------------------------------------------

# AUTOVACUUM PARAMETERS
#---------------------------------------------------------------------------


#autovacuum = off                       # enable autovacuum subprocess?
#autovacuum_naptime = 60                # time between autovacuum runs, in
secs
#autovacuum_vacuum_threshold = 1000     # min # of tuple updates before
                                        # vacuum
#autovacuum_analyze_threshold = 500     # min # of tuple updates before
                                        # analyze
#autovacuum_vacuum_scale_factor = 0.4   # fraction of rel size before
                                        # vacuum
#autovacuum_analyze_scale_factor = 0.2  # fraction of rel size before
                                        # analyze
#autovacuum_vacuum_cost_delay = -1      # default vacuum cost delay for
                                        # autovac, -1 means use
                                        # vacuum_cost_delay
#autovacuum_vacuum_cost_limit = -1      # default vacuum cost limit for
                                        # autovac, -1 means use
                                        # vacuum_cost_limit


#---------------------------------------------------------------------------

# CLIENT CONNECTION DEFAULTS
#---------------------------------------------------------------------------


# - Statement Behavior -

#search_path = '$user,public'           # schema names
#default_tablespace = ''                # a tablespace name, '' uses
                                        # the default
#check_function_bodies = on
#default_transaction_isolation = 'read committed'
#default_transaction_read_only = off
#statement_timeout = 0                  # 0 is disabled, in milliseconds

# - Locale and Formatting -

#datestyle = 'iso, mdy'
#timezone = unknown                     # actually, defaults to TZ
                                        # environment setting
#australian_timezones = off
#extra_float_digits = 0                 # min -15, max 2
#client_encoding = sql_ascii            # actually, defaults to database
                                        # encoding

# These settings are initialized by initdb -- they might be changed
lc_messages = 'en_US.UTF-8'                     # locale for system error
message
                                        # strings
lc_monetary = 'en_US.UTF-8'                     # locale for monetary
formatting
lc_numeric = 'en_US.UTF-8'                      # locale for number
formatting
lc_time = 'en_US.UTF-8'                         # locale for time
formatting

# - Other Defaults -

#explain_pretty_print = on
#dynamic_library_path = '$libdir'


#---------------------------------------------------------------------------

# LOCK MANAGEMENT
#---------------------------------------------------------------------------


#deadlock_timeout = 1000                # in milliseconds
#max_locks_per_transaction = 64         # min 10
# note: each lock table slot uses ~220 bytes of shared memory, and there
are
# max_locks_per_transaction * (max_connections + max_prepared_transactions)
# lock table slots.


#---------------------------------------------------------------------------

# VERSION/PLATFORM COMPATIBILITY
#---------------------------------------------------------------------------


# - Previous Postgres Versions -

#add_missing_from = off
#regex_flavor = advanced                # advanced, extended, or basic
#sql_inheritance = on
#default_with_oids = off
#escape_string_warning = off

# - Other Platforms & Clients -

#transform_null_equals = off


#---------------------------------------------------------------------------

# CUSTOMIZED OPTIONS
#---------------------------------------------------------------------------


#custom_variable_classes = ''           # list of custom variable class
names

Re: BUG #2104: pg_xlog/ trace files not reclaimed by server

From
Tom Lane
Date:
"Reuben Pasquini" <pasquini@imageworks.com> writes:
> It appears that when my 8.1 database is forced
> to generate more than
>             2*checkpoint_segments + 1
> trace files under pg_xlog/, that the database
> becomes confused and stops recycling the trace files.

That's fairly hard to believe, especially since you haven't presented
any actual evidence of it.

What might have happened is that checkpoints were failing for some
reason and so recycling of WAL segments couldn't be performed.  Was
there anything in the postmaster log about write failures?

            regards, tom lane

Re: BUG #2104: pg_xlog/ trace files not reclaimed by server

From
Reuben Pasquini
Date:
Hi Tom,

Unfortunately, I had to rebuild the database pretty quickly
to get the app it supports back up,
and I wiped out the log files.
The postmaster would dump core on startup - so I
just wiped the database, and recreated the schema
(my app can rebuild its state).
I didn't save the core file either - but it wasn't
not very informative - something like:

abort ();
failedToStartupChildProcesses ();
...

I'll save the logs and core if this happens again.
I may have misdiagnosed the cause of the problem,
but the pg_xlog/ directory does fill up with files until
the disk is full - 8 GB worth - when it enters this unstable
state.  I thought it might be related to exceeding
       2*checkpoint_segments + 1
because of what I read here:

http://www.postgresql.org/docs/8.1/static/wal-configuration.html

-------------

There will be at least one WAL segment file, and will normally not be
more than 2 * checkpoint_segments + 1 files. Each segment file is
normally 16 MB (though this size can be altered when building the
server). You can use this to estimate space requirements for WAL.
Ordinarily, when old log segment files are no longer needed, they are
recycled (renamed to become the next segments in the numbered sequence).
If, due to a short-term peak of log output rate, there are more than 2 *
checkpoint_segments + 1 segment files, the unneeded segment files will
be deleted instead of recycled until the system gets back under this limit.

-------------


The database ran fine for the last 10 days - seemed to recycle the
trace files under pg_xlog/ fine.  I was monitoring the
pg_xlog/ directory, because I knew that was what grew out of control
the last crash.  I noticed at the end of the day yesterday that
we had grown up to 25 trace files, and I knew my checkpoint-segment
was set to '12'.

Anyway - I could be completely off here.
I uped my checkpoint_segment setting to 30 (still have a 5 minute timeout),
and trace files are recycling ok:

$ ls -lrt pg_xlog/
total 278872
drwx------  2 monitor user     4096 Dec  9 04:59 archive_status
-rw-------  1 monitor user 16777216 Dec  9 11:00 0000000100000001000000C9
-rw-------  1 monitor user 16777216 Dec  9 11:00 0000000100000001000000C8
-rw-------  1 monitor user 16777216 Dec  9 11:02 0000000100000001000000C7
-rw-------  1 monitor user 16777216 Dec  9 11:05 0000000100000001000000B9
-rw-------  1 monitor user 16777216 Dec  9 11:05 0000000100000001000000BA
-rw-------  1 monitor user 16777216 Dec  9 11:06 0000000100000001000000BB
-rw-------  1 monitor user 16777216 Dec  9 11:07 0000000100000001000000BC
-rw-------  1 monitor user 16777216 Dec  9 11:07 0000000100000001000000BD
-rw-------  1 monitor user 16777216 Dec  9 11:08 0000000100000001000000BE
-rw-------  1 monitor user 16777216 Dec  9 11:09 0000000100000001000000BF
-rw-------  1 monitor user 16777216 Dec  9 11:09 0000000100000001000000C0
-rw-------  1 monitor user 16777216 Dec  9 11:10 0000000100000001000000C1
-rw-------  1 monitor user 16777216 Dec  9 11:11 0000000100000001000000C2
-rw-------  1 monitor user 16777216 Dec  9 11:11 0000000100000001000000C3
-rw-------  1 monitor user 16777216 Dec  9 11:12 0000000100000001000000C4
-rw-------  1 monitor user 16777216 Dec  9 11:13 0000000100000001000000C5
-rw-------  1 monitor user 16777216 Dec  9 11:13 0000000100000001000000C6

We'll see how that goes.  I'll let you know if I get another crash.

Cheers,
Reuben


Tom Lane wrote:

>"Reuben Pasquini" <pasquini@imageworks.com> writes:
>
>
>>It appears that when my 8.1 database is forced
>>to generate more than
>>            2*checkpoint_segments + 1
>>trace files under pg_xlog/, that the database
>>becomes confused and stops recycling the trace files.
>>
>>
>
>That's fairly hard to believe, especially since you haven't presented
>any actual evidence of it.
>
>What might have happened is that checkpoints were failing for some
>reason and so recycling of WAL segments couldn't be performed.  Was
>there anything in the postmaster log about write failures?
>
>            regards, tom lane
>
>
>