Re: BUG #13143: Cannot stop and restart a streaming server with a replication slot - Mailing list pgsql-bugs
| From | Patrice Drolet |
|---|---|
| Subject | Re: BUG #13143: Cannot stop and restart a streaming server with a replication slot |
| Date | |
| Msg-id | 39EF5992-B6B5-44D3-A7F6-F22E35DC5CAC@infodata.ca Whole thread Raw |
| In response to | Re: BUG #13143: Cannot stop and restart a streaming server with a replication slot (Andres Freund <andres@anarazel.de>) |
| List | pgsql-bugs |
Hi,
Here is the log with verbose:
2015-04-25 14:25:59 EDT LOG: 00000: le syst=E8me de bases de donn=E9es =
a =E9t=E9 arr=EAt=E9 =E0 2015-04-25 14:25:39 EDT
2015-04-25 14:25:59 EDT EMPLACEMENT : StartupXLOG, =
src\backend\access\transam\xlog.c:6011
2015-04-25 14:25:59 EDT PANIC: XX000: n'a pas pu synchroniser sur =
disque (fsync) le fichier =AB pg_replslot/node_win2008sec/state =BB : =
Bad file descriptor
2015-04-25 14:25:59 EDT EMPLACEMENT : RestoreSlotFromDisk, =
src\backend\replication\slot.c:1115
2015-04-25 14:25:59 EDT LOG: 00000: processus de lancement (PID 2696) a =
=E9t=E9 arr=EAt=E9 par l'exception 0xC0000409
2015-04-25 14:25:59 EDT ASTUCE : Voir le fichier d'en-t=EAte C =AB =
ntstatus.h =BB pour une description de la valeur
hexad=E9cimale.
2015-04-25 14:25:59 EDT EMPLACEMENT : LogChildExit, =
src\backend\postmaster\postmaster.c:3336
2015-04-25 14:25:59 EDT LOG: 00000: annulation du d=E9marrage =E0 cause =
d'un =E9chec dans le processus de lancement
2015-04-25 14:25:59 EDT EMPLACEMENT : reaper, =
src\backend\postmaster\postmaster.c:2604
As I said, this is a stream replication between 2 windows 64b using pg =
9.4.1.
Here is my postgresql.conf:
=97=97=97=97=97=97=97=97=97=97=97=97=97=97=97=97=97
wal_level =3D hot_standby
max_wal_senders =3D 3
checkpoint_segments =3D 16
wal_keep_segments =3D 32
=
#-------------------------------------------------------------------------=
-----
# FILE LOCATIONS
=
#-------------------------------------------------------------------------=
-----
# The default values of these variables are driven from the -D =
command-line
# option or PGDATA environment variable, represented here as ConfigDir.
#data_directory =3D 'ConfigDir' # use data in another directory
# (change requires restart)
#hba_file =3D 'ConfigDir/pg_hba.conf' # host-based authentication file
# (change requires restart)
#ident_file =3D 'ConfigDir/pg_ident.conf' # ident configuration =
file
# (change requires restart)
# If external_pid_file is not explicitly set, no extra PID file is =
written.
#external_pid_file =3D '' # write an extra PID =
file
# (change requires restart)
=
#-------------------------------------------------------------------------=
-----
# CONNECTIONS AND AUTHENTICATION
=
#-------------------------------------------------------------------------=
-----
# - Connection Settings -
listen_addresses =3D '*' # what IP address(es) to listen =
on;
# comma-separated list of =
addresses;
# defaults to 'localhost'; use =
'*' for all
# (change requires restart)
port =3D 5434 # (change requires restart)
max_connections =3D 100 # (change requires restart)
# Note: Increasing max_connections costs ~400 bytes of shared memory =
per
# connection slot, plus lock space (see max_locks_per_transaction).
#superuser_reserved_connections =3D 3 # (change requires restart)
#unix_socket_directories =3D '' # comma-separated list of directories
# (change requires restart)
#unix_socket_group =3D '' # (change requires =
restart)
#unix_socket_permissions =3D 0777 # begin with 0 to use =
octal notation
# (change requires restart)
#bonjour =3D off # advertise server via =
Bonjour
# (change requires restart)
#bonjour_name =3D '' # defaults to the computer name
# (change requires restart)
# - Security and Authentication -
#authentication_timeout =3D 1min # 1s-600s
#ssl =3D off # (change requires restart)
#ssl_ciphers =3D 'HIGH:MEDIUM:+3DES:!aNULL' # allowed SSL ciphers
# (change requires restart)
#ssl_prefer_server_ciphers =3D on # (change requires =
restart)
#ssl_ecdh_curve =3D 'prime256v1' # (change requires =
restart)
#ssl_renegotiation_limit =3D 512MB # amount of data between =
renegotiations
#ssl_cert_file =3D 'server.crt' # (change requires restart)
#ssl_key_file =3D 'server.key' # (change requires restart)
#ssl_ca_file =3D '' # (change requires restart)
#ssl_crl_file =3D '' # (change requires restart)
#password_encryption =3D on
#db_user_namespace =3D off
# GSSAPI using Kerberos
#krb_server_keyfile =3D ''
#krb_caseins_users =3D off
# - TCP Keepalives -
# see "man 7 tcp" for details
#tcp_keepalives_idle =3D 0 # TCP_KEEPIDLE, in seconds;
# 0 selects the system default
#tcp_keepalives_interval =3D 0 # TCP_KEEPINTVL, in seconds;
# 0 selects the system default
#tcp_keepalives_count =3D 0 # TCP_KEEPCNT;
# 0 selects the system default
=
#-------------------------------------------------------------------------=
-----
# RESOURCE USAGE (except WAL)
=
#-------------------------------------------------------------------------=
-----
# - Memory -
shared_buffers =3D 3072MB # min 128kB
# (change requires restart)
#huge_pages =3D try # on, off, or try
# (change requires restart)
temp_buffers =3D 8MB # min 800kB
#max_prepared_transactions =3D 0 # zero disables the =
feature
# (change requires restart)
# Note: Increasing max_prepared_transactions costs ~600 bytes of shared =
memory
# per transaction slot, plus lock space (see max_locks_per_transaction).
# It is not advisable to set max_prepared_transactions nonzero unless =
you
# actively intend to use prepared transactions.
work_mem =3D 256MB # LIDI 4MB *** min 64kB
maintenance_work_mem =3D 256MB # min 1MB
#autovacuum_work_mem =3D -1 # min 1MB, or -1 to use =
maintenance_work_mem
#max_stack_depth =3D 2MB # min 100kB
dynamic_shared_memory_type =3D windows # the default is the first =
option
# supported by the operating =
system:
# posix
# sysv
# windows
# mmap
# use none to disable dynamic =
shared memory
# - Disk -
#temp_file_limit =3D -1 # limits per-session temp file =
space
# in kB, or -1 for no limit
# - Kernel Resource Usage -
#max_files_per_process =3D 1000 # min 25
# (change requires restart)
#shared_preload_libraries =3D '' # (change requires =
restart)
# - Cost-Based Vacuum Delay -
#vacuum_cost_delay =3D 0 # 0-100 milliseconds
#vacuum_cost_page_hit =3D 1 # 0-10000 credits
#vacuum_cost_page_miss =3D 10 # 0-10000 credits
#vacuum_cost_page_dirty =3D 20 # 0-10000 credits
#vacuum_cost_limit =3D 200 # 1-10000 credits
# - Background Writer -
#bgwriter_delay =3D 200ms # 10-10000ms between =
rounds
#bgwriter_lru_maxpages =3D 100 # 0-1000 max buffers =
written/round
#bgwriter_lru_multiplier =3D 2.0 # 0-10.0 multipler on =
buffers scanned/round
# - Asynchronous Behavior -
#effective_io_concurrency =3D 1 # 1-1000; 0 disables prefetching
#max_worker_processes =3D 8
=
#-------------------------------------------------------------------------=
-----
# WRITE AHEAD LOG
=
#-------------------------------------------------------------------------=
-----
# - Settings -
#wal_level =3D minimal # minimal, archive, hot_standby, =
or logical
# (change requires restart)
#fsync =3D on # turns forced synchronization =
on or off
#synchronous_commit =3D on # synchronization level;
# off, local, remote_write, or =
on
#wal_sync_method =3D fsync # the default is the first =
option
# supported by the operating =
system:
# open_datasync
# fdatasync (default on Linux)
# fsync
# fsync_writethrough
# open_sync
#full_page_writes =3D on # recover from partial =
page writes
#wal_log_hints =3D off # also do full page writes of =
non-critical updates
# (change requires restart)
#wal_buffers =3D -1 # min 32kB, -1 sets based on =
shared_buffers
# (change requires restart)
#wal_writer_delay =3D 200ms # 1-10000 milliseconds
#commit_delay =3D 0 # range 0-100000, in =
microseconds
#commit_siblings =3D 5 # range 1-1000
# - Checkpoints -
checkpoint_segments =3D 90 # in logfile segments, min 1, =
16MB each
checkpoint_timeout =3D 5min # range 30s-1h
checkpoint_completion_target =3D 0.8 # checkpoint target duration, =
0.0 - 1.0
#checkpoint_warning =3D 30s # 0 disables
# - Archiving -
#archive_mode =3D off # allows archiving to be done
# (change requires restart)
#archive_command =3D '' # command to use to archive a logfile =
segment
# placeholders: %p =3D path of file to =
archive
# %f =3D file name only
# e.g. 'test ! -f =
/mnt/server/archivedir/%f && cp %p /mnt/server/archivedir/%f'
#archive_timeout =3D 0 # force a logfile segment switch after =
this
# number of seconds; 0 disables
=
#-------------------------------------------------------------------------=
-----
# REPLICATION
=
#-------------------------------------------------------------------------=
-----
# - Sending Server(s) -
# Set these on the master and on any standby that will send replication =
data.
#max_wal_senders =3D 0 # max number of walsender processes
# (change requires restart)
#wal_keep_segments =3D 0 # in logfile segments, 16MB =
each; 0 disables
#wal_sender_timeout =3D 60s # in milliseconds; 0 disables
max_replication_slots =3D 1 # max number of replication slots
# (change requires restart)
# - Master Server -
# These settings are ignored on a standby server.
#synchronous_standby_names =3D '' # standby servers that provide =
sync rep
# comma-separated list of =
application_name
# from standby(s); '*' =3D all
#vacuum_defer_cleanup_age =3D 0 # number of xacts by which cleanup is =
delayed
# - Standby Servers -
# These settings are ignored on a master server.
#hot_standby =3D off # "on" allows queries during =
recovery
# (change requires restart)
#max_standby_archive_delay =3D 30s # max delay before canceling =
queries
# when reading WAL from archive;
# -1 allows indefinite delay
#max_standby_streaming_delay =3D 30s # max delay before canceling =
queries
# when reading streaming WAL;
# -1 allows indefinite delay
#wal_receiver_status_interval =3D 10s # send replies at least this =
often
# 0 disables
#hot_standby_feedback =3D off # send info from standby to =
prevent
# query conflicts
#wal_receiver_timeout =3D 60s # time that receiver waits for
# communication from master
# in milliseconds; 0 disables
=
#-------------------------------------------------------------------------=
-----
# QUERY TUNING
=
#-------------------------------------------------------------------------=
-----
# - Planner Method Configuration -
#enable_bitmapscan =3D on
#enable_hashagg =3D on
#enable_hashjoin =3D on
#enable_indexscan =3D on
#enable_indexonlyscan =3D on
#enable_material =3D on
#enable_mergejoin =3D on
#enable_nestloop =3D on
#enable_seqscan =3D on
#enable_sort =3D on
#enable_tidscan =3D on
# - Planner Cost Constants -
#seq_page_cost =3D 1.0 # measured on an arbitrary scale
random_page_cost =3D 2.0 # same scale as above
#cpu_tuple_cost =3D 0.01 # same scale as above
#cpu_index_tuple_cost =3D 0.005 # same scale as above
#cpu_operator_cost =3D 0.0025 # same scale as above
effective_cache_size =3D 6GB
# - Genetic Query Optimizer -
#geqo =3D on
geqo_threshold =3D 16
geqo_effort =3D 2 # range 1-10
#geqo_pool_size =3D 0 # selects default based on =
effort
#geqo_generations =3D 0 # selects default based on =
effort
#geqo_selection_bias =3D 2.0 # range 1.5-2.0
#geqo_seed =3D 0.0 # range 0.0-1.0
# - Other Planner Options -
#default_statistics_target =3D 100 # range 1-10000
#constraint_exclusion =3D partition # on, off, or partition
#cursor_tuple_fraction =3D 0.1 # range 0.0-1.0
#from_collapse_limit =3D 8
#join_collapse_limit =3D 8 # 1 disables collapsing of =
explicit
# JOIN clauses
=
#-------------------------------------------------------------------------=
-----
# ERROR REPORTING AND LOGGING
=
#-------------------------------------------------------------------------=
-----
# - Where to Log -
log_destination =3D 'stderr' # Valid values are combinations =
of
# stderr, csvlog, syslog, and =
eventlog,
# depending on platform. csvlog
# requires logging_collector to =
be on.
# This is used when logging to stderr:
logging_collector =3D on # Enable capturing of stderr and =
csvlog
# into log files. Required to be =
on for
# csvlogs.
# (change requires restart)
# These are only used if logging_collector is on:
#log_directory =3D 'pg_log' # directory where log files are =
written,
# can be absolute or relative to =
PGDATA
#log_filename =3D 'postgresql-%Y-%m-%d_%H%M%S.log' # log file name =
pattern,
# can include strftime() escapes
#log_file_mode =3D 0600 # creation mode for log files,
# begin with 0 to use octal =
notation
#log_truncate_on_rotation =3D off # If on, an existing log =
file with the
# same name as the new log file =
will be
# truncated rather than appended =
to.
# But such truncation only =
occurs on
# time-driven rotation, not on =
restarts
# or size-driven rotation. =
Default is
# off, meaning append to =
existing files
# in all cases.
#log_rotation_age =3D 1d # Automatic rotation of =
logfiles will
# happen after that time. 0 =
disables.
#log_rotation_size =3D 10MB # Automatic rotation of logfiles =
will
# happen after that much log =
output.
# 0 disables.
# These are relevant when logging to syslog:
#syslog_facility =3D 'LOCAL0'
#syslog_ident =3D 'postgres'
# This is only relevant when logging to eventlog (win32):
#event_source =3D 'PostgreSQL'
# - When to Log -
#client_min_messages =3D notice # values in order of decreasing =
detail:
# debug5
# debug4
# debug3
# debug2
# debug1
# log
# notice
# warning
# error
#log_min_messages =3D warning # values in order of decreasing =
detail:
# debug5
# debug4
# debug3
# debug2
# debug1
# info
# notice
# warning
# error
# log
# fatal
# panic
#log_min_error_statement =3D error # values in order of decreasing =
detail:
# debug5
# debug4
# debug3
# debug2
# debug1
# info
# notice
# warning
# error
# log
# fatal
# panic (effectively off)
#log_min_duration_statement =3D 150 # -1 is disabled, 0 logs all =
statements
# and their durations, > 0 logs =
only
# statements running at least =
this number
# of milliseconds
# - What to Log -
#debug_print_parse =3D off
#debug_print_rewritten =3D off
#debug_print_plan =3D off
#debug_pretty_print =3D on
#log_checkpoints =3D off
log_connections =3D off
log_disconnections =3D off
log_duration =3D off
log_error_verbosity =3D verbose # terse, default, or verbose =
messages
#log_hostname =3D off
log_line_prefix =3D '%t ' # special values:
# %a =3D application name
# %u =3D user name
# %d =3D database name
# %r =3D remote host and port
# %h =3D remote host
# %p =3D process ID
# %t =3D timestamp without =
milliseconds
# %m =3D timestamp with =
milliseconds
# %i =3D command tag
# %e =3D SQL state
# %c =3D session ID
# %l =3D session line number
# %s =3D session start =
timestamp
# %v =3D virtual transaction =
ID
# %x =3D transaction ID (0 if =
none)
# %q =3D stop here in =
non-session
# processes
# %% =3D '%'
# e.g. '<%u%%%d> '
#log_lock_waits =3D off # log lock waits >=3D =
deadlock_timeout
#log_statement =3D 'none' # none, ddl, mod, all
#log_temp_files =3D -1 # log temporary files equal or =
larger
# than the specified size in =
kilobytes;
# -1 disables, 0 logs all temp =
files
log_timezone =3D 'US/Eastern'
=
#-------------------------------------------------------------------------=
-----
# RUNTIME STATISTICS
=
#-------------------------------------------------------------------------=
-----
# - Query/Index Statistics Collector -
#track_activities =3D on
track_counts =3D on
#track_io_timing =3D off
#track_functions =3D none # none, pl, all
#track_activity_query_size =3D 1024 # (change requires restart)
#update_process_title =3D on
#stats_temp_directory =3D 'pg_stat_tmp'
# - Statistics Monitoring -
#log_parser_stats =3D off
#log_planner_stats =3D off
#log_executor_stats =3D off
#log_statement_stats =3D off
=
#-------------------------------------------------------------------------=
-----
# AUTOVACUUM PARAMETERS
=
#-------------------------------------------------------------------------=
-----
autovacuum =3D on # Enable autovacuum subprocess? =
'on'
# requires track_counts to also =
be on.
#log_autovacuum_min_duration =3D -1 # -1 disables, 0 logs all =
actions and
# their durations, > 0 logs only
# actions running at least this =
number
# of milliseconds.
#autovacuum_max_workers =3D 3 # max number of autovacuum =
subprocesses
# (change requires restart)
#autovacuum_naptime =3D 1min # time between autovacuum runs
#autovacuum_vacuum_threshold =3D 50 # min number of row updates =
before
# vacuum
#autovacuum_analyze_threshold =3D 50 # min number of row updates =
before
# analyze
#autovacuum_vacuum_scale_factor =3D 0.2 # fraction of table size before =
vacuum
#autovacuum_analyze_scale_factor =3D 0.1 # fraction of table size =
before analyze
#autovacuum_freeze_max_age =3D 200000000 # maximum XID age before =
forced vacuum
# (change requires restart)
#autovacuum_multixact_freeze_max_age =3D 400000000 # maximum =
multixact age
# before forced vacuum
# (change requires restart)
autovacuum_vacuum_cost_delay =3D 50ms # default vacuum cost delay for
# autovacuum, in milliseconds;
# -1 means use vacuum_cost_delay
#autovacuum_vacuum_cost_limit =3D -1 # default vacuum cost limit for
# autovacuum, -1 means use
# vacuum_cost_limit
=
#-------------------------------------------------------------------------=
-----
# CLIENT CONNECTION DEFAULTS
=
#-------------------------------------------------------------------------=
-----
# - Statement Behavior -
#search_path =3D '"$user",public' # schema names
#default_tablespace =3D '' # a tablespace name, '' uses the =
default
#temp_tablespaces =3D '' # a list of tablespace =
names, '' uses
# only default tablespace
#check_function_bodies =3D on
#default_transaction_isolation =3D 'read committed'
#default_transaction_read_only =3D off
#default_transaction_deferrable =3D off
#session_replication_role =3D 'origin'
#statement_timeout =3D 0 # in milliseconds, 0 is =
disabled
#lock_timeout =3D 0 # in milliseconds, 0 is disabled
#vacuum_freeze_min_age =3D 50000000
#vacuum_freeze_table_age =3D 150000000
#vacuum_multixact_freeze_min_age =3D 5000000
#vacuum_multixact_freeze_table_age =3D 150000000
#bytea_output =3D 'hex' # hex, escape
#xmlbinary =3D 'base64'
#xmloption =3D 'content'
# - Locale and Formatting -
datestyle =3D 'iso, ymd'
#intervalstyle =3D 'postgres'
timezone =3D 'US/Eastern'
#timezone_abbreviations =3D 'Default' # Select the set of available =
time zone
# abbreviations. Currently, =
there are
# Default
# Australia (historical usage)
# India
# You can create your own file =
in
# share/timezonesets/.
#extra_float_digits =3D 0 # min -15, max 3
#client_encoding =3D sql_ascii # actually, defaults to database
# encoding
# These settings are initialized by initdb, but they can be changed.
lc_messages =3D 'French_Canada.1252' # locale for =
system error message
# strings
lc_monetary =3D 'French_Canada.1252' # locale for =
monetary formatting
lc_numeric =3D 'French_Canada.1252' # locale for =
number formatting
lc_time =3D 'French_Canada.1252' # locale =
for time formatting
# default configuration for text search
default_text_search_config =3D 'pg_catalog.french'
# - Other Defaults -
#dynamic_library_path =3D '$libdir'
#local_preload_libraries =3D ''
#session_preload_libraries =3D ''
=
#-------------------------------------------------------------------------=
-----
# LOCK MANAGEMENT
=
#-------------------------------------------------------------------------=
-----
#deadlock_timeout =3D 1s
#max_locks_per_transaction =3D 64 # min 10
# (change requires restart)
# Note: Each lock table slot uses ~270 bytes of shared memory, and =
there are
# max_locks_per_transaction * (max_connections + =
max_prepared_transactions)
# lock table slots.
#max_pred_locks_per_transaction =3D 64 # min 10
# (change requires restart)
=
#-------------------------------------------------------------------------=
-----
# VERSION/PLATFORM COMPATIBILITY
=
#-------------------------------------------------------------------------=
-----
# - Previous PostgreSQL Versions -
#array_nulls =3D on
#backslash_quote =3D safe_encoding # on, off, or safe_encoding
#default_with_oids =3D off
#escape_string_warning =3D on
#lo_compat_privileges =3D off
#quote_all_identifiers =3D off
#sql_inheritance =3D on
#standard_conforming_strings =3D on
#synchronize_seqscans =3D on
# - Other Platforms and Clients -
#transform_null_equals =3D off
=
#-------------------------------------------------------------------------=
-----
# ERROR HANDLING
=
#-------------------------------------------------------------------------=
-----
#exit_on_error =3D off # terminate session on any =
error?
#restart_after_crash =3D on # reinitialize after backend =
crash?
=
#-------------------------------------------------------------------------=
-----
# CONFIG FILE INCLUDES
=
#-------------------------------------------------------------------------=
-----
# These options allow settings to be loaded from files other than the
# default postgresql.conf.
#include_dir =3D 'conf.d' # include files ending =
in '.conf' from
# directory 'conf.d'
#include_if_exists =3D 'exists.conf' # include file only if it exists
#include =3D 'special.conf' # include file
=
#-------------------------------------------------------------------------=
-----
# CUSTOMIZED OPTIONS
=
#-------------------------------------------------------------------------=
-----
# Add settings for extensions here
> Le 2015-04-25 =E0 08:33, Andres Freund <andres@anarazel.de> a =E9crit =
:
>=20
> Hi,
>=20
> On 2015-04-24 10:10:06 +0000, pdrolet@infodata.ca wrote:
>> The following bug has been logged on the website:
>>=20
>> Bug reference: 13143
>> Logged by: Patrice Drolet
>> Email address: pdrolet@infodata.ca
>> PostgreSQL version: 9.4.1
>> Operating system: Windows 2008r2
>> Description: =20
>>=20
>> I have experienced it many times. The master streams to the slave for =
days
>> and no problem (using a replication slot). If I stop the master, it =
does not
>> want to restart and I have this error in the log:
>>=20
>> 2015-04-24 04:47:12 EDT LOG: le syst=E8me de bases de donn=E9es a =
=E9t=E9 arr=EAt=E9 =E0
>> 2015-04-24 04:44:37 EDT
>> 2015-04-24 04:47:12 EDT PANIC: n'a pas pu synchroniser sur disque =
(fsync)
>> le fichier =AB pg_replslot/node_win2012sec/state =BB : Bad file =
descriptor
>> 2015-04-24 04:47:12 EDT LOG: processus de lancement (PID 23180) =
quitte avec
>> le code de sortie 3
>> 2015-04-24 04:47:12 EDT LOG: annulation du d=E9marrage =E0 cause =
d'un =E9chec
>> dans le processus de lancement
>>=20
>> To restart the server, I have to manually delete the folder in =
pg_replslot.
>> But then I need to re build the slave. Not very practical for a multi
>> gigabyte database.=20
>=20
> Obviously that's not how it supposed to be. I don't have access to a
> windows systems, much less a french one unfortunately.
>=20
> Could you:
> 1) describe your exact setup
> 2) Check that it's unrelated to any anti-virus software running?
> 3) configure 'log_error_verbosity =3D verbose'? Then we'll get line
> numbers, which will help narrowing down what's happening.
> 4) You could try to debug it by installing sysinternal's sysmon and
> recording what is exactly done with that file?
>=20
> Regards,
>=20
> Andres
pgsql-bugs by date: