Re: BUG #13143: Cannot stop and restart a streaming server with a replication slot - Mailing list pgsql-bugs
From | Patrice Drolet |
---|---|
Subject | Re: BUG #13143: Cannot stop and restart a streaming server with a replication slot |
Date | |
Msg-id | 39EF5992-B6B5-44D3-A7F6-F22E35DC5CAC@infodata.ca Whole thread Raw |
In response to | Re: BUG #13143: Cannot stop and restart a streaming server with a replication slot (Andres Freund <andres@anarazel.de>) |
List | pgsql-bugs |
Hi, Here is the log with verbose: 2015-04-25 14:25:59 EDT LOG: 00000: le syst=E8me de bases de donn=E9es = a =E9t=E9 arr=EAt=E9 =E0 2015-04-25 14:25:39 EDT 2015-04-25 14:25:59 EDT EMPLACEMENT : StartupXLOG, = src\backend\access\transam\xlog.c:6011 2015-04-25 14:25:59 EDT PANIC: XX000: n'a pas pu synchroniser sur = disque (fsync) le fichier =AB pg_replslot/node_win2008sec/state =BB : = Bad file descriptor 2015-04-25 14:25:59 EDT EMPLACEMENT : RestoreSlotFromDisk, = src\backend\replication\slot.c:1115 2015-04-25 14:25:59 EDT LOG: 00000: processus de lancement (PID 2696) a = =E9t=E9 arr=EAt=E9 par l'exception 0xC0000409 2015-04-25 14:25:59 EDT ASTUCE : Voir le fichier d'en-t=EAte C =AB = ntstatus.h =BB pour une description de la valeur hexad=E9cimale. 2015-04-25 14:25:59 EDT EMPLACEMENT : LogChildExit, = src\backend\postmaster\postmaster.c:3336 2015-04-25 14:25:59 EDT LOG: 00000: annulation du d=E9marrage =E0 cause = d'un =E9chec dans le processus de lancement 2015-04-25 14:25:59 EDT EMPLACEMENT : reaper, = src\backend\postmaster\postmaster.c:2604 As I said, this is a stream replication between 2 windows 64b using pg = 9.4.1. Here is my postgresql.conf: =97=97=97=97=97=97=97=97=97=97=97=97=97=97=97=97=97 wal_level =3D hot_standby max_wal_senders =3D 3 checkpoint_segments =3D 16 wal_keep_segments =3D 32 = #-------------------------------------------------------------------------= ----- # FILE LOCATIONS = #-------------------------------------------------------------------------= ----- # The default values of these variables are driven from the -D = command-line # option or PGDATA environment variable, represented here as ConfigDir. #data_directory =3D 'ConfigDir' # use data in another directory # (change requires restart) #hba_file =3D 'ConfigDir/pg_hba.conf' # host-based authentication file # (change requires restart) #ident_file =3D 'ConfigDir/pg_ident.conf' # ident configuration = file # (change requires restart) # If external_pid_file is not explicitly set, no extra PID file is = written. #external_pid_file =3D '' # write an extra PID = file # (change requires restart) = #-------------------------------------------------------------------------= ----- # CONNECTIONS AND AUTHENTICATION = #-------------------------------------------------------------------------= ----- # - Connection Settings - listen_addresses =3D '*' # what IP address(es) to listen = on; # comma-separated list of = addresses; # defaults to 'localhost'; use = '*' for all # (change requires restart) port =3D 5434 # (change requires restart) max_connections =3D 100 # (change requires restart) # Note: Increasing max_connections costs ~400 bytes of shared memory = per # connection slot, plus lock space (see max_locks_per_transaction). #superuser_reserved_connections =3D 3 # (change requires restart) #unix_socket_directories =3D '' # comma-separated list of directories # (change requires restart) #unix_socket_group =3D '' # (change requires = restart) #unix_socket_permissions =3D 0777 # begin with 0 to use = octal notation # (change requires restart) #bonjour =3D off # advertise server via = Bonjour # (change requires restart) #bonjour_name =3D '' # defaults to the computer name # (change requires restart) # - Security and Authentication - #authentication_timeout =3D 1min # 1s-600s #ssl =3D off # (change requires restart) #ssl_ciphers =3D 'HIGH:MEDIUM:+3DES:!aNULL' # allowed SSL ciphers # (change requires restart) #ssl_prefer_server_ciphers =3D on # (change requires = restart) #ssl_ecdh_curve =3D 'prime256v1' # (change requires = restart) #ssl_renegotiation_limit =3D 512MB # amount of data between = renegotiations #ssl_cert_file =3D 'server.crt' # (change requires restart) #ssl_key_file =3D 'server.key' # (change requires restart) #ssl_ca_file =3D '' # (change requires restart) #ssl_crl_file =3D '' # (change requires restart) #password_encryption =3D on #db_user_namespace =3D off # GSSAPI using Kerberos #krb_server_keyfile =3D '' #krb_caseins_users =3D off # - TCP Keepalives - # see "man 7 tcp" for details #tcp_keepalives_idle =3D 0 # TCP_KEEPIDLE, in seconds; # 0 selects the system default #tcp_keepalives_interval =3D 0 # TCP_KEEPINTVL, in seconds; # 0 selects the system default #tcp_keepalives_count =3D 0 # TCP_KEEPCNT; # 0 selects the system default = #-------------------------------------------------------------------------= ----- # RESOURCE USAGE (except WAL) = #-------------------------------------------------------------------------= ----- # - Memory - shared_buffers =3D 3072MB # min 128kB # (change requires restart) #huge_pages =3D try # on, off, or try # (change requires restart) temp_buffers =3D 8MB # min 800kB #max_prepared_transactions =3D 0 # zero disables the = feature # (change requires restart) # Note: Increasing max_prepared_transactions costs ~600 bytes of shared = memory # per transaction slot, plus lock space (see max_locks_per_transaction). # It is not advisable to set max_prepared_transactions nonzero unless = you # actively intend to use prepared transactions. work_mem =3D 256MB # LIDI 4MB *** min 64kB maintenance_work_mem =3D 256MB # min 1MB #autovacuum_work_mem =3D -1 # min 1MB, or -1 to use = maintenance_work_mem #max_stack_depth =3D 2MB # min 100kB dynamic_shared_memory_type =3D windows # the default is the first = option # supported by the operating = system: # posix # sysv # windows # mmap # use none to disable dynamic = shared memory # - Disk - #temp_file_limit =3D -1 # limits per-session temp file = space # in kB, or -1 for no limit # - Kernel Resource Usage - #max_files_per_process =3D 1000 # min 25 # (change requires restart) #shared_preload_libraries =3D '' # (change requires = restart) # - Cost-Based Vacuum Delay - #vacuum_cost_delay =3D 0 # 0-100 milliseconds #vacuum_cost_page_hit =3D 1 # 0-10000 credits #vacuum_cost_page_miss =3D 10 # 0-10000 credits #vacuum_cost_page_dirty =3D 20 # 0-10000 credits #vacuum_cost_limit =3D 200 # 1-10000 credits # - Background Writer - #bgwriter_delay =3D 200ms # 10-10000ms between = rounds #bgwriter_lru_maxpages =3D 100 # 0-1000 max buffers = written/round #bgwriter_lru_multiplier =3D 2.0 # 0-10.0 multipler on = buffers scanned/round # - Asynchronous Behavior - #effective_io_concurrency =3D 1 # 1-1000; 0 disables prefetching #max_worker_processes =3D 8 = #-------------------------------------------------------------------------= ----- # WRITE AHEAD LOG = #-------------------------------------------------------------------------= ----- # - Settings - #wal_level =3D minimal # minimal, archive, hot_standby, = or logical # (change requires restart) #fsync =3D on # turns forced synchronization = on or off #synchronous_commit =3D on # synchronization level; # off, local, remote_write, or = on #wal_sync_method =3D fsync # the default is the first = option # supported by the operating = system: # open_datasync # fdatasync (default on Linux) # fsync # fsync_writethrough # open_sync #full_page_writes =3D on # recover from partial = page writes #wal_log_hints =3D off # also do full page writes of = non-critical updates # (change requires restart) #wal_buffers =3D -1 # min 32kB, -1 sets based on = shared_buffers # (change requires restart) #wal_writer_delay =3D 200ms # 1-10000 milliseconds #commit_delay =3D 0 # range 0-100000, in = microseconds #commit_siblings =3D 5 # range 1-1000 # - Checkpoints - checkpoint_segments =3D 90 # in logfile segments, min 1, = 16MB each checkpoint_timeout =3D 5min # range 30s-1h checkpoint_completion_target =3D 0.8 # checkpoint target duration, = 0.0 - 1.0 #checkpoint_warning =3D 30s # 0 disables # - Archiving - #archive_mode =3D off # allows archiving to be done # (change requires restart) #archive_command =3D '' # command to use to archive a logfile = segment # placeholders: %p =3D path of file to = archive # %f =3D file name only # e.g. 'test ! -f = /mnt/server/archivedir/%f && cp %p /mnt/server/archivedir/%f' #archive_timeout =3D 0 # force a logfile segment switch after = this # number of seconds; 0 disables = #-------------------------------------------------------------------------= ----- # REPLICATION = #-------------------------------------------------------------------------= ----- # - Sending Server(s) - # Set these on the master and on any standby that will send replication = data. #max_wal_senders =3D 0 # max number of walsender processes # (change requires restart) #wal_keep_segments =3D 0 # in logfile segments, 16MB = each; 0 disables #wal_sender_timeout =3D 60s # in milliseconds; 0 disables max_replication_slots =3D 1 # max number of replication slots # (change requires restart) # - Master Server - # These settings are ignored on a standby server. #synchronous_standby_names =3D '' # standby servers that provide = sync rep # comma-separated list of = application_name # from standby(s); '*' =3D all #vacuum_defer_cleanup_age =3D 0 # number of xacts by which cleanup is = delayed # - Standby Servers - # These settings are ignored on a master server. #hot_standby =3D off # "on" allows queries during = recovery # (change requires restart) #max_standby_archive_delay =3D 30s # max delay before canceling = queries # when reading WAL from archive; # -1 allows indefinite delay #max_standby_streaming_delay =3D 30s # max delay before canceling = queries # when reading streaming WAL; # -1 allows indefinite delay #wal_receiver_status_interval =3D 10s # send replies at least this = often # 0 disables #hot_standby_feedback =3D off # send info from standby to = prevent # query conflicts #wal_receiver_timeout =3D 60s # time that receiver waits for # communication from master # in milliseconds; 0 disables = #-------------------------------------------------------------------------= ----- # QUERY TUNING = #-------------------------------------------------------------------------= ----- # - Planner Method Configuration - #enable_bitmapscan =3D on #enable_hashagg =3D on #enable_hashjoin =3D on #enable_indexscan =3D on #enable_indexonlyscan =3D on #enable_material =3D on #enable_mergejoin =3D on #enable_nestloop =3D on #enable_seqscan =3D on #enable_sort =3D on #enable_tidscan =3D on # - Planner Cost Constants - #seq_page_cost =3D 1.0 # measured on an arbitrary scale random_page_cost =3D 2.0 # same scale as above #cpu_tuple_cost =3D 0.01 # same scale as above #cpu_index_tuple_cost =3D 0.005 # same scale as above #cpu_operator_cost =3D 0.0025 # same scale as above effective_cache_size =3D 6GB # - Genetic Query Optimizer - #geqo =3D on geqo_threshold =3D 16 geqo_effort =3D 2 # range 1-10 #geqo_pool_size =3D 0 # selects default based on = effort #geqo_generations =3D 0 # selects default based on = effort #geqo_selection_bias =3D 2.0 # range 1.5-2.0 #geqo_seed =3D 0.0 # range 0.0-1.0 # - Other Planner Options - #default_statistics_target =3D 100 # range 1-10000 #constraint_exclusion =3D partition # on, off, or partition #cursor_tuple_fraction =3D 0.1 # range 0.0-1.0 #from_collapse_limit =3D 8 #join_collapse_limit =3D 8 # 1 disables collapsing of = explicit # JOIN clauses = #-------------------------------------------------------------------------= ----- # ERROR REPORTING AND LOGGING = #-------------------------------------------------------------------------= ----- # - Where to Log - log_destination =3D 'stderr' # Valid values are combinations = of # stderr, csvlog, syslog, and = eventlog, # depending on platform. csvlog # requires logging_collector to = be on. # This is used when logging to stderr: logging_collector =3D on # Enable capturing of stderr and = csvlog # into log files. Required to be = on for # csvlogs. # (change requires restart) # These are only used if logging_collector is on: #log_directory =3D 'pg_log' # directory where log files are = written, # can be absolute or relative to = PGDATA #log_filename =3D 'postgresql-%Y-%m-%d_%H%M%S.log' # log file name = pattern, # can include strftime() escapes #log_file_mode =3D 0600 # creation mode for log files, # begin with 0 to use octal = notation #log_truncate_on_rotation =3D off # If on, an existing log = file with the # same name as the new log file = will be # truncated rather than appended = to. # But such truncation only = occurs on # time-driven rotation, not on = restarts # or size-driven rotation. = Default is # off, meaning append to = existing files # in all cases. #log_rotation_age =3D 1d # Automatic rotation of = logfiles will # happen after that time. 0 = disables. #log_rotation_size =3D 10MB # Automatic rotation of logfiles = will # happen after that much log = output. # 0 disables. # These are relevant when logging to syslog: #syslog_facility =3D 'LOCAL0' #syslog_ident =3D 'postgres' # This is only relevant when logging to eventlog (win32): #event_source =3D 'PostgreSQL' # - When to Log - #client_min_messages =3D notice # values in order of decreasing = detail: # debug5 # debug4 # debug3 # debug2 # debug1 # log # notice # warning # error #log_min_messages =3D warning # values in order of decreasing = detail: # debug5 # debug4 # debug3 # debug2 # debug1 # info # notice # warning # error # log # fatal # panic #log_min_error_statement =3D error # values in order of decreasing = detail: # debug5 # debug4 # debug3 # debug2 # debug1 # info # notice # warning # error # log # fatal # panic (effectively off) #log_min_duration_statement =3D 150 # -1 is disabled, 0 logs all = statements # and their durations, > 0 logs = only # statements running at least = this number # of milliseconds # - What to Log - #debug_print_parse =3D off #debug_print_rewritten =3D off #debug_print_plan =3D off #debug_pretty_print =3D on #log_checkpoints =3D off log_connections =3D off log_disconnections =3D off log_duration =3D off log_error_verbosity =3D verbose # terse, default, or verbose = messages #log_hostname =3D off log_line_prefix =3D '%t ' # special values: # %a =3D application name # %u =3D user name # %d =3D database name # %r =3D remote host and port # %h =3D remote host # %p =3D process ID # %t =3D timestamp without = milliseconds # %m =3D timestamp with = milliseconds # %i =3D command tag # %e =3D SQL state # %c =3D session ID # %l =3D session line number # %s =3D session start = timestamp # %v =3D virtual transaction = ID # %x =3D transaction ID (0 if = none) # %q =3D stop here in = non-session # processes # %% =3D '%' # e.g. '<%u%%%d> ' #log_lock_waits =3D off # log lock waits >=3D = deadlock_timeout #log_statement =3D 'none' # none, ddl, mod, all #log_temp_files =3D -1 # log temporary files equal or = larger # than the specified size in = kilobytes; # -1 disables, 0 logs all temp = files log_timezone =3D 'US/Eastern' = #-------------------------------------------------------------------------= ----- # RUNTIME STATISTICS = #-------------------------------------------------------------------------= ----- # - Query/Index Statistics Collector - #track_activities =3D on track_counts =3D on #track_io_timing =3D off #track_functions =3D none # none, pl, all #track_activity_query_size =3D 1024 # (change requires restart) #update_process_title =3D on #stats_temp_directory =3D 'pg_stat_tmp' # - Statistics Monitoring - #log_parser_stats =3D off #log_planner_stats =3D off #log_executor_stats =3D off #log_statement_stats =3D off = #-------------------------------------------------------------------------= ----- # AUTOVACUUM PARAMETERS = #-------------------------------------------------------------------------= ----- autovacuum =3D on # Enable autovacuum subprocess? = 'on' # requires track_counts to also = be on. #log_autovacuum_min_duration =3D -1 # -1 disables, 0 logs all = actions and # their durations, > 0 logs only # actions running at least this = number # of milliseconds. #autovacuum_max_workers =3D 3 # max number of autovacuum = subprocesses # (change requires restart) #autovacuum_naptime =3D 1min # time between autovacuum runs #autovacuum_vacuum_threshold =3D 50 # min number of row updates = before # vacuum #autovacuum_analyze_threshold =3D 50 # min number of row updates = before # analyze #autovacuum_vacuum_scale_factor =3D 0.2 # fraction of table size before = vacuum #autovacuum_analyze_scale_factor =3D 0.1 # fraction of table size = before analyze #autovacuum_freeze_max_age =3D 200000000 # maximum XID age before = forced vacuum # (change requires restart) #autovacuum_multixact_freeze_max_age =3D 400000000 # maximum = multixact age # before forced vacuum # (change requires restart) autovacuum_vacuum_cost_delay =3D 50ms # default vacuum cost delay for # autovacuum, in milliseconds; # -1 means use vacuum_cost_delay #autovacuum_vacuum_cost_limit =3D -1 # default vacuum cost limit for # autovacuum, -1 means use # vacuum_cost_limit = #-------------------------------------------------------------------------= ----- # CLIENT CONNECTION DEFAULTS = #-------------------------------------------------------------------------= ----- # - Statement Behavior - #search_path =3D '"$user",public' # schema names #default_tablespace =3D '' # a tablespace name, '' uses the = default #temp_tablespaces =3D '' # a list of tablespace = names, '' uses # only default tablespace #check_function_bodies =3D on #default_transaction_isolation =3D 'read committed' #default_transaction_read_only =3D off #default_transaction_deferrable =3D off #session_replication_role =3D 'origin' #statement_timeout =3D 0 # in milliseconds, 0 is = disabled #lock_timeout =3D 0 # in milliseconds, 0 is disabled #vacuum_freeze_min_age =3D 50000000 #vacuum_freeze_table_age =3D 150000000 #vacuum_multixact_freeze_min_age =3D 5000000 #vacuum_multixact_freeze_table_age =3D 150000000 #bytea_output =3D 'hex' # hex, escape #xmlbinary =3D 'base64' #xmloption =3D 'content' # - Locale and Formatting - datestyle =3D 'iso, ymd' #intervalstyle =3D 'postgres' timezone =3D 'US/Eastern' #timezone_abbreviations =3D 'Default' # Select the set of available = time zone # abbreviations. Currently, = there are # Default # Australia (historical usage) # India # You can create your own file = in # share/timezonesets/. #extra_float_digits =3D 0 # min -15, max 3 #client_encoding =3D sql_ascii # actually, defaults to database # encoding # These settings are initialized by initdb, but they can be changed. lc_messages =3D 'French_Canada.1252' # locale for = system error message # strings lc_monetary =3D 'French_Canada.1252' # locale for = monetary formatting lc_numeric =3D 'French_Canada.1252' # locale for = number formatting lc_time =3D 'French_Canada.1252' # locale = for time formatting # default configuration for text search default_text_search_config =3D 'pg_catalog.french' # - Other Defaults - #dynamic_library_path =3D '$libdir' #local_preload_libraries =3D '' #session_preload_libraries =3D '' = #-------------------------------------------------------------------------= ----- # LOCK MANAGEMENT = #-------------------------------------------------------------------------= ----- #deadlock_timeout =3D 1s #max_locks_per_transaction =3D 64 # min 10 # (change requires restart) # Note: Each lock table slot uses ~270 bytes of shared memory, and = there are # max_locks_per_transaction * (max_connections + = max_prepared_transactions) # lock table slots. #max_pred_locks_per_transaction =3D 64 # min 10 # (change requires restart) = #-------------------------------------------------------------------------= ----- # VERSION/PLATFORM COMPATIBILITY = #-------------------------------------------------------------------------= ----- # - Previous PostgreSQL Versions - #array_nulls =3D on #backslash_quote =3D safe_encoding # on, off, or safe_encoding #default_with_oids =3D off #escape_string_warning =3D on #lo_compat_privileges =3D off #quote_all_identifiers =3D off #sql_inheritance =3D on #standard_conforming_strings =3D on #synchronize_seqscans =3D on # - Other Platforms and Clients - #transform_null_equals =3D off = #-------------------------------------------------------------------------= ----- # ERROR HANDLING = #-------------------------------------------------------------------------= ----- #exit_on_error =3D off # terminate session on any = error? #restart_after_crash =3D on # reinitialize after backend = crash? = #-------------------------------------------------------------------------= ----- # CONFIG FILE INCLUDES = #-------------------------------------------------------------------------= ----- # These options allow settings to be loaded from files other than the # default postgresql.conf. #include_dir =3D 'conf.d' # include files ending = in '.conf' from # directory 'conf.d' #include_if_exists =3D 'exists.conf' # include file only if it exists #include =3D 'special.conf' # include file = #-------------------------------------------------------------------------= ----- # CUSTOMIZED OPTIONS = #-------------------------------------------------------------------------= ----- # Add settings for extensions here > Le 2015-04-25 =E0 08:33, Andres Freund <andres@anarazel.de> a =E9crit = : >=20 > Hi, >=20 > On 2015-04-24 10:10:06 +0000, pdrolet@infodata.ca wrote: >> The following bug has been logged on the website: >>=20 >> Bug reference: 13143 >> Logged by: Patrice Drolet >> Email address: pdrolet@infodata.ca >> PostgreSQL version: 9.4.1 >> Operating system: Windows 2008r2 >> Description: =20 >>=20 >> I have experienced it many times. The master streams to the slave for = days >> and no problem (using a replication slot). If I stop the master, it = does not >> want to restart and I have this error in the log: >>=20 >> 2015-04-24 04:47:12 EDT LOG: le syst=E8me de bases de donn=E9es a = =E9t=E9 arr=EAt=E9 =E0 >> 2015-04-24 04:44:37 EDT >> 2015-04-24 04:47:12 EDT PANIC: n'a pas pu synchroniser sur disque = (fsync) >> le fichier =AB pg_replslot/node_win2012sec/state =BB : Bad file = descriptor >> 2015-04-24 04:47:12 EDT LOG: processus de lancement (PID 23180) = quitte avec >> le code de sortie 3 >> 2015-04-24 04:47:12 EDT LOG: annulation du d=E9marrage =E0 cause = d'un =E9chec >> dans le processus de lancement >>=20 >> To restart the server, I have to manually delete the folder in = pg_replslot. >> But then I need to re build the slave. Not very practical for a multi >> gigabyte database.=20 >=20 > Obviously that's not how it supposed to be. I don't have access to a > windows systems, much less a french one unfortunately. >=20 > Could you: > 1) describe your exact setup > 2) Check that it's unrelated to any anti-virus software running? > 3) configure 'log_error_verbosity =3D verbose'? Then we'll get line > numbers, which will help narrowing down what's happening. > 4) You could try to debug it by installing sysinternal's sysmon and > recording what is exactly done with that file? >=20 > Regards, >=20 > Andres
pgsql-bugs by date: