BDR not catching up - Mailing list pgsql-general
From | cchee-ob |
---|---|
Subject | BDR not catching up |
Date | |
Msg-id | 1457739190408-5892335.post@n5.nabble.com Whole thread Raw |
List | pgsql-general |
I'm getting this message repeating on the UDR node that I just added today. Any way to get it start applying? svp2=# select * from bdr.bdr_nodes; node_sysid | node_timeline | node_dboid | node_status | node_name | node_local_dsn | node_init_from_dsn ---------------------+---------------+------------+-------------+-----------------+---------------------------------------+------------------------------------ ------- 6206439726032130602 | 1 | 16385 | r | UDR1 | | 6260914790689848233 | 1 | 16385 | c | UDR1-subscriber | host=10.253.0.8 port=5432 dbname=svp2 | host=10.253.228.105 port=5432 dbnam e=svp2 (2 rows) t=2016-03-11 15:23:51 PST d= h= p=7226 a=DEBUG: 00000: per-db worker for node bdr (6260914790689848233,1,16385,) starting t=2016-03-11 15:23:51 PST d= h= p=7226 a=LOCATION: bdr_perdb_worker_main, bdr_perdb.c:707 t=2016-03-11 15:23:51 PST d= h= p=7226 a=DEBUG: 00000: init_replica init from remote host=10.253.228.105 port=5432 dbname=svp2 t=2016-03-11 15:23:51 PST d= h= p=7226 a=LOCATION: bdr_init_replica, bdr_init_replica.c:830 t=2016-03-11 15:23:51 PST d= h= p=7226 a=DEBUG: 00000: found valid replication identifier 1 t=2016-03-11 15:23:51 PST d= h= p=7226 a=LOCATION: bdr_establish_connection_and_slot, bdr.c:572 t=2016-03-11 15:23:51 PST d= h= p=7226 a=DEBUG: 00000: launching catchup mode apply worker t=2016-03-11 15:23:51 PST d= h= p=7226 a=LOCATION: bdr_init_replica, bdr_init_replica.c:1043 t=2016-03-11 15:23:51 PST d= h= p=7226 a=DEBUG: 00000: Registering bdr apply catchup worker for bdr (6206439726032130602,1,16385,) to lsn 19E/10AC4F0 t=2016-03-11 15:23:51 PST d= h= p=7226 a=LOCATION: bdr_catchup_to_lsn, bdr_init_replica.c:1161 t=2016-03-11 15:23:51 PST d= h= p=4718 a=LOG: 00000: registering background worker "bdr: catchup apply to 19E/10AC4F0" t=2016-03-11 15:23:51 PST d= h= p=4718 a=LOCATION: BackgroundWorkerStateChange, bgworker.c:347 t=2016-03-11 15:23:51 PST d= h= p=4718 a=LOG: 00000: starting background worker process "bdr: catchup apply to 19E/10AC4F0" t=2016-03-11 15:23:51 PST d= h= p=4718 a=LOCATION: do_start_bgworker, postmaster.c:5412 t=2016-03-11 15:23:51 PST d= h= p=7227 a=NOTICE: 00000: version "1.0" of extension "btree_gist" is already installed t=2016-03-11 15:23:51 PST d= h= p=7227 a=LOCATION: ExecAlterExtensionStmt, extension.c:2700 t=2016-03-11 15:23:51 PST d= h= p=7227 a=NOTICE: 00000: version "0.9.2.0" of extension "bdr" is already installed t=2016-03-11 15:23:51 PST d= h= p=7227 a=LOCATION: ExecAlterExtensionStmt, extension.c:2700 t=2016-03-11 15:23:51 PST d= h= p=7227 a=NOTICE: 42622: identifier "bdr (6260914790689848233,1,16385,): apply catchup up to 19E/10AC4F0" will be truncated to "bdr (6260914790689848233,1,16385,): apply catchup up to 19E/10A" t=2016-03-11 15:23:51 PST d= h= p=7227 a=LOCATION: truncate_identifier, scansup.c:195 t=2016-03-11 15:23:51 PST d= h= p=7227 a=DEBUG: 00000: found valid replication identifier 1 t=2016-03-11 15:23:51 PST d= h= p=7227 a=LOCATION: bdr_establish_connection_and_slot, bdr.c:572 t=2016-03-11 15:23:51 PST d= h= p=7227 a=INFO: 00000: starting up replication from 1 at 19D/D204D0C8 t=2016-03-11 15:23:51 PST d= h= p=7227 a=LOCATION: bdr_apply_main, bdr_apply.c:2550 t=2016-03-11 15:23:51 PST d= h= p=7227 a=DEBUG: 00000: bdr_apply: BEGIN origin(source, orig_lsn, timestamp): 19D/D204D3A0, 2016-03-11 13:49:47.293208-08 t=2016-03-11 15:23:51 PST d= h= p=7227 a=LOCATION: process_remote_begin, bdr_apply.c:198 t=2016-03-11 15:23:51 PST d= h= p=7227 a=ERROR: XX000: tuple natts mismatch, 26 vs 28 t=2016-03-11 15:23:51 PST d= h= p=7227 a=LOCATION: read_tuple_parts, bdr_apply.c:1892 t=2016-03-11 15:23:51 PST d= h= p=4718 a=LOG: 00000: worker process: bdr: catchup apply to 19E/10AC4F0 (PID 7227) exited with exit code 1 t=2016-03-11 15:23:51 PST d= h= p=4718 a=LOCATION: LogChildExit, postmaster.c:3325 t=2016-03-11 15:23:51 PST d= h= p=4718 a=LOG: 00000: unregistering background worker "bdr: catchup apply to 19E/10AC4F0" t=2016-03-11 15:23:51 PST d= h= p=4718 a=LOCATION: ForgetBackgroundWorker, bgworker.c:376 t=2016-03-11 15:23:52 PST d= h= p=7226 a=ERROR: XX000: catchup worker exited before catching up to target LSN 19E/10AC4F0 t=2016-03-11 15:23:52 PST d= h= p=7226 a=LOCATION: bdr_catchup_to_lsn, bdr_init_replica.c:1273 t=2016-03-11 15:23:52 PST d= h= p=4718 a=LOG: 00000: worker process: bdr db: svp2 (PID 7226) exited with exit code 1 t=2016-03-11 15:23:52 PST d= h= p=4718 a=LOCATION: LogChildExit, postmaster.c:3325 t=2016-03-11 15:23:54 PST d= h= p=7228 a=DEBUG: 00000: autovacuum: processing database "bdr_supervisordb" t=2016-03-11 15:23:54 PST d= h= p=7228 a=LOCATION: AutoVacWorkerMain, autovacuum.c:1684 t=2016-03-11 15:23:57 PST d= h= p=4718 a=LOG: 00000: starting background worker process "bdr db: svp2" t=2016-03-11 15:23:57 PST d= h= p=4718 a=LOCATION: do_start_bgworker, postmaster.c:5412 t=2016-03-11 15:23:57 PST d= h= p=7229 a=NOTICE: 00000: version "1.0" of extension "btree_gist" is already installed t=2016-03-11 15:23:57 PST d= h= p=7229 a=LOCATION: ExecAlterExtensionStmt, extension.c:2700 t=2016-03-11 15:23:57 PST d= h= p=7229 a=NOTICE: 00000: version "0.9.2.0" of extension "bdr" is already installed t=2016-03-11 15:23:57 PST d= h= p=7229 a=LOCATION: ExecAlterExtensionStmt, extension.c:2700 t=2016-03-11 15:23:57 PST d= h= p=7229 a=DEBUG: 00000: per-db worker for node bdr (6260914790689848233,1,16385,) starting t=2016-03-11 15:23:57 PST d= h= p=7229 a=LOCATION: bdr_perdb_worker_main, bdr_perdb.c:707 t=2016-03-11 15:23:57 PST d= h= p=7229 a=DEBUG: 00000: init_replica init from remote host=10.253.228.105 port=5432 dbname=svp2 t=2016-03-11 15:23:57 PST d= h= p=7229 a=LOCATION: bdr_init_replica, bdr_init_replica.c:830 t=2016-03-11 15:23:57 PST d= h= p=7229 a=DEBUG: 00000: found valid replication identifier 1 t=2016-03-11 15:23:57 PST d= h= p=7229 a=LOCATION: bdr_establish_connection_and_slot, bdr.c:572 t=2016-03-11 15:23:57 PST d= h= p=7229 a=DEBUG: 00000: launching catchup mode apply worker t=2016-03-11 15:23:57 PST d= h= p=7229 a=LOCATION: bdr_init_replica, bdr_init_replica.c:1043 t=2016-03-11 15:23:57 PST d= h= p=7229 a=DEBUG: 00000: Registering bdr apply catchup worker for bdr (6206439726032130602,1,16385,) to lsn 19E/10BA488 t=2016-03-11 15:23:57 PST d= h= p=7229 a=LOCATION: bdr_catchup_to_lsn, bdr_init_replica.c:1161 t=2016-03-11 15:23:57 PST d= h= p=4718 a=LOG: 00000: registering background worker "bdr: catchup apply to 19E/10BA488" t=2016-03-11 15:23:57 PST d= h= p=4718 a=LOCATION: BackgroundWorkerStateChange, bgworker.c:347 t=2016-03-11 15:23:57 PST d= h= p=4718 a=LOG: 00000: starting background worker process "bdr: catchup apply to 19E/10BA488" t=2016-03-11 15:23:57 PST d= h= p=4718 a=LOCATION: do_start_bgworker, postmaster.c:5412 t=2016-03-11 15:23:57 PST d= h= p=7230 a=NOTICE: 00000: version "1.0" of extension "btree_gist" is already installed t=2016-03-11 15:23:57 PST d= h= p=7230 a=LOCATION: ExecAlterExtensionStmt, extension.c:2700 t=2016-03-11 15:23:57 PST d= h= p=7230 a=NOTICE: 00000: version "0.9.2.0" of extension "bdr" is already installed t=2016-03-11 15:23:57 PST d= h= p=7230 a=LOCATION: ExecAlterExtensionStmt, extension.c:2700 t=2016-03-11 15:23:57 PST d= h= p=7230 a=NOTICE: 42622: identifier "bdr (6260914790689848233,1,16385,): apply catchup up to 19E/10BA488" will be truncated to "bdr (6260914790689848233,1,16385,): apply catchup up to 19E/10B" t=2016-03-11 15:23:57 PST d= h= p=7230 a=LOCATION: truncate_identifier, scansup.c:195 t=2016-03-11 15:23:57 PST d= h= p=7230 a=DEBUG: 00000: found valid replication identifier 1 t=2016-03-11 15:23:57 PST d= h= p=7230 a=LOCATION: bdr_establish_connection_and_slot, bdr.c:572 t=2016-03-11 15:23:57 PST d= h= p=7230 a=INFO: 00000: starting up replication from 1 at 19D/D204D0C8 t=2016-03-11 15:23:57 PST d= h= p=7230 a=LOCATION: bdr_apply_main, bdr_apply.c:2550 t=2016-03-11 15:23:57 PST d= h= p=7230 a=DEBUG: 00000: bdr_apply: BEGIN origin(source, orig_lsn, timestamp): 19D/D204D3A0, 2016-03-11 13:49:47.293208-08 t=2016-03-11 15:23:57 PST d= h= p=7230 a=LOCATION: process_remote_begin, bdr_apply.c:198 t=2016-03-11 15:23:57 PST d= h= p=7230 a=ERROR: XX000: tuple natts mismatch, 26 vs 28 t=2016-03-11 15:23:57 PST d= h= p=7230 a=LOCATION: read_tuple_parts, bdr_apply.c:1892 t=2016-03-11 15:23:57 PST d= h= p=4718 a=LOG: 00000: worker process: bdr: catchup apply to 19E/10BA488 (PID 7230) exited with exit code 1 t=2016-03-11 15:23:57 PST d= h= p=4718 a=LOCATION: LogChildExit, postmaster.c:3325 t=2016-03-11 15:23:57 PST d= h= p=4718 a=LOG: 00000: unregistering background worker "bdr: catchup apply to 19E/10BA488" t=2016-03-11 15:23:57 PST d= h= p=4718 a=LOCATION: ForgetBackgroundWorker, bgworker.c:376 t=2016-03-11 15:23:58 PST d= h= p=7229 a=ERROR: XX000: catchup worker exited before catching up to target LSN 19E/10BA488 t=2016-03-11 15:23:58 PST d= h= p=7229 a=LOCATION: bdr_catchup_to_lsn, bdr_init_replica.c:1273 t=2016-03-11 15:23:58 PST d= h= p=4718 a=LOG: 00000: worker process: bdr db: svp2 (PID 7229) exited with exit code 1 t=2016-03-11 15:23:58 PST d= h= p=4718 a=LOCATION: LogChildExit, postmaster.c:3325 t=2016-03-11 15:24:03 PST d= h= p=4718 a=LOG: 00000: starting background worker process "bdr db: svp2" t=2016-03-11 15:24:03 PST d= h= p=4718 a=LOCATION: do_start_bgworker, postmaster.c:5412 t=2016-03-11 15:24:03 PST d= h= p=7231 a=NOTICE: 00000: version "1.0" of extension "btree_gist" is already installed t=2016-03-11 15:24:03 PST d= h= p=7231 a=LOCATION: ExecAlterExtensionStmt, extension.c:2700 t=2016-03-11 15:24:03 PST d= h= p=7231 a=NOTICE: 00000: version "0.9.2.0" of extension "bdr" is already installed t=2016-03-11 15:24:03 PST d= h= p=7231 a=LOCATION: ExecAlterExtensionStmt, extension.c:2700 t=2016-03-11 15:24:03 PST d= h= p=7231 a=DEBUG: 00000: per-db worker for node bdr (6260914790689848233,1,16385,) starting t=2016-03-11 15:24:03 PST d= h= p=7231 a=LOCATION: bdr_perdb_worker_main, bdr_perdb.c:707 t=2016-03-11 15:24:03 PST d= h= p=7231 a=DEBUG: 00000: init_replica init from remote host=10.253.228.105 port=5432 dbname=svp2 t=2016-03-11 15:24:03 PST d= h= p=7231 a=LOCATION: bdr_init_replica, bdr_init_replica.c:830 t=2016-03-11 15:24:03 PST d= h= p=7231 a=DEBUG: 00000: found valid replication identifier 1 t=2016-03-11 15:24:03 PST d= h= p=7231 a=LOCATION: bdr_establish_connection_and_slot, bdr.c:572 t=2016-03-11 15:24:03 PST d= h= p=7231 a=DEBUG: 00000: launching catchup mode apply worker t=2016-03-11 15:24:03 PST d= h= p=7231 a=LOCATION: bdr_init_replica, bdr_init_replica.c:1043 t=2016-03-11 15:24:03 PST d= h= p=7231 a=DEBUG: 00000: Registering bdr apply catchup worker for bdr (6206439726032130602,1,16385,) to lsn 19E/10E9D58 t=2016-03-11 15:24:03 PST d= h= p=7231 a=LOCATION: bdr_catchup_to_lsn, bdr_init_replica.c:1161 t=2016-03-11 15:24:03 PST d= h= p=4718 a=LOG: 00000: registering background worker "bdr: catchup apply to 19E/10E9D58" t=2016-03-11 15:24:03 PST d= h= p=4718 a=LOCATION: BackgroundWorkerStateChange, bgworker.c:347 t=2016-03-11 15:24:03 PST d= h= p=4718 a=LOG: 00000: starting background worker process "bdr: catchup apply to 19E/10E9D58" t=2016-03-11 15:24:03 PST d= h= p=4718 a=LOCATION: do_start_bgworker, postmaster.c:5412 t=2016-03-11 15:24:03 PST d= h= p=7232 a=NOTICE: 00000: version "1.0" of extension "btree_gist" is already installed t=2016-03-11 15:24:03 PST d= h= p=7232 a=LOCATION: ExecAlterExtensionStmt, extension.c:2700 t=2016-03-11 15:24:03 PST d= h= p=7232 a=NOTICE: 00000: version "0.9.2.0" of extension "bdr" is already installed t=2016-03-11 15:24:03 PST d= h= p=7232 a=LOCATION: ExecAlterExtensionStmt, extension.c:2700 t=2016-03-11 15:24:03 PST d= h= p=7232 a=NOTICE: 42622: identifier "bdr (6260914790689848233,1,16385,): apply catchup up to 19E/10E9D58" will be truncated to "bdr (6260914790689848233,1,16385,): apply catchup up to 19E/10E" t=2016-03-11 15:24:03 PST d= h= p=7232 a=LOCATION: truncate_identifier, scansup.c:195 t=2016-03-11 15:24:03 PST d= h= p=7232 a=DEBUG: 00000: found valid replication identifier 1 t=2016-03-11 15:24:03 PST d= h= p=7232 a=LOCATION: bdr_establish_connection_and_slot, bdr.c:572 t=2016-03-11 15:24:03 PST d= h= p=7232 a=INFO: 00000: starting up replication from 1 at 19D/D204D0C8 t=2016-03-11 15:24:03 PST d= h= p=7232 a=LOCATION: bdr_apply_main, bdr_apply.c:2550 t=2016-03-11 15:24:03 PST d= h= p=7232 a=DEBUG: 00000: bdr_apply: BEGIN origin(source, orig_lsn, timestamp): 19D/D204D3A0, 2016-03-11 13:49:47.293208-08 t=2016-03-11 15:24:03 PST d= h= p=7232 a=LOCATION: process_remote_begin, bdr_apply.c:198 t=2016-03-11 15:24:03 PST d= h= p=7232 a=ERROR: XX000: tuple natts mismatch, 26 vs 28 t=2016-03-11 15:24:03 PST d= h= p=7232 a=LOCATION: read_tuple_parts, bdr_apply.c:1892 t=2016-03-11 15:24:03 PST d= h= p=4718 a=LOG: 00000: worker process: bdr: catchup apply to 19E/10E9D58 (PID 7232) exited with exit code 1 t=2016-03-11 15:24:03 PST d= h= p=4718 a=LOCATION: LogChildExit, postmaster.c:3325 t=2016-03-11 15:24:03 PST d= h= p=4718 a=LOG: 00000: unregistering background worker "bdr: catchup apply to 19E/10E9D58" t=2016-03-11 15:24:03 PST d= h= p=4718 a=LOCATION: ForgetBackgroundWorker, bgworker.c:376 t=2016-03-11 15:24:04 PST d= h= p=7231 a=ERROR: XX000: catchup worker exited before catching up to target LSN 19E/10E9D58 t=2016-03-11 15:24:04 PST d= h= p=7231 a=LOCATION: bdr_catchup_to_lsn, bdr_init_replica.c:1273 t=2016-03-11 15:24:04 PST d= h= p=4718 a=LOG: 00000: worker process: bdr db: svp2 (PID 7231) exited with exit code 1 t=2016-03-11 15:24:04 PST d= h= p=4718 a=LOCATION: LogChildExit, postmaster.c:3325 This is from the primary node: svp2=# SELECT slot_name, database, active, pg_xlog_location_diff(pg_current_xlog_insert_location(), restart_lsn) AS retained_bytes FROM pg_replication_slots WHERE plugin = 'bdr'; slot_name | database | active | retained_bytes -----------------------------------------+----------+--------+---------------- bdr_16385_6260914790689848233_1_16385__ | svp2 | f | 687816472 (1 row) And this same scenario happens every time I try to add a new node. Thank you, Carter -- View this message in context: http://postgresql.nabble.com/BDR-not-catching-up-tp5892335.html Sent from the PostgreSQL - general mailing list archive at Nabble.com.
pgsql-general by date: