Postgres - BDR issue - Mailing list pgsql-hackers

From Rahul Goel
Subject Postgres - BDR issue
Date
Msg-id CAH8foHB3DpYV8b+2fWjek35DJfHc4SPG2ymrpxSyw7OCk_eihg@mail.gmail.com
Whole thread Raw
Responses Re: Postgres - BDR issue  (Craig Ringer <craig@2ndquadrant.com>)
List pgsql-hackers
Hi

I am facing the below issue in setting up BDR:

I have 2 nodes (For simplicity, I will refer them as node 1 & node 2). BDR group was created from Node 1. When a new postgres node (i.e. node 2) joins the group, then the node_status in bdr.bdr_nodes table of new node (i.e. node 2) show 'r', but node_status remains 'i' on the upstream master (i.e. node 1). I could see conflict has happened in bdr.bdr_nodes table, and node 1 is unable to update the status of node 2, but couldn't able to find the solution

Node 1 (BDR group was created from this node):
(Masked DB Name, and password)

psql -U postgres -d xyz -c "select * from bdr.bdr_nodes;"

node_sysid      | node_timeline | node_dboid | node_status |   node_name   |  node_local_dsn  |  node_init_from_dsn                              
---------------------+---------------+------------+-------------+---------------+-----------------------------------------------------------------------------+----------------------
 6197340597374984280 |             1 |      16385 | r           | 10.42.157.193 | port=5432 dbname=xyzdb host=10.42.157.193 user=postgres password=password |
 6197344706786291803 |             1 |      12156 | i           | 10.42.99.96   | port=5432 dbname=xyzdb host=10.42.99.96 user=postgres password=password   | port=5432 dbname=xyzdb host=10.42.157.193 user=postgres password=password
(2 rows)

Logs
< 2015-09-22 14:16:36.244 UTC >STATEMENT:  CREATE SCHEMA public;
< 2015-09-22 14:16:42.615 UTC >LOG:  registering background worker "bdr db: xyzdb"
< 2015-09-22 14:16:42.615 UTC >LOG:  starting background worker process "bdr db: xyzdb"
< 2015-09-22 14:23:16.498 UTC >LOG:  logical decoding found consistent point at 0/879E980
< 2015-09-22 14:23:16.498 UTC >DETAIL:  There are no running transactions.
< 2015-09-22 14:23:16.498 UTC >LOG:  exported logical decoding snapshot: "00000511-1" with 0 transaction IDs
< 2015-09-22 14:23:25.284 UTC >LOG:  starting logical decoding for slot "bdr_16385_6197344706786291803_1_12156__"
< 2015-09-22 14:23:25.284 UTC >DETAIL:  streaming transactions committing after 0/879E9B8, reading WAL from 0/879E980
< 2015-09-22 14:23:25.284 UTC >LOG:  logical decoding found consistent point at 0/879E980
< 2015-09-22 14:23:25.284 UTC >DETAIL:  There are no running transactions.
< 2015-09-22 14:23:25.294 UTC >LOG:  could not receive data from client: Connection reset by peer
< 2015-09-22 14:23:25.294 UTC >LOG:  unexpected EOF on standby connection
< 2015-09-22 14:23:26.299 UTC >LOG:  registering background worker "bdr (6197340597374984280,1,16385,)->bdr (6197344706786291803,1,"
< 2015-09-22 14:23:26.299 UTC >LOG:  starting background worker process "bdr (6197340597374984280,1,16385,)->bdr (6197344706786291803,1,"
< 2015-09-22 14:23:26.311 UTC >LOG:  starting logical decoding for slot "bdr_16385_6197344706786291803_1_12156__"
< 2015-09-22 14:23:26.311 UTC >DETAIL:  streaming transactions committing after 0/87B0998, reading WAL from 0/879E9B8
< 2015-09-22 14:23:26.313 UTC >LOG:  logical decoding found consistent point at 0/879E9B8
< 2015-09-22 14:23:26.313 UTC >DETAIL:  Logical decoding will begin using saved snapshot.
< 2015-09-22 14:23:26.539 UTC >LOG:  CONFLICT: remote UPDATE on relation bdr.bdr_nodes originating at node 6197344706786291803:1:12156 at ts 2015-09-22 14:23:21.776464+00; row was previously updated at node 0:0. Resolution: last_update_wins_keep_local; PKEY: node_sysid[text]:6197344706786291803 node_timeline[oid]:1 node_dboid[oid]:12156 node_status[char]:i node_name[text]:10.42.99.96 node_local_dsn[text]:port=5432 dbname=xyzdb host=10.42.99.96 user=postgres password=password node_init_from_dsn[text]:port=5432 dbname=xyzdb host=10.42.157.193 user=postgres password=password



Node 2 (check the status of node here. It's ready but in node 1 it is initializing)
(Masked DB Name, and password)

[root@3c8668f9183c /]# psql -U postgres -d hubub -c "select * from bdr.bdr_nodes;"

     node_sysid      | node_timeline | node_dboid | node_status |   node_name   | node_local_dsn |  node_init_from_dsn                              
---------------------+---------------+------------+-------------+---------------+-----------------------------------------------------------------------------+----------------------
 

6197340597374984280 |             1 |      16385 | r           | 10.42.157.193 | port=5432 dbname=hubub host=10.42.157.193 user=postgres password=qsV6hKyW94 |
6197344706786291803 |             1 |      12156 | r           | 10.42.99.96   | port=5432 dbname=hubub host=10.42.99.96 user=postgres password=qsV6hKyW94   | port=5432 dbname=hubub host=10.42.157.193 user=postgres password=qsV6hKyW94
(2 rows)


Logs
< 2015-09-22 14:23:11.824 UTC >LOG:  registering background worker "bdr db: xyzdb"
< 2015-09-22 14:23:11.824 UTC >LOG:  starting background worker process "bdr db: xyzdb"
< 2015-09-22 14:23:11.875 UTC >LOG:  Creating replica with: /usr/pgsql-9.4/bin/bdr_initial_load --snapshot 00000511-1 --source "port=5432 dbname=xyzdb host=10.42.157.193 user=postgres password=password" --target "port=5432 dbname=xyzdb host=10.42.99.96 user=postgres password=password" --tmp-directory "/tmp/postgres-bdr-00000511-1.259", --pg-dump-path "/usr/pgsql-9.4/bin/pg_dump", --pg-restore-path "/usr/pgsql-9.4/bin/pg_restore"
Dumping remote database "port=5432 dbname=xyzdb host=10.42.157.193 user=postgres password=password fallback_application_name='bdr (6197344706786291803,1,12156,): init_replica dump'" with 1 concurrent workers to "/tmp/postgres-bdr-00000511-1.259"
Restoring dump to local DB "port=5432 dbname=xyzdb host=10.42.99.96 user=postgres password=password fallback_application_name='bdr (6197344706786291803,1,12156,): init_replica restore' options='-c bdr.do_not_replicate=on -c bdr.permit_unsafe_ddl_commands=on -c bdr.skip_ddl_replication=on -c bdr.skip_ddl_locking=on'" with 1 concurrent workers from "/tmp/postgres-bdr-00000511-1.259"
< 2015-09-22 14:23:20.632 UTC >LOG:  registering background worker "bdr: catchup apply to 0/87B0DE0"
< 2015-09-22 14:23:20.632 UTC >LOG:  starting background worker process "bdr: catchup apply to 0/87B0DE0"
< 2015-09-22 14:23:20.653 UTC >LOG:  bdr apply finished processing; replayed to 0/87B0DE0 of required 0/87B0DE0
< 2015-09-22 14:23:20.654 UTC >LOG:  worker process: bdr: catchup apply to 0/87B0DE0 (PID 275) exited with exit code 0
< 2015-09-22 14:23:20.654 UTC >LOG:  unregistering background worker "bdr: catchup apply to 0/87B0DE0"
< 2015-09-22 14:23:21.655 UTC >LOG:  registering background worker "bdr (6197344706786291803,1,12156,)->bdr (6197340597374984280,1,"
< 2015-09-22 14:23:21.655 UTC >LOG:  starting background worker process "bdr (6197344706786291803,1,12156,)->bdr (6197340597374984280,1,"
< 2015-09-22 14:23:21.684 UTC >LOG:  logical decoding found consistent point at 0/86E0910
< 2015-09-22 14:23:21.684 UTC >DETAIL:  There are no running transactions.
< 2015-09-22 14:23:21.685 UTC >LOG:  exported logical decoding snapshot: "0000055B-1" with 0 transaction IDs
< 2015-09-22 14:23:21.691 UTC >LOG:  starting logical decoding for slot "bdr_12156_6197340597374984280_1_16385__"
< 2015-09-22 14:23:21.691 UTC >DETAIL:  streaming transactions committing after 0/86E0948, reading WAL from 0/86E0910
< 2015-09-22 14:23:21.691 UTC >LOG:  logical decoding found consistent point at 0/86E0910
< 2015-09-22 14:23:21.691 UTC >DETAIL:  There are no running transactions.

Thanks in advance for the help!

Regards
Rahul Goel
647 949 1679

pgsql-hackers by date:

Previous
From: Alvaro Herrera
Date:
Subject: Re: Rework the way multixact truncations work
Next
From: Kam Lasater
Date:
Subject: No Issue Tracker - Say it Ain't So!