Thread: bdr manual cleanup required
I am trying to repair a broken bdr cluster setup and so far everything I tried failed. Under the original node that ran bdr.bdr_group_create I am getting the following error:
2015-12-04 19:34:29.063 UTC,,,22991,,5661eac4.59cf,1,,2015-12-04 19:34:28 UTC,3/0,0,ERROR,55000,"previous init failed, manual cleanup is required","Found bdr.bdr_nodes entry for bdr (6224504646761731677,1,16389,) with state=i in remote bdr.bdr_nodes","Remove all replication identifiers and slots corresponding to this node from the init target node then drop and recreate this database and try again",,,,,,,"bdr (6224504646761731677,1,16389,): perdb"
Is there a way to get the cluster in a correct state without having to drop the db?
Thanks
-Selim
2015-12-04 19:34:29.063 UTC,,,22991,,5661eac4.59cf,1,,2015-12-04 19:34:28 UTC,3/0,0,ERROR,55000,"previous init failed, manual cleanup is required","Found bdr.bdr_nodes entry for bdr (6224504646761731677,1,16389,) with state=i in remote bdr.bdr_nodes","Remove all replication identifiers and slots corresponding to this node from the init target node then drop and recreate this database and try again",,,,,,,"bdr (6224504646761731677,1,16389,): perdb"
Is there a way to get the cluster in a correct state without having to drop the db?
Thanks
-Selim
Did you try this :
https://github.com/2ndQuadrant/bdr/issues/127 :
<<<
https://github.com/2ndQuadrant/bdr/issues/127 :
<<<
BEGIN;
SET LOCAL bdr.skip_ddl_locking = on;
SET LOCAL bdr.permit_unsafe_ddl_commands = on;
SET LOCAL bdr.skip_ddl_replication = on;
SECURITY LABEL FOR bdr ON DATABASE mydb IS NULL;
DELETE FROM bdr.bdr_connections;
DELETE FROM bdr.bdr_nodes;
SELECT bdr.bdr_connections_changed();
COMMIT;
SELECT pg_terminate_backend(pid)
FROM pg_stat_activity
WHERE datname = current_database() AND application_name LIKE '%): perdb';
>>>For now, I never went into situations where I had to destroy all the databases in all nodes.
Sylvain
Sylvain
2015-12-04 20:40 GMT+01:00 Selim Tuvi <stuvi@ilm.com>:
I am trying to repair a broken bdr cluster setup and so far everything I tried failed. Under the original node that ran bdr.bdr_group_create I am getting the following error:
2015-12-04 19:34:29.063 UTC,,,22991,,5661eac4.59cf,1,,2015-12-04 19:34:28 UTC,3/0,0,ERROR,55000,"previous init failed, manual cleanup is required","Found bdr.bdr_nodes entry for bdr (6224504646761731677,1,16389,) with state=i in remote bdr.bdr_nodes","Remove all replication identifiers and slots corresponding to this node from the init target node then drop and recreate this database and try again",,,,,,,"bdr (6224504646761731677,1,16389,): perdb"
Is there a way to get the cluster in a correct state without having to drop the db?
Thanks
-Selim
Thanks Sylvain, I ran the following on all nodes and dropped the db on all but the first node and rejoined them to the cluster.
Unfortunately the node_status still says "i" for the second and third nodes when I look at bdr.bdr_nodes under the first node.
Under the second node, the node_status has "r" for all and under the third node it has "i" only for the second node.
No warning or error entries in the log file on all nodes but the replication works only from the first node to the second and third nodes and from the second node to the third node.
-Selim
Unfortunately the node_status still says "i" for the second and third nodes when I look at bdr.bdr_nodes under the first node.
Under the second node, the node_status has "r" for all and under the third node it has "i" only for the second node.
No warning or error entries in the log file on all nodes but the replication works only from the first node to the second and third nodes and from the second node to the third node.
-Selim
From: Sylvain Marechal [marechal.sylvain2@gmail.com]
Sent: Sunday, December 06, 2015 4:23 AM
To: Selim Tuvi
Cc: pgsql-general@postgresql.org
Subject: Re: [GENERAL] bdr manual cleanup required
Sent: Sunday, December 06, 2015 4:23 AM
To: Selim Tuvi
Cc: pgsql-general@postgresql.org
Subject: Re: [GENERAL] bdr manual cleanup required
Did you try this :
https://github.com/2ndQuadrant/bdr/issues/127 :
<<<
https://github.com/2ndQuadrant/bdr/issues/127 :
<<<
BEGIN;
SET LOCAL bdr.skip_ddl_locking = on;
SET LOCAL bdr.permit_unsafe_ddl_commands = on;
SET LOCAL bdr.skip_ddl_replication = on;
SECURITY LABEL FOR bdr ON DATABASE mydb IS NULL;
DELETE FROM bdr.bdr_connections;
DELETE FROM bdr.bdr_nodes;
SELECT bdr.bdr_connections_changed();
COMMIT;
SELECT pg_terminate_backend(pid)
FROM pg_stat_activity
WHERE datname = current_database() AND application_name LIKE '%): perdb';
>>>For now, I never went into situations where I had to destroy all the databases in all nodes.
Sylvain
Sylvain
2015-12-04 20:40 GMT+01:00 Selim Tuvi <stuvi@ilm.com>:
I am trying to repair a broken bdr cluster setup and so far everything I tried failed. Under the original node that ran bdr.bdr_group_create I am getting the following error:
2015-12-04 19:34:29.063 UTC,,,22991,,5661eac4.59cf,1,,2015-12-04 19:34:28 UTC,3/0,0,ERROR,55000,"previous init failed, manual cleanup is required","Found bdr.bdr_nodes entry for bdr (6224504646761731677,1,16389,) with state=i in remote bdr.bdr_nodes","Remove all replication identifiers and slots corresponding to this node from the init target node then drop and recreate this database and try again",,,,,,,"bdr (6224504646761731677,1,16389,): perdb"
Is there a way to get the cluster in a correct state without having to drop the db?
Thanks
-Selim
I notice this 'i' state with bdr 0.9.1 (https://github.com/2ndQuadrant/bdr/issues/145)
But this is not the same problem as far as I understand.
In my case, I notice this problem when constantly updating the database. (I was not able to reproduce it with 0.9.3)
Note that I sometimes saw this 'i' state with two nodes only and 0.9.3 version, but it didn't seem to affect the replication, even if I am not confortable with this ...
Sylvain
Le 08/12/2015 18:36, Selim Tuvi a écrit :
But this is not the same problem as far as I understand.
In my case, I notice this problem when constantly updating the database. (I was not able to reproduce it with 0.9.3)
Note that I sometimes saw this 'i' state with two nodes only and 0.9.3 version, but it didn't seem to affect the replication, even if I am not confortable with this ...
Sylvain
Le 08/12/2015 18:36, Selim Tuvi a écrit :
P {margin-top:0;margin-bottom:0;} Thanks Sylvain, I ran the following on all nodes and dropped the db on all but the first node and rejoined them to the cluster.
Unfortunately the node_status still says "i" for the second and third nodes when I look at bdr.bdr_nodes under the first node.
Under the second node, the node_status has "r" for all and under the third node it has "i" only for the second node.
No warning or error entries in the log file on all nodes but the replication works only from the first node to the second and third nodes and from the second node to the third node.
-SelimFrom: Sylvain Marechal [marechal.sylvain2@gmail.com]
Sent: Sunday, December 06, 2015 4:23 AM
To: Selim Tuvi
Cc: pgsql-general@postgresql.org
Subject: Re: [GENERAL] bdr manual cleanup requiredDid you try this :
https://github.com/2ndQuadrant/bdr/issues/127 :
<<<>>>BEGIN; SET LOCAL bdr.skip_ddl_locking = on; SET LOCAL bdr.permit_unsafe_ddl_commands = on; SET LOCAL bdr.skip_ddl_replication = on; SECURITY LABEL FOR bdr ON DATABASE mydb IS NULL; DELETE FROM bdr.bdr_connections; DELETE FROM bdr.bdr_nodes; SELECT bdr.bdr_connections_changed(); COMMIT; SELECT pg_terminate_backend(pid) FROM pg_stat_activity WHERE datname = current_database() AND application_name LIKE '%): perdb';
For now, I never went into situations where I had to destroy all the databases in all nodes.
Sylvain2015-12-04 20:40 GMT+01:00 Selim Tuvi <stuvi@ilm.com>:I am trying to repair a broken bdr cluster setup and so far everything I tried failed. Under the original node that ran bdr.bdr_group_create I am getting the following error:
2015-12-04 19:34:29.063 UTC,,,22991,,5661eac4.59cf,1,,2015-12-04 19:34:28 UTC,3/0,0,ERROR,55000,"previous init failed, manual cleanup is required","Found bdr.bdr_nodes entry for bdr (6224504646761731677,1,16389,) with state=i in remote bdr.bdr_nodes","Remove all replication identifiers and slots corresponding to this node from the init target node then drop and recreate this database and try again",,,,,,,"bdr (6224504646761731677,1,16389,): perdb"
Is there a way to get the cluster in a correct state without having to drop the db?
Thanks
-Selim
Are you adding more than one node at once?
BDR isn't currently smart enough to handle that. Make sure to wait until one node is fully synced up before adding another.
Le 09/12/2015 05:18, Craig Ringer a écrit : > Are you adding more than one node at once? > > BDR isn't currently smart enough to handle that. Make sure to wait > until one node is fully synced up before adding another. > In other words, one shall not attemp to add a new node if the other nodes are not in the 'r'eady state, when more than two nodes ? But what about if one gets this 'i' state with two nodes only? in my case, with two node only, in one side, both nodes had the state 'r', while the states were 'r' and 'i' on the other side. Thank-you, Sylvain
I really couldn't say with the available information.
Can you set provide a step-by-step process by which you set up these nodes?