Re: [BDR] Node Join Question - Mailing list pgsql-general

From Craig Ringer
Subject Re: [BDR] Node Join Question
Date
Msg-id CAMsr+YEVFW5eEf1Gxb198UpBdiQSf-AASQac21mHUoA643nbNA@mail.gmail.com
Whole thread Raw
In response to Re: [BDR] Node Join Question  ("Wayne E. Seguin" <wayneeseguin@gmail.com>)
Responses Re: [BDR] Node Join Question  ("Wayne E. Seguin" <wayneeseguin@gmail.com>)
List pgsql-general


On 12 May 2015 at 14:36, Wayne E. Seguin <wayneeseguin@gmail.com> wrote:
Also,

Is there a way to remove these things from the init target node easier?

d= p=504 a=ERROR:  55000: previous init failed, manual cleanup is required
d= p=504 a=DETAIL:  Found bdr.bdr_nodes entry for bdr (6147869128174526660,1,16908,) with state=i in remote bdr.bdr_nodes
d= p=504 a=HINT:  Remove all replication identifiers and slots corresponding to this node from the init target node then drop and recreate this database and try again

Now that we have SQL-level join it'd probably make sense to provide a cleanup function for failed node joins. At this point there's no such function.


Take note of the node identity given in the error as it corresponds to the replication identifier name and slot name.

You need to, on the join target node:

     SELECT pg_drop_replication_slot(slot_name) 
     FROM pg_replication_slots 
     WHERE slot_name = bdr.bdr_format_slot_name('6147869128174526660',1,16908)

where the sysid, timeline ID and database OID are those given in the error. You must run this from the target node's database, as it'll only consider slots for the current database.

Then 

    SELECT pg_replication_identifier_drop(...)

the replication identifier used, after looking up the replication identifier from pg_catalog.pg_replication_identifier. There isn't an equivalent of  bdr.bdr_format_slot_name for replication identifiers; I'll look at adding one. Look it up visually or write a simple function to format the string in the mean time.

Then delete the bdr.bdr_nodes entry for the failed-to-join node and any bdr.bdr_connections entries for it.

You *must* drop and re-create the database on the failed-to-join node, making a new blank db (preferably from template0).




pgsql-general by date:

Previous
From: Craig Ringer
Date:
Subject: Re: [BDR] Node Join Question
Next
From: hubert depesz lubaczewski
Date:
Subject: Re: Why does this SQL work?