Re: BDR to ignore table exists error - Mailing list pgsql-general

From Nikhil
Subject Re: BDR to ignore table exists error
Date
Msg-id CALo-6YM1zHdP1c+UOs0m3BvnUjEtb9BPvt5iB-d+dU4H5jLKfA@mail.gmail.com
Whole thread Raw
In response to Re: BDR to ignore table exists error  (Martín Marqués <martin@2ndquadrant.com>)
Responses Re: BDR to ignore table exists error
Re: BDR to ignore table exists error
List pgsql-general

Once the node which was down is brought back the replication slot is not turned active. The reason being replication slot is trying to create a partition table which already exists. Because of this error replication slot is stuck in inactive mode. Is there any way to ignore this error?

On 28-May-2016 4:56 PM, "Martín Marqués" <martin@2ndquadrant.com> wrote:
El 27/05/16 a las 06:33, Nikhil escribió:
> Hello,
>
>
> I have a BDR setup with two nodes. If I bring one node down i am seeing that
> the replication slot is becoming inactive with below error.

If you take down one of the nodes of a BDR mesh, the replication slots
from each of the upstream nodes it connects to will switch to inactive.
That's how replication slots work.

> <10.106.43.152(43253)nsxpostgres798452016-05-25 23:58:19 GMTnsxdb%DETAIL:
>  streaming transactions committing after 0/111A91
> 48, reading WAL from 0/110F03F8
> <10.106.43.152(43253)nsxpostgres798452016-05-25 23:58:19 GMTnsxdb%LOG:
>  logical decoding found consistent point at 0/110F03
> F8
> <10.106.43.152(43253)nsxpostgres798452016-05-25 23:58:19 GMTnsxdb%DETAIL:
>  Logical decoding will begin using saved snapshot
> .
> <10.106.43.152(43253)nsxpostgres798452016-05-25 23:58:19 GMTnsxdb%LOG:
>  unexpected EOF on standby connection

Downstream node got disconnected, which is sensible given that you took
that node down.

> <127.0.0.1(31185)nsxroot792492016-05-25 23:58:19 GMTnsxdb%LOG:  duration:
> 0.437 ms
> <127.0.0.1(31185)nsxroot792492016-05-25 23:58:19 GMTnsxdb%LOG:  duration:
> 0.462 ms
> <127.0.0.1(31185)nsxroot792492016-05-25 23:58:19 GMTnsxdb%LOG:  duration:
> 0.096 ms
> <127.0.0.1(31185)nsxroot792492016-05-25 23:58:19 GMTnsxdb%LOG:  duration:
> 0.101 ms
> <3462016-05-25 23:58:20 GMT%LOG:  starting background worker process "bdr
> (6288505144157102317,1,16384,)->bdr (628851211361
> 7339435,2,"

It seems you brought up postgres on the downstream node again and it
connected to the replication slot.

> <798462016-05-25 23:58:20 GMT%ERROR:  relation "af_npx_device_l3_16_149_10"
> already exists

I'm not sure what happened here. Does that relation exist?

Run \d+ af_npx_device_l3_16_149_10 with psql on both nodes.

Also, did replication resume? Check with the lag query from the BDR
documentation.

Regards,

--
Martín Marqués                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

pgsql-general by date:

Previous
From: Martín Marqués
Date:
Subject: Re: BDR to ignore table exists error
Next
From: Martín Marqués
Date:
Subject: Re: BDR to ignore table exists error