El 29/05/16 a las 06:01, Nikhil escribió: > > *Nik>> skip_ddl_locking is set to True in my configuration. As this > was preventing single* > > *node from doing DDL operation (if one is down majority is not there > for doing DDL on available node)*
Well, you have to be prepared to deal with burn wounds if you play with fire. ;)
If you decide to have skip_ddl_locking on you have to be sure all DDLs happen on one node, else you end up with conflicts like this.
I suggest you find out why the table was already created on the downstream node (as a forensics task so you can avoid bumping into the same issue). > Nik>> DDL used is > > > ERROR: relation "af_npx_l3_16_146_10" already exists > <596802016-05-29 08:53:07 GMT%CONTEXT: during DDL replay of ddl > statement: CREATE TABLE public.af_npx_license_l3_16_146_ > 10 (CONSTRAINT af_npx_license_l3_16_146_10_rpt_sample_time_check CHECK > (((rpt_sample_time OPERATOR(pg_catalog.>=) 146417040 > 0) AND (rpt_sample_time OPERATOR(pg_catalog.<=) 1464173999))) ) INHERITS > (public.af_npx_l3) WITH (oids=OFF) > <554132016-05-29 08:53:07 GMT%LOG: worker process: bdr > (6288512113617339435,2,16384,)->bdr (6288505144157102317,1, (PID 59 > 680) exited with exit code 1
On the node where the CREATE TABLE is trying to get applied run this:
BEGIN; SET LOCAL bdr.skip_ddl_replication TO 'on'; SET LOCAL bdr.skip_ddl_locking TO 'on'; DROP TABLE af_npx_l3_16_146_10; END;
After that, the DDL that's stuck will get applied and the stream of changes will continue.
By the looks of what you're dealing with, I wouldn't be surprised if the replication gets stuck again on another DDL conflict.
I suggest rethinking the locking strategy, because this shows that there's something fishy there.