Thread: BUG #12808: BDR lock adding Postgis extension in one node

BUG #12808: BDR lock adding Postgis extension in one node

From
rmelendo@teltronic.es
Date:
The following bug has been logged on the website:

Bug reference:      12808
Logged by:          Ruth Melendo
Email address:      rmelendo@teltronic.es
PostgreSQL version: 9.4.1
Operating system:   Ubuntu 14.04 LTS
Description:

Using BDR with 2 nodes and trying to add a PostGIS extension in node 1, it
adds the extension and automatically adds “spatial_ref_sys”.

In node2, it tries to add the extension and gets an error because table
“spatial_ref_sys” does not exists. The bbdd gets locked after that.

Re: BUG #12808: BDR lock adding Postgis extension in one node

From
Andres Freund
Date:
Hi,

On 2015-02-26 16:29:25 +0000, rmelendo@teltronic.es wrote:
> Using BDR with 2 nodes and trying to add a PostGIS extension in node 1,=
 it
> adds the extension and automatically adds =E2=80=9Cspatial_ref_sys=E2=80=
=9D.
>=20
> In node2, it tries to add the extension and gets an error because table
> =E2=80=9Cspatial_ref_sys=E2=80=9D does not exists. The bbdd gets locked=
 after that.

(this isn't the best forum for this, but I don't really know what to
recommend besides -general)

What's the exact error message? Does it out because of the extension or
because of inserts into that table?

Which version of bdr does this happen with? Because I seem to recall
fixing a similar problem... If not, we'll look into the issue.

Greetings,

Andres Freund

--=20
 Andres Freund                       http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Re: BUG #12808: BDR lock adding Postgis extension in one node

From
"Ruth Melendo"
Date:
The error got when adding a PostGIS extension from PGAdmin in node 1. =
Extension was created and the depending table too but in node 2 got the =
error when adding the extension because the depending table was not =
created.=20

But, although the problem was originated for this, my main dude is how =
to fix the database then it=C2=B4s locked. What do I have to do? Because =
now, I cannot do any DDL or DML against it.


Ruth Patricia Melendo Ventura
Software Engineer
TELTRONIC, S.A.U.
T: +34 976 465656   Ext. 179
F: +34 976 465722=20
www.teltronic.es


-----Mensaje original-----
De: Andres Freund [mailto:andres@2ndquadrant.com]=20
Enviado el: viernes, 27 de febrero de 2015 15:24
Para: rmelendo@teltronic.es
CC: pgsql-bugs@postgresql.org
Asunto: Re: [BUGS] BUG #12808: BDR lock adding Postgis extension in one =
node

Hi,

On 2015-02-26 16:29:25 +0000, rmelendo@teltronic.es wrote:
> Using BDR with 2 nodes and trying to add a PostGIS extension in node=20
> 1, it adds the extension and automatically adds =
=E2=80=9Cspatial_ref_sys=E2=80=9D.
>=20
> In node2, it tries to add the extension and gets an error because=20
> table =E2=80=9Cspatial_ref_sys=E2=80=9D does not exists. The bbdd gets =
locked after that.

(this isn't the best forum for this, but I don't really know what to =
recommend besides -general)

What's the exact error message? Does it out because of the extension or =
because of inserts into that table?

Which version of bdr does this happen with? Because I seem to recall =
fixing a similar problem... If not, we'll look into the issue.

Greetings,

Andres Freund

--=20
 Andres Freund                       http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Re: BUG #12808: BDR lock adding Postgis extension in one node

From
'Andres Freund'
Date:
Hi,

[some information has been acquired privately]

On 2015-02-27 15:30:20 +0100, Ruth Melendo wrote:
> The error got when adding a PostGIS extension from PGAdmin in node
> 1. Extension was created and the depending table too but in node 2 got
> the error when adding the extension because the depending table was
> not created.

> But, although the problem was originated for this, my main dude is how
> to fix the database then it=C2=B4s locked. What do I have to do? Becaus=
e
> now, I cannot do any DDL or DML against it.

The reason it's locked is that it's waiting for DDL to be
replicated. And that fails due to the spatial_ref_sys error.

In the version you're using the easiest way to resolve this is probably
to skip replication of the problematic transaction, and then fix up
things afterwards by hand.

To do that connect to the database you performed the CREATE EXTENSION
on. There get a list of the outgoing connections with a query like:

SELECT slot_name, datoid, database, active
FROM pg_replication_slots
WHERE slot_name LIKE 'bdr_'||(SELECT oid FROM pg_database WHERE datname =3D=
 current_database())||'_%';
=E2=94=8C=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=
=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=
=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=
=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=
=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=
=80=E2=94=AC=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=
=E2=94=80=E2=94=AC=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=
=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=AC=E2=94=80=E2=94=80=E2=94=80=E2=94=
=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=90
=E2=94=82                slot_name                =E2=94=82 datoid =E2=94=
=82 database =E2=94=82 active =E2=94=82
=E2=94=9C=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=
=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=
=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=
=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=
=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=
=80=E2=94=BC=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=
=E2=94=80=E2=94=BC=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=
=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=BC=E2=94=80=E2=94=80=E2=94=80=E2=94=
=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=A4
=E2=94=82 bdr_17911_6120567807158814813_1_16385__ =E2=94=82  17911 =E2=94=
=82 node_03  =E2=94=82 t      =E2=94=82
=E2=94=82 bdr_17911_6120567807158814813_1_17153__ =E2=94=82  17911 =E2=94=
=82 node_03  =E2=94=82 f      =E2=94=82
=E2=94=94=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=
=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=
=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=
=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=
=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=
=80=E2=94=B4=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=
=E2=94=80=E2=94=B4=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=
=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=B4=E2=94=80=E2=94=80=E2=94=80=E2=94=
=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=98
(2 rows)

In my three node setup those two consume changes from the node_03
database.

To consume the problematic transaction, use something like:

COPY (SELECT * FROM pg_logical_slot_get_binary_changes('bdr_17911_6120567=
807158814813_1_16385__', NULL, 1, 'interactive', 'true')) TO '/dev/null';

for each of the remote nodes. After that you should be able to execute
DDL again.

Then please reconnect and do
SET bdr.skip_ddl_replication =3D on;
DROP EXTENSION postgis;

That should bring the nodes back in sync with regard to the postgis
extension.


We'll work on making it possible to replicate extensions that insert
data into their own tables during creation.

Greetings,

Andres Freund

--=20
 Andres Freund                       http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Re: BUG #12808: BDR lock adding Postgis extension in one node

From
"Ruth Melendo"
Date:
Thanks for your explanation Andres. It worked for me.

Just 3  more questions:

1.- Should I get a newer version of BDR? Does mine have important bugs? =
I=C2=B4m testing that for production env in a near future and the =
application in production in my company depends on these results.

2.- I don=E2=80=99t understand exactly what I=C2=B4m doing with this =
sentence: COPY (SELECT * FROM =
pg_logical_slot_get_binary_changes('bdr_17911_6120567807158814813_1_16385=
__', NULL, 1, 'interactive', 'true')) TO '/dev/null';=20
I suppose I=C2=B4m discarding the changes. That will make my changes to =
be discarded or are you counting with bdr.initial_copy activated to get =
them again?=20

3.- When I do this:
SET bdr.skip_ddl_replication =3D on;
Is it the same as commenting the bdr properties in postgres.conf?

As I=C2=B4m testing for production env, I need to write a procedure to =
recover database when getting an error like this one. I have gotten =
similar errors more times when doing tests, in transactions DDL or DML, =
I guess my fault in any configuration property. When I get to that =
situation, should I follow the same instructions?

Instructions to recover database when gets corrupted after bdr error:

1- Skip replication of the problematic transaction.=20
    SELECT slot_name, datoid, database, active FROM pg_replication_slots =
WHERE slot_name LIKE 'bdr_'||(SELECT oid FROM pg_database WHERE datname =
=3D     current_database())||'_%';

    COPY (SELECT * FROM =
pg_logical_slot_get_binary_changes('bdr_17911_6120567807158814813_1_16385=
__', NULL, 1, 'interactive', 'true')) TO '/dev/null';

2.- Reconnect to the database and execute

    SET bdr.skip_ddl_replication =3D on;
    DROP EXTENSION postgis;

Should it be something like that or should I delete the data directory =
in the problematic node and create the cluster again so that the =
bdr.init_copy bring the database to a right state?

Thanks again!

Ruth Patricia Melendo Ventura
Software Engineer
TELTRONIC, S.A.U.
T: +34 976 465656   Ext. 179
F: +34 976 465722=20
www.teltronic.es


-----Mensaje original-----
De: 'Andres Freund' [mailto:andres@2ndquadrant.com]=20
Enviado el: s=C3=A1bado, 28 de febrero de 2015 10:36
Para: Ruth Melendo
CC: pgsql-bugs@postgresql.org
Asunto: Re: [BUGS] BUG #12808: BDR lock adding Postgis extension in one =
node

Hi,

[some information has been acquired privately]

On 2015-02-27 15:30:20 +0100, Ruth Melendo wrote:
> The error got when adding a PostGIS extension from PGAdmin in node 1.=20
> Extension was created and the depending table too but in node 2 got=20
> the error when adding the extension because the depending table was=20
> not created.

> But, although the problem was originated for this, my main dude is how =

> to fix the database then it=C2=B4s locked. What do I have to do? =
Because=20
> now, I cannot do any DDL or DML against it.

The reason it's locked is that it's waiting for DDL to be replicated. =
And that fails due to the spatial_ref_sys error.

In the version you're using the easiest way to resolve this is probably =
to skip replication of the problematic transaction, and then fix up =
things afterwards by hand.

To do that connect to the database you performed the CREATE EXTENSION =
on. There get a list of the outgoing connections with a query like:

SELECT slot_name, datoid, database, active FROM pg_replication_slots =
WHERE slot_name LIKE 'bdr_'||(SELECT oid FROM pg_database WHERE datname =
=3D current_database())||'_%'; =
=E2=94=8C=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=
=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=
=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=
=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=
=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=
=80=E2=94=AC=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=
=E2=94=80=E2=94=AC=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=
=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=AC=E2=94=80=E2=94=80=E2=94=80=E2=94=
=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=90
=E2=94=82                slot_name                =E2=94=82 datoid =
=E2=94=82 database =E2=94=82 active =E2=94=82
=E2=94=9C=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=
=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=
=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=
=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=
=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=
=80=E2=94=BC=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=
=E2=94=80=E2=94=BC=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=
=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=BC=E2=94=80=E2=94=80=E2=94=80=E2=94=
=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=A4
=E2=94=82 bdr_17911_6120567807158814813_1_16385__ =E2=94=82  17911 =
=E2=94=82 node_03  =E2=94=82 t      =E2=94=82
=E2=94=82 bdr_17911_6120567807158814813_1_17153__ =E2=94=82  17911 =
=E2=94=82 node_03  =E2=94=82 f      =E2=94=82
=E2=94=94=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=
=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=
=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=
=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=
=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=
=80=E2=94=B4=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=
=E2=94=80=E2=94=B4=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=
=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=B4=E2=94=80=E2=94=80=E2=94=80=E2=94=
=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=98
(2 rows)

In my three node setup those two consume changes from the node_03 =
database.

To consume the problematic transaction, use something like:

COPY (SELECT * FROM =
pg_logical_slot_get_binary_changes('bdr_17911_6120567807158814813_1_16385=
__', NULL, 1, 'interactive', 'true')) TO '/dev/null';

for each of the remote nodes. After that you should be able to execute =
DDL again.

Then please reconnect and do
SET bdr.skip_ddl_replication =3D on;
DROP EXTENSION postgis;

should bring the nodes back in sync with regard to the postgis =
extension.That=20


We'll work on making it possible to replicate extensions that insert =
data into their own tables during creation.

Greetings,

Andres Freund

--=20
 Andres Freund                       http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services