Thread: BDR: Can a node live alone after being detached
Hello all, Is it possible to completely detach a node so that it can live alone, in particular that DDL are again possible on that node? I tried with a simple node without success: <<< postgres=# create database test template template0; CREATE DATABASE postgres=# \c test You are now connected to database "test" as user "dbadmin". test=# CREATE EXTENSION btree_gist; CREATE EXTENSION test=# CREATE EXTENSION bdr; CREATE EXTENSION -- DDL are still possible test=# create table before_node_creation (i int primary key not null); CREATE TABLE -- Now create the group. After that, DDL are now forbidden test=# SELECT bdr.bdr_group_create(local_node_name := 'node1',node_external_dsn := 'host=172.27.118.64 port=5432 dbname=test'); test=# create table after_node_creation (i int primary key not null); ERROR: No peer nodes or peer node count unknown, cannot acquire DDL lock HINT: BDR is probably still starting up, wait a while -- Now detach the group, in the hope to create some table -- But this does not work. DDL are still forbidden test=# select bdr.bdr_part_by_node_names('{node1}'); test=# create table after_node_creation (i int primary key not null); ERROR: No peer nodes or peer node count unknown, cannot acquire DDL lock HINT: BDR is probably still starting up, wait a while test=# >>> Thanks and Regards, Sylvain NB: This post was part of a bigger - may be too dense - post: "BDR: Node join and leave questions"
On 15 June 2015 at 17:19, Sylvain MARECHAL <marechal.sylvain2@gmail.com> wrote: > Is it possible to completely detach a node so that it can live alone Yes. On a different node to the one you want to remove, bdr.bdr_part_by_node_names(ARRAY['the_node_to_remove']) to safely remove it without disrupting the other nodes. > in particular that DDL are again possible on that node? DDL is possible on a BDR node anyway, just with some limitations. > I tried with a simple node without success: > <<< > postgres=# create database test template template0; > CREATE DATABASE > postgres=# \c test > You are now connected to database "test" as user "dbadmin". > test=# CREATE EXTENSION btree_gist; > CREATE EXTENSION > test=# CREATE EXTENSION bdr; > CREATE EXTENSION > > -- DDL are still possible > test=# create table before_node_creation (i int primary key not null); > CREATE TABLE > > -- Now create the group. After that, DDL are now forbidden > test=# SELECT bdr.bdr_group_create(local_node_name := > 'node1',node_external_dsn := 'host=172.27.118.64 port=5432 dbname=test'); > > test=# create table after_node_creation (i int primary key not null); > ERROR: No peer nodes or peer node count unknown, cannot acquire DDL lock > HINT: BDR is probably still starting up, wait a while Well, DDL is only forbidden because it hasn't successfully joined yet. Check the PostgreSQL logs to see what it's doing. > -- Now detach the group, in the hope to create some table > -- But this does not work. DDL are still forbidden > test=# select bdr.bdr_part_by_node_names('{node1}'); You shouldn't part a node from its self. The next revision will prevent this with an error. > test=# create table after_node_creation (i int primary key not null); > ERROR: No peer nodes or peer node count unknown, cannot acquire DDL lock > HINT: BDR is probably still starting up, wait a while Currently, once detached, the BDR extension isn't disabled on the node. Support for that is possible, but not yet implemented. We need to add a two-phase part protocol where the node confirms it has left the system and disables BDR. At present removed nodes aren't really a focus; it's expected that you're removing the node because you're going to retire it and will be deleting the cluster or dropping the database. In the mean time you can remove the security label on the database to disable BDR once the node has been parted, so that BDR no longer activates on that node. The command filter prevents this so you'll have to do this with the command filter off. BEGIN; SET LOCAL bdr.permit_unsafe_ddl_commands = true; SET LOCAL bdr.skip_ddl_locking = true; security label for 'bdr' on database bdr_testdb is '{"bdr": false}'; COMMIT; Out of interest, why do you want to detach a node and keep using it as a standalone DB? -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Hi Craig, Thanks for your response. Le 16/06/2015 03:58, Craig Ringer a écrit : > On 15 June 2015 at 17:19, Sylvain MARECHAL <marechal.sylvain2@gmail.com> wrote: >> Is it possible to completely detach a node so that it can live alone > Yes. On a different node to the one you want to remove, > bdr.bdr_part_by_node_names(ARRAY['the_node_to_remove']) to safely > remove it without disrupting the other nodes. > >> in particular that DDL are again possible on that node? > DDL is possible on a BDR node anyway, just with some limitations. > >> I tried with a simple node without success: >> <<< >> postgres=# create database test template template0; >> CREATE DATABASE >> postgres=# \c test >> You are now connected to database "test" as user "dbadmin". >> test=# CREATE EXTENSION btree_gist; >> CREATE EXTENSION >> test=# CREATE EXTENSION bdr; >> CREATE EXTENSION >> >> -- DDL are still possible >> test=# create table before_node_creation (i int primary key not null); >> CREATE TABLE >> >> -- Now create the group. After that, DDL are now forbidden >> test=# SELECT bdr.bdr_group_create(local_node_name := >> 'node1',node_external_dsn := 'host=172.27.118.64 port=5432 dbname=test'); >> >> test=# create table after_node_creation (i int primary key not null); >> ERROR: No peer nodes or peer node count unknown, cannot acquire DDL lock >> HINT: BDR is probably still starting up, wait a while > Well, DDL is only forbidden because it hasn't successfully joined yet. > Check the PostgreSQL logs to see what it's doing. Nothing special in the logs here. What I would like to emphasize is that a single node can not live alone. > >> -- Now detach the group, in the hope to create some table >> -- But this does not work. DDL are still forbidden >> test=# select bdr.bdr_part_by_node_names('{node1}'); > You shouldn't part a node from its self. The next revision will > prevent this with an error. Ok, this was not clear for me. > >> test=# create table after_node_creation (i int primary key not null); >> ERROR: No peer nodes or peer node count unknown, cannot acquire DDL lock >> HINT: BDR is probably still starting up, wait a while > Currently, once detached, the BDR extension isn't disabled on the node. > > Support for that is possible, but not yet implemented. We need to add > a two-phase part protocol where the node confirms it has left the > system and disables BDR. At present removed nodes aren't really a > focus; it's expected that you're removing the node because you're > going to retire it and will be deleting the cluster or dropping the > database. > > In the mean time you can remove the security label on the database to > disable BDR once the node has been parted, so that BDR no longer > activates on that node. The command filter prevents this so you'll > have to do this with the command filter off. > > BEGIN; > SET LOCAL bdr.permit_unsafe_ddl_commands = true; > SET LOCAL bdr.skip_ddl_locking = true; > security label for 'bdr' on database bdr_testdb is '{"bdr": false}'; > COMMIT; Ok, I will do this as a workaround. But having a function doing the detach() properly would be really nice. > > Out of interest, why do you want to detach a node and keep using it as > a standalone DB? > > This is related to the design of our application: We would like to provide to our users a flexible way to add and remove nodes, for failover (2 nodes) and scalability reasons (possibly more than 2 nodes with load balancing). If for some reason (long time maintenance for one of the machine, testing the application ...) the user wants to remove the failover, we would like him to be able to keep at least one DB without the need of dropping/recreating it to minimize the down time. Of course, in a real production scenario, the user won't probably change often its configuration, but I think this is really useful to have this kind of flexibility when taking control of the application. Sylvain
On 16 June 2015 at 18:40, Sylvain MARECHAL <marechal.sylvain2@gmail.com> wrote: > Nothing special in the logs here. What I would like to emphasize is that a > single node can not live alone. Yes, that's right. "Single node BDR", i.e. running with BDR enabled but no peers, would be nice to support for testing and robustness. It's just not a key priority at this point. >>> -- Now detach the group, in the hope to create some table >>> -- But this does not work. DDL are still forbidden >>> test=# select bdr.bdr_part_by_node_names('{node1}'); >> >> You shouldn't part a node from its self. The next revision will >> prevent this with an error. > > > Ok, this was not clear for me. Or anyone else, hence the coming docs and code changes. > Ok, I will do this as a workaround. > But having a function doing the detach() properly would be really nice. We have a very long list of "would be nice"s, but have to focus on things that are core requirements for functionality or for specific customer needs. At this point this comes under neither category, so I don't anticipate working on it soon. If you're interested in getting into the codebase it would be an interesting and manageable project, though... > Of course, in a real production scenario, the user won't probably change > often its configuration, but I think this is really useful to have this kind > of flexibility when taking control of the application. Absolutely. The trouble is that all such things have trade-offs. For example, with the ability to re-attach a node that you asked about, doing so can't be done without accumulating lots of upstream WAL. It'd be effectively identical to just shutting down the node then starting it up again - with all the same costs and downsides. -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
>> You shouldn't part a node from its self. The next revision will
>> prevent this with an error.
>
>
> Ok, this was not clear for me.
Or anyone else, hence the coming docs and code changes.
Suppose I have 2 nodes, 'node1' and 'node2'. So I am OK to call bdr_part_by_node_names('node2') on node1 to detach node2. But then, is it safe to call bdr_part_by_node_names('node1') to do the cleanup on node2, or should I better call pg_drop_replication_slot('...') instead?
Thanks and Regards,
Sylvain
On 26 June 2015 at 04:59, Sylvain MARECHAL <marechal.sylvain2@gmail.com> wrote: > >> >> You shouldn't part a node from its self. The next revision will >> >> prevent this with an error. >> > >> > >> > Ok, this was not clear for me. >> >> Or anyone else, hence the coming docs and code changes. >> > Sorry to bother again about that, but what about the "detached node cleanup" > best practice? > > Suppose I have 2 nodes, 'node1' and 'node2'. So I am OK to call > bdr_part_by_node_names('node2') on node1 to detach node2. But then, is it > safe to call bdr_part_by_node_names('node1') to do the cleanup on node2, or > should I better call pg_drop_replication_slot('...') instead? BDR 0.9.1 drops the slot on remaining nodes correctly on node part, so no further action is required there. Just part_by_node_names the node you are removing, running the query from one of the nodes you wish to retain. (0.9.1 enforces this). If you want to retain the node you're removing as a standalone, it should be sufficient to drop any slots on that node, truncate bdr.bdr_nodes and bdr.bdr_connections, and remove the bdr security label with: SECURITY LABEL FOR bdr ON DATABASE thedb IS NULL; then restart the DB. You may also wish to remove 'bdr' from 'shared_preload_libraries'. -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Le 26/06/2015 03:26, Craig Ringer a écrit : > [...] >> Sorry to bother again about that, but what about the "detached node cleanup" >> best practice? >> >> Suppose I have 2 nodes, 'node1' and 'node2'. So I am OK to call >> bdr_part_by_node_names('node2') on node1 to detach node2. But then, is it >> safe to call bdr_part_by_node_names('node1') to do the cleanup on node2, or >> should I better call pg_drop_replication_slot('...') instead? > BDR 0.9.1 drops the slot on remaining nodes correctly on node part, so > no further action is required there. Just part_by_node_names the node > you are removing, running the query from one of the nodes you wish to > retain. (0.9.1 enforces this). > > If you want to retain the node you're removing as a standalone, it > should be sufficient to drop any slots on that node, truncate > bdr.bdr_nodes and bdr.bdr_connections, and remove the bdr security > label with: > > SECURITY LABEL FOR bdr ON DATABASE thedb IS NULL; > > then restart the DB. You may also wish to remove 'bdr' from > 'shared_preload_libraries'. > Thanks a lot, Sylvain