Thread: BDR: no free replication state could be found
Hi I am testing BDR functionality with Postgres 9.4. I had went through the bdrdemo example with a 3 node cluster and then tried to set up my own db.
My "max_replication_slots" is set to 6. After getting removing the bdrdemo db I am having trouble starting up the postgres instance unless I increase the value of "max_replication_slots". I get the following error in the log:
"starting up replication identifier with ckpt at 0/28E8250",,,,,,,,,""
"recovered replication state of node 1 to 0/54DDCD0",,,,,,,,,""
"recovered replication state of node 2 to 0/1ECBEA0",,,,,,,,,""
"recovered replication state of node 3 to 0/59FB1C0",,,,,,,,,""
"recovered replication state of node 4 to 0/2AA5320",,,,,,,,,""
"recovered replication state of node 5 to 0/27F2F98",,,,,,,,,""
"recovered replication state of node 6 to 0/59F35A8",,,,,,,,,""
"no free replication state could be found, increase max_replication_slots",,,,,,,,,""
pg_replication_slots is only reporting two slots:
postgres=# SELECT * FROM pg_catalog.pg_replication_slots;
slot_name | plugin | slot_type | datoid | database | active | xmin | catalog_xmin | restart_lsn
-----------------------------------------+--------+-----------+--------+----------+--------+------+--------------+-------------
bdr_19685_6199712740068695651_1_18817__ | bdr | logical | 19685 | deliver | t | | 2280 | 0/28EA5E0
bdr_19685_6197393155020108291_1_48609__ | bdr | logical | 19685 | deliver | t | | 2280 | 0/28EA5E0
How can I get rid of the stale node recovery on startup?
Thanks
-Selim
My "max_replication_slots" is set to 6. After getting removing the bdrdemo db I am having trouble starting up the postgres instance unless I increase the value of "max_replication_slots". I get the following error in the log:
"starting up replication identifier with ckpt at 0/28E8250",,,,,,,,,""
"recovered replication state of node 1 to 0/54DDCD0",,,,,,,,,""
"recovered replication state of node 2 to 0/1ECBEA0",,,,,,,,,""
"recovered replication state of node 3 to 0/59FB1C0",,,,,,,,,""
"recovered replication state of node 4 to 0/2AA5320",,,,,,,,,""
"recovered replication state of node 5 to 0/27F2F98",,,,,,,,,""
"recovered replication state of node 6 to 0/59F35A8",,,,,,,,,""
"no free replication state could be found, increase max_replication_slots",,,,,,,,,""
pg_replication_slots is only reporting two slots:
postgres=# SELECT * FROM pg_catalog.pg_replication_slots;
slot_name | plugin | slot_type | datoid | database | active | xmin | catalog_xmin | restart_lsn
-----------------------------------------+--------+-----------+--------+----------+--------+------+--------------+-------------
bdr_19685_6199712740068695651_1_18817__ | bdr | logical | 19685 | deliver | t | | 2280 | 0/28EA5E0
bdr_19685_6197393155020108291_1_48609__ | bdr | logical | 19685 | deliver | t | | 2280 | 0/28EA5E0
How can I get rid of the stale node recovery on startup?
Thanks
-Selim
On 9 October 2015 at 06:54, Selim Tuvi <stuvi@ilm.com> wrote: > Hi I am testing BDR functionality with Postgres 9.4. I had went through the > bdrdemo example with a 3 node cluster and then tried to set up my own db. > > My "max_replication_slots" is set to 6. After getting removing the bdrdemo > db I am having trouble starting up the postgres instance unless I increase > the value of "max_replication_slots". I get the following error in the log: > > "starting up replication identifier with ckpt at 0/28E8250",,,,,,,,,"" > "recovered replication state of node 1 to 0/54DDCD0",,,,,,,,,"" > "recovered replication state of node 2 to 0/1ECBEA0",,,,,,,,,"" > "recovered replication state of node 3 to 0/59FB1C0",,,,,,,,,"" > "recovered replication state of node 4 to 0/2AA5320",,,,,,,,,"" > "recovered replication state of node 5 to 0/27F2F98",,,,,,,,,"" > "recovered replication state of node 6 to 0/59F35A8",,,,,,,,,"" > "no free replication state could be found, increase > max_replication_slots",,,,,,,,,"" > > pg_replication_slots is only reporting two slots: > > postgres=# SELECT * FROM pg_catalog.pg_replication_slots; > slot_name | plugin | slot_type | datoid | > database | active | xmin | catalog_xmin | restart_lsn > -----------------------------------------+--------+-----------+--------+----------+--------+------+--------------+------------- > bdr_19685_6199712740068695651_1_18817__ | bdr | logical | 19685 | > deliver | t | | 2280 | 0/28EA5E0 > bdr_19685_6197393155020108291_1_48609__ | bdr | logical | 19685 | > deliver | t | | 2280 | 0/28EA5E0 > > How can I get rid of the stale node recovery on startup? Can you show the output of select * from pg_replication_identifiers; please? On all nodes. Also pg_catalog.pg_replication_slots on the other nodes. -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
node: deliver_sing (the problem node): postgres=# SELECT * FROM pg_catalog.pg_replication_identifier; riident | riname ---------+---------------------------------------- 1 | bdr_6197393155020108291_1_47458_16385_ 2 | bdr_6199712740068695651_1_16385_16385_ 3 | bdr_6197393155020108291_1_47458_17167_ 4 | bdr_6199712740068695651_1_16385_17167_ 5 | bdr_6199712740068695651_1_18817_17951_ 6 | bdr_6197393155020108291_1_48609_17951_ 7 | bdr_6197393155020108291_1_48609_19685_ 8 | bdr_6199712740068695651_1_18817_19685_ (8 rows) postgres=# SELECT * FROM pg_catalog.pg_replication_slots; slot_name | plugin | slot_type | datoid | database | active | xmin | catalog_xmin | restart_lsn -----------------------------------------+--------+-----------+--------+----------+--------+------+--------------+------------- bdr_19685_6199712740068695651_1_18817__ | bdr | logical | 19685 | deliver | t | | 2299 | 0/290AB88 bdr_19685_6197393155020108291_1_48609__ | bdr | logical | 19685 | deliver | t | | 2299 | 0/290AB88 (2 rows) node: deliver_sf: postgres=# SELECT * FROM pg_catalog.pg_replication_identifier; riident | riname ---------+---------------------------------------- 1 | bdr_6199712740068695651_1_16385_47458_ 2 | bdr_6199711219508308907_1_17167_47458_ 3 | bdr_6199712740068695651_1_18817_48609_ 4 | bdr_6199711219508308907_1_17951_48609_ 5 | bdr_6199711219508308907_1_19685_48609_ (5 rows) postgres=# SELECT * FROM pg_catalog.pg_replication_slots; slot_name | plugin | slot_type | datoid | database | active | xmin | catalog_xmin | restart_lsn -----------------------------------------+--------+-----------+--------+----------+--------+------+--------------+------------- bdr_48609_6199712740068695651_1_18817__ | bdr | logical | 48609 | deliver | t | | 4744 | 0/5BC0DF0 bdr_48609_6199711219508308907_1_19685__ | bdr | logical | 48609 | deliver | t | | 4744 | 0/5BC0DF0 (2 rows) node: deliver_lon: postgres=# SELECT * FROM pg_catalog.pg_replication_identifier; riident | riname ---------+---------------------------------------- 1 | bdr_6197393155020108291_1_47458_16385_ 2 | bdr_6199711219508308907_1_17167_16385_ 3 | bdr_6199711219508308907_1_17951_17173_ 4 | bdr_6199711219508308907_1_17951_18817_ 5 | bdr_6197393155020108291_1_48609_18817_ 6 | bdr_6199711219508308907_1_19685_18817_ (6 rows) postgres=# SELECT * FROM pg_catalog.pg_replication_slots; slot_name | plugin | slot_type | datoid | database | active | xmin | catalog_xmin | restart_lsn -----------------------------------------+--------+-----------+--------+----------+--------+------+--------------+------------- bdr_18817_6199711219508308907_1_19685__ | bdr | logical | 18817 | deliver | t | | 2217 | 0/2B04738 bdr_18817_6197393155020108291_1_48609__ | bdr | logical | 18817 | deliver | t | | 2217 | 0/2B04738 (2 rows) Thanks -Selim ________________________________________ From: Craig Ringer [craig@2ndquadrant.com] Sent: Thursday, October 08, 2015 11:05 PM To: Selim Tuvi Cc: pgsql-general@postgresql.org Subject: Re: [GENERAL] BDR: no free replication state could be found On 9 October 2015 at 06:54, Selim Tuvi <stuvi@ilm.com> wrote: > Hi I am testing BDR functionality with Postgres 9.4. I had went through the > bdrdemo example with a 3 node cluster and then tried to set up my own db. > > My "max_replication_slots" is set to 6. After getting removing the bdrdemo > db I am having trouble starting up the postgres instance unless I increase > the value of "max_replication_slots". I get the following error in the log: > > "starting up replication identifier with ckpt at 0/28E8250",,,,,,,,,"" > "recovered replication state of node 1 to 0/54DDCD0",,,,,,,,,"" > "recovered replication state of node 2 to 0/1ECBEA0",,,,,,,,,"" > "recovered replication state of node 3 to 0/59FB1C0",,,,,,,,,"" > "recovered replication state of node 4 to 0/2AA5320",,,,,,,,,"" > "recovered replication state of node 5 to 0/27F2F98",,,,,,,,,"" > "recovered replication state of node 6 to 0/59F35A8",,,,,,,,,"" > "no free replication state could be found, increase > max_replication_slots",,,,,,,,,"" > > pg_replication_slots is only reporting two slots: > > postgres=# SELECT * FROM pg_catalog.pg_replication_slots; > slot_name | plugin | slot_type | datoid | > database | active | xmin | catalog_xmin | restart_lsn > -----------------------------------------+--------+-----------+--------+----------+--------+------+--------------+------------- > bdr_19685_6199712740068695651_1_18817__ | bdr | logical | 19685 | > deliver | t | | 2280 | 0/28EA5E0 > bdr_19685_6197393155020108291_1_48609__ | bdr | logical | 19685 | > deliver | t | | 2280 | 0/28EA5E0 > > How can I get rid of the stale node recovery on startup? Can you show the output of select * from pg_replication_identifiers; please? On all nodes. Also pg_catalog.pg_replication_slots on the other nodes. -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
On 10 October 2015 at 02:53, Selim Tuvi <stuvi@ilm.com> wrote: > node: deliver_sing (the problem node): > > postgres=# SELECT * FROM pg_catalog.pg_replication_identifier; > riident | riname > ---------+---------------------------------------- > 1 | bdr_6197393155020108291_1_47458_16385_ > 2 | bdr_6199712740068695651_1_16385_16385_ > 3 | bdr_6197393155020108291_1_47458_17167_ > 4 | bdr_6199712740068695651_1_16385_17167_ > 5 | bdr_6199712740068695651_1_18817_17951_ > 6 | bdr_6197393155020108291_1_48609_17951_ > 7 | bdr_6197393155020108291_1_48609_19685_ > 8 | bdr_6199712740068695651_1_18817_19685_ > (8 rows) > On 9 October 2015 at 06:54, Selim Tuvi <stuvi@ilm.com> wrote: >> "recovered replication state of node 6 to 0/59F35A8",,,,,,,,,"" >> "no free replication state could be found, increase >> max_replication_slots",,,,,,,,,"" The number of supported replication identifiers (in bdr 9.4) is controlled by max_replication_slots, hence the error message. This should be documented; I'll amend the docs appropriately. https://github.com/2ndQuadrant/bdr/issues/133 The identifiers aren't currently dropped during node part, which should be changed. It hasn't come up to date because frequent node addition and removal is something to be avoided, and because most deployments configure room for more slots than needed to avoid future restarts. https://github.com/2ndQuadrant/bdr/issues/134 -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services