Re: repmgr won't update witness after failover - Mailing list pgsql-general
From | Aviel Buskila |
---|---|
Subject | Re: repmgr won't update witness after failover |
Date | |
Msg-id | CAB3=tTGy2DCiceEC8=i9ZkYOTm_3OGx-Wf5HJtDKggoruHVj_g@mail.gmail.com Whole thread Raw |
In response to | Re: repmgr won't update witness after failover (Jony Cohen <jony.cohenjo@gmail.com>) |
List | pgsql-general |
Hey,
Thanks for the reply, this helped me very much.
Kind Regards,
Aviel Buskila.
Hi,The clone command just clones the data from node2 to node1, you need to also register it with the `force` option to override the old record. (as if you're building a new replica node...)see:Regards,- JonyOn Sun, Aug 16, 2015 at 3:19 PM, Aviel Buskila <aviel33@gmail.com> wrote:Hey,I think I know what the problem is,after the first failover when I clone the old master to be standby with the 'repmgr standby clone' command it seems that nothing updates the repl_nodes table with the new standby in my cluster so on the next failover the repmgrd is failed to find a new upcoming standby to failover..this issue is confirmed after that I manually updated the repl_nodes table after the clone so that the old master is now a standby database.now my question is:Where does is suppose to happen that after I issue the 'repmgr standby clone' the repl_nodes should be updated too about the new standby server?Best regards,Aviel Buskila2015-08-16 12:11 GMT+03:00 Aviel Buskila <aviel33@gmail.com>:hey,
I have tried to set the configuration all over again, now the status of 'repl_nodes' before the failover is:
id | type | upstream_node_id | cluster | name | conninfo | priority | active
----+---------+---------------+------------------------------------------------------------+----------+---------
1 | master | | cluster_name |node1| host=node1 dbname=repmgr port=5432 user=repmgr | 100 | t
2 | standby| 1 | cluster_name |node2| host=node2 dbname=repmgr port=5432 user=repmgr | 100 | t3 | witness| | cluster_name |node3| host=node3 dbname=repmgr port=5499 user=repmgr | 100 | t
repmgr is started on node2 and node3 (standby and witness) now when I kill postgresmaster process I can see in the
repmgrd log the following messages:
[WARNING] connection to master has been lost, trying to recover... 60 seconds before failover decision
[WARNING] connection to master has been lost, trying to recover... 50 seconds before failover decision
[WARNING] connection to master has been lost, trying to recover... 40 seconds before failover decision
[WARNING] connection to master has been lost, trying to recover... 30 seconds before failover decision
[WARNING] connection to master has been lost, trying to recover... 20 seconds before failover decision
[WARNING] connection to master has been lost, trying to recover... 10 seconds before failover decision
and than when it tried to elect node2 to be promoted it shows the following messages:
[DEBUG] connecting to: 'host=node2 user=repmgr dbname=repmgr fallback_application_name='repmgr''
[WARNING] unable to defermmine a valid master server; waiting 10 seconds to retry...
[ERROR] unable to determine a valid master node, terminating...
[INFO] repmgrd terminating..
what am I doing wrong?
El 14/08/15 a las 04:14, Aviel Buskila escribió:
> Hey,
> yes I did .. and still it wont fail back..
Can you send over the output of "repmgr cluster show" before and after
the failover process?
The output of SELECT * FROM repmgr_schema.repl_nodes; after the failover
(you need to change repmgr_schema with what you have configured).
Also, which version of repmgr are you running?
> 2015-08-13 16:23 GMT+03:00 Jony Vesterman Cohen <jony.cohenjo@gmail.com>:
>
>> Hi, did you make the old master follow the new one using repmgr?
>>
>> It doesn't update itself automatically...
>> From the looks of it repmgr thinks you have 2 masters - the old one
>> offline and the new one online.
Regards,
--
Martín Marqués http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
pgsql-general by date: