Thread: Error promoting slave on cascading replication using replication slots
Hi, I'm configuring a cascading replication environment, with replication slots, but I'm having a problem when the master goes down and I promote a slave. All servers start from a cluster created from scratch, with default config options. The process that I'm using to set up the cascading replication it is: 1 - On master: wal_level = hot_standby max_wal_senders = 3 max_wal_replication_slots = 3 hot_standby = on 2 - On slave1: Stop Server Apply the same configuration from above Erase the old cluster Run pg_basebackup -v -P -R -X stream -c fast -h IP -U postgres -D PGDATA 3 - On master: pg_create_physical_replication_slot('NAME') 4 - On slave1: Add the primary_slot_name to recovery.conf Start cluster Everything run smoothly, according to with "SELECT * FROM pg_stat_replication" and "SELECT * FROM pg_replication_slots". The steps 2, 3 and 4 are repeated on slave2 wich points to slave1. The problem happens when I stop the master, and run a pg_ctl -D /var/lib/postgresql/9.4/main promote on slave1. At this point, slave2 throws the following log, and stops receiving WAL through the replication slot: 2015-12-17 11:23:06 BRST [944-2] LOG: replication terminated by primary server 2015-12-17 11:23:06 BRST [944-3] DETAIL: End of WAL reached on timeline 1 at 0/30001A0. 2015-12-17 11:23:06 BRST [944-4] LOG: fetching timeline history file for timeline 2 from primary server 2015-12-17 11:23:06 BRST [937-7] LOG: record with zero length at 0/30001A0 2015-12-17 11:23:06 BRST [944-5] LOG: restarted WAL streaming at 0/3000000 on timeline 1 2015-12-17 11:23:06 BRST [944-6] LOG: replication terminated by primary server 2015-12-17 11:23:06 BRST [944-7] DETAIL: End of WAL reached on timeline 1 at 0/30001A0. 2015-12-17 11:23:11 BRST [944-8] LOG: restarted WAL streaming at 0/3000000 on timeline 1 2015-12-17 11:23:11 BRST [944-9] LOG: replication terminated by primary server I found a instruction to add the following line to recovery.conf: recovery_target_timeline = 'latest' When this line is added, slave2 keeps its replication with slave 1: 2015-12-17 13:37:54 BRST [868-2] LOG: replication terminated by primary server 2015-12-17 13:37:54 BRST [868-3] DETAIL: End of WAL reached on timeline 1 at 0/3001358. 2015-12-17 13:37:54 BRST [868-4] LOG: fetching timeline history file for timeline 2 from primary server 2015-12-17 13:37:54 BRST [863-7] LOG: new target timeline is 2 2015-12-17 13:37:54 BRST [863-8] LOG: record with zero length at 0/3001358 2015-12-17 13:37:54 BRST [868-5] LOG: restarted WAL streaming at 0/3000000 on timeline 2 My question is: is this the right procedure, or am I missing something? Best regards, -- Álvaro Nunes Melo Atua Sistemas de Informação alvaro@atua.com.br http://www.atua.com.br (54) 9976-0106 (54) 3045-8100
Re: Error promoting slave on cascading replication using replication slots
From
Andreas Kretschmer
Date:
Alvaro Melo <al_nunes@atua.com.br> wrote: > > I found a instruction to add the following line to recovery.conf: > recovery_target_timeline = 'latest' > > When this line is added, slave2 keeps its replication with slave 1: > 2015-12-17 13:37:54 BRST [868-2] LOG: replication terminated by primary > server > 2015-12-17 13:37:54 BRST [868-3] DETAIL: End of WAL reached on timeline > 1 at 0/3001358. > 2015-12-17 13:37:54 BRST [868-4] LOG: fetching timeline history file > for timeline 2 from primary server > 2015-12-17 13:37:54 BRST [863-7] LOG: new target timeline is 2 > 2015-12-17 13:37:54 BRST [863-8] LOG: record with zero length at 0/3001358 > 2015-12-17 13:37:54 BRST [868-5] LOG: restarted WAL streaming at > 0/3000000 on timeline 2 > > My question is: is this the right procedure, or am I missing something? Yeah, this is the right procedure, afaik. Slave2 is now in sync with the new timeline, everything is okay. Andreas -- Really, I'm not out to destroy Microsoft. That will just be a completely unintentional side effect. (Linus Torvalds) "If I was god, I would recompile penguin with --enable-fly." (unknown) Kaufbach, Saxony, Germany, Europe. N 51.05082°, E 13.56889°