Strange replication problem - segment restored from archive but still requested from master - Mailing list pgsql-general
From | Piotr Gasidło |
---|---|
Subject | Strange replication problem - segment restored from archive but still requested from master |
Date | |
Msg-id | CAF8akQtSgY+K4Z24qmBbRjZBEN9q+C8rGenq36KqmUGRTBqLhQ@mail.gmail.com Whole thread Raw |
Responses |
Re: Strange replication problem - segment restored from
archive but still requested from master
Re: Strange replication problem - segment restored from archive but still requested from master |
List | pgsql-general |
Got strange problem. Unable to repeat, but got logs. Simple master-slave using streaming replication. Master is running. Slave is down. Segment 0000000400004C4D00000090 was successfully archived and send from master to slave. Now I've started slave, and: ay 21 21:23:37 d8 postgres[50645]: [3-1] 2015-05-21 21:23:37.033 CEST @ 50645 LOG: database system was shut down in recovery at 2015-05-21 21:22:03 CEST May 21 21:23:37 d8 postgres[50645]: [4-1] 2015-05-21 21:23:37.034 CEST @ 50645 LOG: entering standby mode May 21 21:23:37 d8 postgres[50645]: [5-1] 2015-05-21 21:23:37.058 CEST @ 50645 LOG: restored log file "0000000400004C4D00000088" from archive May 21 21:23:37 d8 postgres[50645]: [6-1] 2015-05-21 21:23:37.120 CEST @ 50645 LOG: redo starts at 4C4D/88493B50 May 21 21:23:37 d8 postgres[50645]: [7-1] 2015-05-21 21:23:37.226 CEST @ 50645 LOG: restored log file "0000000400004C4D00000089" from archive May 21 21:23:37 d8 postgres[50645]: [8-1] 2015-05-21 21:23:37.426 CEST @ 50645 LOG: restored log file "0000000400004C4D0000008A" from archive May 21 21:23:37 d8 postgres[50645]: [9-1] 2015-05-21 21:23:37.750 CEST @ 50645 LOG: restored log file "0000000400004C4D0000008B" from archive May 21 21:23:38 d8 postgres[50645]: [10-1] 2015-05-21 21:23:38.376 CEST @ 50645 LOG: restored log file "0000000400004C4D0000008C" from archive May 21 21:23:38 d8 postgres[50645]: [11-1] 2015-05-21 21:23:38.690 CEST @ 50645 LOG: restored log file "0000000400004C4D0000008D" from archive May 21 21:23:38 d8 postgres[50645]: [12-1] 2015-05-21 21:23:38.855 CEST @ 50645 LOG: restored log file "0000000400004C4D0000008E" from archive May 21 21:23:39 d8 postgres[50645]: [13-1] 2015-05-21 21:23:39.275 CEST @ 50645 LOG: restored log file "0000000400004C4D0000008F" from archive May 21 21:23:39 d8 postgres[50645]: [14-1] 2015-05-21 21:23:39.654 CEST @ 50645 LOG: restored log file "0000000400004C4D00000090" from archive May 21 21:23:40 d8 postgres[50645]: [15-1] 2015-05-21 21:23:40.222 CEST @ 50645 LOG: consistent recovery state reached at 4C4D/90FFF9C8 May 21 21:23:40 d8 postgres[50644]: [3-1] 2015-05-21 21:23:40.222 CEST @ 50644 LOG: database system is ready to accept read only connections May 21 21:23:40 d8 postgres[50645]: [16-1] 2015-05-21 21:23:40.223 CEST @ 50645 LOG: unexpected pageaddr 4C46/E000000 in log segment 0000000400004C4D00000091, offset 0 May 21 21:23:40 d8 postgres[50699]: [4-1] 2015-05-21 21:23:40.232 CEST @ 50699 LOG: started streaming WAL from primary at 4C4D/90000000 on timeline 4 May 21 21:23:40 d8 postgres[50699]: [5-1] 2015-05-21 21:23:40.232 CEST @ 50699 FATAL: could not receive data from WAL stream: ERROR: requested WAL segment 0000000400004C4D00000090 has already been removed May 21 21:23:40 d8 postgres[50645]: [17-1] 2015-05-21 21:23:40.255 CEST @ 50645 LOG: restored log file "0000000400004C4D00000090" from archive May 21 21:23:40 d8 postgres[50703]: [4-1] 2015-05-21 21:23:40.268 CEST @ 50703 LOG: started streaming WAL from primary at 4C4D/90000000 on timeline 4 May 21 21:23:40 d8 postgres[50703]: [5-1] 2015-05-21 21:23:40.268 CEST @ 50703 FATAL: could not receive data from WAL stream: ERROR: requested WAL segment 0000000400004C4D00000090 has already been removed May 21 21:23:40 d8 postgres[50703]: [5-2] ... (and so on) So, as I understand: - slave was started and entered restore, - slave restored 0000000400004C4D00000090 from archive, reached consistent recovery state - now, it connected to master and noticed, that, it has 0000000400004C4D00000091 segment uncomplete - and then, started yelling about missing segment (master deleted it already after archiving) on master Why? I fixed it by stopping slave, doing some operations on master, copying 0000000400004C4D00000091 and 0000000400004C4D00000092 from master, and staring the slave: May 21 21:24:56 d8 postgres[50644]: [4-1] 2015-05-21 21:24:56.160 CEST @ 50644 LOG: received fast shutdown request May 21 21:24:56 d8 postgres[50644]: [5-1] 2015-05-21 21:24:56.160 CEST @ 50644 LOG: aborting any active transactions May 21 21:24:56 d8 postgres[50657]: [3-1] 2015-05-21 21:24:56.162 CEST @ 50657 LOG: shutting down May 21 21:24:56 d8 postgres[50657]: [4-1] 2015-05-21 21:24:56.174 CEST @ 50657 LOG: database system is shut down May 21 21:28:54 d8 postgres[53574]: [1-1] 2015-05-21 21:28:54.781 CEST @ 53574 LOG: loaded library "pg_stat_statements" May 21 21:28:55 d8 postgres[53574]: [2-1] 2015-05-21 21:28:55.123 CEST @ 53574 LOG: ending log output to stderr May 21 21:28:55 d8 postgres[53574]: [2-2] 2015-05-21 21:28:55.123 CEST @ 53574 HINT: Future log output will go to log destination "syslog". May 21 21:28:55 d8 postgres[53575]: [3-1] 2015-05-21 21:28:55.124 CEST @ 53575 LOG: database system was shut down in recovery at 2015-05-21 21:24:56 CEST May 21 21:28:55 d8 postgres[53575]: [4-1] 2015-05-21 21:28:55.126 CEST @ 53575 LOG: entering standby mode May 21 21:28:55 d8 postgres[53575]: [5-1] 2015-05-21 21:28:55.150 CEST @ 53575 LOG: restored log file "0000000400004C4D00000088" from archive May 21 21:28:55 d8 postgres[53575]: [6-1] 2015-05-21 21:28:55.209 CEST @ 53575 LOG: redo starts at 4C4D/88493B50 May 21 21:28:55 d8 postgres[53575]: [7-1] 2015-05-21 21:28:55.314 CEST @ 53575 LOG: restored log file "0000000400004C4D00000089" from archive May 21 21:28:55 d8 postgres[53575]: [8-1] 2015-05-21 21:28:55.512 CEST @ 53575 LOG: restored log file "0000000400004C4D0000008A" from archive May 21 21:28:55 d8 postgres[53575]: [9-1] 2015-05-21 21:28:55.831 CEST @ 53575 LOG: restored log file "0000000400004C4D0000008B" from archive May 21 21:28:56 d8 postgres[53575]: [10-1] 2015-05-21 21:28:56.508 CEST @ 53575 LOG: restored log file "0000000400004C4D0000008C" from archive May 21 21:28:56 d8 postgres[53575]: [11-1] 2015-05-21 21:28:56.818 CEST @ 53575 LOG: restored log file "0000000400004C4D0000008D" from archive May 21 21:28:56 d8 postgres[53575]: [12-1] 2015-05-21 21:28:56.982 CEST @ 53575 LOG: restored log file "0000000400004C4D0000008E" from archive May 21 21:28:57 d8 postgres[53575]: [13-1] 2015-05-21 21:28:57.415 CEST @ 53575 LOG: restored log file "0000000400004C4D0000008F" from archive May 21 21:28:57 d8 postgres[53575]: [14-1] 2015-05-21 21:28:57.790 CEST @ 53575 LOG: restored log file "0000000400004C4D00000090" from archive May 21 21:28:58 d8 postgres[53575]: [15-1] 2015-05-21 21:28:58.328 CEST @ 53575 LOG: consistent recovery state reached at 4C4D/90FFF9C8 May 21 21:28:58 d8 postgres[53574]: [3-1] 2015-05-21 21:28:58.328 CEST @ 53574 LOG: database system is ready to accept read only connections May 21 21:28:58 d8 postgres[53575]: [16-1] 2015-05-21 21:28:58.349 CEST @ 53575 LOG: restored log file "0000000400004C4D00000091" from archive May 21 21:28:59 d8 postgres[53575]: [17-1] 2015-05-21 21:28:59.962 CEST @ 53575 LOG: restored log file "0000000400004C4D00000092" from archive May 21 21:29:00 d8 postgres[53575]: [18-1] 2015-05-21 21:29:00.037 CEST @ 53575 LOG: unexpected pageaddr 4C45/DC000000 in log segment 0000000400004C4D00000093, offset 0 May 21 21:29:00 d8 postgres[53640]: [4-1] 2015-05-21 21:29:00.047 CEST @ 53640 LOG: started streaming WAL from primary at 4C4D/92000000 on timeline 4 What was wrong? Postgresql 9.3.6 on slave and 9.3.4 on master. Unable to retest if master was also 9.3.6. -- Piotr Gasidło
pgsql-general by date: