Thread: Re: [GENERAL] Strange replication problem - segment restored from archive but still requested from master

On Fri, May 22, 2015 at 5:52 AM, Piotr Gasidło <quaker@barbara.eu.org> wrote:
> Got strange problem. Unable to repeat, but got logs.
>
> Simple master-slave using streaming replication.
> Master is running. Slave is down.
> Segment 0000000400004C4D00000090 was successfully archived and send
> from master to slave.
>
> Now I've started slave, and:
>
> ay 21 21:23:37 d8 postgres[50645]: [3-1] 2015-05-21 21:23:37.033 CEST
> @ 50645   LOG:  database system was shut down in recovery at
> 2015-05-21 21:22:03 CEST
> May 21 21:23:37 d8 postgres[50645]: [4-1] 2015-05-21 21:23:37.034 CEST
> @ 50645   LOG:  entering standby mode
> May 21 21:23:37 d8 postgres[50645]: [5-1] 2015-05-21 21:23:37.058 CEST
> @ 50645   LOG:  restored log file "0000000400004C4D00000088" from
> archive
> May 21 21:23:37 d8 postgres[50645]: [6-1] 2015-05-21 21:23:37.120 CEST
> @ 50645   LOG:  redo starts at 4C4D/88493B50
> May 21 21:23:37 d8 postgres[50645]: [7-1] 2015-05-21 21:23:37.226 CEST
> @ 50645   LOG:  restored log file "0000000400004C4D00000089" from
> archive
> May 21 21:23:37 d8 postgres[50645]: [8-1] 2015-05-21 21:23:37.426 CEST
> @ 50645   LOG:  restored log file "0000000400004C4D0000008A" from
> archive
> May 21 21:23:37 d8 postgres[50645]: [9-1] 2015-05-21 21:23:37.750 CEST
> @ 50645   LOG:  restored log file "0000000400004C4D0000008B" from
> archive
> May 21 21:23:38 d8 postgres[50645]: [10-1] 2015-05-21 21:23:38.376
> CEST @ 50645   LOG:  restored log file "0000000400004C4D0000008C" from
> archive
> May 21 21:23:38 d8 postgres[50645]: [11-1] 2015-05-21 21:23:38.690
> CEST @ 50645   LOG:  restored log file "0000000400004C4D0000008D" from
> archive
> May 21 21:23:38 d8 postgres[50645]: [12-1] 2015-05-21 21:23:38.855
> CEST @ 50645   LOG:  restored log file "0000000400004C4D0000008E" from
> archive
> May 21 21:23:39 d8 postgres[50645]: [13-1] 2015-05-21 21:23:39.275
> CEST @ 50645   LOG:  restored log file "0000000400004C4D0000008F" from
> archive
> May 21 21:23:39 d8 postgres[50645]: [14-1] 2015-05-21 21:23:39.654
> CEST @ 50645   LOG:  restored log file "0000000400004C4D00000090" from
> archive
> May 21 21:23:40 d8 postgres[50645]: [15-1] 2015-05-21 21:23:40.222
> CEST @ 50645   LOG:  consistent recovery state reached at
> 4C4D/90FFF9C8
> May 21 21:23:40 d8 postgres[50644]: [3-1] 2015-05-21 21:23:40.222 CEST
> @ 50644   LOG:  database system is ready to accept read only
> connections
> May 21 21:23:40 d8 postgres[50645]: [16-1] 2015-05-21 21:23:40.223
> CEST @ 50645   LOG:  unexpected pageaddr 4C46/E000000 in log segment
> 0000000400004C4D00000091, offset 0
> May 21 21:23:40 d8 postgres[50699]: [4-1] 2015-05-21 21:23:40.232 CEST
> @ 50699   LOG:  started streaming WAL from primary at 4C4D/90000000 on
> timeline 4
> May 21 21:23:40 d8 postgres[50699]: [5-1] 2015-05-21 21:23:40.232 CEST
> @ 50699   FATAL:  could not receive data from WAL stream: ERROR:
> requested WAL segment 0000000400004C4D00000090 has already been
> removed
> May 21 21:23:40 d8 postgres[50645]: [17-1] 2015-05-21 21:23:40.255
> CEST @ 50645   LOG:  restored log file "0000000400004C4D00000090" from
> archive
> May 21 21:23:40 d8 postgres[50703]: [4-1] 2015-05-21 21:23:40.268 CEST
> @ 50703   LOG:  started streaming WAL from primary at 4C4D/90000000 on
> timeline 4
> May 21 21:23:40 d8 postgres[50703]: [5-1] 2015-05-21 21:23:40.268 CEST
> @ 50703   FATAL:  could not receive data from WAL stream: ERROR:
> requested WAL segment 0000000400004C4D00000090 has already been
> removed
> May 21 21:23:40 d8 postgres[50703]: [5-2]
> ...
> (and so on)
>
> So, as I understand:
> - slave was started and entered restore,
> - slave restored 0000000400004C4D00000090 from archive, reached
> consistent recovery state
> - now, it connected to master and noticed, that, it has
> 0000000400004C4D00000091 segment uncomplete
> - and then, started yelling about missing segment (master deleted it
> already after archiving) on master
>
> Why?

Thanks for the report! This seems to be a bug.

This problem happens when WAL record is stored in separate two WAL files and
there is no valid latter WAL file in the standby. In your case, the former file
is 0000000400004C4D00000090 and the latter is 0000000400004C4D00000091.

In this case, the first half of WAL record can be read from the former WAL file,
but the remaining half not because no valid latter file exists in the standby.
Then the standby tries to retrieve the latter WAL file via replication.
The problem here is that the standby tries to start the replication from the
starting point of WAL record, i.e., that's the location of the former WAL file.
So the already-read WAL file is requested via replication.

To address this problem, maybe we can just use the location where the remaining
half of WAL record starts from as the replication starting location. We would
need to look at carefully the side effect of that change, though.

Regards,

--
Fujii Masao


2015-05-22 6:55 GMT+02:00 Fujii Masao <masao.fujii@gmail.com>:
> Thanks for the report! This seems to be a bug.
>
> This problem happens when WAL record is stored in separate two WAL files and
> there is no valid latter WAL file in the standby. In your case, the former file
> is 0000000400004C4D00000090 and the latter is 0000000400004C4D00000091.

I've got bot files, when I've pg_xdumplog-ed them:

$ cat dump1
0000000400004C4D00000090.dump
...
rmgr: Heap2       len (rec/tot):     24/  7772, tx:          0, lsn:
4C4D/90FF6DB8, prev 4C4D/90FF51E8, bkp: 1000, desc: clean: rel
1663/131438/10079072; blk 13843 remxid 472083193
rmgr: Heap2       len (rec/tot):    504/   536, tx:          0, lsn:
4C4D/90FF8C30, prev 4C4D/90FF6DB8, bkp: 0000, desc: freeze_page: rel
1663/131438/10079072; blk 13843; cutoff xid 422325322 ntuples 40
rmgr: Heap2       len (rec/tot):     24/  5632, tx:          0, lsn:
4C4D/90FF8E48, prev 4C4D/90FF8C30, bkp: 1000, desc: freeze_page: rel
1663/131438/10079072; blk 13844; cutoff xid 422325322 ntuples 30
rmgr: Heap2       len (rec/tot):     24/  7528, tx:          0, lsn:
4C4D/90FFA460, prev 4C4D/90FF8E48, bkp: 1000, desc: clean: rel
1663/131438/10079072; blk 13845 remxid 472083193
rmgr: Heap2       len (rec/tot):    564/   596, tx:          0, lsn:
4C4D/90FFC1E0, prev 4C4D/90FFA460, bkp: 0000, desc: freeze_page: rel
1663/131438/10079072; blk 13845; cutoff xid 422325322 ntuples 45
rmgr: Heap2       len (rec/tot):     24/  5316, tx:          0, lsn:
4C4D/90FFC438, prev 4C4D/90FFC1E0, bkp: 1000, desc: clean: rel
1663/131438/10079072; blk 13846 remxid 472083193
rmgr: Heap2       len (rec/tot):    420/   452, tx:          0, lsn:
4C4D/90FFD900, prev 4C4D/90FFC438, bkp: 0000, desc: freeze_page: rel
1663/131438/10079072; blk 13846; cutoff xid 422325322 ntuples 33
rmgr: Heap2       len (rec/tot):     24/  7856, tx:          0, lsn:
4C4D/90FFDAC8, prev 4C4D/90FFD900, bkp: 1000, desc: freeze_page: rel
1663/131438/10079072; blk 13847; cutoff xid 422325322 ntuples 53
rmgr: Heap2       len (rec/tot):     20/    52, tx:          0, lsn:
4C4D/90FFF990, prev 4C4D/90FFDAC8, bkp: 0000, desc: visible: rel
1663/131438/10079072; blk 13847
0000000400004C4D00000091.dump
rmgr: Heap2       len (rec/tot):    600/   632, tx:          0, lsn:
4C4D/910014A0, prev 4C4D/90FFF9C8, bkp: 0000, desc: freeze_page: rel
1663/131438/10079072; blk 13848; cutoff xid 422325322 ntuples 48
rmgr: Heap2       len (rec/tot):     24/  6640, tx:          0, lsn:
4C4D/91001718, prev 4C4D/910014A0, bkp: 1000, desc: freeze_page: rel
1663/131438/10079072; blk 13849; cutoff xid 422325322 ntuples 42
rmgr: Heap2       len (rec/tot):     24/  7848, tx:          0, lsn:
4C4D/91003120, prev 4C4D/91001718, bkp: 1000, desc: freeze_page: rel
1663/131438/10079072; blk 13850; cutoff xid 422325322 ntuples 36
rmgr: Heap2       len (rec/tot):     20/    52, tx:          0, lsn:
4C4D/91004FE0, prev 4C4D/91003120, bkp: 0000, desc: visible: rel
1663/131438/10079072; blk 13850
rmgr: Heap2       len (rec/tot):     24/  6208, tx:          0, lsn:
4C4D/91005018, prev 4C4D/91004FE0, bkp: 1000, desc: clean: rel
1663/131438/10079072; blk 13851 remxid 472083193
rmgr: Heap2       len (rec/tot):    564/   596, tx:          0, lsn:
4C4D/91006870, prev 4C4D/91005018, bkp: 0000, desc: freeze_page: rel
1663/131438/10079072; blk 13851; cutoff xid 422325322 ntuples 45
rmgr: Heap2       len (rec/tot):     24/  7340, tx:          0, lsn:
4C4D/91006AC8, prev 4C4D/91006870, bkp: 1000, desc: freeze_page: rel
1663/131438/10079072; blk 13852; cutoff xid 422325322 ntuples 39
rmgr: Heap2       len (rec/tot):     24/  7744, tx:          0, lsn:
4C4D/91008790, prev 4C4D/91006AC8, bkp: 1000, desc: clean: rel
1663/131438/10079072; blk 13853 remxid 472083193
rmgr: Heap2       len (rec/tot):    492/   524, tx:          0, lsn:
4C4D/9100A5E8, prev 4C4D/91008790, bkp: 0000, desc: freeze_page: rel
1663/131438/10079072; blk 13853; cutoff xid 422325322 ntuples 39
rmgr: Heap2       len (rec/tot):     24/  7152, tx:          0, lsn:
4C4D/9100A7F8, prev 4C4D/9100A5E8, bkp: 1000, desc: freeze_page: rel
1663/131438/10079072; blk 13854; cutoff xid 422325322 ntuples 32
rmgr: Heap2       len (rec/tot):     24/  7380, tx:          0, lsn:
4C4D/9100C400, prev 4C4D/9100A7F8, bkp: 1000, desc: freeze_page: rel
1663/131438/10079072; blk 13855; cutoff xid 422325322 ntuples 42
rmgr: Heap2       len (rec/tot):     24/  6784, tx:          0, lsn:
4C4D/9100E0F0, prev 4C4D/9100C400, bkp: 1000, desc: clean: rel
1663/131438/10079072; blk 13856 remxid 472083193
rmgr: Heap2       len (rec/tot):    408/   440, tx:          0, lsn:
4C4D/9100FB70, prev 4C4D/9100E0F0, bkp: 0000, desc: freeze_page: rel
1663/131438/10079072; blk 13856; cutoff xid 422325322 ntuples 32
rmgr: Heap2       len (rec/tot):     24/  7312, tx:          0, lsn:
4C4D/9100FD28, prev 4C4D/9100FB70, bkp: 1000, desc: clean: rel
1663/131438/10079072; blk 13857 remxid 472083193
rmgr: Heap2       len (rec/tot):    432/   464, tx:          0, lsn:
4C4D/910119D0, prev 4C4D/9100FD28, bkp: 0000, desc: freeze_page: rel
1663/131438/10079072; blk 13857; cutoff xid 422325322 ntuples 34
rmgr: Heap2       len (rec/tot):     24/  7920, tx:          0, lsn:
4C4D/91011BA0, prev 4C4D/910119D0, bkp: 1000, desc: freeze_page: rel
1663/131438/10079072; blk 13858; cutoff xid 422325322 ntuples 60
...

I this is not enought, I can attach both files.


--
Piotr Gasidło