Home > mailing lists

Re: Replication failure, slave requesting old segments - Mailing list pgsql-general

From	Phil Endecott
Subject	Re: Replication failure, slave requesting old segments
Date	August 13, 2018 01:25:38
Msg-id	1534101938762@dmwebmail.dmwebmail.chezphil.org Whole thread Raw
In response to	Re: Replication failure, slave requesting old segments (Adrian Klaver <adrian.klaver@aklaver.com>)
Responses	Re: Replication failure, slave requesting old segments
List	pgsql-general

Tree view

Hi Adrian,

Adrian Klaver wrote:
> On 08/11/2018 12:42 PM, Phil Endecott wrote:
>> Hi Adrian,
>> 
>> Adrian Klaver wrote:
>>> Looks like the master recycled the WAL's while the slave could not 
>>> connect.
>> 
>> Yes but... why is that a problem?  The master is copying the WALs to
>> the backup server using scp, where they remain forever.  The slave gets
>
> To me it looks like that did not happen:
>
> 2018-08-11 00:05:50.364 UTC [615] LOG:  restored log file 
> "0000000100000007000000D0" from archive
> scp: backup/postgresql/archivedir/0000000100000007000000D1: No such file 
> or directory
> 2018-08-11 00:05:51.325 UTC [7208] LOG:  started streaming WAL from 
> primary at 7/D0000000 on timeline 1
> 2018-08-11 00:05:51.325 UTC [7208] FATAL:  could not receive data from 
> WAL stream: ERROR:  requested WAL segment 0000000100000007000000D0 has 
> already been removed
>
> Above 0000000100000007000000D0 is gone/recycled on the master and the 
> archived version does not seem to be complete as the streaming 
> replication is trying to find it.

The files on the backup server were all 16 MB.

> Below you kick the master and it coughs up the files to the archive 
> including *D0 and *D1 on up to *D4 and then the streaming picks using *D5.

When I kicked it, the master wrote D1 to D4 to the backup.  It did not
change D0 (its modification time on the backup is from before the "kick").
The slave re-read D0, again, as it had been doing throughout this period,
and then read D1 to D4.

> Best guess is the archiving did not work as expected during:
>
> "(During this time the master was also down for a shorter period.)"

Around the time the master was down, the WAL segment names were CB and CC.
Files CD to CF were written between the master coming up and the slave
coming up.  The slave had no trouble restoring those segments when it started.
The problematic segments D0 and D1 were the ones that were "current" 
when the
slave restarted, at which time the master was up consistently.

Regards, Phil.

pgsql-general by date:

From: Tom Lane
Date: 13 August 2018, 01:12:14
Subject: Re: PostgreSQL C Language Extension with C++ Code

From: "Phil Endecott"
Date: 13 August 2018, 01:53:08
Subject: Re: Replication failure, slave requesting old segments

Re: Replication failure, slave requesting old segments - Mailing list pgsql-general

Previous

Next