Re: Stronger safeguard for archive recovery not to miss data - Mailing list pgsql-hackers

From Kyotaro Horiguchi
Subject Re: Stronger safeguard for archive recovery not to miss data
Date
Msg-id 20201126.162840.1665375222523010434.horikyota.ntt@gmail.com
Whole thread Raw
In response to Stronger safeguard for archive recovery not to miss data  ("osumi.takamichi@fujitsu.com" <osumi.takamichi@fujitsu.com>)
Responses RE: Stronger safeguard for archive recovery not to miss data  ("osumi.takamichi@fujitsu.com" <osumi.takamichi@fujitsu.com>)
RE: Stronger safeguard for archive recovery not to miss data  ("osumi.takamichi@fujitsu.com" <osumi.takamichi@fujitsu.com>)
List pgsql-hackers
At Thu, 26 Nov 2020 07:18:39 +0000, "osumi.takamichi@fujitsu.com" <osumi.takamichi@fujitsu.com> wrote in 
> Hello
> 
> 
> The attached patch is intended to prevent a scenario that
> archive recovery hits WALs which come from wal_level=minimal
> and the server continues to work, which was discussed in the thread of [1].
> The motivation is to protect that user ends up with both getting replica
> that could miss data and getting the server to miss data in targeted recovery mode.
> 
> About how to modify this, we reached the consensus in the thread.
> It is by changing the ereport's level from WARNING to FATAL in CheckRequiredParameterValues().
> 
> In order to test this fix, what I did is
> 1 - get a base backup during wal_level is replica
> 2 - stop the server and change the wal_level from replica to minimal
> 3 - restart the server(to generate XLOG_PARAMETER_CHANGE)
> 4 - stop the server and make the wal_level back to replica
> 5 - start the server again
> 6 - execute archive recoveries in both cases
>     (1) by editing the postgresql.conf and
>     touching recovery.signal in the base backup from 1th step
>     (2) by making a replica with standby.signal
> * During wal_level is replica, I enabled archive_mode in this test.
> 
> First of all, I confirmed the server started up without this patch.
> After applying this safeguard patch, I checked that
> the server cannot start up any more in the scenario case.
> I checked the log and got the result below with this patch.
> 
> 2020-11-26 06:49:46.003 UTC [19715] FATAL:  WAL was generated with wal_level=minimal, data may be missing
> 2020-11-26 06:49:46.003 UTC [19715] HINT:  This happens if you temporarily set wal_level=minimal without taking a new
basebackup.
 
> 
> Lastly, this should be backpatched.
> Any comments ?

Perhaps we need the TAP test that conducts the above steps.

> [1]
>
https://www.postgresql.org/message-id/TYAPR01MB29901EBE5A3ACCE55BA99186FE320%40TYAPR01MB2990.jpnprd01.prod.outlook.com

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



pgsql-hackers by date:

Previous
From: Fujii Masao
Date:
Subject: Re: [doc] plan invalidation when statistics are update
Next
From: Fabien COELHO
Date:
Subject: Re: Add table access method as an option to pgbench