Re: Stronger safeguard for archive recovery not to miss data - Mailing list pgsql-hackers

From Laurenz Albe
Subject Re: Stronger safeguard for archive recovery not to miss data
Date
Msg-id 0cf55744297b3028485ecb9933c4256bd2985fe8.camel@cybertec.at
Whole thread Raw
In response to Re: Stronger safeguard for archive recovery not to miss data  (Fujii Masao <masao.fujii@oss.nttdata.com>)
Responses RE: Stronger safeguard for archive recovery not to miss data  ("osumi.takamichi@fujitsu.com" <osumi.takamichi@fujitsu.com>)
List pgsql-hackers
On Wed, 2021-01-20 at 13:10 +0900, Fujii Masao wrote:
> +                                errhint("Run recovery again from a new base backup taken after setting wal_level
higherthan minimal")));
 
> 
> Isn't it impossible to do this in normal archive recovery case? In that case,
> since the server crashed and the database got corrupted, probably
> we cannot take a new base backup.
> 
> Originally even when users accidentally set wal_level to minimal, they could
> start the server from the backup by disabling hot_standby and salvage the data.
> But with the patch, we lost the way to do that. Right? I was wondering that
> WARNING was used intentionally there for that case.

I would argue that if you set your "wal_level" to minimal, do some work,
reset it to "replica" and recover past that, two things could happen:

1. Since "archive_mode" was off, you are missing some WAL segments and
   cannot recover past that point anyway.

2. You are missing some relevant WAL records, and your recovered
   database is corrupted.  This is very likely, because you probably
   switched to "minimal" with the intention to generate less WAL.

Now I see your point that a corrupted database may be better than no
database at all, but wouldn't you agree that a warning in the log
(that nobody reads) is too little information?

Normally we don't take such a relaxed attitude towards data corruption.

Perhaps there could be a GUC "recovery_allow_data_corruption" to
override this check, but I'd say that a warning is too little.

>                 if (ControlFile->wal_level < WAL_LEVEL_REPLICA)
>                         ereport(ERROR,
>                                         (errmsg("hot standby is not possible because wal_level was not set to
\"replica\"or higher on the primary server"),
 
>                                          errhint("Either set wal_level to \"replica\" on the primary, or turn off
hot_standbyhere.")));
 
> 
> With the patch, we never reach the above code?

Right, that would have to go.  I didn't notice that.

Yours,
Laurenz Albe




pgsql-hackers by date:

Previous
From: Daniel Gustafsson
Date:
Subject: Re: Support for NSS as a libpq TLS backend
Next
From: torikoshia
Date:
Subject: Re: TOAST condition for column size