RE: Stronger safeguard for archive recovery not to miss data - Mailing list pgsql-hackers
From | osumi.takamichi@fujitsu.com |
---|---|
Subject | RE: Stronger safeguard for archive recovery not to miss data |
Date | |
Msg-id | OSBPR01MB488847BC24220F046DB6BEB4ED769@OSBPR01MB4888.jpnprd01.prod.outlook.com Whole thread Raw |
In response to | Re: Stronger safeguard for archive recovery not to miss data (Kyotaro Horiguchi <horikyota.ntt@gmail.com>) |
Responses |
Re: Stronger safeguard for archive recovery not to miss data
|
List | pgsql-hackers |
On Tuesday, April 6, 2021 3:24 PM Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote: > At Tue, 6 Apr 2021 04:11:35 +0000, "osumi.takamichi@fujitsu.com" > <osumi.takamichi@fujitsu.com> wrote in > > On Tuesday, April 6, 2021 9:41 AM Fujii Masao > > <masao.fujii@oss.nttdata.com> > > > On 2021/04/05 23:54, osumi.takamichi@fujitsu.com wrote: > > > >> This makes me think that we should document this risk.... Thought? > > > > +1. We should notify the risk when user changes > > > > the wal_level higher than minimal to minimal to invoke a > > > > carefulness of user for such kind of operation. > > > > > > I removed the HINT message "or recover to the point in ..." and > > > added the following note into the docs. > > > > > > Note that changing <varname>wal_level</varname> to > > > <literal>minimal</literal> makes any base backups taken before > > > unavailable for archive recovery and standby server, which may > > > lead to database loss. > > Thank you for updating the patch. Let's make the sentence more strict. > > > > My suggestion for this explanation is > > "In order to prevent database corruption, changing wal_level to > > minimal from higher level in the middle of WAL archiving requires > > careful attention. It makes any base backups taken before the > > operation unavailable for archive recovery and standby server. Also, > > it may lead to whole database loss when archive recovery fails with an > > error for that change. > > Take a new base backup immediately after making wal_level back to higher > level." > > The first sentense looks like somewhat nanny-ish. The database is not > corrupt at the time of this error. Yes. Excuse me for misleading sentence. I just wanted to write why the error was introduced, but it was not necessary. We should remove and fix the first part of the sentence. > We just lose updates after the last read > segment at this point. As Fujii-san said, we can continue recoverying using > crash recovery and we will reach having a corrupt database after that. OK. Thank you for explanation. > About the last sentence, I prefer more flat wording, such as "You need to take > a new base backup..." Either is fine to me. > > Then, we can be consistent with our new hint message, "Use a backup > > taken after setting wal_level to higher than minimal.". > > > Is it better to add something similar to "Take an offline backup when > > you stop the server and change the wal_level" around the end of this part as > another option for safeguard, also? > > Backup policy is completely a matter of DBAs. OK. No problem. No need to add it. > If flipping wal_level alone > highly causes unstartable corruption,,, I think it is a bug. > > For the performance technique part, what we need to explain is same. > > Might be good, but in simpler wording. Yeah, I agree. > > Another minor thing I felt we need to do might be to add double quotes to > wrap minimal in errhint. > > Since the error about hot_standby has gone, either will do for me. Thanks for sharing your thoughts. Best Regards, Takamichi Osumi
pgsql-hackers by date: