RE: Stronger safeguard for archive recovery not to miss data - Mailing list pgsql-hackers
From | osumi.takamichi@fujitsu.com |
---|---|
Subject | RE: Stronger safeguard for archive recovery not to miss data |
Date | |
Msg-id | OSBPR01MB488886DB12B35261CDB44DBAED7B9@OSBPR01MB4888.jpnprd01.prod.outlook.com Whole thread Raw |
In response to | Re: Stronger safeguard for archive recovery not to miss data (Kyotaro Horiguchi <horikyota.ntt@gmail.com>) |
Responses |
Re: Stronger safeguard for archive recovery not to miss data
RE: Stronger safeguard for archive recovery not to miss data |
List | pgsql-hackers |
Hi, On Wednesday, March 31, 2021 3:06 PM Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote: > At Wed, 31 Mar 2021 15:03:28 +0900 (JST), Kyotaro Horiguchi > <horikyota.ntt@gmail.com> wrote in > > At Wed, 31 Mar 2021 02:11:48 +0900, Fujii Masao > > <masao.fujii@oss.nttdata.com> wrote in > > > > So, I would revert all the changes in xlog.c except changing the > > > > warning to an error: > > > > - ereport(WARNING, > > > > - (errmsg("WAL was generated with > > > > wal_level=minimal, -data may be missing"), > > > > - errhint("This happens if you temporarily set > > > > -wal_level=minimal without taking a new base backup."))); > > > > + ereport(FATAL, > > > > + (errmsg("WAL was generated with > > > > wal_level=minimal, cannot continue recovering"), > > > > + errdetail("This happens if you temporarily > > > > +set > > > > wal_level=minimal on the server."), > > > > + errhint("Run recovery again from a new > base > > > > backup taken after setting wal_level higher than minimal"))); > > > I guess that users usually encounter this error because they have > > > not taken base backups yet after setting wal_level to higher than > > > minimal and have to use the old base backup for archive recovery. So > > > I'm not sure how much only this HINT is helpful for them. Isn't it > > > better to append something like "If there is no such backup, recover > > > to the point in time before wal_level is set to minimal even though > > > which cause data loss, to start the server." into HINT? > > > > I agree that the hint doesn't make sense. > > For the primary case, > > > HINT: Restart with archive recovery turned off. The past backups are no > longer usable. You need to take a new one after restart. > > > > If it's the replica case, it would be.. > > > > HINT: Start from a fresh standby created from the curent primary server. > > Start from a fresh backup... Thank you for sharing your ideas about the hint. Absolutely need to change the message. In my opinion, combining the basic idea of yours and Fujii-san's would be the best. Updated the patch and made v05. The changes I made are * rewording of errhint although this has become long ! * fix of the typo in the TAP test * modification of my past changes not to change conditions in CheckRequiredParameterValues * rename of the test file to 024_archive_recovery.pl because two files are made since the last update of this patch * pgindent is conducted to check my alignment again. By the way, when I build postgres with this patch and enable-coverage option, the results of RT becomes unstable. Does someone know the reason ? When it fails, I get stderr like below t/001_start_stop.pl .. 10/24 # Failed test 'pg_ctl start: no stderr' # at t/001_start_stop.pl line 48. # got: 'profiling:/home/k5user/new_disk/recheck/PostgreSQL-Source-Dev/src/backend/executor/execMain.gcda:Merge mismatchfor function 15 # ' # expected: '' t/001_start_stop.pl .. 24/24 # Looks like you failed 1 test of 24. t/001_start_stop.pl .. Dubious, test returned 1 (wstat 256, 0x100) Failed 1/24 subtests Similar phenomena was observed in [1] and its solution seems to upgrade my gcc higher than 7. And, I did so but still get this unstable error with enable-coverage. This didn't happen when I remove enable-option and the make check-world passes. [1] - https://www.mail-archive.com/pgsql-hackers@postgresql.org/msg323147.html Best Regards, Takamichi Osumi
Attachment
pgsql-hackers by date: