RE: Stronger safeguard for archive recovery not to miss data - Mailing list pgsql-hackers

From osumi.takamichi@fujitsu.com
Subject RE: Stronger safeguard for archive recovery not to miss data
Date
Msg-id OSBPR01MB488886D6E30B32428C08E766ED789@OSBPR01MB4888.jpnprd01.prod.outlook.com
Whole thread Raw
In response to Re: Stronger safeguard for archive recovery not to miss data  (Fujii Masao <masao.fujii@oss.nttdata.com>)
Responses Re: Stronger safeguard for archive recovery not to miss data  (Fujii Masao <masao.fujii@oss.nttdata.com>)
List pgsql-hackers
On Thursday, April 1, 2021 5:25 PM Fujii Masao <masao.fujii@oss.nttdata.com> wrote:
> On 2021/04/01 12:45, osumi.takamichi@fujitsu.com wrote:
> > Thank you for sharing your ideas about the hint. Absolutely need to change
> the message.
> > In my opinion, combining the basic idea of yours and Fujii-san's would be
> the best.
> >
> > Updated the patch and made v05. The changes I made are
> >
> > * rewording of errhint although this has become long !
> > * fix of the typo in the TAP test
> > * modification of my past changes not to change conditions in
> > CheckRequiredParameterValues
> > * rename of the test file to 024_archive_recovery.pl because two files are
> made
> >     since the last update of this patch
> > * pgindent is conducted to check my alignment again.
> 
> Thanks for updating the patch!
> 
> +                 errhint("Use a backup taken after setting
> wal_level to higher than minimal "
> +                         "or recover to the point in
> time before wal_level becomes
> +minimal even though it causes data loss")));
> 
> ISTM that "or recover to the point in time before wal_level was changed
>   to minimal even though it may cause data loss" sounds better. Thought?
Adopted.


> +# Check if standby.signal exists
> +my $pgdata = $new_node->data_dir;
> +ok (-f "${pgdata}/standby.signal", 'standby.signal was created');
> 
> +# Check if recovery.signal exists
> +my $path = $another_node->data_dir;
> +ok (-f "${path}/recovery.signal", 'recovery.signal was created');
> 
> Why are these tests necessary?
> These seem to test that init_from_backup() works expectedly based on the
> parameter "standby". But if we are sure that init_from_backup() works fine,
> these tests don't seem to be necessary.
Absolutely, you are right. Fixed.


> +use Config;
> 
> This is not necessary?
Removed.


> +# Make the wal_level back to replica
> +$node->append_conf('postgresql.conf', $replica_config); $node->restart;
> +check_wal_level('replica', 'wal_level went back to replica again');
> 
> IMO it's better to comment why this server restart is necessary.
> As far as I understand correctly, this is necessary to ensure the WAL file
> containing the record about the change of wal_level (to minimal) is archived,
> so that the subsequent archive recovery will be able to replay it.
OK, added some comments. Further, I felt the way I wrote this part was not good at all and self-evident
and developers who read this test would feel uneasy about that point.
So, a little bit fixed that test so that we can get clearer conviction for wal archive.


> > By the way, when I build postgres with this patch and enable-coverage
> > option, the results of RT becomes unstable. Does someone know the
> reason ?
> > When it fails, I get stderr like below
> 
> I have no idea about this. Does this happen even without the patch?
Unfortunately, no. I get this only with --enable-coverage and with my patch,
althought regression tests have passed with this patch.
OSS HEAD doesn't produce the stderr even with --enable-coverage.


Best Regards,
    Takamichi Osumi


Attachment

pgsql-hackers by date:

Previous
From: Zhihong Yu
Date:
Subject: Unused variable found in AttrDefaultFetch
Next
From: "osumi.takamichi@fujitsu.com"
Date:
Subject: RE: Stronger safeguard for archive recovery not to miss data