RE: Stronger safeguard for archive recovery not to miss data - Mailing list pgsql-hackers
From | osumi.takamichi@fujitsu.com |
---|---|
Subject | RE: Stronger safeguard for archive recovery not to miss data |
Date | |
Msg-id | OSBPR01MB4888297D2CEA401A2B05C1BDED769@OSBPR01MB4888.jpnprd01.prod.outlook.com Whole thread Raw |
In response to | RE: Stronger safeguard for archive recovery not to miss data ("osumi.takamichi@fujitsu.com" <osumi.takamichi@fujitsu.com>) |
List | pgsql-hackers |
On Tuesday, April 6, 2021 8:32 AM Osumi, Takamichi/大墨 昂道 <osumi.takamichi@fujitsu.com> > On Monday, April 5, 2021 11:49 PM osumi.takamichi@fujitsu.com > <osumi.takamichi@fujitsu.com> > > On Mon Apr 5, 2021 12:35 PM Fujii Masao <masao.fujii@oss.nttdata.com> > > wrote: > > > >>> By the way, when I build postgres with this patch and > > > >>> enable-coverage option, the results of RT becomes unstable. Does > > > >>> someone know the > > > >> reason ? > > > >>> When it fails, I get stderr like below > > > >> > > > >> I have no idea about this. Does this happen even without the patch? > > > > Unfortunately, no. I get this only with --enable-coverage and with > > > > my patch, althought regression tests have passed with this patch. > > > > OSS HEAD doesn't produce the stderr even with --enable-coverage. > > > > > > Could you check whether the latest patch still causes this issue or not? > > > If it still causes, could you check which part (the change of xlog.c > > > or the addition of regression test) caused the issue? > > v07 reproduces the phenomena, even with make coverage-clean between > > tests. > > The possibility is not high though. > > > > We cannot do the regression test separately from xlog.c because it > > uses the new error message of xlog.c. > > Applying only the TAP test should fail because we get an warning not error. > > > > Therefore, I took the changes of xlog.c only and I'm doing the RT in a > > loop now. If we can get the stderr again, then we can guess xlog.c is > > the cause, right ? > > > > I think I can report the result tomorrow. > > Just in case, I'm running the RT for OSS HEAD in parallel... > > although I cannot reproduce it with it at all. > I really apologie that this OSS HEAD reproduced that stderr with success of > RT. > I executed check-world in parallel with -j option so the reason should be what > Tsunakawa-san told us. > Its probability is pretty low. > I'm so sorry for making noises loudly. > Therefore, I don't have any concern left. This is *not* due to the patch but for future analysis. The phenomena happens with a very little possibility, and in other case, with --enable-coverage and make check-world causes an error like below. I used gcc 8. # Failed test 'pg_ctl start: no stderr' # at t/001_start_stop.pl line 48. # got: 'profiling:/home/(path/to/oss/head)/src/backend/utils/adt/regproc.gcda:Merge mismatch for function 24 # ' # expected: '' # Looks like you failed 1 test of 24. make[2]: *** [Makefile:50: check] Error 1 make[1]: *** [Makefile:43: check-pg_ctl-recurse] Error 2 make[1]: *** Waiting for unfinished jobs.... make: *** [GNUmakefile:71: check-world-src/bin-recurse] Error 2 make: *** Waiting for unfinished jobs.... The steps I used are $ git clone and cd to OSS HEAD $ ./configure --enable-coverage --enable-cassert --enable-debug --enable-tap-tests --with-icu CFLAGS=-O0 --prefix=/where/to/put/binary $ make -j4 2> make.log $ make check-world -j4 2> make_check_world.log Best Regards, Takamichi Osumi
pgsql-hackers by date: