On Sat, Apr 05, 2025 at 11:07:13AM -0400, Tom Lane wrote:
> Michael Paquier <michael@paquier.xyz> writes:
> > On Wed, Apr 02, 2025 at 05:29:00PM -0700, Noah Misch wrote:
> >> Here it is. Making it fail three times took looping 1383s, 5841s, and 2594s.
> >> Hence, it couldn't be expected to catch the regression before commit, but it
> >> would have made sufficient buildfarm and CI noise in the day after commit.
>
> > Hmm. Not much of a fan of the addition of a test that has less than
> > 1% of reproducibility for the problem, even if it's good to see that
> > this can be made portable to run down to v13.
>
> Yeah, it's good to have a test but I doubt we should commit it.
> Too many buildfarm cycles will be expended for too little result.
Current extent of our archive recovery restartpoint test coverage:
$ grep -c 'restartpoint starting' $(grep -rl 'restored log file' **/log) | grep -v :0
src/bin/pg_combinebackup/tmp_check/log/002_compare_backups_pitr1.log:1
src/test/recovery/tmp_check/log/020_archive_status_standby2.log:1
src/test/recovery/tmp_check/log/002_archiving_standby.log:1
src/test/recovery/tmp_check/log/020_archive_status_standby.log:1
src/test/recovery/tmp_check/log/035_standby_logical_decoding_standby.log:2
Since the 2025-02 releases made non-toy-size archive recoveries fail easily,
that's not enough. If the proposed 3-second test is the wrong thing, what
instead?