Re: BUG #17928: Standby fails to decode WAL on termination of primary - Mailing list pgsql-bugs

From Alexander Lakhin
Subject Re: BUG #17928: Standby fails to decode WAL on termination of primary
Date
Msg-id c3f123ca-03f7-b7ce-1e49-f8fee4c16545@gmail.com
Whole thread Raw
In response to Re: BUG #17928: Standby fails to decode WAL on termination of primary  (Thomas Munro <thomas.munro@gmail.com>)
Responses Re: BUG #17928: Standby fails to decode WAL on termination of primary
Re: BUG #17928: Standby fails to decode WAL on termination of primary
List pgsql-bugs
Hello,

16.09.2023 03:20, Thomas Munro wrote:
> On Sat, Sep 16, 2023 at 12:03 PM Michael Paquier <michael@paquier.xyz> wrote:
>>> [1] https://github.com/macdice/postgres/commits/fix-12
>> Hmm.  What was the test that failed?
> $ make -s -C src/test/recovery/ check PROVE_TESTS=t/039*
> t/039_end_of_wal.pl .. 4/?
> #   Failed test 'xlp_magic zero'
> #   at t/039_end_of_wal.pl line 312.
>
> not ok 5 - xlp_magic zero
>
> Where the log should say "invalid magic number 0000" I see:
>
> 2023-09-16 12:13:07.331 NZST [156812] LOG:  record with incorrect
> prev-link 0/16B60C0 at 0/16B6120
>
> It has to do with initial WAL position after initdb, because I get
> this only on Debian, on REL_12_STABLE (with the commit listed above on
> my public fix-12 branch) and only with --with-icu, but not without it,
> and I can't repro it on my other local OSes.

I tried to reproduce the failure on Debian 9, 10, 11, but not succeeded yet.
Though I got another error on Debian 9:
t/039_end_of_wal.pl .. Dubious, test returned 25 (wstat 6400, 0x1900)
No subtests run
...
cat src/test/recovery/tmp_check/log/regress_log_039_end_of_wal
could not find match in header access/xlog_internal.h

It looks like the construction "@{^CAPTURE}" used in scan_server_header()
is not supported by Perl 5.24, which is included in Debian stretch:
https://perldoc.perl.org/variables/@%7B%5ECAPTURE%7D
I replaced it with
@match = ($1);
and that worked for me.

Also, I observed that "wal_log_hints = on" in extra.config, which I use via
"TEMP_CONFIG=extra.config make check-world" makes the test fail too, though
check-world passes fine without the new test.
Maybe that's not an issue, and probably there are other parameters, which
might affect this test, but I'm somewhat confused by the fact that only this
test breaks with it.

Best regards,
Alexander



pgsql-bugs by date:

Previous
From: Michael Paquier
Date:
Subject: Re: BUG #17928: Standby fails to decode WAL on termination of primary
Next
From: Michael Paquier
Date:
Subject: Re: BUG #18070: Assertion failed when processing error from plpy's iterator