Re: BUG #17928: Standby fails to decode WAL on termination of primary - Mailing list pgsql-bugs

From Michael Paquier
Subject Re: BUG #17928: Standby fails to decode WAL on termination of primary
Date
Msg-id ZNxqFgo243DOar4L@paquier.xyz
Whole thread Raw
In response to Re: BUG #17928: Standby fails to decode WAL on termination of primary  (Thomas Munro <thomas.munro@gmail.com>)
Responses Re: BUG #17928: Standby fails to decode WAL on termination of primary
List pgsql-bugs
On Wed, Aug 16, 2023 at 03:29:49PM +1200, Thomas Munro wrote:
> I hacked on this idea for quite a long time yesterday and today and
> came up with a set of tests for the main end-of-WAL conditions:
>
> ▶ 1/1 - xl_tot_len zero                                  OK
> ▶ 1/1 - xl_tot_len short                                 OK
> ▶ 1/1 - xl_prev bad                                      OK
> ▶ 1/1 - xl_crc bad                                       OK
> ▶ 1/1 - xlp_magic zero                                   OK
> ▶ 1/1 - xlp_magic bad                                    OK
> ▶ 1/1 - xlp_pageaddr bad                                 OK
> ▶ 1/1 - xlp_info bad                                     OK
> ▶ 1/1 - xlp_info lacks XLP_FIRST_IS_CONTRECORD           OK
> ▶ 1/1 - xlp_rem_len bad                                  OK
> ▶ 1/1 - xlp_magic zero (split record header)             OK
> ▶ 1/1 - xlp_pageaddr bad (split record header)           OK
> ▶ 1/1 - xlp_rem_len bad (split record header)            OK
> 1/1 postgresql:recovery / recovery/038_end_of_wal        OK
>     5.79s   13 subtests passed

Nice.

> It took me a while to come up with a workable way to get into the
> record-header-splitting zone.  Based on some of your clues about
> flushing, I eventually realised I needed transactional messages, and I
> built a kind of self-calibrating Rube Goldberg function around that.
> It's terrible, and I'm sure we can do better.

So that's advance_to_record_splitting_zone(), with the idea to loop
across multiple records.  The use of transactional messages ensures
the flush and the stability of the test when the server is stopped,
but it makes the records a bit longer than the non-transactional
messages.

> I wonder what people think about putting internal details of the WAL
> format into a Perl test like this.  Obviously it requires maintenance,
> since it knows the size and layout of a few things.  I guess it'd be
> allowed to fish a couple of those numbers out of the source.

The format of the WAL header does not change a lot across the years,
so I'd be OK with that.  If that's an issue, the scope of the test
could always be restricted a bit.

> Work in progress...  I'm sure more useful checks could be added.  One
> thing that occurred to me while thinking about all this it that the
> 'treat malloc failure as end of WAL' thing you highlighted in another
> thread is indeed completely bananas -- I didn't go digging, but
> perhaps it was an earlier solution to the very same garbage xl_tot_len
> problem, before 70b4f82a4b5 and now this xl_rem_len-based solution?

It is bananas.  Still, the problem of the other thread is similar to
what's here, and different at the same time because it can happen with
a valid record if the host is under memory pressure, and we should
still allow a standby to loop and continue in this case.  Perhaps also
a startup process doing only crash recovery.  This thread's problem is
dependent on this one, which is why this one needs to happen first: we
need to clarify the boundary between a real OOM and garbage indicating
the end of WAL.

+my $RECORD_HEADER_SIZE = 24;
+my $XLP_PAGE_MAGIC = 0xd113;
+my $XLP_FIRST_IS_CONTRECORD = 0x1;

This could use something like check_pg_config() to find the header
contents for the last two ones, but there's no easy way to extract the
size of the header struct, is there?  Or perhaps perl has a tool for
that..  Anything I can read on the matter tells to use pack() to make
the size estimations predictible.

+   my $page_offset = $end_lsn % $WAL_BLOCK_SIZE;
+   while ($page_offset >= $WAL_BLOCK_SIZE - 2000)
+   {
+       $end_lsn = emit_message($node, 2000)

The hardcoded 2000 feels magic here, but I get that you just want to
get away from the page border, as well.  I'd be OK with more comments.

+sub lsn_to_segment_and_offset
+{
+   my $lsn = shift;
+   return ($lsn / $WAL_SEGMENT_SIZE, $lsn % $WAL_SEGMENT_SIZE);
+}

Now the backend also has pg_walfile_name_offset().  I tend to rely on
the backend logic as much as I can for tests.

+sub start_of_page

Not sure this one is needed, only being called in
start_of_next_page().

+sub get_insert_lsn

Name is confusing with Cluster::lsn('insert') available, perhaps this
should be named get_insert_lsn_in_bytes.

As a whole, I find that pretty cool.  Not sure if I would backpatch
the test, so my opinion would be to use that on HEAD and let it
stabilize there.
--
Michael

Attachment

pgsql-bugs by date:

Previous
From: Thomas Munro
Date:
Subject: Re: BUG #17928: Standby fails to decode WAL on termination of primary
Next
From: PG Bug reporting form
Date:
Subject: BUG #18058: Http Extension Postgres