Re: BUG #17928: Standby fails to decode WAL on termination of primary - Mailing list pgsql-bugs
From | Michael Paquier |
---|---|
Subject | Re: BUG #17928: Standby fails to decode WAL on termination of primary |
Date | |
Msg-id | ZNxqFgo243DOar4L@paquier.xyz Whole thread Raw |
In response to | Re: BUG #17928: Standby fails to decode WAL on termination of primary (Thomas Munro <thomas.munro@gmail.com>) |
Responses |
Re: BUG #17928: Standby fails to decode WAL on termination of primary
|
List | pgsql-bugs |
On Wed, Aug 16, 2023 at 03:29:49PM +1200, Thomas Munro wrote: > I hacked on this idea for quite a long time yesterday and today and > came up with a set of tests for the main end-of-WAL conditions: > > ▶ 1/1 - xl_tot_len zero OK > ▶ 1/1 - xl_tot_len short OK > ▶ 1/1 - xl_prev bad OK > ▶ 1/1 - xl_crc bad OK > ▶ 1/1 - xlp_magic zero OK > ▶ 1/1 - xlp_magic bad OK > ▶ 1/1 - xlp_pageaddr bad OK > ▶ 1/1 - xlp_info bad OK > ▶ 1/1 - xlp_info lacks XLP_FIRST_IS_CONTRECORD OK > ▶ 1/1 - xlp_rem_len bad OK > ▶ 1/1 - xlp_magic zero (split record header) OK > ▶ 1/1 - xlp_pageaddr bad (split record header) OK > ▶ 1/1 - xlp_rem_len bad (split record header) OK > 1/1 postgresql:recovery / recovery/038_end_of_wal OK > 5.79s 13 subtests passed Nice. > It took me a while to come up with a workable way to get into the > record-header-splitting zone. Based on some of your clues about > flushing, I eventually realised I needed transactional messages, and I > built a kind of self-calibrating Rube Goldberg function around that. > It's terrible, and I'm sure we can do better. So that's advance_to_record_splitting_zone(), with the idea to loop across multiple records. The use of transactional messages ensures the flush and the stability of the test when the server is stopped, but it makes the records a bit longer than the non-transactional messages. > I wonder what people think about putting internal details of the WAL > format into a Perl test like this. Obviously it requires maintenance, > since it knows the size and layout of a few things. I guess it'd be > allowed to fish a couple of those numbers out of the source. The format of the WAL header does not change a lot across the years, so I'd be OK with that. If that's an issue, the scope of the test could always be restricted a bit. > Work in progress... I'm sure more useful checks could be added. One > thing that occurred to me while thinking about all this it that the > 'treat malloc failure as end of WAL' thing you highlighted in another > thread is indeed completely bananas -- I didn't go digging, but > perhaps it was an earlier solution to the very same garbage xl_tot_len > problem, before 70b4f82a4b5 and now this xl_rem_len-based solution? It is bananas. Still, the problem of the other thread is similar to what's here, and different at the same time because it can happen with a valid record if the host is under memory pressure, and we should still allow a standby to loop and continue in this case. Perhaps also a startup process doing only crash recovery. This thread's problem is dependent on this one, which is why this one needs to happen first: we need to clarify the boundary between a real OOM and garbage indicating the end of WAL. +my $RECORD_HEADER_SIZE = 24; +my $XLP_PAGE_MAGIC = 0xd113; +my $XLP_FIRST_IS_CONTRECORD = 0x1; This could use something like check_pg_config() to find the header contents for the last two ones, but there's no easy way to extract the size of the header struct, is there? Or perhaps perl has a tool for that.. Anything I can read on the matter tells to use pack() to make the size estimations predictible. + my $page_offset = $end_lsn % $WAL_BLOCK_SIZE; + while ($page_offset >= $WAL_BLOCK_SIZE - 2000) + { + $end_lsn = emit_message($node, 2000) The hardcoded 2000 feels magic here, but I get that you just want to get away from the page border, as well. I'd be OK with more comments. +sub lsn_to_segment_and_offset +{ + my $lsn = shift; + return ($lsn / $WAL_SEGMENT_SIZE, $lsn % $WAL_SEGMENT_SIZE); +} Now the backend also has pg_walfile_name_offset(). I tend to rely on the backend logic as much as I can for tests. +sub start_of_page Not sure this one is needed, only being called in start_of_next_page(). +sub get_insert_lsn Name is confusing with Cluster::lsn('insert') available, perhaps this should be named get_insert_lsn_in_bytes. As a whole, I find that pretty cool. Not sure if I would backpatch the test, so my opinion would be to use that on HEAD and let it stabilize there. -- Michael
Attachment
pgsql-bugs by date: