Re: Strengthen pg_waldump's --save-fullpage tests - Mailing list pgsql-hackers

From Bharath Rupireddy
Subject Re: Strengthen pg_waldump's --save-fullpage tests
Date
Msg-id CALj2ACUx-W07Tf9cV3pdgfd750BNVe3MbuQ2X-TzYE3VJR_kfQ@mail.gmail.com
Whole thread Raw
In response to Re: Strengthen pg_waldump's --save-fullpage tests  ("Drouvot, Bertrand" <bertranddrouvot.pg@gmail.com>)
Responses Re: Strengthen pg_waldump's --save-fullpage tests
List pgsql-hackers
On Wed, Jan 11, 2023 at 3:28 PM Drouvot, Bertrand
<bertranddrouvot.pg@gmail.com> wrote:
>
> Hi,
>
> On 1/11/23 5:17 AM, Bharath Rupireddy wrote:
> > On Wed, Jan 11, 2023 at 6:32 AM Michael Paquier <michael@paquier.xyz> wrote:
> >>
> >> On Tue, Jan 10, 2023 at 05:25:44PM +0100, Drouvot, Bertrand wrote:
> >>> I like the idea of comparing the full page (and not just the LSN) but
> >>> I'm not sure that adding the pageinspect dependency is a good thing.
> >>>
> >>> What about extracting the block directly from the relation file and
> >>> comparing it with the one extracted from the WAL? (We'd need to skip the
> >>> first 8 bytes to skip the LSN though).
> >>
> >> Byte-by-byte counting for the page hole?
>
> I've in mind to use diff on the whole page (minus the LSN).
>
> >> The page checksum would
> >> matter as well,
>
> Right, but the TAP test is done without checksum and we could also
> skip the checksum from the page if we really want to.
>
> > Right. LSN of FPI from the WAL record and page from the table won't be
> > the same, essentially FPI LSN <= table page.
>
> Right, that's why I proposed to exclude it for the comparison.
>
> What about something like the attached?

Note that the raw page on the table might differ not just in page LSN
but also in other fields, for instance see heap_mask for instance. It
masks lsn, checksum, hint bits, unused space etc. before verifying FPI
consistency during recovery in
verifyBackupPageConsistency().

I think the job of verifying FPI from WAL record with the page LSN is
better left to the core - via verifyBackupPageConsistency(). Honestly,
pg_waldump is good with what it has currently - LSN checks.

+# Extract the binary data without the LSN from the relation's block
+sysseek($frel, 8, 0); #bypass the LSN
+sysread($frel, $blk, 8184) or die "sysread failed: $!";
+syswrite($blkfrel, $blk) or die "syswrite failed: $!";

I suspect that these tests are portable with the hardcoded values such as above.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com



pgsql-hackers by date:

Previous
From: Bharath Rupireddy
Date:
Subject: Re: Add a new pg_walinspect function to extract FPIs from WAL records
Next
From: Peter Eisentraut
Date:
Subject: Re: Rework of collation code, extensibility