Re: Corruption during WAL replay - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Corruption during WAL replay
Date
Msg-id 3261706.1648223448@sss.pgh.pa.us
Whole thread Raw
In response to Re: Corruption during WAL replay  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: Corruption during WAL replay  (Andres Freund <andres@anarazel.de>)
Re: Corruption during WAL replay  (Dagfinn Ilmari Mannsåker <ilmari@ilmari.org>)
List pgsql-hackers
Robert Haas <robertmhaas@gmail.com> writes:
> ... It's not
> like a 16-bit checksum was state-of-the-art even when we introduced
> it. We just did it because we had 2 bytes that we could repurpose
> relatively painlessly, and not any larger number. And that's still the
> case today, so at least in the short term we will have to choose some
> other solution to this problem.

Indeed.  I propose the attached, which also fixes the unsafe use
of seek() alongside syswrite(), directly contrary to what "man perlfunc"
says to do.

            regards, tom lane

diff --git a/src/bin/pg_checksums/t/002_actions.pl b/src/bin/pg_checksums/t/002_actions.pl
index 62c608eaf6..8c70453a45 100644
--- a/src/bin/pg_checksums/t/002_actions.pl
+++ b/src/bin/pg_checksums/t/002_actions.pl
@@ -24,6 +24,7 @@ sub check_relation_corruption
     my $tablespace = shift;
     my $pgdata     = $node->data_dir;

+    # Create table and discover its filesystem location.
     $node->safe_psql(
         'postgres',
         "CREATE TABLE $table AS SELECT a FROM generate_series(1,10000) AS a;
@@ -37,9 +38,6 @@ sub check_relation_corruption
     my $relfilenode_corrupted = $node->safe_psql('postgres',
         "SELECT relfilenode FROM pg_class WHERE relname = '$table';");

-    # Set page header and block size
-    my $pageheader_size = 24;
-    my $block_size = $node->safe_psql('postgres', 'SHOW block_size;');
     $node->stop;

     # Checksums are correct for single relfilenode as the table is not
@@ -55,8 +53,12 @@ sub check_relation_corruption

     # Time to create some corruption
     open my $file, '+<', "$pgdata/$file_corrupted";
-    seek($file, $pageheader_size, SEEK_SET);
-    syswrite($file, "\0\0\0\0\0\0\0\0\0");
+    my $pageheader;
+    sysread($file, $pageheader, 24) or die "sysread failed";
+    # This inverts the pd_checksum field (only); see struct PageHeaderData
+    $pageheader ^= "\0\0\0\0\0\0\0\0\xff\xff";
+    sysseek($file, 0, 0) or die "sysseek failed";
+    syswrite($file, $pageheader) or die "syswrite failed";
     close $file;

     # Checksum checks on single relfilenode fail

pgsql-hackers by date:

Previous
From: Fabien COELHO
Date:
Subject: Re: psql - add SHOW_ALL_RESULTS option
Next
From: David Steele
Date:
Subject: Re: ubsan