Re: Making pg_rewind faster - Mailing list pgsql-hackers
From | Srinath Reddy Sadipiralla |
---|---|
Subject | Re: Making pg_rewind faster |
Date | |
Msg-id | CAFC+b6ryuc_5uwkNcHDXUYZWZ+zUydTPPTS5xiZY1Z_t-EpkjQ@mail.gmail.com Whole thread Raw |
In response to | Re: Making pg_rewind faster (Michael Paquier <michael@paquier.xyz>) |
List | pgsql-hackers |
Hi Michael,
On Thu, Oct 23, 2025 at 1:52 PM Michael Paquier <michael@paquier.xyz> wrote:
On Tue, Oct 21, 2025 at 04:14:55PM -0700, John H wrote:
> Made the changes and included the documentation update.
I have been looking at this patch, and the first part that stood out
is that the refactoring related to isRelDataFile() can be done as an
independent piece, cutting the total size of the main patch by 30% or
so. So I have extracted this part, simplified it to make it more
consistent and les noisy on HEAD, and applied it as 6ae08d9583e9.
yeah ... that's cleaner, thanks for committing.
+# Sleep to allow mtime to be different
+usleep(1000000);
[...]
+ [qr/WAL segment \"$wal_seg_skipped\" not copied to target/,
+ qr/pg_wal\/$corrupt_wal_seg \(COPY\)/
+ ],
FWIW, I am definitely not a fan of the test relying on the timestamp
of the files based on their retrieved meta-data stats, with the mtime.
I suspect complications on Windows. Worse thing, there is a forced
sleep of 1s added to the test, to make sure that the timestamp of the
files we compare have a different number. But do we really need that
anyway if we have the debug information with the file map printed in
it?
thought it would act like an extra sanity check ,to prove the point that the
pg_rewind is saying through the debug info that it has been really copied
or skipped.
Hence, I think that we should simplify and rework the test a bit,
tweaking a bit how the filemap is printed: now that we can detect if
an operation is happening on a WAL file thanks to file_content_type_t,
let's always print the type of operation done for each WAL file and
remove this "not copied to target" debug log which provides the same
information, then let's compare the debug output with what we want.
i guess this is what you want ,please let me know if it's otherwise,
diff --git a/src/bin/pg_rewind/filemap.c b/src/bin/pg_rewind/filemap.c
index 2d82edec060..2180495d337 100644
--- a/src/bin/pg_rewind/filemap.c
+++ b/src/bin/pg_rewind/filemap.c
@@ -546,7 +546,7 @@ print_filemap(filemap_t *filemap)
for (i = 0; i < filemap->nentries; i++)
{
entry = filemap->entries[i];
- if (entry->action != FILE_ACTION_NONE ||
+ if (entry->content_type == FILE_CONTENT_TYPE_WAL || entry->action != FILE_ACTION_NONE ||
entry->target_pages_to_overwrite.bitmapsize > 0)
{
pg_log_debug("%s (%s)", entry->path,
@@ -733,7 +733,6 @@ decide_wal_file_action(const char *fname, XLogSegNo last_common_segno,
*/
if (file_segno < last_common_segno && source_size == target_size)
{
- pg_log_debug("WAL segment \"%s\" not copied to target", fname);
return FILE_ACTION_NONE;
}
diff --git a/src/bin/pg_rewind/t/011_wal_copy.pl b/src/bin/pg_rewind/t/011_wal_copy.pl
index b9e24844654..3e12744dfa6 100644
--- a/src/bin/pg_rewind/t/011_wal_copy.pl
+++ b/src/bin/pg_rewind/t/011_wal_copy.pl
@@ -93,7 +93,7 @@ command_checks_all(
0,
[qr//],
[
- qr/WAL segment \"$wal_seg_skipped\" not copied to target/,
+ qr/pg_wal\/$wal_seg_skipped \(NONE\)/,
qr/pg_wal\/$corrupt_wal_seg \(COPY\)/
],
'run pg_rewind');
It seems to me that it would be good for the test to run some sanity
checks on the rewound standby, as well. That would provide more
validation for the case of the "corrupted" segment that gets copied
anyway.
i think these checks help do the same
1) qr/pg_wal\/$corrupt_wal_seg \(COPY\)/
this shows the corrupted segment is copied.
2) ok( $corrupt_wal_seg_stat_before_rewind->size !=
$corrupt_wal_seg_source_stat->size,
"Expected WAL segment $corrupt_wal_seg to have different size in
source vs target before rewind"
);
my $corrupt_wal_seg_stat_after_rewind = stat($corrupt_wal_seg_in_target_path);
ok( $corrupt_wal_seg_stat_after_rewind->size ==
$corrupt_wal_seg_source_stat->size,
"Expected WAL segment $corrupt_wal_seg file sizes to be same between
target and source after rewind as it was copied"
);
These checks show that before the corrupt segment's size on
rewound standby(target) was not the same as source but after
1) qr/pg_wal\/$corrupt_wal_seg \(COPY\)/
this shows the corrupted segment is copied.
2) ok( $corrupt_wal_seg_stat_before_rewind->size !=
$corrupt_wal_seg_source_stat->size,
"Expected WAL segment $corrupt_wal_seg to have different size in
source vs target before rewind"
);
my $corrupt_wal_seg_stat_after_rewind = stat($corrupt_wal_seg_in_target_path);
ok( $corrupt_wal_seg_stat_after_rewind->size ==
$corrupt_wal_seg_source_stat->size,
"Expected WAL segment $corrupt_wal_seg file sizes to be same between
target and source after rewind as it was copied"
);
These checks show that before the corrupt segment's size on
rewound standby(target) was not the same as source but after
rewind it was the same as source,please let me know if i am
missing your point here.
pgsql-hackers by date: