Thank you for your feedback, I've incorporated your suggestions by scanning the logs produced from pg_rewind when asserting that certain WAL segment files were skipped from being copied over to the target server.
I've also updated the pg_rewind patch file to target the Postgres master branch (version 16 to be). Please see attached.
Thanks,
Justin
From: Michael Paquier Sent: Tuesday, July 19, 2022 1:36 AM To: Justin Kwan Cc: Tom Lane; pgsql-hackers; vignesh; jkwan@cloudflare.com; vignesh ravichandran; hlinnaka@iki.fi Subject: Re: Making pg_rewind faster
On Mon, Jul 18, 2022 at 05:14:00PM +0000, Justin Kwan wrote: > Thank you for taking a look at this and that sounds good. I will > send over a patch compatible with Postgres v16.
+$node_2->psql( + 'postgres', + "SELECT extract(epoch from modification) FROM pg_stat_file('pg_wal/000000010000000000000003');", + stdout => \my $last_common_tli1_wal_last_modified_at); Please note that you should not rely on the FS-level stats for anything that touches the WAL segments. A rough guess about what you could here to make sure that only the set of WAL segments you are looking for is being copied over would be to either: - Scan the logs produced by pg_rewind and see if the segments are copied or not, depending on the divergence point (aka the last checkpoint before WAL forked). - Clean up pg_wal/ in the target node before running pg_rewind, checking that only the segments you want are available once the operation completes. -- Michael
From:
Amit Kapila Date: Subject:
Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns