Re: Making pg_rewind faster - Mailing list pgsql-hackers

From Robert Haas
Subject Re: Making pg_rewind faster
Date
Msg-id CA+TgmobtNkXDMSi+n0TO2O4cpSGZTVn9414xsHOfeiJc_gwPuA@mail.gmail.com
Whole thread Raw
In response to Re: Making pg_rewind faster  (Michael Paquier <michael@paquier.xyz>)
List pgsql-hackers
On Tue, Oct 28, 2025 at 12:02 AM Michael Paquier <michael@paquier.xyz> wrote:
> I was thinking about this argument over the weekend, and I am
> wondering if we could not do better here to detect if a file should be
> copied or not.  What if we included a checksum of each file if both
> exist on the target and source, and just not copy them if the
> checksums match?

Well, that would require reading the entire file on both sides to
compute the checksum, which sounds pretty expensive. I mean, a copy
would only be a read on one side and a write on the other. Even
granting that writes are more expensive than reads, a read of both
sides would still be a substantial percentage of the total cost, I
think.

Also, I don't think we really want to reinvent a worse version of
rsync. If you want to use checksums or file timestamps to decide what
to copy, there are already good tools for that which probably handle
that task better than our code ever will. What we can bring to the
table is PG-specific logic, where we're able to reason about the
behavior of PG in a way that a general-purpose tool can't. That's why
for example we use the WAL to decide what data blocks need to be
copied, rather than checksums -- it's an optimization that rsync can't
do, and we can. The rule implemented here is similar: rsync can't know
that WAL from before the divergence point should be the same on both
sides, but we can.

Now, of course, if in a specific situation the assumptions on which
pg_rewind relies are not valid, e.g. because manual data directory
modification has occurred, then pg_rewind should not be used. And if
on the other hand we find some flaw that will keep pg_rewind from
delivering correct results even when nothing strange has happened,
then that's a bug or a design problem that we need to fix. But if we
just start second-guessing ourselves by adding overhead to protect
against can't-happen scenarios, we'll end up making pg_rewind useless.

--
Robert Haas
EDB: http://www.enterprisedb.com



pgsql-hackers by date:

Previous
From: Bryan Green
Date:
Subject: [Patch] Windows relation extension failure at 2GB and 4GB
Next
From: Tomas Vondra
Date:
Subject: Re: failed NUMA pages inquiry status: Operation not permitted