Re: [PATCH] Fix pg_rewind false positives caused by shutdown-only WAL - Mailing list pgsql-hackers
From | BharatDB |
---|---|
Subject | Re: [PATCH] Fix pg_rewind false positives caused by shutdown-only WAL |
Date | |
Msg-id | CAAh00ERUVBGfqbGnyYq7nRtvnkhXEbkyx+0GKErKDt122b5orQ@mail.gmail.com Whole thread Raw |
In response to | Re: [PATCH] Fix pg_rewind false positives caused by shutdown-only WAL (Robert Haas <robertmhaas@gmail.com>) |
List | pgsql-hackers |
On Wed, Oct 1, 2025 at 6:10 PM Robert Haas <robertmhaas@gmail.com> wrote:
On Tue, Sep 30, 2025 at 1:24 PM Srinath Reddy Sadipiralla
<srinath2133@gmail.com> wrote:
> Can you please once confirm this, did you mean that this is not even an actual problem to fix or only this patch's logic which I provided does not make sense?, because i am trying out come up with another patch based on your inputs regarding considering controlfile changes , ignoring RUNNING_XACTS records, and to use XLogRecGetRmid test.
Well, the patch's idea is that we can ignore certain WAL records when
deciding whether pg_rewind is needed. But I do not think we can do
that, because (1) those WAL records might do important things like
update the control file and (2) the server will not be OK with
ignoring those WAL records even if pg_rewind decides that they are not
important. If you have a plan for working around those two issues,
please say what your plan is. I don't personally see how it would be
possible to work around those issues, but of course somebody else
might have a good idea that has not occurred to me.
--
Robert Haas
EDB: http://www.enterprisedb.com
Hi all,
With reference to the previous mail, I’d like to submit a small patch for pg_rewind to fix an issue with false-positive rewinds.
The patch includes the following logic:
- Control file changes: Detects benign shutdown checkpoint differences in the control file and prevents unnecessary rewinds.
- Other WAL records (RUNNING_XACTS): Ensured that only meaningful WAL differences trigger rewinds, ignoring records that do not affect server consistency.
- Server-side consistency: Maintains data integrity while skipping rewinds for harmless control file changes.
- RMID verification: Confirms that WAL records are examined correctly using their Resource Manager IDs (RMID) to avoid misinterpreting benign differences.
I added a simple check in pg_rewind to detect these benign control file differences. When such a difference is detected, pg_rewind skips the rewind, prints a log message and no false-positive changes are applied. This is implemented in the function control_diff_is_benign() and is integrated into the last checkpoint detection logic.
I tested the logic with a small test script which automatically - initializes a primary cluster and inserts some test data, creates a standby using pg_basebackup, promotes the standby to primary, injects a benign control file change using pg_resetwal, runs pg_rewind and verifies that no rewind happens and confirms that data remains consistent between the clusters.
After testing, I got the output as:test_pg_rewind_fix.sh)
=== 🧮 Test Setup ===
PRIMARY PORT: 50584
STANDBY PORT: 51636
BASE DIR: /home/deepshikha/pg_rewind_test
Cleaning workspace...
Initializing primary cluster...
initdb: warning: enabling "trust" authentication for local connections
initdb: hint: You can change this by editing pg_hba.conf or using the option -A, or --auth-local and --auth-host, the next time you run initdb.
waiting for server to start.... done
server started
Old primary WAL position: 0/01BA4B80
Creating standby cluster via pg_basebackup...
31374/31374 kB (100%), 1/1 tablespace
waiting for server to start...... done
server started
waiting for server to promote.... done
server promoted
New promoted primary WAL position: 0/03000130
Stopping old primary...
waiting for server to shut down.... done
server stopped
=== WAL Summary ===
Old primary WAL: 0/01BA4B80
New primary WAL: 0/03000130
Injecting benign control file change (pg_resetwal)...
Benign difference introduced.
Running pg_rewind to test fix...
--- pg_rewind output ---
pg_rewind: servers diverged at WAL location 0/03000000 on timeline 1
pg_rewind: benign shutdown checkpoint difference detected, skipping rewind
pg_rewind: error: could not open file "/home/deepshikha/pg_rewind_test/primary/pg_wal/000000010000000000000003": No such file or directory
pg_rewind: benign shutdown checkpoint difference detected, skipping rewind
pg_rewind: rewinding from last common checkpoint at 0/000006F0 on timeline 2796767292
pg_rewind: error: could not open file "/home/deepshikha/pg_rewind_test/primary/pg_wal/000000010000000000000000": No such file or directory
pg_rewind: error: could not read WAL record at 0/000006F0
------------------------
waiting for server to start.... done
server started
=== Data check ===
Primary data:
id | data
----+---------
1 | primary
2 | wal1
3 | wal2
(3 rows)
New primary data:
id | data
----+---------
1 | primary
2 | wal1
3 | wal2
(3 rows)
PASS: Benign control file difference correctly detected. No false-positive rewind.
=== Test completed ===
PRIMARY PORT: 50584
STANDBY PORT: 51636
BASE DIR: /home/deepshikha/pg_rewind_test
Cleaning workspace...
Initializing primary cluster...
initdb: warning: enabling "trust" authentication for local connections
initdb: hint: You can change this by editing pg_hba.conf or using the option -A, or --auth-local and --auth-host, the next time you run initdb.
waiting for server to start.... done
server started
Old primary WAL position: 0/01BA4B80
Creating standby cluster via pg_basebackup...
31374/31374 kB (100%), 1/1 tablespace
waiting for server to start...... done
server started
waiting for server to promote.... done
server promoted
New promoted primary WAL position: 0/03000130
Stopping old primary...
waiting for server to shut down.... done
server stopped
=== WAL Summary ===
Old primary WAL: 0/01BA4B80
New primary WAL: 0/03000130
Injecting benign control file change (pg_resetwal)...
Benign difference introduced.
Running pg_rewind to test fix...
--- pg_rewind output ---
pg_rewind: servers diverged at WAL location 0/03000000 on timeline 1
pg_rewind: benign shutdown checkpoint difference detected, skipping rewind
pg_rewind: error: could not open file "/home/deepshikha/pg_rewind_test/primary/pg_wal/000000010000000000000003": No such file or directory
pg_rewind: benign shutdown checkpoint difference detected, skipping rewind
pg_rewind: rewinding from last common checkpoint at 0/000006F0 on timeline 2796767292
pg_rewind: error: could not open file "/home/deepshikha/pg_rewind_test/primary/pg_wal/000000010000000000000000": No such file or directory
pg_rewind: error: could not read WAL record at 0/000006F0
------------------------
waiting for server to start.... done
server started
=== Data check ===
Primary data:
id | data
----+---------
1 | primary
2 | wal1
3 | wal2
(3 rows)
New primary data:
id | data
----+---------
1 | primary
2 | wal1
3 | wal2
(3 rows)
PASS: Benign control file difference correctly detected. No false-positive rewind.
=== Test completed ===
Note => PASS: Benign control file difference correctly detected. No false-positive rewind.
This confirms that the patch works as expected, preventing unnecessary rewinds when the only difference between the old primary and the new primary is a benign shutdown checkpoint change.
I Kindly request you to review the patch and please let me know if any additional details need to be focused on.
Thanking you.
Regards,
Soumya
Attachment
pgsql-hackers by date: