Re: BUG #18575: Sometimes pg_rewind mistakenly assumes that nothing needs to be done. - Mailing list pgsql-bugs

From Heikki Linnakangas
Subject Re: BUG #18575: Sometimes pg_rewind mistakenly assumes that nothing needs to be done.
Date
Msg-id e08235e3-b216-4bf3-8a9a-8ea819ae105e@iki.fi
Whole thread Raw
In response to BUG #18575: Sometimes pg_rewind mistakenly assumes that nothing needs to be done.  (PG Bug reporting form <noreply@postgresql.org>)
List pgsql-bugs
On 08/08/2024 10:57, Georgy Shelkovy wrote:
> Unfortunately, the playback is not very stable, but sometimes it shoots. 
> I added some commands to show last WAL rows

Thanks. I still haven't been able to reproduce it, but here's a theory:

When determining whether the target needs rewinding, pg_rewind looks at 
the target's last checkpoint record, or if it's a standby, its 
minRecoveryPoint. It's possible that standby2's minRecoveryPoint is 
indeed before the point of divergence. That means it has replayed the 
340 insert records, but all the changes are still only sitting in the 
shared buffer cache. When you shut it down, those 340 inserts are gone 
on standby2. When you restart it, they will be applied again from the WAL.

In that case, pg_rewind's conclusion that no rewind is needed is 
correct. standby2 is strictly behind standby1, and could catch up 
directly to it. However, when you restart standby2, it will first replay 
the WAL it had streamed from master.

Can you show the full output of pg_controldata on all the servers, 
please? In your latest snippet, you showed just the checkpoint 
locations, but if just remove the "grep checkpoint | grep location" 
filters, it would print the whole thing. I'm particularly interested in 
the minRecoveryPoint on standby2, in the cases when it works and when it 
doesn't.

I'm not sure what the right behavior would be if that's the issue. 
Perhaps pg_rewind should truncate the WAL in standby2/pg_wal/ in that 
case, so that when you start it up again, it would not replay the local 
WAL but would connect to standby2 directly. Also, perhaps a fast 
shutdown of a standby server should update minRecoveryPoint before exiting.

-- 
Heikki Linnakangas
Neon (https://neon.tech)




pgsql-bugs by date:

Previous
From: "狂奔的蜗牛"
Date:
Subject: 回复: BUG #18568: BUG: Result wrong when do group by on partition table!
Next
From: Tender Wang
Date:
Subject: Re: BUG #18568: BUG: Result wrong when do group by on partition table!