Re: Slot's restart_lsn may point to removed WAL segment after hard restart unexpectedly - Mailing list pgsql-hackers
From | Alexander Korotkov |
---|---|
Subject | Re: Slot's restart_lsn may point to removed WAL segment after hard restart unexpectedly |
Date | |
Msg-id | CAPpHfdvjGWo--xqqjJbyb_amdkhqamnzrwCZWe_hBD-rSTFbBg@mail.gmail.com Whole thread Raw |
In response to | RE: Slot's restart_lsn may point to removed WAL segment after hard restart unexpectedly ("Hayato Kuroda (Fujitsu)" <kuroda.hayato@fujitsu.com>) |
Responses |
Re: Slot's restart_lsn may point to removed WAL segment after hard restart unexpectedly
|
List | pgsql-hackers |
Dear Kuroda-san, On Thu, Jun 19, 2025 at 2:05 PM Hayato Kuroda (Fujitsu) <kuroda.hayato@fujitsu.com> wrote: > > > Regarding assertion failure, I've found that assert in > > > PhysicalConfirmReceivedLocation() conflicts with restart_lsn > > > previously set by ReplicationSlotReserveWal(). As I can see, > > > ReplicationSlotReserveWal() just picks fresh XLogCtl->RedoRecPtr lsn. > > > So, it doesn't seems there is a guarantee that restart_lsn never goes > > > backward. The commit in ReplicationSlotReserveWal() even states there > > > is a "chance that we have to retry". > > > > > > > I don't see how this theory can lead to a restart_lsn of a slot going > > backwards. The retry mentioned there is just a retry to reserve the > > slot's position again if the required WAL is already removed. Such a > > retry can only get the position later than the previous restart_lsn. > > We analyzed the assertion failure happened at pg_basebackup/020_pg_receivewal, > and confirmed that restart_lsn can go backward. This meant that Assert() added > by the ca307d5 can cause crash. > > Background > =========== > When pg_receivewal starts the replication and it uses the replication slot, it > sets as the beginning of the segment where restart_lsn exists, as the startpoint. > E.g., if the restart_lsn of the slot is 0/B000D0, pg_receivewal requests WALs > from 0/B00000. > More detail of this behavior, see f61e1dd2 and d9bae531. > > What happened here > ================== > Based on above theory, walsender sent from the beginning of segment (0/B00000). > When walreceiver receives, it tried to send reply. At that time the flushed > location of WAL would be 0/B00000. walsender sets the received lsn as restart_lsn > in PhysicalConfirmReceivedLocation(). Here the restart_lsn went backward (0/B000D0->0/B00000). > > The assertion failure could happen if CHECKPOINT happened at that time. > Attribute last_saved_restart_lsn of the slot was 0/B000D0, but the data.restart_lsn > was 0/B00000. It could not satisfy the assertion added in InvalidatePossiblyObsoleteSlot(). Thank you for your detailed explanation! > Note > ==== > 1. > In this case, starting from the beginning of the segment is not a problem, because > the checkpoint process only removes WAL files from segments that precede the > restart_lsn's wal segment. The current segment (0/B00000) will not be removed, > so there is no risk of data loss or inconsistency. > > 2. > A similar pattern applies to pg_basebackup. Both use logic that adjusts the > requested streaming position to the start of the segment, and it replies the > received LSN as flushed. > > 3. > I considered the theory above, but I could not reproduce 040_standby_failover_slots_sync > because it is a timing issue. Have someone else reproduced? > > We are still investigating failure caused at 040_standby_failover_slots_sync. I didn't manage to reproduce this. But as I see from the logs [1] on mamba that START_REPLICATION command was issued just before assert trap. Could it be something similar to what I described in [2]. Namely: 1. ReplicationSlotReserveWal() sets restart_lsn for the slot. 2. Concurrent checkpoint flushes that restart_lsn to the disk. 3. PhysicalConfirmReceivedLocation() sets restart_lsn of the slot to the beginning of the segment. [1] https://buildfarm.postgresql.org/cgi-bin/show_stage_log.pl?nm=mamba&dt=2025-06-17%2005%3A10%3A36&stg=recovery-check [2] https://www.postgresql.org/message-id/CAPpHfdv3UEUBjsLhB_CwJT0xX9LmN6U%2B__myYopq4KcgvCSbTg%40mail.gmail.com ------ Regards, Alexander Korotkov Supabase
pgsql-hackers by date: