Re: logical: fix recomputation required LSN on restart_lsn-only advancement - Mailing list pgsql-hackers

From Antonin Houska
Subject Re: logical: fix recomputation required LSN on restart_lsn-only advancement
Date
Msg-id 10069.1776755340@localhost
Whole thread
In response to logical: fix recomputation required LSN on restart_lsn-only advancement  (Chao Li <li.evan.chao@gmail.com>)
Responses Re: logical: fix recomputation required LSN on restart_lsn-only advancement
List pgsql-hackers
Chao Li <li.evan.chao@gmail.com> wrote:

> While reading logical replication code, I found an issue in LogicalConfirmReceivedLocation().
>
> In LogicalConfirmReceivedLocation(), updated_restart is tracked independently from updated_xmin, and the slot is
markeddirty and saved when either one changed. But after that, ReplicationSlotsComputeRequiredLSN() is still only
calledinside "if (updated_xmin)”.  
>
> So for the restart-only case:
>
> * updated_restart = true
> * updated_xmin = false
> * ReplicationSlotSave() runs
> * ReplicationSlotsComputeRequiredLSN() does not run because updated_xmin is false
>
> That means the global retention point managed by XLogSetReplicationSlotMinimumLSN() can stay stale until some later
unrelatedevent recomputes it. Since ReplicationSlotsComputeRequiredLSN() derives the global minimum from slot
restat_lsn,skipping it after a restart-only advance can retain excess WAL and may lead to WAL bloat. 
>
> This patch fixes the problem by moving ReplicationSlotsComputeRequiredLSN() under “if (updated_restart)”.

FYI, this overlaps with another post in the REPACK thread [1].

> Looks like this issue has been there for a long time, so if this analysis is correct, it may also be worth
back-patching.

As REPACK in PG 19 does not let xmin advance (that should be fixed in the
future), I think makes sense to apply [1] to v19. However, during logical
replication, xmin (IMO) gets updated rather often, so the problem should not
be that severe in earlier versions.

[1]
https://www.postgresql.org/message-id/TYRPR01MB14195633567DA00ABD42570B794592%40TYRPR01MB14195.jpnprd01.prod.outlook.com

--
Antonin Houska
Web: https://www.cybertec-postgresql.com



pgsql-hackers by date:

Previous
From: Yuchen Li
Date:
Subject: Re: repack: fix a bug to reject deferrable primary key fallback for concurrent mode
Next
From: Shruthi Gowda
Date:
Subject: Re: [BUG] CRASH: ECPGprepared_statement() and ECPGdeallocate_all() when connection is NULL