Re: Replication slot WAL reservation - Mailing list pgsql-general

From Christophe Pettus
Subject Re: Replication slot WAL reservation
Date
Msg-id 3CC662DB-55FC-42C5-9068-2365F89229E8@thebuild.com
Whole thread Raw
In response to Re: Replication slot WAL reservation  (Phillip Diffley <phillip6402@gmail.com>)
List pgsql-general

> On Mar 26, 2025, at 07:55, Phillip Diffley <phillip6402@gmail.com> wrote:
> Just to confirm, it sounds like the order messages are sent from the output plugin is what matters here. When you
updateconfirmed_flush_lsn to LSN "A", any messages that were sent by the output plugin after the message with LSN "A"
willbe replayable. Any messages sent by the output plugin before the message with LSN "A" will most likely not be
replayed,since their data is freed for deletion. Is that correct? 

The terminology is shifting around a bit here, so to be specific: When the primary (or publisher) receives a message
fromthe secondary (or replica) that a particular LSN has been flushed, the primary at that point feels free to recycle
anyWAL segments that only contain WAL entries whose LSN is less than that flush point (whether or not it actually does
dependson a lot of other factors).  The actual horizon that the primary needs to retain can be farther back than that,
becausethere's no requirement that the secondary send an LSN as confirmed_flush_lsn that is at a transaction boundary,
sothe flush LSN might land in the middle of a transaction.  The actual point before which the primary can recycle WAL
isrestart_lsn, which the primary determines based on the flush LSN. 

When the secondary connects, it provides an LSN from which the primary should start sending WAL (if a binary replica)
ordecoded WAL via the plugin (if a logical replica).  For a logical replica, that can be confirmed_flush_lsn or any
pointafter, but it can't be before.  (Even if the WAL exists, the primary will return an error if the start point
providedin START_REPLICATION is before confirmed_flush_lsn for a logical replication slot.)  Of course, you'll get an
errorif START_REPLICATION supplies an LSN that doesn't actually exist yet. 

The behavior that the primary is expecting from the secondary is that the secondary never sends back a
confirmed_flush_lsnuntil up to that point is crash / disconnection-safe.  What "safe" means in this case depends on the
clientbehavior.  It might be just spooling the incoming stream to disk and processing it later, or it might be
processingit completely on the fly as it comes in. 

The most important point here is that the client consuming the logical replication messages must keep track of the
flushpoint (defined however the client implements processing the messages), and provide the right one back to the
primarywhen it connects.  (Another option is that that the client is written so that each transaction is idempotent,
andeven if transactions that it has already processed are sent again, the result is the same.) 

One more note is that if the client supplies an LSN (for logical replication) that lands in the middle of a
transaction,the primary will send over the complete transaction, so the actual start point may be earlier than the
suppliedstart point.  Generally, this means that the client should respect transaction boundaries, and be able to deal
withgetting a partial transaction but discarding it if it doesn't get a commit record for it. 


pgsql-general by date:

Previous
From: Phillip Diffley
Date:
Subject: Re: Replication slot WAL reservation
Next
From: Karsten Hilbert
Date:
Subject: Re: Q on SELECT column list pushdown from view to table