Re: WAL segments removed from primary despite the fact that logical replication slot needs it. - Mailing list pgsql-bugs

From Masahiko Sawada
Subject Re: WAL segments removed from primary despite the fact that logical replication slot needs it.
Date
Msg-id CAD21AoBVhYnGBuW_o=wEGgTp01qiHNAx1a14b1X9kFXmuBe=sg@mail.gmail.com
Whole thread Raw
In response to Re: WAL segments removed from primary despite the fact that logical replication slot needs it.  (Alvaro Herrera <alvherre@alvh.no-ip.org>)
List pgsql-bugs
On Tue, Feb 7, 2023 at 12:22 AM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
>
> On 2023-Feb-06, Masahiko Sawada wrote:
>
> > I've also confirmed that this issue is fixed by the attached patch,
> > which clears candidate_restart_lsn and friends during
> > ReplicationSlotRelease().
>
> Hmm, interesting -- I was studying some other bug recently involving the
> xmin of a slot that had been invalidated and I remember wondering if
> these "candidate" fields were being properly ignored when the slot is
> marked not in use; but I didn't check.  Are you sure that resetting them
> when the slot is released is the appropriate thing to do?  I mean,
> shouldn't they be kept set while the slot is in use, and only reset if
> we destroy it?

Thinking about the root cause more, it seems to me that the root cause
is not the fact that candidate_xxx values are not cleared when being
released.

In the scenario I reproduced, after restarting the logical decoding,
the walsender sets the restart_lsn to a candidate_restart_lsn left in
the slot upon receiving the ack from the subscriber. Then, when
decoding a RUNNING_XACTS record, the walsender updates the
restart_lsn, which actually retreats it. The patch, which clears the
candidate_xxx values at releasing, fixes this issue by not updating
the restart_lsn upon receiving the ack in this case, but I think it's
legitimate that the walsender sets the restart_lsn to the
candidate_restart_lsn left in the slot in this case. The root cause
here seems to me that it allows the restart_lsn to retreat. In
LogicalIncreaseRestartDecodingForSlot(), we have the following codes:

/* don't overwrite if have a newer restart lsn */
if (restart_lsn <= slot->data.restart_lsn)
{
}

/*
 * We might have already flushed far enough to directly accept this lsn,
 * in this case there is no need to check for existing candidate LSNs
 */
else if (current_lsn <= slot->data.confirmed_flush)
{
    slot->candidate_restart_valid = current_lsn;
    slot->candidate_restart_lsn = restart_lsn;

    /* our candidate can directly be used */
    updated_lsn = true;
}

/*
 * Only increase if the previous values have been applied, otherwise we
 * might never end up updating if the receiver acks too slowly. A missed
 * value here will just cause some extra effort after reconnecting.
 */
if (slot->candidate_restart_valid == InvalidXLogRecPtr)
{
    slot->candidate_restart_valid = current_lsn;
    slot->candidate_restart_lsn = restart_lsn;
    SpinLockRelease(&slot->mutex);
}

If slot->candidate_restart_valid is InvalidXLogRecPtr, he
candidate_restart_lsn could be set to a LSN lower than the slot's
restart_lsn in spite of the first comment saying "don't overwrite if
have a newer restart lsn". So I think the right fix here is to not
allow the restart_lsn to retreat by changing "if" to "else if" of the
third block.

>
> (But, actually, in that case, maybe we don't need to reset them: instead
> we need to make sure to ignore the values for slots that are not in_use.
> However, I don't know what the semantics are for these "candidate"
> fields, so maybe this is wrong.)

IIUC in that case, the WAL segments are removed while the slot is in_use.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com



pgsql-bugs by date:

Previous
From: "Qu, Mischa, Majorel China"
Date:
Subject: 答复: exceptional result of postres_fdw external table joining local table
Next
From: PG Bug reporting form
Date:
Subject: BUG #17781: Assert in setrefs.c