Re: logical replication: restart_lsn can go backwards (and more), seems broken since 9.4 - Mailing list pgsql-hackers

From Ashutosh Bapat
Subject Re: logical replication: restart_lsn can go backwards (and more), seems broken since 9.4
Date
Msg-id CAExHW5sC6qQ=GZHij2o=Kj4gsMnJuE_PExs39Pn3-BqgAL4haA@mail.gmail.com
Whole thread Raw
In response to Re: logical replication: restart_lsn can go backwards (and more), seems broken since 9.4  (Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>)
Responses Re: logical replication: restart_lsn can go backwards (and more), seems broken since 9.4
List pgsql-hackers
On Sat, Nov 9, 2024 at 5:05 PM Tomas Vondra <tomas@vondra.me> wrote:
>
> On 11/8/24 15:57, Ashutosh Bapat wrote:
> > ...
> >
> > After examining the code before reading [2], I came to the same
> > conclusion as Masahiko-san in [2]. We install candidate_restart_lsn
> > based on the running transaction record whose LSN is between
> > restart_lsn and confirmed_flush_lsn. Since candidate_restart_valid of
> > such candidates would be lesser than any confirmed_flush_lsn received
> > from downstream. I am surprised that the fix suggested by Masahiko-san
> > didn't work though. The fix also fix the asymmetry, between
> > LogicalIncreaseXminForSlot and LogicalIncreaseRestartDecodingForSlot,
> > that you have pointed out in your next email. What behaviour do you
> > see with that fix applied?
> >
> >
> > [1] https://www.postgresql.org/message-id/Yz2hivgyjS1RfMKs%40depesz.com
> > [2] https://www.postgresql.org/message-id/CAD21AoBVhYnGBuW_o%3DwEGgTp01qiHNAx1a14b1X9kFXmuBe%3Dsg%40mail.gmail.com
> >
> >
>
> I read that message (and the earlier discussion multiple times) while
> investigating the issue, and TBH it's not very clear to me what the
> conclusion is :-(
>
> There's some discussion about whether the candidate fields should be
> reset on release or not. There are even claims that it might be
> legitimate to not reset the fields and update the restart_lsn. Using
> such "stale" LSN values seems rather suspicious to me, but I don't have
> a proof that it's incorrect. FWIW this unclarity is what I mentioned the
> policy/contract for candidate fields is not explained anywhere.
>
> That being said, I gave that fix a try - see the attached 0001 patch. It
> tweaks LogicalIncreaseRestartDecodingForSlot (it needs a bit more care
> because of the spinlock), and it adds a couple asserts to make sure the
> data.restart_lsn never moves back.
>
> And indeed, with this my stress test script does not crash anymore.

:)

I think the problem is about processing older running transactions
record and setting data.restart_lsn based on the candidates those
records produce. But what is not clear to me is how come a newer
candidate_restart_lsn is available immediately upon WAL sender
restart. I.e. in the sequence of events you mentioned in your first
email
1.  344.139 LOG:  starting logical decoding for slot "s1"

2. 344.139 DETAIL:  Streaming transactions committing after 1/E97EAC30,
                   reading WAL from 1/E96FB4D0.

 3. 344.140 LOG:  logical decoding found consistent point at 1/E96FB4D0

 4. 344.140 DETAIL:  Logical decoding will begin using saved snapshot.

  5. 344.140 LOG:  LogicalConfirmReceivedLocation 1/E9865398

6.  344.140 LOG:  LogicalConfirmReceivedLocation updating
                   data.restart_lsn to 1/E979D4C8 (from 1/E96FB4D0)
                   candidate_restart_valid 0/0 (from 1/E9865398)
                   candidate_restart_lsn 0/0 (from 1/E979D4C8)

how did candidate_restart_lsn = 1/E979D4C8 and candidate_restart_valid
= 1/E9865398 were set in ReplicationSlot after WAL sender? It means it
must have read and processed running transaction record at 1/E9865398.
If that's true, how come it went back to a running transactions WAL
record at 1/E979D4C8? It should be reading WAL records sequentially,
hence read 1/E979D4C8 first then 1/E9865398.

 344.145 LOG:  LogicalIncreaseRestartDecodingForSlot s1
                   candidate_restart_valid_lsn 1/E979D4C8 (0/0)
                   candidate_restart_lsn 1/E96FB4D0 (0/0)

--
Best Wishes,
Ashutosh Bapat



pgsql-hackers by date:

Previous
From: Heikki Linnakangas
Date:
Subject: Re: Support LIKE with nondeterministic collations
Next
From: Ashutosh Bapat
Date:
Subject: Re: Add html-serve target to autotools and meson