Re: TRAP: FailedAssertion("prev_first_lsn < cur_txn->first_lsn", File: "reorderbuffer.c", Line: 927, PID: 568639) - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: TRAP: FailedAssertion("prev_first_lsn < cur_txn->first_lsn", File: "reorderbuffer.c", Line: 927, PID: 568639)
Date
Msg-id CAA4eK1LrxR3YhE-0ydL7eU1-ZfBBz8VzbiJWyYBkGr9wXVKfiA@mail.gmail.com
Whole thread Raw
In response to Re: TRAP: FailedAssertion("prev_first_lsn < cur_txn->first_lsn", File: "reorderbuffer.c", Line: 927, PID: 568639)  (Tomas Vondra <tomas.vondra@enterprisedb.com>)
Responses Re: TRAP: FailedAssertion("prev_first_lsn < cur_txn->first_lsn", File: "reorderbuffer.c", Line: 927, PID: 568639)
List pgsql-hackers
On Mon, Sep 5, 2022 at 5:24 PM Tomas Vondra
<tomas.vondra@enterprisedb.com> wrote:
>
> On 9/5/22 12:12, Amit Kapila wrote:
> > On Mon, Sep 5, 2022 at 12:14 PM Tomas Vondra
> > <tomas.vondra@enterprisedb.com> wrote:
> >
> > It is possible that there is some other problem here that I am
> > missing. But at this stage, I don't see anything wrong other than the
> > assertion you have reported.
> >
>
> I'm not sure I agree with that. I'm not convinced the assert is at
> fault, it might easily be that it hints there's a logic bug somewhere.
>

It is possible but let's try to prove it. I am also keen to know if
this hints at a logic bug somewhere.

> >> I think it's mostly clear we won't output this transaction, because the
> >> restart LSN is half-way through. We can either ignore it at commit time,
> >> and then we have to make everything work in case we miss assignments (or
> >> any other part of the transaction).
> >>
> >
> > Note, traditionally, we only form these assignments at commit time
> > after deciding whether to skip such commits. So, ideally, there
> > shouldn't be any fundamental problem with not making these
> > associations before deciding whether we need to replay (send
> > downstream) any particular transaction.
> >
>
> Isn't that self-contradictory? Either we form these assignments at
> commit time, or we support streaming (in which case it clearly can't
> happen at commit time).
>

I was talking about non-streaming cases which also have this assert
problem as seen in this thread. I am intentionally keeping streaming
cases out of this discussion as it happens without those and by
including streaming in the discussion, we will add another angle to
this problem which may not be required.

> AFAICS that's exactly why we started logging
> (and processing) assignments immediately, no?
>
> >> Or we can ignore stuff early, and not even process some of the changes.
> >> For example in this case do we need to process the NEW_CID contents for
> >> transaction 848? If we can skip that bit, the problem will disappear.
> >>
> >> But maybe this is futile and there are other similar issues ...
> >>
> >>>> However, when processing the NEW_CID record:
> >>>>
> >>>> tx:        849, lsn: 0/01CDCDB0, prev 0/01CDCD70, desc: NEW_CID rel
> >>>> 1663/16384/1249; tid 58/38; cmin: 1, cmax: 14, combo: 6
> >>>>
> >>>> we ultimately do this in SnapBuildProcessNewCid:
> >>>>
> >>>> #1  0x0000005566cccdb4 in ReorderBufferAddNewTupleCids (rb=0x559dd64218,
> >>>> xid=848, lsn=30264752, locator=..., tid=..., cmin=1, cmax=14,
> >>>> combocid=6) at reorderbuffer.c:3218
> >>>> #2  0x0000005566cd1f7c in SnapBuildProcessNewCid (builder=0x559dd6a248,
> >>>> xid=849, lsn=30264752, xlrec=0x559dd6e1e0) at snapbuild.c:818
> >>>>
> >>>> so in fact we *know* 849 is a subxact of 848, but we don't call
> >>>> ReorderBufferAssignChild in this case. In fact we can't even do the
> >>>> assignment easily in this case, because we create the subxact first, so
> >>>> that the crash happens right when we attempt to create the toplevel one,
> >>>> and we never even get a chance to do the assignment:
> >>>>
> >>>> 1) process the NEW_CID record, logged for 849 (subxact)
> >>>> 2) process CIDs in the WAL record, which has topleve_xid 848
> >>>>
> >>>>
> >>>> So IMHO we need to figure out what to do for WAL records that create
> >>>> both the toplevel and subxact - either we need to skip them, or rethink
> >>>> how we create the ReorderBufferTXN structs.
> >>>>
> >>>
> >>> As per my understanding, we can't skip them as they are used to build
> >>> the snapshot.
> >>>
> >>
> >> Don't we know 848 (the top-level xact) won't be decoded? In that case we
> >> won't need the snapshot, so why build it?
> >>
> >
> > But this transaction id can be part of committed.xip array if it has
> > made any catalog changes. We add the transaction/subtransaction to
> > this array before deciding whether to skip decoding/replay of its
> > commit.
> >
>
> Hmm, yeah. It's been a while since I last looked into how we build
> snapshots and how we share them between the transactions :-( If we share
> the snapshots between transactions, you're probably right we can't just
> skip these changes.
>
> However, doesn't that pretty much mean we *have* to do something about
> the assignment? I mean, suppose we miss the assignment (like now), so
> that we end up with two TXNs that we think are top-level. And then we
> get the commit for the actual top-level transaction. AFAICS that won't
> clean-up the subxact, and we end up with a lingering TXN.
>

I think we will clean up such a subxact. Such a xact should be skipped
via DecodeTXNNeedSkip() and then it will call ReorderBufferForget()
for each of the subxacts and that will make sure that we clean up each
of subtxn's.

-- 
With Regards,
Amit Kapila.



pgsql-hackers by date:

Previous
From: Peter Eisentraut
Date:
Subject: Re: pg_waldump: add test for coverage
Next
From: Pavel Stehule
Date:
Subject: Re: Modernizing our GUC infrastructure