Re: TRAP: FailedAssertion("prev_first_lsn < cur_txn->first_lsn", File: "reorderbuffer.c", Line: 927, PID: 568639) - Mailing list pgsql-hackers
From | Masahiko Sawada |
---|---|
Subject | Re: TRAP: FailedAssertion("prev_first_lsn < cur_txn->first_lsn", File: "reorderbuffer.c", Line: 927, PID: 568639) |
Date | |
Msg-id | CAD21AoBmrum63r5anV+Rxo2MyvxehYioRvkKqvENdYzMvE8_7w@mail.gmail.com Whole thread Raw |
In response to | Re: TRAP: FailedAssertion("prev_first_lsn < cur_txn->first_lsn", File: "reorderbuffer.c", Line: 927, PID: 568639) (Amit Kapila <amit.kapila16@gmail.com>) |
Responses |
Re: TRAP: FailedAssertion("prev_first_lsn < cur_txn->first_lsn", File: "reorderbuffer.c", Line: 927, PID: 568639)
|
List | pgsql-hackers |
On Tue, Oct 18, 2022 at 1:07 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Tue, Oct 18, 2022 at 6:29 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > > > On Mon, Oct 17, 2022 at 4:40 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > > > > > IIUC, here you are speaking of three different changes. Change-1: Add > > > a check in AssertTXNLsnOrder() to skip assert checking till we reach > > > start_decoding_at. Change-2: Set needs_timetravel to true in one of > > > the else if branches in SnapBuildCommitTxn(). Change-3: Remove the > > > call to ReorderBufferAssignChild() from SnapBuildXidSetCatalogChanges > > > in PG-14/15 as that won't be required after Change-1. > > > > Yes. > > > > > > > > AFAIU, Change-1 is required till v10; Change-2 and Change-3 are > > > required in HEAD/v15/v14 to fix the problem. > > > > IIUC Change-2 is required in v16 and HEAD > > > > Why are you referring v16 and HEAD separately? Sorry, my wrong, I was confused. > > > but not mandatory in v15 and > > v14. The reason why we need Change-2 is that there is a case where we > > mark only subtransactions as containing catalog change while not doing > > that for its top-level transaction. In v15 and v14, since we mark both > > subtransactions and top-level transaction in > > SnapBuildXidSetCatalogChanges() as containing catalog changes, we > > don't get the assertion failure at "Assert(!needs_snapshot || > > needs_timetravel)". > > > > Regarding Change-3, it's required in v15 and v14 but not in HEAD and > > v16. Since we didn't add SnapBuildXidSetCatalogChanges() to v16 and > > HEAD, Change-3 cannot be applied to the two branches. > > > > > Now, the second and third > > > changes are not required in branches prior to v14 because we don't > > > record invalidations via XLOG_XACT_INVALIDATIONS record. However, if > > > we want, we can even back-patch Change-2 and Change-3 to keep the code > > > consistent or maybe just Change-3. > > > > Right. I don't think it's a good idea to back-patch Change-2 in > > branches prior to v14 as it's not a relevant issue. > > > > Fair enough but then why to even backpatch it to v15 and v14? Oops, it's a typo. I wanted to say Change-2 should be back-patched only to HEAD. > > > Regarding > > back-patching Change-3 to branches prior 14, I think it may be okay > > til v11, but I'd be hesitant for v10 as the final release comes in a > > month. > > > > So to fix the issue in all branches, what we need to do is to > backpatch change-1: in all branches till v10, change-2: in HEAD, and > change-3: in V15 and V14. Additionally, we think, it is okay to > backpatch change-3 till v11 as it is mainly done to avoid the problem > fixed by change-1 and it makes code consistent in back branches. Right. > > I think because the test case proposed needs all three changes, we can > push the change-1 without a test case and then as a second patch have > change-2 for HEAD and change-3 for back branches with the test case. > Do you have any other ideas to proceed here? I found another test case that causes the assertion failure at "Assert(!needs_snapshot || needs_timetravel);" on all branches. I've attached the patch for the test case. In this test case, I modified a user-catalog table instead of system-catalog table. That way, we don't generate invalidation messages while generating NEW_CID records. As a result, we mark only the subtransactions as containing catalog change and don't make association between top-level and sub transactions. The assertion failure happens on all supported branches. If we need to fix this (I believe so), Change-2 needs to be backpatched to all supported branches. There are three changes as Amit mentioned, and regarding the test case, we have three test cases I've attached: truncate_testcase.patch, analyze_testcase.patch, uesr_catalog_testcase.patch. The relationship between assertion failures and test cases are very complex. I could not find any test case to cause only one assertion failure on all branches. One idea to proceed is: Patch-1 includes Change-1 and is applied to all branches. Patch-2 includes Change-2 and the user_catalog test case, and is applied to all branches. Patch-3 includes Change-3 and the truncate test case (or the analyze test case), and is applied to v14 and v15 (also till v11 if we prefer). The patch-1 doesn't include any test case but the user_catalog test case can test both Change-1 and Change-2 on all branches. In v15 and v14, the analyze test case causes both the assertions at "Assert(txn->ninvalidations == 0);" and "Assert(prev_first_lsn < cur_txn->first_lsn);" whereas the truncate test case causes the assertion only at "Assert(txn->ninvalidations == 0);". Since the patch-2 is applied on top of the patch-1, there is no difference in terms of testing Change-2. Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com
Attachment
pgsql-hackers by date: