Re:Re:BUG #18369: logical decoding core on AssertTXNLsnOrder() - Mailing list pgsql-bugs

From ocean_li_996
Subject Re:Re:BUG #18369: logical decoding core on AssertTXNLsnOrder()
Date
Msg-id 3d696d51.cfa6.18deeccd483.Coremail.ocean_li_996@163.com
Whole thread Raw
In response to Re:BUG #18369: logical decoding core on AssertTXNLsnOrder()  (ocean_li_996 <ocean_li_996@163.com>)
Responses Re: BUG #18369: logical decoding core on AssertTXNLsnOrder()  (Alexander Lakhin <exclusion@gmail.com>)
List pgsql-bugs

This issue exists in PG 12 -15. 



At 2024-02-28 15:57:37, "ocean_li_996" <ocean_li_996@163.com> wrote:

At 2024-02-28 15:53:30, "PG Bug reporting form" <noreply@postgresql.org> wrote:

>1) The WAL records from restart_lsn to the corresponding lsn when the issue
>occurred,
>2) personal analysis of the problem,
>3) the steps to reproduce the issue,
>4) personal proposed solution
>will be posted later under this thread.
>

1) The WAL records from restart_lsn to the corresponding lsn when the issue occurred is supported in attachment file 1.

2) As indicated in 1), some invalidation messages are generated in 19933 top xact. After the decoding restarted, the invalidation messages will make 19933 top xact and its subtransaction(s) to be marked as containing catalog change while processing its commit record(see SnapBuildXidSetCatalogChanges() ). In this step, the corresponding subxacts which never procedded before are added into ReorderBuffer with the same first_lsn as top-level xact.  Then, the check in AssertTXNLsnOrder() will failed if  the number of subxact mentioned above more than 1.

3) The patch to reproduce the issue is supported in attachment file 2.  DML on temporary table can consume xid and not log any WAL RECORD except it's the firtst subxact of top xact(log ASSIGNMENT record). So we use DML on temporary table to generate two "never procedded before" sunxacts in on top xact. 

4) Since it is already known to be a subxact before being add into ReorderBuffer, I think an appropriate fix is extending the ReorderBufferXidSetCatalogChanges function with an is_top parameter to indicate whether the xact is a top-level xact. 
For a subxact, it would not be added to the toplevel_by_lsn list and would not undergo the AssertTXNLsnOrder check. Of course, it is necessary to introduce a check to verify whether a node is in the list when attempting to remove a node from toplevel_by_lsn.  
The specific fix patch is provided in Attachment 3.

Thanks
Haiyang Li

pgsql-bugs by date:

Previous
From: ocean_li_996
Date:
Subject: Re:BUG #18369: logical decoding core on AssertTXNLsnOrder()
Next
From: Alexander Lakhin
Date:
Subject: Re: BUG #18369: logical decoding core on AssertTXNLsnOrder()