RE: [BUG] Assert failure in ReorderBufferReturnTXN during logical decoding due to leaked specinsert change - Mailing list pgsql-bugs

From Vishal Prasanna
Subject RE: [BUG] Assert failure in ReorderBufferReturnTXN during logical decoding due to leaked specinsert change
Date
Msg-id 19ccf4ccdb3.2e31ab735232.2694179375476897261@zohocorp.com
Whole thread Raw
In response to RE: [BUG] Assert failure in ReorderBufferReturnTXN during logical decoding due to leaked specinsert change  ("Hayato Kuroda (Fujitsu)" <kuroda.hayato@fujitsu.com>)
Responses RE: [BUG] Assert failure in ReorderBufferReturnTXN during logical decoding due to leaked specinsert change
List pgsql-bugs
Hi Hayato,

>> Found a workload that can reproduce the issue across all supported versions, except PG 14.
>>
>> For PG 14, since row_filter is not supported, so we can go with the 'publication does not exist' error instead.
>> Refer: PG14-Test-specinsert-cleanup-in-ReorderBufferProcessTXN-error-path.patch
>>
>> For PG 15 - 18, using the row_filter option we can cause an error in the logical decoder.
>
> Per my understanding, for PG16-, the provided workload cannot cause an assertion
> failure because it misses the Assert() in the ReorderBufferReturnTXN(), right?
> Adding the line is essential, otherwise the test could pass even without the fix.

Right. Added `Assert(txn->size == 0)` in `ReorderBufferReturnTXN()` for PG14, 15, 16 patches.

> 01.
> I think it is better to combine tests actual code patches into one. Because they
> would be done when patches are committed.

combined the fix and test into a single patch. since some changes are version specific
(like add assert check for PG14-16, different function names from PG17 -> 18, test case)
separate patches are provided per version. PG15 and PG16 share the same patch.

> 02.
> This issue can happen even on HEAD, but PG18-Fix-specinsert... cannot be applied
> atop the branch. Can you create it as well?

created a separate patch for master.

> 03. 100_bugs.pl
> 
> Other tests start from the comment like "The bug...", but it does not follow.
> Can we update?
> 
> 04. 100_bugs.pl
> 
> Can we just rotate a log instead of starting new instance? It might be faster.
> 
> 05. 100_bugs.pl for PG
> 
> ```
> # The publication row filter WHERE ((a / 0) > 0) will trigger a division by zero error.
> ```
> 
> I think the comment can be improved like:
> 
> Create a publication with the zero-division row filter. It always throws an
> ERROR before publishing changes, when the filter is evaluated.
> 
> Please see attached my top-up patch for PG18, it addresses comments 03-05.

Applied your suggestions across all versions. Thanks.

Overall:
PG14: Assert check + fix + test (uses 'pub does not exist' error)
PG15-16: Assert check + fix + test (uses row_filter error)
PG17: fix + test (uses row_filter error)
PG18, master: fix + test (uses row_filter error)

Regards,
Vishal Prasanna
Zoho Corporation
Attachment

pgsql-bugs by date:

Previous
From: Laurenz Albe
Date:
Subject: Re: BUG #19426: pgdump is stuck
Next
From: "Hayato Kuroda (Fujitsu)"
Date:
Subject: RE: [BUG] Assert failure in ReorderBufferReturnTXN during logical decoding due to leaked specinsert change