RE: [BUG] Assert failure in ReorderBufferReturnTXN during logical decoding due to leaked specinsert change - Mailing list pgsql-bugs

From Vishal Prasanna
Subject RE: [BUG] Assert failure in ReorderBufferReturnTXN during logical decoding due to leaked specinsert change
Date
Msg-id 19c95a57600.599bb79483132.9142801067257165882@zohocorp.com
Whole thread Raw
In response to RE: [BUG] Assert failure in ReorderBufferReturnTXN during logical decoding due to leaked specinsert change  ("Hayato Kuroda (Fujitsu)" <kuroda.hayato@fujitsu.com>)
Responses Re: [BUG] Assert failure in ReorderBufferReturnTXN during logical decoding due to leaked specinsert change
List pgsql-bugs
Hi,

Thanks for the response.

> Anyway, I agreed that the leak could happen across all versions.
> How about fixing the issue as shown below? Thought?
> 1. Add an Assert() for PG16-.

Yeah, Adding `Assert(txn->size == 0)` to PG 14, 15, 16 would ensure everything
is cleaned up before freeing the `ReorderBufferTXN`. Consider adding this assert as well.

> 2. Make sure specinsert is released for all supported versions.

In PG 14 to 17, `ReorderBufferReturnChange()` is used to free the `ReorderBufferChange`.
In PG 18, this function was renamed to `ReorderBufferFreeChange()`.

Refer:
PG 14, 15, 16, 17: PG14-17-Fix-specinsert-leak-in-ReorderBufferProcessTXN-error-path.patch
PG 18: PG18-Fix-specinsert-leak-in-ReorderBufferProcessTXN-error-path.patch

The above changes cover the `specinsert` cleanup in the error path for all supported versions.

> 3. Consider a workload that could reproduce on PG18 and master.
> 3-1. If found, the test code can be added for all versions.
> 3-2. Otherwise, the test code can be added for PG17-.

Found a workload that can reproduce the issue across all supported versions, except PG 14.

For PG 14, since row_filter is not supported, so we can go with the 'publication does not exist' error instead.
Refer: PG14-Test-specinsert-cleanup-in-ReorderBufferProcessTXN-error-path.patch

For PG 15 - 18, using the row_filter option we can cause an error in the logical decoder.
Refer: PG15-18-Test-specinsert-cleanup-in-ReorderBufferProcessTXN-error-path.patch

>> Additional Suggestion:
>> Currently in `PG_CATCH` block, `specinsert` is only freed in the `ERRCODE_TRANSACTION_ROLLBACK` branch
>> for streaming or prepared transactions, via `ReorderBufferResetTXN()` at line 2691.
>> Would it make sense to move the freeing of `specinsert` before the if/else branch,
>> so that it is always freed regardless of the error path? This would avoid duplication and ensure
>> that `specinsert` is always cleaned up.

> It looks OK for me. In this case an argument should be reduced from
> ReorderBufferResetTXN(), right? It is harmless because the function is a static one.

Yes, the `specinsert` is no longer needed in `ReorderBufferResetTXN()`. Updated the patch where `specinsert` cleanup
is now handled in the `PG_CATCH()` block of `ReorderBufferProcessTXN()`, so it is always freed before the if/else branch.
Refer:
PG14-17-Fix-specinsert-leak-in-ReorderBufferProcessTXN-error-path.patch
PG18-Fix-specinsert-leak-in-ReorderBufferProcessTXN-error-path.patch


Regards,
Vishal Prasanna
Zoho Corporation


Attachment

pgsql-bugs by date:

Previous
From: Robert Haas
Date:
Subject: Re: Major Version Upgrade failure due to orphan roles entries in catalog
Next
From: Tom Lane
Date:
Subject: Re: Major Version Upgrade failure due to orphan roles entries in catalog