Re: Fix logical decoding not track transaction during SNAPBUILD_BUILDING_SNAPSHOT - Mailing list pgsql-hackers

From Ajin Cherian
Subject Re: Fix logical decoding not track transaction during SNAPBUILD_BUILDING_SNAPSHOT
Date
Msg-id CAFPTHDagn1PProB2RGM-0tOt2D4BYjpxsoRVO0sn-bLAvXg+mQ@mail.gmail.com
Whole thread Raw
In response to Fix logical decoding not track transaction during SNAPBUILD_BUILDING_SNAPSHOT  (ocean_li_996 <ocean_li_996@163.com>)
Responses Re:Re: Fix logical decoding not track transaction during SNAPBUILD_BUILDING_SNAPSHOT
List pgsql-hackers
On Sat, Nov 22, 2025 at 8:28 PM ocean_li_996 <ocean_li_996@163.com> wrote:
>
> Hi all,
>
> I would like to share a logical replication bug and some possible fixes. It seems that this bug has existed since
> logical replication was first introduced, so it has been around for quite some time. In fact, the previously
> reported issues [1], [2], [3] were all caused by this bug.
>
> # Problem description
>
> When in the BUILDING_SNAPSHOT state, the snapshot builder does not track the status of any
> transaction. It can lead to missing transaction states when:
> -- The transaction commits before the builder reaches FULL_SNAPSHOT state, and
> -- The transaction's xid is greater than or equal to builder->xmin when the builder reaches
> FULL_SNAPSHOT state.

> 2) Based on v6-0001, I have provided a minimal fix in v6-0003 (not yet reviewed). AFAICS, it resolves
> the problem, though it records additional useless information in the reorder buffer during BUILDING_SNAPSHOT
> state (which is discarded later). This increases memory usage and slightly impacts performance. But since
> snapshot building is infrequent, I consider this acceptable.
>
> 3) I have also prepared a cleaner and more efficient fix in v6-0004 than v6-0003, albeit more complex
> (similar to v6-0001). Provided as an alternative reference.

Hello Haiyang,

I agree with your analysis and approach, but when I tried out the
patch (applying patch 0002 for the tests and patch 0004), I see the
tests in contrib/test_decoding failing.
Similarly, applying patch 0002 and 0003 also results in the tests
failing. So, I am not sure how your minimal fix fixes the problem. Am
I doing something wrong?
Does patch 0003 and 0004 have to be applied on top of 0001? That
doesn't seem to be the case, as both make the same code change and
don't apply cleanly.

regards,
Ajin Cherian
Fujitsu Australia



pgsql-hackers by date:

Previous
From: "Euler Taveira"
Date:
Subject: Re: pg_waldump: support decoding of WAL inside tarfile
Next
From: Peter Smith
Date:
Subject: Re: Proposal: Conflict log history table for Logical Replication