Re: long-standing data loss bug in initial sync of logical replication - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: long-standing data loss bug in initial sync of logical replication
Date
Msg-id CAA4eK1KR_h_eettKBV3=cy96AvhVyR8Jw9K=W=3Bt_G+=w-F_A@mail.gmail.com
Whole thread Raw
In response to Re: long-standing data loss bug in initial sync of logical replication  (Amit Kapila <amit.kapila16@gmail.com>)
Responses Re: long-standing data loss bug in initial sync of logical replication
List pgsql-hackers
On Tue, Mar 18, 2025 at 3:25 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Mon, Mar 17, 2025 at 4:56 PM Hayato Kuroda (Fujitsu)
> <kuroda.hayato@fujitsu.com> wrote:
> >
> > > Regarding the PG13, it cannot be
> > > applied
> > > as-is thus some adjustments are needed. I will share it in upcoming posts.
> >
> > Here is a patch set for PG13. Apart from PG14-17, the patch could be created as-is,
> > because...
> >
> > 1. WAL record for invalidation messages (XLOG_XACT_INVALIDATIONS) does not exist.
> > 2. Thus the ReorderBufferChange for the invalidation does not exist.
> >    Our patch tries to distribute it but cannot be done as-is.
> > 3. Codes assumed that invalidation messages can be added only once.
> > 4. The timing when invalidation messages are consumed is limited:
> >   a. COMMAND_ID change is poped,
> >   b. start of decoding a transaction, or
> >   c. end of decoding a transaction.
> >
> > Above means that invalidations cannot be executed while being decoded.
> > I created two patch sets to resolve the data loss issue. 0001 has less code
> > changes but could resolve a part of issue, 0002 has huge changes but provides a
> > complete solution.
> >
> > 0001 - mostly same as patches for other versions. ReorderBufferAddInvalidations()
> >        was adjusted to allow being called several times. As I said above,
> >        0001 cannot execute inval messages while decoding the transacitons.
> > 0002 - introduces new ReorderBufferChange type to indicate inval messages.
> >        It would be handled like PG14+.
> >
> > Here is an example. Assuming that the table foo exists on both nodes, a
> > publication "pub" which publishes all tables, and a subscription "sub" which
> > subscribes "pub". What if the workload is executed?
> >
> > ```
> > S1                              S2
> > BEGIN;
> > INSERT INTO foo VALUES (1)
> >                                 ALTER PUBLICATION pub RENAME TO pub_renamed;
> > INSERT INTO foo VALUES (2)
> > COMMIT;
> > LR -> ?
> > ```
> >
> > With 0001, tuples (1) and (2) would be replicated to the subscriber.
> > An error "publication "pub" does not exist" would raise when new changes are done
> > later.
> >
> > 0001+0002 works more aggressively; the error would raise when S1 transaction is decoded.
> > The behavior is same as for patched PG14-PG17.
> >
>
> I understand that with 0001 the fix is partial in the sense that
> because invalidations are processed at the transaction end, the
> changes of concurrent DDL will only be reflected for the next
> transaction. Now, on one hand, it is prudent to not add a new type of
> ReorderBufferChange in the backbranch (v13) but the change is not that
> invasive, so we can go with it as well. My preference would be to go
> with just 0001 for v13 to minimize the risk of adding new bugs or
> breaking something unintentionally.
>
> Sawada-San, and others involved here, do you have any suggestions on
> this matter?
>

Seeing no responses for a long time, I am planning to push the fix
till 14 tomorrow unless there are some opinions on the fix for 13. We
can continue to discuss the scope of the fix for 13.

--
With Regards,
Amit Kapila.



pgsql-hackers by date:

Previous
From: Daniel Gustafsson
Date:
Subject: Re: Enhancing Memory Context Statistics Reporting
Next
From: "Aya Iwata (Fujitsu)"
Date:
Subject: RE: [WIP]Vertical Clustered Index (columnar store extension) - take2