Home > mailing lists

Re: long-standing data loss bug in initial sync of logical replication - Mailing list pgsql-hackers

From	Amit Kapila
Subject	Re: long-standing data loss bug in initial sync of logical replication
Date	March 15 09:24:40
Msg-id	CAA4eK1KKX74B_hitfpK3-R4kAx5qJ6X71OeGxW7cvjO5qOV1ZQ@mail.gmail.com Whole thread Raw
In response to	RE: long-standing data loss bug in initial sync of logical replication ("Hayato Kuroda (Fujitsu)" <kuroda.hayato@fujitsu.com>)
Responses	RE: long-standing data loss bug in initial sync of logical replication
List	pgsql-hackers

Tree view

On Thu, Mar 13, 2025 at 2:12 PM Hayato Kuroda (Fujitsu)
<kuroda.hayato@fujitsu.com> wrote:
>
>
> Workload C. DDL is happening on publication but on unrelated table
> ============================================
> We did not run the workload because we expected this could be same results as D.
> 588acf6 is needed to optimize the workload.
>
> -----
>
> Workload D. DDL is happening on the related published table,
>                         and one insert is done per invalidation
> =========================================
> This workload had huge regression same as the master branch. This is expected
> because distributed invalidation messages require all concurrent transactions
> to rebuild relsync caches.
>
> Concurrent txn     | Head (sec)   | Patch (sec)  | Degradation (%)
> ------------------ | ------------ | ------------ | ----------------
> 50                 | 0.013496     | 0.015588     | 15.5034
> 100                | 0.015112     | 0.018868     | 24.8517
> 500                | 0.018483     | 0.038714     | 109.4536
> 1000               | 0.023402     | 0.063735     | 172.3524
> 2000               | 0.031596     | 0.110860     | 250.8720
>

IIUC, workloads C and D will have regression in back branches, and
HEAD will have regression only for workload D. We have avoided
workload C regression in HEAD via commits 7c99dc587a and 3abe9dc188.
We can backpatch those commits if required, but I think it is better
not to do those as scenarios C and D won't be that common, and we
should go ahead with the fix as it is. In the future, if we get any
way to avoid regression due to scenario-D, then we can do that for the
HEAD branch.

Thoughts?

--
With Regards,
Amit Kapila.

pgsql-hackers by date:

From: Amit Kapila
Date: 15 March, 09:15:58
Subject: Re: Adding a '--clean-publisher-objects' option to 'pg_createsubscriber' utility.

From: Michael Paquier
Date: 15 March, 09:26:10
Subject: Re: More Perl cleanups

Re: long-standing data loss bug in initial sync of logical replication - Mailing list pgsql-hackers

Previous

Next