Re: Proposal: Conflict log history table for Logical Replication - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: Proposal: Conflict log history table for Logical Replication
Date
Msg-id CAA4eK1+tW8_LiTt1ZCGpH06fq4SpyUaduqtapAT1PUHVKBGrxg@mail.gmail.com
Whole thread Raw
In response to Re: Proposal: Conflict log history table for Logical Replication  (shveta malik <shveta.malik@gmail.com>)
Responses Re: Proposal: Conflict log history table for Logical Replication
List pgsql-hackers
On Mon, Dec 1, 2025 at 2:58 PM shveta malik <shveta.malik@gmail.com> wrote:
>
> On Mon, Dec 1, 2025 at 2:04 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> >
> > On Mon, Dec 1, 2025 at 1:57 PM shveta malik <shveta.malik@gmail.com> wrote:
> > >
> > > Since there is a concern that multiple rows for
> > > multiple_unique_conflicts can cause data-bloat, it made me rethink
> > > that this is actually more prone to causing data-bloat if it is not
> > > resolved on time, as it seems a far more frequent scenario. So shall
> > > we keep inserting the record or insert it once and avoid inserting it
> > > again based on lsn?  Thoughts?
> >
> > I agree, this is the real problem related to bloat so maybe we can see
> > if the same tuple exists we can avoid inserting it again, although I
> > haven't put thought on how to we distinguish between the new conflict
> > on the same row vs the same conflict being inserted multiple times due
> > to worker restart.
> >
>
> If there is consensus on this approach, IMO, it appears safe to rely
> on 'remote_origin' and 'remote_commit_lsn' as the comparison keys for
> the given 'conflict_type' before we insert a new record.
>

What happens if as part of multiple_unique_conflict, in the next apply
round only some of the rows conflict (say in the meantime user has
removed a few conflicting rows)? I think the ideal way for users to
avoid such multiple occurrences is to configure subscription with
disable_on_error. I think we should LOG errors again on retry and it
is better to keep it consistent with what we print in LOG because we
may want to give an option to users in future where to LOG (in
conflict_history_table, LOG, or both) the conflicts.

--
With Regards,
Amit Kapila.



pgsql-hackers by date:

Previous
From: shveta malik
Date:
Subject: Re: Proposal: Conflict log history table for Logical Replication
Next
From: Peter Eisentraut
Date:
Subject: Re: Migrate to autoconf 2.72?