Home > mailing lists

Re: Conflict Detection and Resolution - Mailing list pgsql-hackers

From	Amit Kapila
Subject	Re: Conflict Detection and Resolution
Date	June 19, 2024 05:53:24
Msg-id	CAA4eK1JHtCc5N2BSrdu=AKzdbSjwW+SGWV7grhj0swPo33vq0g@mail.gmail.com Whole thread Raw
In response to	RE: Conflict Detection and Resolution ("Zhijie Hou (Fujitsu)" <houzj.fnst@fujitsu.com>)
List	pgsql-hackers

Tree view

On Tue, Jun 18, 2024 at 7:44 AM Zhijie Hou (Fujitsu)
<houzj.fnst@fujitsu.com> wrote:
>
> On Thursday, June 13, 2024 2:11 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> Hi,
>
> > On Wed, Jun 5, 2024 at 3:32 PM Zhijie Hou (Fujitsu) <houzj.fnst@fujitsu.com>
> > wrote:
> > >
> > > This time at PGconf.dev[1], we had some discussions regarding this
> > > project. The proposed approach is to split the work into two main
> > > components. The first part focuses on conflict detection, which aims
> > > to identify and report conflicts in logical replication. This feature
> > > will enable users to monitor the unexpected conflicts that may occur.
> > > The second part involves the actual conflict resolution. Here, we will
> > > provide built-in resolutions for each conflict and allow user to
> > > choose which resolution will be used for which conflict(as described
> > > in the initial email of this thread).
> >
> > I agree with this direction that we focus on conflict detection (and
> > logging) first and then develop conflict resolution on top of that.
>
> Thanks for your reply !
>
> >
> > >
> > > Of course, we are open to alternative ideas and suggestions, and the
> > > strategy above can be changed based on ongoing discussions and
> > > feedback received.
> > >
> > > Here is the patch of the first part work, which adds a new parameter
> > > detect_conflict for CREATE and ALTER subscription commands. This new
> > > parameter will decide if subscription will go for conflict detection.
> > > By default, conflict detection will be off for a subscription.
> > >
> > > When conflict detection is enabled, additional logging is triggered in
> > > the following conflict scenarios:
> > >
> > > * updating a row that was previously modified by another origin.
> > > * The tuple to be updated is not found.
> > > * The tuple to be deleted is not found.
> > >
> > > While there exist other conflict types in logical replication, such as
> > > an incoming insert conflicting with an existing row due to a primary
> > > key or unique index, these cases already result in constraint violation errors.
> >
> > What does detect_conflict being true actually mean to users? I understand that
> > detect_conflict being true could introduce some overhead to detect conflicts.
> > But in terms of conflict detection, even if detect_confict is false, we detect
> > some conflicts such as concurrent inserts with the same key. Once we
> > introduce the complete conflict detection feature, I'm not sure there is a case
> > where a user wants to detect only some particular types of conflict.
> >
> > > Therefore, additional conflict detection for these cases is currently
> > > omitted to minimize potential overhead. However, the pre-detection for
> > > conflict in these error cases is still essential to support automatic
> > > conflict resolution in the future.
> >
> > I feel that we should log all types of conflict in an uniform way. For example,
> > with detect_conflict being true, the update_differ conflict is reported as
> > "conflict %s detected on relation "%s"", whereas concurrent inserts with the
> > same key is reported as "duplicate key value violates unique constraint "%s"",
> > which could confuse users.
>
> Do you mean it's ok to add a pre-check before applying the INSERT, which will
> verify if the remote tuple violates any unique constraints, and if it violates
> then we log a conflict message ? I thought about this but was slightly
> worried about the extra cost it would bring. OTOH, if we think it's acceptable,
> we could do that since the cost is there only when detect_conflict is enabled.
>
> I also thought of logging such a conflict message in pg_catch(), but I think we
> lack some necessary info(relation, index name, column name) at the catch block.
>

Can't we use/extend existing 'apply_error_callback_arg' for this purpose?

--
With Regards,
Amit Kapila.

pgsql-hackers by date:

From: Tom Lane
Date: 19 June 2024, 05:45:43
Subject: Re: Document NULL

From: "David G. Johnston"
Date: 19 June 2024, 06:02:14
Subject: Re: Document NULL

Re: Conflict Detection and Resolution - Mailing list pgsql-hackers

Previous

Next