Re: Conflict Detection and Resolution - Mailing list pgsql-hackers
From | Amit Kapila |
---|---|
Subject | Re: Conflict Detection and Resolution |
Date | |
Msg-id | CAA4eK1JHtCc5N2BSrdu=AKzdbSjwW+SGWV7grhj0swPo33vq0g@mail.gmail.com Whole thread Raw |
In response to | RE: Conflict Detection and Resolution ("Zhijie Hou (Fujitsu)" <houzj.fnst@fujitsu.com>) |
List | pgsql-hackers |
On Tue, Jun 18, 2024 at 7:44 AM Zhijie Hou (Fujitsu) <houzj.fnst@fujitsu.com> wrote: > > On Thursday, June 13, 2024 2:11 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > Hi, > > > On Wed, Jun 5, 2024 at 3:32 PM Zhijie Hou (Fujitsu) <houzj.fnst@fujitsu.com> > > wrote: > > > > > > This time at PGconf.dev[1], we had some discussions regarding this > > > project. The proposed approach is to split the work into two main > > > components. The first part focuses on conflict detection, which aims > > > to identify and report conflicts in logical replication. This feature > > > will enable users to monitor the unexpected conflicts that may occur. > > > The second part involves the actual conflict resolution. Here, we will > > > provide built-in resolutions for each conflict and allow user to > > > choose which resolution will be used for which conflict(as described > > > in the initial email of this thread). > > > > I agree with this direction that we focus on conflict detection (and > > logging) first and then develop conflict resolution on top of that. > > Thanks for your reply ! > > > > > > > > > Of course, we are open to alternative ideas and suggestions, and the > > > strategy above can be changed based on ongoing discussions and > > > feedback received. > > > > > > Here is the patch of the first part work, which adds a new parameter > > > detect_conflict for CREATE and ALTER subscription commands. This new > > > parameter will decide if subscription will go for conflict detection. > > > By default, conflict detection will be off for a subscription. > > > > > > When conflict detection is enabled, additional logging is triggered in > > > the following conflict scenarios: > > > > > > * updating a row that was previously modified by another origin. > > > * The tuple to be updated is not found. > > > * The tuple to be deleted is not found. > > > > > > While there exist other conflict types in logical replication, such as > > > an incoming insert conflicting with an existing row due to a primary > > > key or unique index, these cases already result in constraint violation errors. > > > > What does detect_conflict being true actually mean to users? I understand that > > detect_conflict being true could introduce some overhead to detect conflicts. > > But in terms of conflict detection, even if detect_confict is false, we detect > > some conflicts such as concurrent inserts with the same key. Once we > > introduce the complete conflict detection feature, I'm not sure there is a case > > where a user wants to detect only some particular types of conflict. > > > > > Therefore, additional conflict detection for these cases is currently > > > omitted to minimize potential overhead. However, the pre-detection for > > > conflict in these error cases is still essential to support automatic > > > conflict resolution in the future. > > > > I feel that we should log all types of conflict in an uniform way. For example, > > with detect_conflict being true, the update_differ conflict is reported as > > "conflict %s detected on relation "%s"", whereas concurrent inserts with the > > same key is reported as "duplicate key value violates unique constraint "%s"", > > which could confuse users. > > Do you mean it's ok to add a pre-check before applying the INSERT, which will > verify if the remote tuple violates any unique constraints, and if it violates > then we log a conflict message ? I thought about this but was slightly > worried about the extra cost it would bring. OTOH, if we think it's acceptable, > we could do that since the cost is there only when detect_conflict is enabled. > > I also thought of logging such a conflict message in pg_catch(), but I think we > lack some necessary info(relation, index name, column name) at the catch block. > Can't we use/extend existing 'apply_error_callback_arg' for this purpose? -- With Regards, Amit Kapila.
pgsql-hackers by date: