Re: Conflict Detection and Resolution - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: Conflict Detection and Resolution
Date
Msg-id CAA4eK1Jq1cvPhRO4nDmCjJsqXbgOYNBp2SoG-YaQdkTwrZmtkw@mail.gmail.com
Whole thread Raw
In response to Re: Conflict Detection and Resolution  (Dilip Kumar <dilipbalaut@gmail.com>)
List pgsql-hackers
On Tue, Jun 18, 2024 at 1:18 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> On Tue, Jun 18, 2024 at 12:11 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Tue, Jun 18, 2024 at 11:54 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> > >
> > > On Mon, Jun 17, 2024 at 8:51 PM Robert Haas <robertmhaas@gmail.com> wrote:
> > > >
> > > > On Mon, Jun 17, 2024 at 1:42 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > > > The difference w.r.t the existing mechanisms for holding deleted data
> > > > > is that we don't know whether we need to hold off the vacuum from
> > > > > cleaning up the rows because we can't say with any certainty whether
> > > > > other nodes will perform any conflicting operations in the future.
> > > > > Using the example we discussed,
> > > > > Node A:
> > > > >   T1: INSERT INTO t (id, value) VALUES (1,1);
> > > > >   T2: DELETE FROM t WHERE id = 1;
> > > > >
> > > > > Node B:
> > > > >   T3: UPDATE t SET value = 2 WHERE id = 1;
> > > > >
> > > > > Say the order of receiving the commands is T1-T2-T3. We can't predict
> > > > > whether we will ever get T-3, so on what basis shall we try to prevent
> > > > > vacuum from removing the deleted row?
> > > >
> > > > The problem arises because T2 and T3 might be applied out of order on
> > > > some nodes. Once either one of them has been applied on every node, no
> > > > further conflicts are possible.
> > >
> > > If we decide to skip the update whether the row is missing or deleted,
> > > we indeed reach the same end result regardless of the order of T2, T3,
> > > and Vacuum. Here's how it looks in each case:
> > >
> > > Case 1: T1, T2, Vacuum, T3 -> Skip the update for a non-existing row
> > > -> end result we do not have a row.
> > > Case 2: T1, T2, T3 -> Skip the update for a deleted row -> end result
> > > we do not have a row.
> > > Case 3: T1, T3, T2 -> deleted the row -> end result we do not have a row.
> > >
> >
> > In case 3, how can deletion be successful? The row required to be
> > deleted has already been updated.
>
> Hmm, I was considering this case in the example given by you above[1],
> so we have updated some fields of the row with id=1, isn't this row
> still detectable by the delete because delete will find this by id=1
> as we haven't updated the id?  I was making the point w.r.t. the
> example used above.
>

Your point is correct w.r.t the example but I responded considering a
general update-delete ordering. BTW, it is not clear to me how
update_delete conflict will be handled with what Robert and you are
saying. I'll try to say what I understood. If we assume that there are
two nodes A & B as mentioned in the above example and DELETE has
applied on both nodes, now say UPDATE has been performed on node B
then irrespective of whether we consider the conflict as update_delete
or update_missing, the data will remain same on both nodes. So, in
such a case, we don't need to bother differentiating between those two
types of conflicts. Is that what we can interpret from above?

--
With Regards,
Amit Kapila.



pgsql-hackers by date:

Previous
From: Kyotaro Horiguchi
Date:
Subject: Re: 001_rep_changes.pl fails due to publisher stuck on shutdown
Next
From: Peter Eisentraut
Date:
Subject: Re: altering a column's collation leaves an invalid foreign key