Re: Conflict Detection and Resolution - Mailing list pgsql-hackers

From Dilip Kumar
Subject Re: Conflict Detection and Resolution
Date
Msg-id CAFiTN-saFMD8iY83Fj+=ey2+E+DVS6ZBzeJzkrAXzW7Dpz-fDg@mail.gmail.com
Whole thread Raw
In response to Re: Conflict Detection and Resolution  (Tomas Vondra <tomas.vondra@enterprisedb.com>)
Responses Re: Conflict Detection and Resolution
Re: Conflict Detection and Resolution
List pgsql-hackers
On Tue, Jun 11, 2024 at 7:44 PM Tomas Vondra
<tomas.vondra@enterprisedb.com> wrote:

> > Yes, that's correct. However, many cases could benefit from the
> > update_deleted conflict type if it can be implemented reliably. That's
> > why we wanted to give it a try. But if we can't achieve predictable
> > results with it, I'm fine to drop this approach and conflict_type. We
> > can consider a better design in the future that doesn't depend on
> > non-vacuumed entries and provides a more robust method for identifying
> > deleted rows.
> >
>
> I agree having a separate update_deleted conflict would be beneficial,
> I'm not arguing against that - my point is actually that I think this
> conflict type is required, and that it needs to be detected reliably.
>

When working with a distributed system, we must accept some form of
eventual consistency model. However, it's essential to design a
predictable and acceptable behavior. For example, if a change is a
result of a previous operation (such as an update on node B triggered
after observing an operation on node A), we can say that the operation
on node A happened before the operation on node B. Conversely, if
operations on nodes A and B are independent, we consider them
concurrent.

In distributed systems, clock skew is a known issue. To establish a
consistency model, we need to ensure it guarantees the
"happens-before" relationship. Consider a scenario with three nodes:
NodeA, NodeB, and NodeC. If NodeA sends changes to NodeB, and
subsequently NodeB makes changes, and then both NodeA's and NodeB's
changes are sent to NodeC, the clock skew might make NodeB's changes
appear to have occurred before NodeA's changes. However, we should
maintain data that indicates NodeB's changes were triggered after
NodeA's changes arrived at NodeB. This implies that logically, NodeB's
changes happened after NodeA's changes, despite what the timestamps
suggest.

A common method to handle such cases is using vector clocks for
conflict resolution. "Vector clocks" allow us to track the causal
relationships between changes across nodes, ensuring that we can
correctly order events and resolve conflicts in a manner that respects
the "happens-before" relationship. This method helps maintain
consistency and predictability in the system despite issues like clock
skew.

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com



pgsql-hackers by date:

Previous
From: Ashutosh Sharma
Date:
Subject: Re: Addressing SECURITY DEFINER Function Vulnerabilities in PostgreSQL Extensions
Next
From: Masahiko Sawada
Date:
Subject: Re: Logical Replication of sequences