Re: Conflict Detection and Resolution - Mailing list pgsql-hackers

From Dilip Kumar
Subject Re: Conflict Detection and Resolution
Date
Msg-id CAFiTN-uY+FGoj_6gwoMfs5rAVQA25eLq18G4FNHQU4gT9ge4eA@mail.gmail.com
Whole thread Raw
In response to Re: Conflict Detection and Resolution  (Tomas Vondra <tomas.vondra@enterprisedb.com>)
List pgsql-hackers
On Mon, Jun 17, 2024 at 5:38 AM Tomas Vondra
<tomas.vondra@enterprisedb.com> wrote:
>

> > The issue with using commit timestamps is that, when multiple nodes
> > are involved, the commit timestamp won't accurately represent the
> > actual order of operations. There's no reliable way to determine the
> > perfect order of each operation happening on different nodes roughly
> > simultaneously unless we use some globally synchronized counter.
> > Generally, that order might not cause real issues unless one operation
> > is triggered by a previous operation, and relying solely on physical
> > timestamps would not detect that correctly.
> >
> This whole conflict detection / resolution proposal is based on using
> commit timestamps. Aren't you suggesting it can't really work with
> commit timestamps?
>
> FWIW there are ways to builds distributed consistency with timestamps,
> as long as it's monotonic - e.g. clock-SI does that. It's not perfect,
> but it shows it's possible.

Hmm, I see that clock-SI does this by delaying the transaction when it
detects the clock skew.

> However, I'm not we have to go there - it depends on what the goal is.
> For a one-directional replication (multiple nodes replicating to the
> same target) it might be sufficient if the conflict resolution is
> "deterministic" (e.g. not dependent on the order in which the changes
> are applied). I'm not sure, but it's why I asked what's the goal in my
> very first message in this thread.

I'm not completely certain about this.  Even in one directional
replication if multiple nodes are sending data how can we guarantee
determinism in the presence of clock skew if we are not using some
other mechanism like logical counters or something like what clock-SI
is doing?  I don't want to insist on using any specific solution here.
However, I noticed that we haven't addressed how we plan to manage
clock skew, which is my primary concern. I believe that if multiple
nodes are involved and we're receiving data from them with
unsynchronized clocks, ensuring determinism about their order will
require us to take some measures to handle that.

> > We need some sort of logical counter, such as a vector clock, which
> > might be an independent counter on each node but can perfectly track
> > the causal order. For example, if NodeA observes an operation from
> > NodeB with a counter value of X, NodeA will adjust its counter to X+1.
> > This ensures that if NodeA has seen an operation from NodeB, its next
> > operation will appear to have occurred after NodeB's operation.
> >
> > I admit that I haven't fully thought through how we could design such
> > version tracking in our logical replication protocol or how it would
> > fit into our system. However, my point is that we need to consider
> > something beyond commit timestamps to achieve reliable ordering.
> >
>
> I can't really respond to this as there's no suggestion how it would be
> implemented in the patch discussed in this thread.
>
No worries, I'll consider whether finding such a solution is feasible
for our situation. Thank you!

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com



pgsql-hackers by date:

Previous
From: Amit Langote
Date:
Subject: Re: SQL/JSON query functions context_item doc entry and type requirement
Next
From: Matthias van de Meent
Date:
Subject: Re: RFC: adding pytest as a supported test framework