Re: Commit Timestamp and LSN Inversion issue - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: Commit Timestamp and LSN Inversion issue
Date
Msg-id CAA4eK1+ZWfQ9_cTeCkhxqdEFeS_-H7vAeZSJEWFs9__soLK9pA@mail.gmail.com
Whole thread Raw
In response to Commit Timestamp and LSN Inversion issue  (shveta malik <shveta.malik@gmail.com>)
Responses Re: Commit Timestamp and LSN Inversion issue
List pgsql-hackers
On Tue, Nov 5, 2024 at 7:28 PM Jan Wieck <jan@wi3ck.info> wrote:
>
> >
> > We can't forget CDR completely as this could only be a potential
> > problem in that context. Right now, we don't have any built-in
> > resolution strategies, so this can't impact but if this is a problem
> > then we need to have a solution for it before considering a solution
> > like "last_write_wins" strategy.
>
> I agree that we can't forget about CDR. This is precisely the problem we
> ran into here at pgEdge and why we came up with a solution (attached).
>

I would like to highlight that we need to solve LSN<->Timestamp
inversion issue not only for resolution strategies like
'last_write_wins' but also for conflict detection as well. In
particular, while implementing/discussing the patch to detect the
update_deleted conflict type, we came across the race conditions [1]
where the inversion issue discussed here would lead to removing the
required rows before we could detect the conflict. So, +1 to solve
this issue.

> > Now, instead of discussing LSN<->timestamp inversion issue, you
> > started to discuss "last_write_wins" strategy itself which we have
> > discussed to some extent in the thread [2]. BTW, we are planning to
> > start a separate thread as well just to discuss the clock skew problem
> > w.r.t resolution strategies like "last_write_wins" strategy. So, we
> > can discuss clock skew in that thread and keep the focus of this
> > thread LSN<->timestamp inversion problem.
>
> Fact is that "last_write_wins" together with some implementation of
> Conflict free Replicated Data Types (CRDT) is good enough for many real
> world situations. Anything resembling a TPC-B or TPC-C is quite happy
> with it.
>
> The attached solution is minimally invasive because it doesn't move the
> timestamp generation (clock_gettime() call) into the critical section of
> ReserveXLogInsertLocation() that is protected by a spinlock. Instead it
> keeps track of the last commit-ts written to WAL in shared memory and
> simply bumps that by one microsecond if the next one is below or equal.
> There is one extra condition in that code section plus a function call
> by pointer for every WAL record.
>

I think we avoid calling hook/callback functions after holding a lock
(spinlock or LWLock) as the user may do an expensive operation or
acquire some other locks in those functions which could lead to
deadlocks or impact the concurrency. So, it would be better to
directly call an inline function to perform the required operation.

This sounds like a good idea to solve this problem. Thanks for sharing
the patch.

[1] - https://www.postgresql.org/message-id/CAA4eK1LKgkjyNKeW5jEhy1%3DuE8z0p7Pdae0rohoJP51eJGd%3Dgg%40mail.gmail.com

--
With Regards,
Amit Kapila.



pgsql-hackers by date:

Previous
From: wenhui qiu
Date:
Subject: Re: optimize the value of vacthresh and anlthresh
Next
From: Bertrand Drouvot
Date:
Subject: Re: define pg_structiszero(addr, s, r)