Re: Conflict Detection and Resolution - Mailing list pgsql-hackers

From shveta malik
Subject Re: Conflict Detection and Resolution
Date
Msg-id CAJpy0uA3DAcyTOz--ToM-H-_mS-Pv2bqT2Zu-Mm3gJrw-YTcKQ@mail.gmail.com
Whole thread Raw
In response to Re: Conflict Detection and Resolution  (Amit Kapila <amit.kapila16@gmail.com>)
List pgsql-hackers
On Wed, Jul 3, 2024 at 3:35 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Wed, Jul 3, 2024 at 2:16 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> >
> > On Wed, Jul 3, 2024 at 12:30 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > On Wed, Jul 3, 2024 at 11:29 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> >
> > > But waiting after applying the operations and before applying the
> > > commit would mean that we need to wait with the locks held. That could
> > > be a recipe for deadlocks in the system. I see your point related to
> > > performance but as we are not expecting clock skew in normal cases, we
> > > shouldn't be too much bothered on the performance due to this. If
> > > there is clock skew, we expect users to fix it, this is just a
> > > worst-case aid for users.
> >
> > But if we make it wait at the very first operation that means we will
> > not suck more decoded data from the network and wouldn't that make the
> > sender wait for the network buffer to get sucked in by the receiver?
> >
>
> That would be true even if we wait just before applying the commit
> record considering the transaction is small and the wait time is
> large.
>
> > Also, we already have a handling of parallel apply workers so if we do
> > not have an issue of deadlock there or if we can handle those issues
> > there we can do it here as well no?
> >
>
> Parallel apply workers won't wait for a long time. There is some
> similarity and in both cases, deadlock will be detected but chances of
> such implementation-related deadlocks will be higher if we start
> waiting for a random amount of times. The other possibility is that we
> can keep a cap on the max clock skew time above which we will give
> ERROR even if the user has configured wait.

+1. But I think cap has to be on wait-time. As an example, let's say
the user has configured 'clock skew tolerance' of 10sec while the
actual clock skew between nodes is 5 min. It means, we will mostly
have to  wait '5 min - 10sec' to bring the clock skew to a tolerable
limit, which is a huge waiting time. We can keep a max limit on this
wait time.

thanks
Shveta



pgsql-hackers by date:

Previous
From: "Andrey M. Borodin"
Date:
Subject: Re: What is a typical precision of gettimeofday()?
Next
From: David Rowley
Date:
Subject: Re: Use generation memory context for tuplestore.c