Hi,
On 2024-11-12 11:40:39 -0500, Jan Wieck wrote:
> On 11/12/24 10:34, Andres Freund wrote:
> > I have working code - pretty ugly at this state, but mostly needs a fair bit
> > of elbow grease not divine inspiration... It's not a trivial change, but
> > entirely doable.
> >
> > The short summary of how it works is that it uses a single 64bit atomic that
> > is internally subdivided into a ringbuffer position in N high bits and an
> > offset from a base LSN in the remaining bits. The insertion sequence is
> >
> > ...
> >
> > This leaves you with a single xadd to contended cacheline as the contention
> > point (scales far better than cmpxchg and far far better than
> > cmpxchg16b). There's a bit of contention for the ringbuffer[].oldpos being set
> > and read, but it's only by two backends, not all of them.
>
> That sounds rather promising.
>
> Would it be reasonable to have both implementations available at least at
> compile time, if not at runtime?
No, not reasonably.
> Is it possible that we need to do that anyway for some time or are those
> atomic operations available on all supported CPU architectures?
We have a fallback atomics implementation for the uncommon architectures
without 64bit atomics.
> In any case, thanks for the input. Looks like in the long run we need to
> come up with a different way to solve the inversion problem.
IMO there's absolutely no way the changes proposed in this thread so far
should get merged.
Greetings,
Andres Freund