On Tue, Aug 26, 2025 at 12:45 PM Andres Freund <andres@anarazel.de> wrote:
> On 2025-08-25 10:43:21 +1200, Thomas Munro wrote:
> > On Mon, Aug 25, 2025 at 6:11 AM Konstantin Knizhnik <knizhnik@garret.ru> wrote:
> > > In theory even replacing bitfield with in should not
> > > avoid race condition, because they are still shared the same cache line.
> >
> > I'm no expert in this stuff, but that's not my understanding of how it
> > works. Plain stores to normal memory go into the store buffer and are
> > eventually flushed to the memory hierarchy, but all modifications that reach
> > the cache hierarchy have a consistent view of memory created by the cache
> > coherency protocol (in ARM's case MOESI[1]): only one core can change a
> > cache line at a time while it has exclusive access (with some optimisations,
> > owner mode, snooping, etc but AFAIK that doesn't change the basic
> > consistency).
>
> From what I understand that's not quite right - the whole point of the store
> buffer is to avoid the latency hit of having to wait for cacheline
> ownership. Instead the write is done into the store buffer, notably on a
> granularity *smaller* than the cacheline (it has to be smaller, because we
> don't have the contents of the cacheline). The reason that that is somewhat
> OK from a coherency perspective is that this is done only for pure writes, not
> read-modify-write operations. As the write overwrites the prior contents of
> the memory, it is "ok" to do the write without waiting for cacheline ownership
> ahead of time.
*confused* Where's the contradiction?