Re: LogwrtResult contended spinlock - Mailing list pgsql-hackers

From Andres Freund
Subject Re: LogwrtResult contended spinlock
Date
Msg-id 20240702121759.jmig2lz6eytpm3lb@alap3.anarazel.de
Whole thread Raw
In response to Re: LogwrtResult contended spinlock  (Alvaro Herrera <alvherre@alvh.no-ip.org>)
Responses Re: LogwrtResult contended spinlock
List pgsql-hackers
Hi,

On 2024-07-01 21:12:25 +0200, Alvaro Herrera wrote:
> On 2024-Jul-01, Andres Freund wrote:
> 
> > On 2024-07-01 11:10:24 +0200, Alvaro Herrera wrote:
> > > In the meantime I noticed that pg_attribute_aligned() is not supported
> > > in every platform/compiler, so for safety sake I think it's better to go
> > > with what we do for PGAlignedBlock: use a union with a double member.
> > > That should be 8-byte aligned on x86 as well, unless I misunderstand.
> > 
> > If a platform wants to support 8 byte atomics, it better provides a way to
> > make variables 8 bytes aligned. We already rely on that, actually. See use of
> > pg_attribute_aligned in e.g. src/include/port/atomics/generic-msvc.h
> 
> Well, pg_atomic_uint64 is a struct. Passing pointers to it is fine,
> which is what non-platform-specific code does; but because the
> declaration of the type is in each platform-specific file, it might not
> work to use it directly in generic code.  I didn't actually try, but it
> seems a bit of a layering violation.  (I didn't find any place where
> the struct is used that way.)
> 
> If that works, then I think we could simply declare currval as a
> pg_atomic_uint64 and it'd be prettier.

I don't think that'd make sense at all. The expected value isn't an atomic, so
it shouldn't be the type of an atomic.


> > > +    /*
> > > +     * On 32-bit machines, declaring a bare uint64 variable doesn't promise
> > > +     * the alignment we need, so coerce the compiler this way.
> > > +     */
> > > +    union
> > > +    {
> > > +        uint64        u64;
> > > +        double        force_align_d;
> > > +    }            currval;
> > 
> > I wonder if we should just relax the alignment requirement for currval. It's
> > crucial that the pointer is atomically aligned (atomic ops across pages are
> > either forbidden or extremely slow), but it's far from obvious that it's
> > crucial for comparator value to be aligned.
> 
> I'm pretty sure the Microsoft docs I linked to are saying it must be
> aligned.

I don't think so:
https://learn.microsoft.com/en-us/windows/win32/api/winnt/nf-winnt-interlockedcompareexchange64

LONG64 InterlockedCompareExchange64(
  [in, out] LONG64 volatile *Destination,
  [in]      LONG64          ExChange,
  [in]      LONG64          Comperand
);

Note that Destination is the only argument passed by reference (and thus the
caller controls alignment of the in-memory value). ExChange is passed by
value, so we don't control alignment in any way.


> > >  #ifndef PG_HAVE_ATOMIC_U64_SIMULATION
> > >      AssertPointerAlignment(ptr, 8);
> > >  #endif
> > 
> > What's the point of this assert, btw? This stuff is already asserted in lower
> > level routines, so it just seems redundant to have it here?
> 
> There are in some of them, but not in pg_atomic_compare_exchange_u64_impl.

But there's one in pg_atomic_read_u64_impl().  But I actually think it's wrong
for pg_atomic_monotonic_advance_u64() to use _impl(), that's just for the
wrapper functions around the implementation. Wheras
pg_atomic_monotonic_advance_u64() should just use the generic interface.


> 
> > > -    return Max(target_, currval);
> > > +    return Max(target_, currval.u64);
> > 
> > What does the Max() actually achieve here? Shouldn't it be impossible to reach
> > this with  currval < target_?
> 
> When two processes are hitting the cmpxchg concurrently, we need to
> return the highest value that was written, even if it was the other
> process that did it.

Sure. That explains needing to use currval. But not really needing to use
Max(). If cmpxchg succeeds, we need to return target_, if the loop terminates
otherwise we need to return currval. No?



> > And why does target_ end in an underscore?
> 
> Heh, you tell me -- I just copied the style of the functions just above.

IIRC using plain "and" "or" "add" caused conflicts with some headers or such.

Greetings,

Andres Freund



pgsql-hackers by date:

Previous
From: Alexander Lakhin
Date:
Subject: Re: Improving tracking/processing of buildfarm test failures
Next
From: Matthias van de Meent
Date:
Subject: Re: Use generation memory context for tuplestore.c