Re: Suboptimal spinlock code due to volatile - Mailing list pgsql-hackers

From Heikki Linnakangas
Subject Re: Suboptimal spinlock code due to volatile
Date
Msg-id b30e197b-f04d-4971-b2e3-f5f56a740f81@iki.fi
Whole thread Raw
In response to Suboptimal spinlock code due to volatile  (Andres Freund <andres@anarazel.de>)
Responses Re: Suboptimal spinlock code due to volatile
List pgsql-hackers
On 29/07/2024 22:59, Andres Freund wrote:
> After being confused for a while, the explanation is fairly simple: We use
> volatile and dereference the address:
> 
> static __inline__ int
> tas(volatile slock_t *lock)
> {
>     slock_t        _res = 1;
> 
>     __asm__ __volatile__(
>         "    lock            \n"
>         "    xchgb    %0,%1    \n"
> :        "+q"(_res), "+m"(*lock)
> :        /* no inputs */
> :        "memory", "cc");
>     return (int) _res;
> }
> 
> (note the (*lock) and the volatile in the signature).
> 
> I think it'd be just as defensible to not emit a separate load here, despite
> the volatile, and indeed clang doesn't emit a separate load. But it also does
> seem defensible to take translate the code very literally, as gcc does.
> 
> 
> If I remove the volatile from the signature or cast it away, gcc indeed
> generates the offset version:
>      4230:    f0 86 82 c0 01 00 00     lock xchg %al,0x1c0(%rdx)

Good catch. Seems safe to just remove the volatile.

> A second, even smaller, issue with the code is that we use "lock xchgb"
> despite xchg having implied lock approximately forever ([2]). That makes the code
> slightly wider than necessary (the lock prefix is one byte).
> 
> 
> I doubt there's a lot of situations where these end up having a meaningful
> performance impact, but it still seems suboptimal.   I may be seeing a *small*
> gain in a workload inserting lots of tiny records, but it's hard to be sure if
> it's above the noise floor.
> 
> 
> I'm wondering in how many places our fairly broad use of volatiles causes
> more substantially worse code being generated.

Aside from performance, I find "volatile" difficult to reason about. I 
feel more comfortable with atomics and memory barriers.

-- 
Heikki Linnakangas
Neon (https://neon.tech)




pgsql-hackers by date:

Previous
From: Richard Guo
Date:
Subject: Re: Trivial revise for the check of parameterized partial paths
Next
From: Richard Guo
Date:
Subject: Re: Short-circuit sort_inner_and_outer if there are no mergejoin clauses