Thread: spinlock support on loongarch64
add spinlock support on loongarch64.
Attachment
=?gb2312?B?zuLRx7fJ?= <wuyf41619@hundsun.com> writes: > add spinlock support on loongarch64. I wonder if we shouldn't just do that (ie, try to use __sync_lock_test_and_set) as a generic fallback on any unsupported architecture. We could get rid of the separate stanza for RISC-V that way. The main thing that an arch-specific stanza could bring is knowledge of the best data type width to use for a spinlock; but I don't see a big problem with defaulting to "int". We can always add arch-specific stanzas for any machines where that's shown to be a seriously poor choice. regards, tom lane
Hi, On 2022-11-02 11:37:35 -0400, Tom Lane wrote: > =?gb2312?B?zuLRx7fJ?= <wuyf41619@hundsun.com> writes: > > add spinlock support on loongarch64. > > I wonder if we shouldn't just do that (ie, try to use > __sync_lock_test_and_set) as a generic fallback on any unsupported > architecture. We could get rid of the separate stanza for RISC-V > that way. The main thing that an arch-specific stanza could bring > is knowledge of the best data type width to use for a spinlock; > but I don't see a big problem with defaulting to "int". We can > always add arch-specific stanzas for any machines where that's > shown to be a seriously poor choice. Yes, please. It might not be perfect for all architectures, and it might not be good for some very old architectures. But for anything new it'll be vastly better than not having spinlocks at all. Greetings, Andres Freund
Andres Freund <andres@anarazel.de> writes: > On 2022-11-02 11:37:35 -0400, Tom Lane wrote: >> I wonder if we shouldn't just do that (ie, try to use >> __sync_lock_test_and_set) as a generic fallback on any unsupported >> architecture. We could get rid of the separate stanza for RISC-V >> that way. The main thing that an arch-specific stanza could bring >> is knowledge of the best data type width to use for a spinlock; >> but I don't see a big problem with defaulting to "int". We can >> always add arch-specific stanzas for any machines where that's >> shown to be a seriously poor choice. > Yes, please. It might not be perfect for all architectures, and it might not > be good for some very old architectures. But for anything new it'll be vastly > better than not having spinlocks at all. So about like this, then. regards, tom lane diff --git a/src/include/storage/s_lock.h b/src/include/storage/s_lock.h index 4225d9b7fc..d3972f61fc 100644 --- a/src/include/storage/s_lock.h +++ b/src/include/storage/s_lock.h @@ -293,29 +293,6 @@ spin_delay(void) #endif /* __arm__ || __arm || __aarch64__ */ -/* - * RISC-V likewise uses __sync_lock_test_and_set(int *, int) if available. - */ -#if defined(__riscv) -#ifdef HAVE_GCC__SYNC_INT32_TAS -#define HAS_TEST_AND_SET - -#define TAS(lock) tas(lock) - -typedef int slock_t; - -static __inline__ int -tas(volatile slock_t *lock) -{ - return __sync_lock_test_and_set(lock, 1); -} - -#define S_UNLOCK(lock) __sync_lock_release(lock) - -#endif /* HAVE_GCC__SYNC_INT32_TAS */ -#endif /* __riscv */ - - /* S/390 and S/390x Linux (32- and 64-bit zSeries) */ #if defined(__s390__) || defined(__s390x__) #define HAS_TEST_AND_SET @@ -716,6 +693,52 @@ spin_delay(void) #endif /* !defined(HAS_TEST_AND_SET) */ +/* + * Last-ditch try. + * + * If we have no platform-specific knowledge, but we found that the compiler + * provides __sync_lock_test_and_set(), use that. Prefer the int-width + * version over the char-width version if we have both, on the rather dubious + * grounds that that's known to be more likely to work in the ARM ecosystem. + * (But we dealt with ARM above.) + */ +#if !defined(HAS_TEST_AND_SET) + +#if defined(HAVE_GCC__SYNC_INT32_TAS) +#define HAS_TEST_AND_SET + +#define TAS(lock) tas(lock) + +typedef int slock_t; + +static __inline__ int +tas(volatile slock_t *lock) +{ + return __sync_lock_test_and_set(lock, 1); +} + +#define S_UNLOCK(lock) __sync_lock_release(lock) + +#elif defined(HAVE_GCC__SYNC_CHAR_TAS) +#define HAS_TEST_AND_SET + +#define TAS(lock) tas(lock) + +typedef char slock_t; + +static __inline__ int +tas(volatile slock_t *lock) +{ + return __sync_lock_test_and_set(lock, 1); +} + +#define S_UNLOCK(lock) __sync_lock_release(lock) + +#endif /* HAVE_GCC__SYNC_INT32_TAS */ + +#endif /* !defined(HAS_TEST_AND_SET) */ + + /* Blow up if we didn't have any way to do spinlocks */ #ifndef HAS_TEST_AND_SET #error PostgreSQL does not have native spinlock support on this platform. To continue the compilation, rerun configureusing --disable-spinlocks. However, performance will be poor. Please report this to pgsql-bugs@lists.postgresql.org.
I wrote: > So about like this, then. After actually testing (by removing the ARM stanza on a macOS machine), it seems that placement doesn't work, because of the default definition of S_UNLOCK at the bottom of the "#if defined(__GNUC__)" stuff. Putting it inside that test works, and seems like it should be fine, since this is a GCC-ism. regards, tom lane diff --git a/src/include/storage/s_lock.h b/src/include/storage/s_lock.h index 4225d9b7fc..8b19ab160f 100644 --- a/src/include/storage/s_lock.h +++ b/src/include/storage/s_lock.h @@ -293,29 +293,6 @@ spin_delay(void) #endif /* __arm__ || __arm || __aarch64__ */ -/* - * RISC-V likewise uses __sync_lock_test_and_set(int *, int) if available. - */ -#if defined(__riscv) -#ifdef HAVE_GCC__SYNC_INT32_TAS -#define HAS_TEST_AND_SET - -#define TAS(lock) tas(lock) - -typedef int slock_t; - -static __inline__ int -tas(volatile slock_t *lock) -{ - return __sync_lock_test_and_set(lock, 1); -} - -#define S_UNLOCK(lock) __sync_lock_release(lock) - -#endif /* HAVE_GCC__SYNC_INT32_TAS */ -#endif /* __riscv */ - - /* S/390 and S/390x Linux (32- and 64-bit zSeries) */ #if defined(__s390__) || defined(__s390x__) #define HAS_TEST_AND_SET @@ -619,6 +596,50 @@ tas(volatile slock_t *lock) #endif /* __hppa || __hppa__ */ +/* + * If we have no platform-specific knowledge, but we found that the compiler + * provides __sync_lock_test_and_set(), use that. Prefer the int-width + * version over the char-width version if we have both, on the rather dubious + * grounds that that's known to be more likely to work in the ARM ecosystem. + * (But we dealt with ARM above.) + */ +#if !defined(HAS_TEST_AND_SET) + +#if defined(HAVE_GCC__SYNC_INT32_TAS) +#define HAS_TEST_AND_SET + +#define TAS(lock) tas(lock) + +typedef int slock_t; + +static __inline__ int +tas(volatile slock_t *lock) +{ + return __sync_lock_test_and_set(lock, 1); +} + +#define S_UNLOCK(lock) __sync_lock_release(lock) + +#elif defined(HAVE_GCC__SYNC_CHAR_TAS) +#define HAS_TEST_AND_SET + +#define TAS(lock) tas(lock) + +typedef char slock_t; + +static __inline__ int +tas(volatile slock_t *lock) +{ + return __sync_lock_test_and_set(lock, 1); +} + +#define S_UNLOCK(lock) __sync_lock_release(lock) + +#endif /* HAVE_GCC__SYNC_INT32_TAS */ + +#endif /* !defined(HAS_TEST_AND_SET) */ + + /* * Default implementation of S_UNLOCK() for gcc/icc. *
Hi, On 2022-11-02 14:55:04 -0400, Tom Lane wrote: > I wrote: > > So about like this, then. > > After actually testing (by removing the ARM stanza on a macOS machine), > it seems that placement doesn't work, because of the default definition > of S_UNLOCK at the bottom of the "#if defined(__GNUC__)" stuff. Putting > it inside that test works, and seems like it should be fine, since this > is a GCC-ism. Looks reasonable. I tested it on x86-64 by disabling that section and it works. FWIW, In a heavily spinlock-contending workload it's a tad slower, largely due to to loosing spin_delay. If I define that it's very close. Not that it matters hugely, I just thought it'd be good to validate. I wonder if it's worth keeing the full copy of this in the arm section? We could just define SPIN_DELAY() for aarch64? Greetings, Andres Freund
Andres Freund <andres@anarazel.de> writes: > On 2022-11-02 14:55:04 -0400, Tom Lane wrote: >> After actually testing (by removing the ARM stanza on a macOS machine), >> it seems that placement doesn't work, because of the default definition >> of S_UNLOCK at the bottom of the "#if defined(__GNUC__)" stuff. Putting >> it inside that test works, and seems like it should be fine, since this >> is a GCC-ism. > Looks reasonable. I tested it on x86-64 by disabling that section and it > works. Thanks for looking. > I wonder if it's worth keeing the full copy of this in the arm section? We > could just define SPIN_DELAY() for aarch64? I thought about that, but given the increasing popularity of ARM I bet that that stanza is going to accrete more special-case knowledge over time. It's probably simplest to keep it separate. regards, tom lane
On 2022-11-02 17:37:04 -0400, Tom Lane wrote: > Andres Freund <andres@anarazel.de> writes: > > On 2022-11-02 14:55:04 -0400, Tom Lane wrote: > >> After actually testing (by removing the ARM stanza on a macOS machine), > >> it seems that placement doesn't work, because of the default definition > >> of S_UNLOCK at the bottom of the "#if defined(__GNUC__)" stuff. Putting > >> it inside that test works, and seems like it should be fine, since this > >> is a GCC-ism. > > > Looks reasonable. I tested it on x86-64 by disabling that section and it > > works. > > Thanks for looking. > > > I wonder if it's worth keeing the full copy of this in the arm section? We > > could just define SPIN_DELAY() for aarch64? > > I thought about that, but given the increasing popularity of ARM > I bet that that stanza is going to accrete more special-case knowledge > over time. It's probably simplest to keep it separate. WFM.