Thread: spinlock support on loongarch64

spinlock support on loongarch64

From
吴亚飞
Date:

add spinlock support on loongarch64.

Attachment

Re: spinlock support on loongarch64

From
Tom Lane
Date:
=?gb2312?B?zuLRx7fJ?= <wuyf41619@hundsun.com> writes:
> add spinlock support on loongarch64.

I wonder if we shouldn't just do that (ie, try to use
__sync_lock_test_and_set) as a generic fallback on any unsupported
architecture.  We could get rid of the separate stanza for RISC-V
that way.  The main thing that an arch-specific stanza could bring
is knowledge of the best data type width to use for a spinlock;
but I don't see a big problem with defaulting to "int".  We can
always add arch-specific stanzas for any machines where that's
shown to be a seriously poor choice.

            regards, tom lane



Re: spinlock support on loongarch64

From
Andres Freund
Date:
Hi,

On 2022-11-02 11:37:35 -0400, Tom Lane wrote:
> =?gb2312?B?zuLRx7fJ?= <wuyf41619@hundsun.com> writes:
> > add spinlock support on loongarch64.
> 
> I wonder if we shouldn't just do that (ie, try to use
> __sync_lock_test_and_set) as a generic fallback on any unsupported
> architecture.  We could get rid of the separate stanza for RISC-V
> that way.  The main thing that an arch-specific stanza could bring
> is knowledge of the best data type width to use for a spinlock;
> but I don't see a big problem with defaulting to "int".  We can
> always add arch-specific stanzas for any machines where that's
> shown to be a seriously poor choice.

Yes, please. It might not be perfect for all architectures, and it might not
be good for some very old architectures. But for anything new it'll be vastly
better than not having spinlocks at all.

Greetings,

Andres Freund



Re: spinlock support on loongarch64

From
Tom Lane
Date:
Andres Freund <andres@anarazel.de> writes:
> On 2022-11-02 11:37:35 -0400, Tom Lane wrote:
>> I wonder if we shouldn't just do that (ie, try to use
>> __sync_lock_test_and_set) as a generic fallback on any unsupported
>> architecture.  We could get rid of the separate stanza for RISC-V
>> that way.  The main thing that an arch-specific stanza could bring
>> is knowledge of the best data type width to use for a spinlock;
>> but I don't see a big problem with defaulting to "int".  We can
>> always add arch-specific stanzas for any machines where that's
>> shown to be a seriously poor choice.

> Yes, please. It might not be perfect for all architectures, and it might not
> be good for some very old architectures. But for anything new it'll be vastly
> better than not having spinlocks at all.

So about like this, then.

            regards, tom lane

diff --git a/src/include/storage/s_lock.h b/src/include/storage/s_lock.h
index 4225d9b7fc..d3972f61fc 100644
--- a/src/include/storage/s_lock.h
+++ b/src/include/storage/s_lock.h
@@ -293,29 +293,6 @@ spin_delay(void)
 #endif     /* __arm__ || __arm || __aarch64__ */


-/*
- * RISC-V likewise uses __sync_lock_test_and_set(int *, int) if available.
- */
-#if defined(__riscv)
-#ifdef HAVE_GCC__SYNC_INT32_TAS
-#define HAS_TEST_AND_SET
-
-#define TAS(lock) tas(lock)
-
-typedef int slock_t;
-
-static __inline__ int
-tas(volatile slock_t *lock)
-{
-    return __sync_lock_test_and_set(lock, 1);
-}
-
-#define S_UNLOCK(lock) __sync_lock_release(lock)
-
-#endif     /* HAVE_GCC__SYNC_INT32_TAS */
-#endif     /* __riscv */
-
-
 /* S/390 and S/390x Linux (32- and 64-bit zSeries) */
 #if defined(__s390__) || defined(__s390x__)
 #define HAS_TEST_AND_SET
@@ -716,6 +693,52 @@ spin_delay(void)
 #endif    /* !defined(HAS_TEST_AND_SET) */


+/*
+ * Last-ditch try.
+ *
+ * If we have no platform-specific knowledge, but we found that the compiler
+ * provides __sync_lock_test_and_set(), use that.  Prefer the int-width
+ * version over the char-width version if we have both, on the rather dubious
+ * grounds that that's known to be more likely to work in the ARM ecosystem.
+ * (But we dealt with ARM above.)
+ */
+#if !defined(HAS_TEST_AND_SET)
+
+#if defined(HAVE_GCC__SYNC_INT32_TAS)
+#define HAS_TEST_AND_SET
+
+#define TAS(lock) tas(lock)
+
+typedef int slock_t;
+
+static __inline__ int
+tas(volatile slock_t *lock)
+{
+    return __sync_lock_test_and_set(lock, 1);
+}
+
+#define S_UNLOCK(lock) __sync_lock_release(lock)
+
+#elif defined(HAVE_GCC__SYNC_CHAR_TAS)
+#define HAS_TEST_AND_SET
+
+#define TAS(lock) tas(lock)
+
+typedef char slock_t;
+
+static __inline__ int
+tas(volatile slock_t *lock)
+{
+    return __sync_lock_test_and_set(lock, 1);
+}
+
+#define S_UNLOCK(lock) __sync_lock_release(lock)
+
+#endif     /* HAVE_GCC__SYNC_INT32_TAS */
+
+#endif    /* !defined(HAS_TEST_AND_SET) */
+
+
 /* Blow up if we didn't have any way to do spinlocks */
 #ifndef HAS_TEST_AND_SET
 #error PostgreSQL does not have native spinlock support on this platform.  To continue the compilation, rerun
configureusing --disable-spinlocks.  However, performance will be poor.  Please report this to
pgsql-bugs@lists.postgresql.org.

Re: spinlock support on loongarch64

From
Tom Lane
Date:
I wrote:
> So about like this, then.

After actually testing (by removing the ARM stanza on a macOS machine),
it seems that placement doesn't work, because of the default definition
of S_UNLOCK at the bottom of the "#if defined(__GNUC__)" stuff.  Putting
it inside that test works, and seems like it should be fine, since this
is a GCC-ism.

            regards, tom lane

diff --git a/src/include/storage/s_lock.h b/src/include/storage/s_lock.h
index 4225d9b7fc..8b19ab160f 100644
--- a/src/include/storage/s_lock.h
+++ b/src/include/storage/s_lock.h
@@ -293,29 +293,6 @@ spin_delay(void)
 #endif     /* __arm__ || __arm || __aarch64__ */
 
 
-/*
- * RISC-V likewise uses __sync_lock_test_and_set(int *, int) if available.
- */
-#if defined(__riscv)
-#ifdef HAVE_GCC__SYNC_INT32_TAS
-#define HAS_TEST_AND_SET
-
-#define TAS(lock) tas(lock)
-
-typedef int slock_t;
-
-static __inline__ int
-tas(volatile slock_t *lock)
-{
-    return __sync_lock_test_and_set(lock, 1);
-}
-
-#define S_UNLOCK(lock) __sync_lock_release(lock)
-
-#endif     /* HAVE_GCC__SYNC_INT32_TAS */
-#endif     /* __riscv */
-
-
 /* S/390 and S/390x Linux (32- and 64-bit zSeries) */
 #if defined(__s390__) || defined(__s390x__)
 #define HAS_TEST_AND_SET
@@ -619,6 +596,50 @@ tas(volatile slock_t *lock)
 #endif     /* __hppa || __hppa__ */
 
 
+/*
+ * If we have no platform-specific knowledge, but we found that the compiler
+ * provides __sync_lock_test_and_set(), use that.  Prefer the int-width
+ * version over the char-width version if we have both, on the rather dubious
+ * grounds that that's known to be more likely to work in the ARM ecosystem.
+ * (But we dealt with ARM above.)
+ */
+#if !defined(HAS_TEST_AND_SET)
+
+#if defined(HAVE_GCC__SYNC_INT32_TAS)
+#define HAS_TEST_AND_SET
+
+#define TAS(lock) tas(lock)
+
+typedef int slock_t;
+
+static __inline__ int
+tas(volatile slock_t *lock)
+{
+    return __sync_lock_test_and_set(lock, 1);
+}
+
+#define S_UNLOCK(lock) __sync_lock_release(lock)
+
+#elif defined(HAVE_GCC__SYNC_CHAR_TAS)
+#define HAS_TEST_AND_SET
+
+#define TAS(lock) tas(lock)
+
+typedef char slock_t;
+
+static __inline__ int
+tas(volatile slock_t *lock)
+{
+    return __sync_lock_test_and_set(lock, 1);
+}
+
+#define S_UNLOCK(lock) __sync_lock_release(lock)
+
+#endif     /* HAVE_GCC__SYNC_INT32_TAS */
+
+#endif    /* !defined(HAS_TEST_AND_SET) */
+
+
 /*
  * Default implementation of S_UNLOCK() for gcc/icc.
  *

Re: spinlock support on loongarch64

From
Andres Freund
Date:
Hi,

On 2022-11-02 14:55:04 -0400, Tom Lane wrote:
> I wrote:
> > So about like this, then.
> 
> After actually testing (by removing the ARM stanza on a macOS machine),
> it seems that placement doesn't work, because of the default definition
> of S_UNLOCK at the bottom of the "#if defined(__GNUC__)" stuff.  Putting
> it inside that test works, and seems like it should be fine, since this
> is a GCC-ism.

Looks reasonable. I tested it on x86-64 by disabling that section and it
works.

FWIW, In a heavily spinlock-contending workload it's a tad slower, largely due
to to loosing spin_delay. If I define that it's very close. Not that it
matters hugely, I just thought it'd be good to validate.

I wonder if it's worth keeing the full copy of this in the arm section? We
could just define SPIN_DELAY() for aarch64?

Greetings,

Andres Freund



Re: spinlock support on loongarch64

From
Tom Lane
Date:
Andres Freund <andres@anarazel.de> writes:
> On 2022-11-02 14:55:04 -0400, Tom Lane wrote:
>> After actually testing (by removing the ARM stanza on a macOS machine),
>> it seems that placement doesn't work, because of the default definition
>> of S_UNLOCK at the bottom of the "#if defined(__GNUC__)" stuff.  Putting
>> it inside that test works, and seems like it should be fine, since this
>> is a GCC-ism.

> Looks reasonable. I tested it on x86-64 by disabling that section and it
> works.

Thanks for looking.

> I wonder if it's worth keeing the full copy of this in the arm section? We
> could just define SPIN_DELAY() for aarch64?

I thought about that, but given the increasing popularity of ARM
I bet that that stanza is going to accrete more special-case knowledge
over time.  It's probably simplest to keep it separate.

            regards, tom lane



Re: spinlock support on loongarch64

From
Andres Freund
Date:
On 2022-11-02 17:37:04 -0400, Tom Lane wrote:
> Andres Freund <andres@anarazel.de> writes:
> > On 2022-11-02 14:55:04 -0400, Tom Lane wrote:
> >> After actually testing (by removing the ARM stanza on a macOS machine),
> >> it seems that placement doesn't work, because of the default definition
> >> of S_UNLOCK at the bottom of the "#if defined(__GNUC__)" stuff.  Putting
> >> it inside that test works, and seems like it should be fine, since this
> >> is a GCC-ism.
> 
> > Looks reasonable. I tested it on x86-64 by disabling that section and it
> > works.
> 
> Thanks for looking.
> 
> > I wonder if it's worth keeing the full copy of this in the arm section? We
> > could just define SPIN_DELAY() for aarch64?
> 
> I thought about that, but given the increasing popularity of ARM
> I bet that that stanza is going to accrete more special-case knowledge
> over time.  It's probably simplest to keep it separate.

WFM.