Thread: Spinlocks and semaphores in 9.2 and 9.3

Spinlocks and semaphores in 9.2 and 9.3

From
Tom Lane
Date:
This rabbit hole keeps getting deeper and deeper :-(

I realized a couple days ago that it had been close to three years since
I last tried building the further-back branches on my ancient HPPA box.
Not so surprisingly, things were broken: commits 37de8de9e33606a0 et al
had introduced use of memory barriers into 9.2 and 9.3, which up to that
moment had had none, so we'd never bothered to make barrier.h actually
work in those branches.  I back-patched some fixes in barrier.h, which
got things to compile, and then commit 44cd47c1d49655c5, which got things
to actually work ... on that box anyway.  I now see from the buildfarm
(and can confirm locally) that 44cd47c1d49655c5 breaks builds with
--disable-spinlock, because it introduces an initialization of a spinlock
before the underlying PGSemaphore infrastructure can possibly get
initialized.

In 9.4 and up, this works because of daa7527afc227443, which decoupled
spinlocks from semaphores enough that you can do s_init_lock_sema()
long before you can actually use the spinlock.  Come to think of it,
that commit also means that you can get away with using a spinlock
you've never done initialization on at all, which is not so good.

So at this point I'm not sure what to do.  I could back out the back-patch
of 44cd47c1d49655c5, which would mean accepting that 9.2/9.3 are broken
and will never be fixed for HPPA, as well as any other architectures that
use the same fallback memory barrier implementation.  The lack of
complaints from the field suggests that nobody would care.  Or I could
push forward by back-patching daa7527afc227443 (and a couple of minor
follow-on cleanups).  That doesn't seem particularly risky, now that
9.4's been out for awhile, but it's kind of a large back-patch to benefit
architectures that apparently no actual users care about.

Thoughts?

Independently of that, I think we should fix the --disable-spinlock
support so that a spinlock containing zero is rejected as not properly
initialized.  A large part of the mess here comes from the fact that HPPA
is our only architecture in which zero is not the correct initialization
state for a spinlock.  I'd like to have some more portable method of
catching failure to initialize a spinlock.  I only propose doing this
part in HEAD, though.
        regards, tom lane



Re: Spinlocks and semaphores in 9.2 and 9.3

From
Tom Lane
Date:
I wrote:
> So at this point I'm not sure what to do.  I could back out the back-patch
> of 44cd47c1d49655c5, which would mean accepting that 9.2/9.3 are broken
> and will never be fixed for HPPA, as well as any other architectures that
> use the same fallback memory barrier implementation.  The lack of
> complaints from the field suggests that nobody would care.  Or I could
> push forward by back-patching daa7527afc227443 (and a couple of minor
> follow-on cleanups).  That doesn't seem particularly risky, now that
> 9.4's been out for awhile, but it's kind of a large back-patch to benefit
> architectures that apparently no actual users care about.

I went ahead and prepared and tested such a patch; the version for 9.3
is attached.  (9.2 is identical modulo some pgindent-induced whitespace
difference.)  This doesn't look too hazardous to me, so I'm thinking
we should apply it.

            regards, tom lane

diff --git a/src/backend/postmaster/postmaster.c b/src/backend/postmaster/postmaster.c
index 3808836..fc4abbc 100644
*** a/src/backend/postmaster/postmaster.c
--- b/src/backend/postmaster/postmaster.c
*************** typedef struct
*** 500,505 ****
--- 500,508 ----
      slock_t    *ShmemLock;
      VariableCache ShmemVariableCache;
      Backend    *ShmemBackendArray;
+ #ifndef HAVE_SPINLOCKS
+     PGSemaphore    SpinlockSemaArray;
+ #endif
      LWLock       *LWLockArray;
      slock_t    *ProcStructLock;
      PROC_HDR   *ProcGlobal;
*************** save_backend_variables(BackendParameters
*** 6050,6055 ****
--- 6053,6061 ----
      param->ShmemVariableCache = ShmemVariableCache;
      param->ShmemBackendArray = ShmemBackendArray;

+ #ifndef HAVE_SPINLOCKS
+     param->SpinlockSemaArray = SpinlockSemaArray;
+ #endif
      param->LWLockArray = LWLockArray;
      param->ProcStructLock = ProcStructLock;
      param->ProcGlobal = ProcGlobal;
*************** restore_backend_variables(BackendParamet
*** 6278,6283 ****
--- 6284,6292 ----
      ShmemVariableCache = param->ShmemVariableCache;
      ShmemBackendArray = param->ShmemBackendArray;

+ #ifndef HAVE_SPINLOCKS
+     SpinlockSemaArray = param->SpinlockSemaArray;
+ #endif
      LWLockArray = param->LWLockArray;
      ProcStructLock = param->ProcStructLock;
      ProcGlobal = param->ProcGlobal;
diff --git a/src/backend/storage/ipc/ipci.c b/src/backend/storage/ipc/ipci.c
index 918ac51..ec8e4f6 100644
*** a/src/backend/storage/ipc/ipci.c
--- b/src/backend/storage/ipc/ipci.c
*************** CreateSharedMemoryAndSemaphores(bool mak
*** 103,108 ****
--- 103,109 ----
           * need to be so careful during the actual allocation phase.
           */
          size = 100000;
+         size = add_size(size, SpinlockSemaSize());
          size = add_size(size, hash_estimate_size(SHMEM_INDEX_SIZE,
                                                   sizeof(ShmemIndexEnt)));
          size = add_size(size, BufferShmemSize());
diff --git a/src/backend/storage/ipc/shmem.c b/src/backend/storage/ipc/shmem.c
index 129d9f8..4194db6 100644
*** a/src/backend/storage/ipc/shmem.c
--- b/src/backend/storage/ipc/shmem.c
*************** InitShmemAllocation(void)
*** 116,124 ****
      Assert(shmhdr != NULL);

      /*
!      * Initialize the spinlock used by ShmemAlloc.  We have to do the space
!      * allocation the hard way, since obviously ShmemAlloc can't be called
!      * yet.
       */
      ShmemLock = (slock_t *) (((char *) shmhdr) + shmhdr->freeoffset);
      shmhdr->freeoffset += MAXALIGN(sizeof(slock_t));
--- 116,139 ----
      Assert(shmhdr != NULL);

      /*
!      * If spinlocks are disabled, initialize emulation layer.  We have to do
!      * the space allocation the hard way, since obviously ShmemAlloc can't be
!      * called yet.
!      */
! #ifndef HAVE_SPINLOCKS
!     {
!         PGSemaphore spinsemas;
!
!         spinsemas = (PGSemaphore) (((char *) shmhdr) + shmhdr->freeoffset);
!         shmhdr->freeoffset += MAXALIGN(SpinlockSemaSize());
!         SpinlockSemaInit(spinsemas);
!         Assert(shmhdr->freeoffset <= shmhdr->totalsize);
!     }
! #endif
!
!     /*
!      * Initialize the spinlock used by ShmemAlloc; we have to do this the hard
!      * way, too, for the same reasons as above.
       */
      ShmemLock = (slock_t *) (((char *) shmhdr) + shmhdr->freeoffset);
      shmhdr->freeoffset += MAXALIGN(sizeof(slock_t));
diff --git a/src/backend/storage/lmgr/spin.c b/src/backend/storage/lmgr/spin.c
index 2864790..874e313 100644
*** a/src/backend/storage/lmgr/spin.c
--- b/src/backend/storage/lmgr/spin.c
***************
*** 25,33 ****
--- 25,48 ----
  #include "miscadmin.h"
  #include "replication/walsender.h"
  #include "storage/lwlock.h"
+ #include "storage/pg_sema.h"
  #include "storage/spin.h"


+ #ifndef HAVE_SPINLOCKS
+ PGSemaphore SpinlockSemaArray;
+ #endif
+
+ /*
+  * Report the amount of shared memory needed to store semaphores for spinlock
+  * support.
+  */
+ Size
+ SpinlockSemaSize(void)
+ {
+     return SpinlockSemas() * sizeof(PGSemaphoreData);
+ }
+
  #ifdef HAVE_SPINLOCKS

  /*
*************** SpinlockSemas(void)
*** 51,71 ****
  int
  SpinlockSemas(void)
  {
!     int            nsemas;

!     /*
!      * It would be cleaner to distribute this logic into the affected modules,
!      * similar to the way shmem space estimation is handled.
!      *
!      * For now, though, there are few enough users of spinlocks that we just
!      * keep the knowledge here.
!      */
!     nsemas = NumLWLocks();        /* one for each lwlock */
!     nsemas += NBuffers;            /* one for each buffer header */
!     nsemas += max_wal_senders;    /* one for each wal sender process */
!     nsemas += 30;                /* plus a bunch for other small-scale use */

!     return nsemas;
  }

  /*
--- 66,85 ----
  int
  SpinlockSemas(void)
  {
!     return NUM_SPINLOCK_SEMAPHORES;
! }

! /*
!  * Initialize semaphores.
!  */
! extern void
! SpinlockSemaInit(PGSemaphore spinsemas)
! {
!     int            i;

!     for (i = 0; i < NUM_SPINLOCK_SEMAPHORES; ++i)
!         PGSemaphoreCreate(&spinsemas[i]);
!     SpinlockSemaArray = spinsemas;
  }

  /*
*************** SpinlockSemas(void)
*** 75,87 ****
  void
  s_init_lock_sema(volatile slock_t *lock)
  {
!     PGSemaphoreCreate((PGSemaphore) lock);
  }

  void
  s_unlock_sema(volatile slock_t *lock)
  {
!     PGSemaphoreUnlock((PGSemaphore) lock);
  }

  bool
--- 89,103 ----
  void
  s_init_lock_sema(volatile slock_t *lock)
  {
!     static int    counter = 0;
!
!     *lock = (++counter) % NUM_SPINLOCK_SEMAPHORES;
  }

  void
  s_unlock_sema(volatile slock_t *lock)
  {
!     PGSemaphoreUnlock(&SpinlockSemaArray[*lock]);
  }

  bool
*************** int
*** 96,102 ****
  tas_sema(volatile slock_t *lock)
  {
      /* Note that TAS macros return 0 if *success* */
!     return !PGSemaphoreTryLock((PGSemaphore) lock);
  }

  #endif   /* !HAVE_SPINLOCKS */
--- 112,118 ----
  tas_sema(volatile slock_t *lock)
  {
      /* Note that TAS macros return 0 if *success* */
!     return !PGSemaphoreTryLock(&SpinlockSemaArray[*lock]);
  }

  #endif   /* !HAVE_SPINLOCKS */
diff --git a/src/include/pg_config_manual.h b/src/include/pg_config_manual.h
index 24c5069..74d7ba2 100644
*** a/src/include/pg_config_manual.h
--- b/src/include/pg_config_manual.h
***************
*** 57,62 ****
--- 57,70 ----
  #define NUM_USER_DEFINED_LWLOCKS    4

  /*
+  * When we don't have native spinlocks, we use semaphores to simulate them.
+  * Decreasing this value reduces consumption of OS resources; increasing it
+  * may improve performance, but supplying a real spinlock implementation is
+  * probably far better.
+  */
+ #define    NUM_SPINLOCK_SEMAPHORES        1024
+
+ /*
   * Define this if you want to allow the lo_import and lo_export SQL
   * functions to be executed by ordinary users.  By default these
   * functions are only available to the Postgres superuser.  CAUTION:
diff --git a/src/include/storage/s_lock.h b/src/include/storage/s_lock.h
index f869cb0..b2605f9 100644
*** a/src/include/storage/s_lock.h
--- b/src/include/storage/s_lock.h
***************
*** 94,104 ****
  #ifndef S_LOCK_H
  #define S_LOCK_H

- #include "storage/pg_sema.h"
-
  #ifdef HAVE_SPINLOCKS    /* skip spinlocks if requested */

-
  #if defined(__GNUC__) || defined(__INTEL_COMPILER)
  /*************************************************************************
   * All the gcc inlines
--- 94,101 ----
*************** spin_delay(void)
*** 1032,1038 ****
   * to fall foul of kernel limits on number of semaphores, so don't use this
   * unless you must!  The subroutines appear in spin.c.
   */
! typedef PGSemaphoreData slock_t;

  extern bool s_lock_free_sema(volatile slock_t *lock);
  extern void s_unlock_sema(volatile slock_t *lock);
--- 1029,1035 ----
   * to fall foul of kernel limits on number of semaphores, so don't use this
   * unless you must!  The subroutines appear in spin.c.
   */
! typedef int slock_t;

  extern bool s_lock_free_sema(volatile slock_t *lock);
  extern void s_unlock_sema(volatile slock_t *lock);
diff --git a/src/include/storage/spin.h b/src/include/storage/spin.h
index f459b90..b2b1339 100644
*** a/src/include/storage/spin.h
--- b/src/include/storage/spin.h
***************
*** 57,62 ****
--- 57,65 ----
  #define SPIN_H

  #include "storage/s_lock.h"
+ #ifndef HAVE_SPINLOCKS
+ #include "storage/pg_sema.h"
+ #endif


  #define SpinLockInit(lock)    S_INIT_LOCK(lock)
***************
*** 69,73 ****
--- 72,82 ----


  extern int    SpinlockSemas(void);
+ extern Size SpinlockSemaSize(void);
+
+ #ifndef HAVE_SPINLOCKS
+ extern void SpinlockSemaInit(PGSemaphore);
+ extern PGSemaphore SpinlockSemaArray;
+ #endif

  #endif   /* SPIN_H */

Re: Spinlocks and semaphores in 9.2 and 9.3

From
Andres Freund
Date:

On April 16, 2016 6:02:39 PM PDT, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>I wrote:
>> So at this point I'm not sure what to do.  I could back out the
>back-patch
>> of 44cd47c1d49655c5, which would mean accepting that 9.2/9.3 are
>broken
>> and will never be fixed for HPPA, as well as any other architectures
>that
>> use the same fallback memory barrier implementation.  The lack of
>> complaints from the field suggests that nobody would care.  Or I
>could
>> push forward by back-patching daa7527afc227443 (and a couple of minor
>> follow-on cleanups).  That doesn't seem particularly risky, now that
>> 9.4's been out for awhile, but it's kind of a large back-patch to
>benefit
>> architectures that apparently no actual users care about.
>
>I went ahead and prepared and tested such a patch; the version for 9.3
>is attached.  (9.2 is identical modulo some pgindent-induced whitespace
>difference.)  This doesn't look too hazardous to me, so I'm thinking
>we should apply it.

I can't look at the patch just now, but the plan sounds good. Of you rather have somebody look art the patch before, I
cando tomorrow morning.
 

Andres
-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.



Re: Spinlocks and semaphores in 9.2 and 9.3

From
Tom Lane
Date:
Andres Freund <andres@anarazel.de> writes:
> On April 16, 2016 6:02:39 PM PDT, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> I went ahead and prepared and tested such a patch; the version for 9.3
>> is attached.  (9.2 is identical modulo some pgindent-induced whitespace
>> difference.)  This doesn't look too hazardous to me, so I'm thinking
>> we should apply it.

> I can't look at the patch just now, but the plan sounds good. Of you rather have somebody look art the patch before,
Ican do tomorrow morning.
 

Did you want to actually review this patch, or should I just push it?
        regards, tom lane



Re: Spinlocks and semaphores in 9.2 and 9.3

From
Andres Freund
Date:
On 2016-04-18 11:07:08 -0400, Tom Lane wrote:
> Andres Freund <andres@anarazel.de> writes:
> > On April 16, 2016 6:02:39 PM PDT, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> >> I went ahead and prepared and tested such a patch; the version for 9.3
> >> is attached.  (9.2 is identical modulo some pgindent-induced whitespace
> >> difference.)  This doesn't look too hazardous to me, so I'm thinking
> >> we should apply it.
> 
> > I can't look at the patch just now, but the plan sounds good. Of you rather have somebody look art the patch
before,I can do tomorrow morning.
 
> 
> Did you want to actually review this patch, or should I just push it?

No, I'm good, you should push it. I did a quick scan of the patch, and
it looks sane. For a second I was concerned that there might be a
situation in which this patch increases the total number of semaphore
needed, which might make backpatching a bit problematic - but it appears
that that'd be a very absurd configuration.

Greetings,

Andres Freund



Re: Spinlocks and semaphores in 9.2 and 9.3

From
Tom Lane
Date:
Andres Freund <andres@anarazel.de> writes:
> On 2016-04-18 11:07:08 -0400, Tom Lane wrote:
>> Did you want to actually review this patch, or should I just push it?

> No, I'm good, you should push it. I did a quick scan of the patch, and
> it looks sane. For a second I was concerned that there might be a
> situation in which this patch increases the total number of semaphore
> needed, which might make backpatching a bit problematic - but it appears
> that that'd be a very absurd configuration.

I was actually wondering whether it'd make sense to cut the number of
underlying semaphores to 128 or 512 or thereabouts.  But I think we had
that discussion when the daa7527afc227443 patch went in to begin with,
and convinced ourselves that 1024 was okay.  Robert, do you recall the
reasoning?

(The thing that gave me pause about this was noticing that I could not
start two such postmasters concurrently on my RHEL6 box, without changing
the default system limits on number of SysV semaphores.)
        regards, tom lane



Re: Spinlocks and semaphores in 9.2 and 9.3

From
Andres Freund
Date:
On 2016-04-18 11:24:07 -0400, Tom Lane wrote:
> (The thing that gave me pause about this was noticing that I could not
> start two such postmasters concurrently on my RHEL6 box, without changing
> the default system limits on number of SysV semaphores.)

Hm, is that different before/after the patch? Because afaics at around
NBuffers = 1000, the number of semaphores should be lower after the
patch?

Andres



Re: Spinlocks and semaphores in 9.2 and 9.3

From
Tom Lane
Date:
Andres Freund <andres@anarazel.de> writes:
> On 2016-04-18 11:24:07 -0400, Tom Lane wrote:
>> (The thing that gave me pause about this was noticing that I could not
>> start two such postmasters concurrently on my RHEL6 box, without changing
>> the default system limits on number of SysV semaphores.)

> Hm, is that different before/after the patch? Because afaics at around
> NBuffers = 1000, the number of semaphores should be lower after the
> patch?

Yeah, that was probably the argument we used before.  But it's true that a
postmaster built with --disable-spinlocks is harder to test than one built
without (because you're going from ~100 semas to ~1100 semas, at default
configuration).  If we believe that the main real use-case for this switch
is testing, that's not a very nice property.

The bottom line IMO is that --disable-spinlocks is actually not that awful
for low-stress, low-concurrency scenarios; for example, it doesn't change
the runtime of make installcheck-parallel very much for me.  On the other
hand, for high-concurrency scenarios you'd darn well better get a real
spinlock implementation.  Given that sort of scope of use, it seems like
a hundred or so underlying semas should be plenty, and it'd be less likely
to cause operational problems than 1024.
        regards, tom lane



Re: Spinlocks and semaphores in 9.2 and 9.3

From
Robert Haas
Date:
On Mon, Apr 18, 2016 at 11:24 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Andres Freund <andres@anarazel.de> writes:
>> On 2016-04-18 11:07:08 -0400, Tom Lane wrote:
>>> Did you want to actually review this patch, or should I just push it?
>
>> No, I'm good, you should push it. I did a quick scan of the patch, and
>> it looks sane. For a second I was concerned that there might be a
>> situation in which this patch increases the total number of semaphore
>> needed, which might make backpatching a bit problematic - but it appears
>> that that'd be a very absurd configuration.
>
> I was actually wondering whether it'd make sense to cut the number of
> underlying semaphores to 128 or 512 or thereabouts.  But I think we had
> that discussion when the daa7527afc227443 patch went in to begin with,
> and convinced ourselves that 1024 was okay.  Robert, do you recall the
> reasoning?

I don't recall a specific discussion about it, but the number we would
have needed before was gigantic - one per lwlock, which is typically
going to be far more than 1024.  I mean, at shared_buffers=32MB, there
are 4096 buffers using two lwlocks each... never mind everything else
that uses lwlocks.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Spinlocks and semaphores in 9.2 and 9.3

From
Robert Haas
Date:
On Mon, Apr 18, 2016 at 11:34 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Andres Freund <andres@anarazel.de> writes:
>> On 2016-04-18 11:24:07 -0400, Tom Lane wrote:
>>> (The thing that gave me pause about this was noticing that I could not
>>> start two such postmasters concurrently on my RHEL6 box, without changing
>>> the default system limits on number of SysV semaphores.)
>
>> Hm, is that different before/after the patch? Because afaics at around
>> NBuffers = 1000, the number of semaphores should be lower after the
>> patch?
>
> Yeah, that was probably the argument we used before.  But it's true that a
> postmaster built with --disable-spinlocks is harder to test than one built
> without (because you're going from ~100 semas to ~1100 semas, at default
> configuration).  If we believe that the main real use-case for this switch
> is testing, that's not a very nice property.
>
> The bottom line IMO is that --disable-spinlocks is actually not that awful
> for low-stress, low-concurrency scenarios; for example, it doesn't change
> the runtime of make installcheck-parallel very much for me.  On the other
> hand, for high-concurrency scenarios you'd darn well better get a real
> spinlock implementation.  Given that sort of scope of use, it seems like
> a hundred or so underlying semas should be plenty, and it'd be less likely
> to cause operational problems than 1024.

I suppose that's probably true.  I thought surely any system worth its
salt wouldn't have a problem with 1024 semaphores, but a quick Google
search for "hp-ux semmns" turned up this page:


-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Spinlocks and semaphores in 9.2 and 9.3

From
Robert Haas
Date:
On Mon, Apr 18, 2016 at 11:53 AM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Mon, Apr 18, 2016 at 11:34 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Andres Freund <andres@anarazel.de> writes:
>>> On 2016-04-18 11:24:07 -0400, Tom Lane wrote:
>>>> (The thing that gave me pause about this was noticing that I could not
>>>> start two such postmasters concurrently on my RHEL6 box, without changing
>>>> the default system limits on number of SysV semaphores.)
>>
>>> Hm, is that different before/after the patch? Because afaics at around
>>> NBuffers = 1000, the number of semaphores should be lower after the
>>> patch?
>>
>> Yeah, that was probably the argument we used before.  But it's true that a
>> postmaster built with --disable-spinlocks is harder to test than one built
>> without (because you're going from ~100 semas to ~1100 semas, at default
>> configuration).  If we believe that the main real use-case for this switch
>> is testing, that's not a very nice property.
>>
>> The bottom line IMO is that --disable-spinlocks is actually not that awful
>> for low-stress, low-concurrency scenarios; for example, it doesn't change
>> the runtime of make installcheck-parallel very much for me.  On the other
>> hand, for high-concurrency scenarios you'd darn well better get a real
>> spinlock implementation.  Given that sort of scope of use, it seems like
>> a hundred or so underlying semas should be plenty, and it'd be less likely
>> to cause operational problems than 1024.
>
> I suppose that's probably true.  I thought surely any system worth its
> salt wouldn't have a problem with 1024 semaphores, but a quick Google
> search for "hp-ux semmns" turned up this page:

Oops, hit send too soon.

http://docstore.mik.ua/manuals/hp-ux/en/B2355-60130/semmni.5.html

That's from 2007 and indicates a default maximum of 2048.  So it would
be fairly easy to hit the limit.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company