Thread: Spinlocks and semaphores in 9.2 and 9.3
This rabbit hole keeps getting deeper and deeper :-( I realized a couple days ago that it had been close to three years since I last tried building the further-back branches on my ancient HPPA box. Not so surprisingly, things were broken: commits 37de8de9e33606a0 et al had introduced use of memory barriers into 9.2 and 9.3, which up to that moment had had none, so we'd never bothered to make barrier.h actually work in those branches. I back-patched some fixes in barrier.h, which got things to compile, and then commit 44cd47c1d49655c5, which got things to actually work ... on that box anyway. I now see from the buildfarm (and can confirm locally) that 44cd47c1d49655c5 breaks builds with --disable-spinlock, because it introduces an initialization of a spinlock before the underlying PGSemaphore infrastructure can possibly get initialized. In 9.4 and up, this works because of daa7527afc227443, which decoupled spinlocks from semaphores enough that you can do s_init_lock_sema() long before you can actually use the spinlock. Come to think of it, that commit also means that you can get away with using a spinlock you've never done initialization on at all, which is not so good. So at this point I'm not sure what to do. I could back out the back-patch of 44cd47c1d49655c5, which would mean accepting that 9.2/9.3 are broken and will never be fixed for HPPA, as well as any other architectures that use the same fallback memory barrier implementation. The lack of complaints from the field suggests that nobody would care. Or I could push forward by back-patching daa7527afc227443 (and a couple of minor follow-on cleanups). That doesn't seem particularly risky, now that 9.4's been out for awhile, but it's kind of a large back-patch to benefit architectures that apparently no actual users care about. Thoughts? Independently of that, I think we should fix the --disable-spinlock support so that a spinlock containing zero is rejected as not properly initialized. A large part of the mess here comes from the fact that HPPA is our only architecture in which zero is not the correct initialization state for a spinlock. I'd like to have some more portable method of catching failure to initialize a spinlock. I only propose doing this part in HEAD, though. regards, tom lane
I wrote: > So at this point I'm not sure what to do. I could back out the back-patch > of 44cd47c1d49655c5, which would mean accepting that 9.2/9.3 are broken > and will never be fixed for HPPA, as well as any other architectures that > use the same fallback memory barrier implementation. The lack of > complaints from the field suggests that nobody would care. Or I could > push forward by back-patching daa7527afc227443 (and a couple of minor > follow-on cleanups). That doesn't seem particularly risky, now that > 9.4's been out for awhile, but it's kind of a large back-patch to benefit > architectures that apparently no actual users care about. I went ahead and prepared and tested such a patch; the version for 9.3 is attached. (9.2 is identical modulo some pgindent-induced whitespace difference.) This doesn't look too hazardous to me, so I'm thinking we should apply it. regards, tom lane diff --git a/src/backend/postmaster/postmaster.c b/src/backend/postmaster/postmaster.c index 3808836..fc4abbc 100644 *** a/src/backend/postmaster/postmaster.c --- b/src/backend/postmaster/postmaster.c *************** typedef struct *** 500,505 **** --- 500,508 ---- slock_t *ShmemLock; VariableCache ShmemVariableCache; Backend *ShmemBackendArray; + #ifndef HAVE_SPINLOCKS + PGSemaphore SpinlockSemaArray; + #endif LWLock *LWLockArray; slock_t *ProcStructLock; PROC_HDR *ProcGlobal; *************** save_backend_variables(BackendParameters *** 6050,6055 **** --- 6053,6061 ---- param->ShmemVariableCache = ShmemVariableCache; param->ShmemBackendArray = ShmemBackendArray; + #ifndef HAVE_SPINLOCKS + param->SpinlockSemaArray = SpinlockSemaArray; + #endif param->LWLockArray = LWLockArray; param->ProcStructLock = ProcStructLock; param->ProcGlobal = ProcGlobal; *************** restore_backend_variables(BackendParamet *** 6278,6283 **** --- 6284,6292 ---- ShmemVariableCache = param->ShmemVariableCache; ShmemBackendArray = param->ShmemBackendArray; + #ifndef HAVE_SPINLOCKS + SpinlockSemaArray = param->SpinlockSemaArray; + #endif LWLockArray = param->LWLockArray; ProcStructLock = param->ProcStructLock; ProcGlobal = param->ProcGlobal; diff --git a/src/backend/storage/ipc/ipci.c b/src/backend/storage/ipc/ipci.c index 918ac51..ec8e4f6 100644 *** a/src/backend/storage/ipc/ipci.c --- b/src/backend/storage/ipc/ipci.c *************** CreateSharedMemoryAndSemaphores(bool mak *** 103,108 **** --- 103,109 ---- * need to be so careful during the actual allocation phase. */ size = 100000; + size = add_size(size, SpinlockSemaSize()); size = add_size(size, hash_estimate_size(SHMEM_INDEX_SIZE, sizeof(ShmemIndexEnt))); size = add_size(size, BufferShmemSize()); diff --git a/src/backend/storage/ipc/shmem.c b/src/backend/storage/ipc/shmem.c index 129d9f8..4194db6 100644 *** a/src/backend/storage/ipc/shmem.c --- b/src/backend/storage/ipc/shmem.c *************** InitShmemAllocation(void) *** 116,124 **** Assert(shmhdr != NULL); /* ! * Initialize the spinlock used by ShmemAlloc. We have to do the space ! * allocation the hard way, since obviously ShmemAlloc can't be called ! * yet. */ ShmemLock = (slock_t *) (((char *) shmhdr) + shmhdr->freeoffset); shmhdr->freeoffset += MAXALIGN(sizeof(slock_t)); --- 116,139 ---- Assert(shmhdr != NULL); /* ! * If spinlocks are disabled, initialize emulation layer. We have to do ! * the space allocation the hard way, since obviously ShmemAlloc can't be ! * called yet. ! */ ! #ifndef HAVE_SPINLOCKS ! { ! PGSemaphore spinsemas; ! ! spinsemas = (PGSemaphore) (((char *) shmhdr) + shmhdr->freeoffset); ! shmhdr->freeoffset += MAXALIGN(SpinlockSemaSize()); ! SpinlockSemaInit(spinsemas); ! Assert(shmhdr->freeoffset <= shmhdr->totalsize); ! } ! #endif ! ! /* ! * Initialize the spinlock used by ShmemAlloc; we have to do this the hard ! * way, too, for the same reasons as above. */ ShmemLock = (slock_t *) (((char *) shmhdr) + shmhdr->freeoffset); shmhdr->freeoffset += MAXALIGN(sizeof(slock_t)); diff --git a/src/backend/storage/lmgr/spin.c b/src/backend/storage/lmgr/spin.c index 2864790..874e313 100644 *** a/src/backend/storage/lmgr/spin.c --- b/src/backend/storage/lmgr/spin.c *************** *** 25,33 **** --- 25,48 ---- #include "miscadmin.h" #include "replication/walsender.h" #include "storage/lwlock.h" + #include "storage/pg_sema.h" #include "storage/spin.h" + #ifndef HAVE_SPINLOCKS + PGSemaphore SpinlockSemaArray; + #endif + + /* + * Report the amount of shared memory needed to store semaphores for spinlock + * support. + */ + Size + SpinlockSemaSize(void) + { + return SpinlockSemas() * sizeof(PGSemaphoreData); + } + #ifdef HAVE_SPINLOCKS /* *************** SpinlockSemas(void) *** 51,71 **** int SpinlockSemas(void) { ! int nsemas; ! /* ! * It would be cleaner to distribute this logic into the affected modules, ! * similar to the way shmem space estimation is handled. ! * ! * For now, though, there are few enough users of spinlocks that we just ! * keep the knowledge here. ! */ ! nsemas = NumLWLocks(); /* one for each lwlock */ ! nsemas += NBuffers; /* one for each buffer header */ ! nsemas += max_wal_senders; /* one for each wal sender process */ ! nsemas += 30; /* plus a bunch for other small-scale use */ ! return nsemas; } /* --- 66,85 ---- int SpinlockSemas(void) { ! return NUM_SPINLOCK_SEMAPHORES; ! } ! /* ! * Initialize semaphores. ! */ ! extern void ! SpinlockSemaInit(PGSemaphore spinsemas) ! { ! int i; ! for (i = 0; i < NUM_SPINLOCK_SEMAPHORES; ++i) ! PGSemaphoreCreate(&spinsemas[i]); ! SpinlockSemaArray = spinsemas; } /* *************** SpinlockSemas(void) *** 75,87 **** void s_init_lock_sema(volatile slock_t *lock) { ! PGSemaphoreCreate((PGSemaphore) lock); } void s_unlock_sema(volatile slock_t *lock) { ! PGSemaphoreUnlock((PGSemaphore) lock); } bool --- 89,103 ---- void s_init_lock_sema(volatile slock_t *lock) { ! static int counter = 0; ! ! *lock = (++counter) % NUM_SPINLOCK_SEMAPHORES; } void s_unlock_sema(volatile slock_t *lock) { ! PGSemaphoreUnlock(&SpinlockSemaArray[*lock]); } bool *************** int *** 96,102 **** tas_sema(volatile slock_t *lock) { /* Note that TAS macros return 0 if *success* */ ! return !PGSemaphoreTryLock((PGSemaphore) lock); } #endif /* !HAVE_SPINLOCKS */ --- 112,118 ---- tas_sema(volatile slock_t *lock) { /* Note that TAS macros return 0 if *success* */ ! return !PGSemaphoreTryLock(&SpinlockSemaArray[*lock]); } #endif /* !HAVE_SPINLOCKS */ diff --git a/src/include/pg_config_manual.h b/src/include/pg_config_manual.h index 24c5069..74d7ba2 100644 *** a/src/include/pg_config_manual.h --- b/src/include/pg_config_manual.h *************** *** 57,62 **** --- 57,70 ---- #define NUM_USER_DEFINED_LWLOCKS 4 /* + * When we don't have native spinlocks, we use semaphores to simulate them. + * Decreasing this value reduces consumption of OS resources; increasing it + * may improve performance, but supplying a real spinlock implementation is + * probably far better. + */ + #define NUM_SPINLOCK_SEMAPHORES 1024 + + /* * Define this if you want to allow the lo_import and lo_export SQL * functions to be executed by ordinary users. By default these * functions are only available to the Postgres superuser. CAUTION: diff --git a/src/include/storage/s_lock.h b/src/include/storage/s_lock.h index f869cb0..b2605f9 100644 *** a/src/include/storage/s_lock.h --- b/src/include/storage/s_lock.h *************** *** 94,104 **** #ifndef S_LOCK_H #define S_LOCK_H - #include "storage/pg_sema.h" - #ifdef HAVE_SPINLOCKS /* skip spinlocks if requested */ - #if defined(__GNUC__) || defined(__INTEL_COMPILER) /************************************************************************* * All the gcc inlines --- 94,101 ---- *************** spin_delay(void) *** 1032,1038 **** * to fall foul of kernel limits on number of semaphores, so don't use this * unless you must! The subroutines appear in spin.c. */ ! typedef PGSemaphoreData slock_t; extern bool s_lock_free_sema(volatile slock_t *lock); extern void s_unlock_sema(volatile slock_t *lock); --- 1029,1035 ---- * to fall foul of kernel limits on number of semaphores, so don't use this * unless you must! The subroutines appear in spin.c. */ ! typedef int slock_t; extern bool s_lock_free_sema(volatile slock_t *lock); extern void s_unlock_sema(volatile slock_t *lock); diff --git a/src/include/storage/spin.h b/src/include/storage/spin.h index f459b90..b2b1339 100644 *** a/src/include/storage/spin.h --- b/src/include/storage/spin.h *************** *** 57,62 **** --- 57,65 ---- #define SPIN_H #include "storage/s_lock.h" + #ifndef HAVE_SPINLOCKS + #include "storage/pg_sema.h" + #endif #define SpinLockInit(lock) S_INIT_LOCK(lock) *************** *** 69,73 **** --- 72,82 ---- extern int SpinlockSemas(void); + extern Size SpinlockSemaSize(void); + + #ifndef HAVE_SPINLOCKS + extern void SpinlockSemaInit(PGSemaphore); + extern PGSemaphore SpinlockSemaArray; + #endif #endif /* SPIN_H */
On April 16, 2016 6:02:39 PM PDT, Tom Lane <tgl@sss.pgh.pa.us> wrote: >I wrote: >> So at this point I'm not sure what to do. I could back out the >back-patch >> of 44cd47c1d49655c5, which would mean accepting that 9.2/9.3 are >broken >> and will never be fixed for HPPA, as well as any other architectures >that >> use the same fallback memory barrier implementation. The lack of >> complaints from the field suggests that nobody would care. Or I >could >> push forward by back-patching daa7527afc227443 (and a couple of minor >> follow-on cleanups). That doesn't seem particularly risky, now that >> 9.4's been out for awhile, but it's kind of a large back-patch to >benefit >> architectures that apparently no actual users care about. > >I went ahead and prepared and tested such a patch; the version for 9.3 >is attached. (9.2 is identical modulo some pgindent-induced whitespace >difference.) This doesn't look too hazardous to me, so I'm thinking >we should apply it. I can't look at the patch just now, but the plan sounds good. Of you rather have somebody look art the patch before, I cando tomorrow morning. Andres -- Sent from my Android device with K-9 Mail. Please excuse my brevity.
Andres Freund <andres@anarazel.de> writes: > On April 16, 2016 6:02:39 PM PDT, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> I went ahead and prepared and tested such a patch; the version for 9.3 >> is attached. (9.2 is identical modulo some pgindent-induced whitespace >> difference.) This doesn't look too hazardous to me, so I'm thinking >> we should apply it. > I can't look at the patch just now, but the plan sounds good. Of you rather have somebody look art the patch before, Ican do tomorrow morning. Did you want to actually review this patch, or should I just push it? regards, tom lane
On 2016-04-18 11:07:08 -0400, Tom Lane wrote: > Andres Freund <andres@anarazel.de> writes: > > On April 16, 2016 6:02:39 PM PDT, Tom Lane <tgl@sss.pgh.pa.us> wrote: > >> I went ahead and prepared and tested such a patch; the version for 9.3 > >> is attached. (9.2 is identical modulo some pgindent-induced whitespace > >> difference.) This doesn't look too hazardous to me, so I'm thinking > >> we should apply it. > > > I can't look at the patch just now, but the plan sounds good. Of you rather have somebody look art the patch before,I can do tomorrow morning. > > Did you want to actually review this patch, or should I just push it? No, I'm good, you should push it. I did a quick scan of the patch, and it looks sane. For a second I was concerned that there might be a situation in which this patch increases the total number of semaphore needed, which might make backpatching a bit problematic - but it appears that that'd be a very absurd configuration. Greetings, Andres Freund
Andres Freund <andres@anarazel.de> writes: > On 2016-04-18 11:07:08 -0400, Tom Lane wrote: >> Did you want to actually review this patch, or should I just push it? > No, I'm good, you should push it. I did a quick scan of the patch, and > it looks sane. For a second I was concerned that there might be a > situation in which this patch increases the total number of semaphore > needed, which might make backpatching a bit problematic - but it appears > that that'd be a very absurd configuration. I was actually wondering whether it'd make sense to cut the number of underlying semaphores to 128 or 512 or thereabouts. But I think we had that discussion when the daa7527afc227443 patch went in to begin with, and convinced ourselves that 1024 was okay. Robert, do you recall the reasoning? (The thing that gave me pause about this was noticing that I could not start two such postmasters concurrently on my RHEL6 box, without changing the default system limits on number of SysV semaphores.) regards, tom lane
On 2016-04-18 11:24:07 -0400, Tom Lane wrote: > (The thing that gave me pause about this was noticing that I could not > start two such postmasters concurrently on my RHEL6 box, without changing > the default system limits on number of SysV semaphores.) Hm, is that different before/after the patch? Because afaics at around NBuffers = 1000, the number of semaphores should be lower after the patch? Andres
Andres Freund <andres@anarazel.de> writes: > On 2016-04-18 11:24:07 -0400, Tom Lane wrote: >> (The thing that gave me pause about this was noticing that I could not >> start two such postmasters concurrently on my RHEL6 box, without changing >> the default system limits on number of SysV semaphores.) > Hm, is that different before/after the patch? Because afaics at around > NBuffers = 1000, the number of semaphores should be lower after the > patch? Yeah, that was probably the argument we used before. But it's true that a postmaster built with --disable-spinlocks is harder to test than one built without (because you're going from ~100 semas to ~1100 semas, at default configuration). If we believe that the main real use-case for this switch is testing, that's not a very nice property. The bottom line IMO is that --disable-spinlocks is actually not that awful for low-stress, low-concurrency scenarios; for example, it doesn't change the runtime of make installcheck-parallel very much for me. On the other hand, for high-concurrency scenarios you'd darn well better get a real spinlock implementation. Given that sort of scope of use, it seems like a hundred or so underlying semas should be plenty, and it'd be less likely to cause operational problems than 1024. regards, tom lane
On Mon, Apr 18, 2016 at 11:24 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Andres Freund <andres@anarazel.de> writes: >> On 2016-04-18 11:07:08 -0400, Tom Lane wrote: >>> Did you want to actually review this patch, or should I just push it? > >> No, I'm good, you should push it. I did a quick scan of the patch, and >> it looks sane. For a second I was concerned that there might be a >> situation in which this patch increases the total number of semaphore >> needed, which might make backpatching a bit problematic - but it appears >> that that'd be a very absurd configuration. > > I was actually wondering whether it'd make sense to cut the number of > underlying semaphores to 128 or 512 or thereabouts. But I think we had > that discussion when the daa7527afc227443 patch went in to begin with, > and convinced ourselves that 1024 was okay. Robert, do you recall the > reasoning? I don't recall a specific discussion about it, but the number we would have needed before was gigantic - one per lwlock, which is typically going to be far more than 1024. I mean, at shared_buffers=32MB, there are 4096 buffers using two lwlocks each... never mind everything else that uses lwlocks. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Mon, Apr 18, 2016 at 11:34 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Andres Freund <andres@anarazel.de> writes: >> On 2016-04-18 11:24:07 -0400, Tom Lane wrote: >>> (The thing that gave me pause about this was noticing that I could not >>> start two such postmasters concurrently on my RHEL6 box, without changing >>> the default system limits on number of SysV semaphores.) > >> Hm, is that different before/after the patch? Because afaics at around >> NBuffers = 1000, the number of semaphores should be lower after the >> patch? > > Yeah, that was probably the argument we used before. But it's true that a > postmaster built with --disable-spinlocks is harder to test than one built > without (because you're going from ~100 semas to ~1100 semas, at default > configuration). If we believe that the main real use-case for this switch > is testing, that's not a very nice property. > > The bottom line IMO is that --disable-spinlocks is actually not that awful > for low-stress, low-concurrency scenarios; for example, it doesn't change > the runtime of make installcheck-parallel very much for me. On the other > hand, for high-concurrency scenarios you'd darn well better get a real > spinlock implementation. Given that sort of scope of use, it seems like > a hundred or so underlying semas should be plenty, and it'd be less likely > to cause operational problems than 1024. I suppose that's probably true. I thought surely any system worth its salt wouldn't have a problem with 1024 semaphores, but a quick Google search for "hp-ux semmns" turned up this page: -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Mon, Apr 18, 2016 at 11:53 AM, Robert Haas <robertmhaas@gmail.com> wrote: > On Mon, Apr 18, 2016 at 11:34 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> Andres Freund <andres@anarazel.de> writes: >>> On 2016-04-18 11:24:07 -0400, Tom Lane wrote: >>>> (The thing that gave me pause about this was noticing that I could not >>>> start two such postmasters concurrently on my RHEL6 box, without changing >>>> the default system limits on number of SysV semaphores.) >> >>> Hm, is that different before/after the patch? Because afaics at around >>> NBuffers = 1000, the number of semaphores should be lower after the >>> patch? >> >> Yeah, that was probably the argument we used before. But it's true that a >> postmaster built with --disable-spinlocks is harder to test than one built >> without (because you're going from ~100 semas to ~1100 semas, at default >> configuration). If we believe that the main real use-case for this switch >> is testing, that's not a very nice property. >> >> The bottom line IMO is that --disable-spinlocks is actually not that awful >> for low-stress, low-concurrency scenarios; for example, it doesn't change >> the runtime of make installcheck-parallel very much for me. On the other >> hand, for high-concurrency scenarios you'd darn well better get a real >> spinlock implementation. Given that sort of scope of use, it seems like >> a hundred or so underlying semas should be plenty, and it'd be less likely >> to cause operational problems than 1024. > > I suppose that's probably true. I thought surely any system worth its > salt wouldn't have a problem with 1024 semaphores, but a quick Google > search for "hp-ux semmns" turned up this page: Oops, hit send too soon. http://docstore.mik.ua/manuals/hp-ux/en/B2355-60130/semmni.5.html That's from 2007 and indicates a default maximum of 2048. So it would be fairly easy to hit the limit. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company