Thread: Could we replace SysV semaphores with latches?
There has been regular griping in this list about our dependence on SysV shared memory, but not so much about SysV semaphores, even though the latter cause their fair share of issues; as seen for example in buildfarm member spoonbill's recent string of failures: creating template1 database in /home/pgbuild/pgbuildfarm/HEAD/pgsql.25563/src/test/regress/./tmp_check/data/base/1 ... FATAL: could not create semaphores: No space left on device DETAIL: Failed system call was semget(1, 17, 03600). HINT: This error does *not* mean that you have run out of disk space. It occurs when either the system limit for the maximumnumber of semaphore sets (SEMMNI), or the system wide maximum number of semaphores (SEMMNS), would be exceeded. Youneed to raise the respective kernel parameter. Alternatively, reduce PostgreSQL's consumption of semaphores by reducingits max_connections parameter.The PostgreSQL documentation contains more information about configuring your systemfor PostgreSQL. child process exited with exit code 1 It strikes me that we have recently put together an independent but just about equivalent waiting mechanism in the form of latches. And not only that, but there's already a latch for each process. Could we replace our usages of SysV semaphores with WaitLatch on the procLatch? Unlike the situation with shared memory where we need some secondary features (mumble shm_nattch mumble), I think we aren't really using anything interesting about SysV semaphores except for the raw ability to wait for somebody to signal us. regards, tom lane
On 07.06.2012 07:09, Tom Lane wrote: > There has been regular griping in this list about our dependence on SysV > shared memory, but not so much about SysV semaphores, even though the > latter cause their fair share of issues; as seen for example in > buildfarm member spoonbill's recent string of failures: > > creating template1 database in /home/pgbuild/pgbuildfarm/HEAD/pgsql.25563/src/test/regress/./tmp_check/data/base/1 ...FATAL: could not create semaphores: No space left on device > DETAIL: Failed system call was semget(1, 17, 03600). > HINT: This error does *not* mean that you have run out of disk space. It occurs when either the system limit for themaximum number of semaphore sets (SEMMNI), or the system wide maximum number of semaphores (SEMMNS), would be exceeded. You need to raise the respective kernel parameter. Alternatively, reduce PostgreSQL's consumption of semaphoresby reducing its max_connections parameter. > The PostgreSQL documentation contains more information about configuring your system for PostgreSQL. > child process exited with exit code 1 > > It strikes me that we have recently put together an independent but just > about equivalent waiting mechanism in the form of latches. And not only > that, but there's already a latch for each process. Could we replace > our usages of SysV semaphores with WaitLatch on the procLatch? Unlike > the situation with shared memory where we need some secondary features > (mumble shm_nattch mumble), I think we aren't really using anything > interesting about SysV semaphores except for the raw ability to wait for > somebody to signal us. Would have to performance test that carefully. We use semaphores in lwlocks, so it's performance critical. A latch might well be slower, especially on platforms where a signal does not interrupt sleep, and we rely on the signal handler and self-pipe to wake up. Although perhaps we could improve the latch implementation. pselect() might perform better than the self-pipe trick, on platforms where it works. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes: > On 07.06.2012 07:09, Tom Lane wrote: >> It strikes me that we have recently put together an independent but just >> about equivalent waiting mechanism in the form of latches. And not only >> that, but there's already a latch for each process. Could we replace >> our usages of SysV semaphores with WaitLatch on the procLatch? > Would have to performance test that carefully. We use semaphores in > lwlocks, so it's performance critical. A latch might well be slower, > especially on platforms where a signal does not interrupt sleep, and we > rely on the signal handler and self-pipe to wake up. By the time you've reached the point where you conclude you have to block, all hope of high performance has gone out the window anyway, so I can't get terribly excited about that. But in general, yeah, our current implementation of latches could very possibly use some more work to improve performance. I think we've latch-ified enough code to make that worth doing already. > Although perhaps we could improve the latch implementation. pselect() > might perform better than the self-pipe trick, on platforms where it works. AFAIK pselect does not fix the basic race condition: what if somebody else does SetLatch just before you reach the blocking kernel call? You still end up needing a self-pipe. I would be more inclined to look into OS-specific primitives such as futexes on Linux. (No idea whether those actually would be suitable, just pointing out that they exist.) Our semaphore-based API was always both overcomplicated and underspecified, but I think we have got latch semantics nailed down well enough that implementations built on OS-specific primitives could be a reasonable thing. regards, tom lane
On Thu, Jun 7, 2012 at 10:13 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > I would be more inclined to look into OS-specific primitives such as > futexes on Linux. (No idea whether those actually would be suitable, > just pointing out that they exist.) Our semaphore-based API was always > both overcomplicated and underspecified, but I think we have got latch > semantics nailed down well enough that implementations built on > OS-specific primitives could be a reasonable thing. I've been thinking about trying to futex-ify our spinlock implementation, so that when we detect that the spinlock is contended (or contended sufficiently badly?) we go into a long kernel sleep (e.g. 10 s) and wait to be woken up. This might perform better than our current implementation in cases where the spinlock is badly contended, since it would avoid yanking the cache line around between all the CPUs on the box. But I haven't yet, because (1) it'll only work on Linux and (2) it's better to fix the problem that is causing the contention rather than make the contention less expensive. Still, it might be worth looking into. I'm not sure whether there's a sensible way to use this for LWLocks directly. It would be nice not to be doing the lock-within-a-lock thing, but I don't know that I really want to maintain a completely separate LWLock implementation just for Linux, and it's not obvious how you're supposed to use a futex to implement a lock with more than one lock mode. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On 06/07/2012 06:09 AM, Tom Lane wrote: > There has been regular griping in this list about our dependence on SysV > shared memory, but not so much about SysV semaphores, even though the > latter cause their fair share of issues; as seen for example in > buildfarm member spoonbill's recent string of failures: > > creating template1 database in /home/pgbuild/pgbuildfarm/HEAD/pgsql.25563/src/test/regress/./tmp_check/data/base/1 ...FATAL: could not create semaphores: No space left on device > DETAIL: Failed system call was semget(1, 17, 03600). > HINT: This error does *not* mean that you have run out of disk space. It occurs when either the system limit for themaximum number of semaphore sets (SEMMNI), or the system wide maximum number of semaphores (SEMMNS), would be exceeded. You need to raise the respective kernel parameter. Alternatively, reduce PostgreSQL's consumption of semaphoresby reducing its max_connections parameter. > The PostgreSQL documentation contains more information about configuring your system for PostgreSQL. > child process exited with exit code 1 hmm now that you mention that I missed the issue completely - the problem here is that spoonbill only has resources for one running postmaster and the failure on 24.5.2012 caused a left over postmaster instance - should be fixed now Stefan