Re: BUG #14206: Switch to using POSIX semaphores on FreeBSD - Mailing list pgsql-bugs

From Tom Lane
Subject Re: BUG #14206: Switch to using POSIX semaphores on FreeBSD
Date
Msg-id 30087.1466606930@sss.pgh.pa.us
Whole thread Raw
In response to Re: BUG #14206: Switch to using POSIX semaphores on FreeBSD  (Konstantin Belousov <kostikbel@gmail.com>)
Responses Re: BUG #14206: Switch to using POSIX semaphores on FreeBSD  (Konstantin Belousov <kostikbel@gmail.com>)
List pgsql-bugs
Konstantin Belousov <kostikbel@gmail.com> writes:
> On Tue, Jun 21, 2016 at 04:36:02PM -0400, Tom Lane wrote:
>> If that seems like a competitive alternative for you, it'd be nice to have
>> a platform where we use unnamed POSIX semaphores by default.  I'm a little
>> worried about whether that code has suffered bit-rot, since it's been
>> sitting there basically unused for so long.

> On FreeBSD, there is no practical difference in the resource consumption
> for named vs. unnamed semaphore. I mean that after sem_open(3) call, an
> open file descriptor is not kept in the process fd table. The semaphore
> is represented by the mmaped page, libc+kernel operate solely on the
> page content and use umtx(2) to implement counted semaphore.

Is there any kernel-side resource at all?  The thing that concerns me
about the POSIX APIs is that it's not very clear whether anything gets
left behind if the database crashes.  The Linux man page for sem_destroy
says

       An unnamed semaphore should be destroyed with sem_destroy() before  the
       memory  in  which it is located is deallocated.  Failure to do this can
       result in resource leaks on some implementations.

and while they don't say that their own implementation has such a problem,
it's worrisome.  We go to some lengths to ensure that we can recycle SysV
semaphores after a crash, but there's no equivalent logic in the POSIX
semaphore code, and I don't see how it would even be possible to identify
leftover "unnamed" semaphores.

> That said, the problem with the SysV semaphores is that API allows
> operations on arbitrary sets of the semaphores. Unless some unordinary
> and complex measures are taken, implementation has to use global
> internal lock to synchronize semop(2). This is what I noted in the
> paper.

It's certainly true that semop(2) is more complicated than we need.
But in practice, we only call semop(2) when we need to sleep, or to
awaken a sleeping process, so I'm not sure that performance of it
matters a lot to us.

            regards, tom lane

pgsql-bugs by date:

Previous
From: Maxim Sobolev
Date:
Subject: Re: BUG #14206: Switch to using POSIX semaphores on FreeBSD
Next
From: Stephen Frost
Date:
Subject: Re: pg_dump doesn't dump new objects created in schemas from extensions