Thread: OpenBSD versus semaphores

OpenBSD versus semaphores

From
Tom Lane
Date:
I've been toying with OpenBSD lately, and soon noticed a seriously
annoying problem for running Postgres on it: by default, its limits
for SysV semaphores are only SEMMNS=60, SEMMNI=10.  Not only does that
greatly constrain the number of connections for a single installation,
it means that our TAP tests fail because you can't start two postmasters
concurrently (cf [1]).

Raising the annoyance factor considerably, AFAICT the only way to
increase these settings is to build your own custom kernel.

So I looked around for an alternative, and found out that modern
OpenBSD releases support named POSIX semaphores (though not unnamed
ones, at least not shared unnamed ones).  What's more, it appears that
in this implementation, named semaphores don't eat open file descriptors
as they do on macOS, removing our major objection to using them.

I don't have any OpenBSD installation on hardware that I'd take very
seriously for performance testing, but some light testing with
"pgbench -S" suggests that a build with PREFERRED_SEMAPHORES=NAMED_POSIX
has just about the same performance as a build with SysV semaphores.

This all leads to the thought that maybe we should be selecting
PREFERRED_SEMAPHORES=NAMED_POSIX on OpenBSD.  At the very least,
our docs ought to recommend it as a credible alternative for
people who don't want to get into building custom kernels.

I've checked that this works back to OpenBSD 6.0, and scanning
their man pages suggests that the feature appeared in 5.5.
5.5 isn't that old (2014) so possibly people are still running
older versions, but we could easily put in version-specific
default logic similar to what's in src/template/darwin.

Thoughts?

            regards, tom lane

[1] https://www.postgresql.org/message-id/e6ecf989-9d5a-9dc5-12de-96985b6e5a5f%40mksoft.nu


Re: OpenBSD versus semaphores

From
Thomas Munro
Date:
On Tue, Jan 8, 2019 at 7:14 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> I've been toying with OpenBSD lately, and soon noticed a seriously
> annoying problem for running Postgres on it: by default, its limits
> for SysV semaphores are only SEMMNS=60, SEMMNI=10.  Not only does that
> greatly constrain the number of connections for a single installation,
> it means that our TAP tests fail because you can't start two postmasters
> concurrently (cf [1]).
>
> Raising the annoyance factor considerably, AFAICT the only way to
> increase these settings is to build your own custom kernel.
>
> So I looked around for an alternative, and found out that modern
> OpenBSD releases support named POSIX semaphores (though not unnamed
> ones, at least not shared unnamed ones).  What's more, it appears that
> in this implementation, named semaphores don't eat open file descriptors
> as they do on macOS, removing our major objection to using them.
>
> I don't have any OpenBSD installation on hardware that I'd take very
> seriously for performance testing, but some light testing with
> "pgbench -S" suggests that a build with PREFERRED_SEMAPHORES=NAMED_POSIX
> has just about the same performance as a build with SysV semaphores.
>
> This all leads to the thought that maybe we should be selecting
> PREFERRED_SEMAPHORES=NAMED_POSIX on OpenBSD.  At the very least,
> our docs ought to recommend it as a credible alternative for
> people who don't want to get into building custom kernels.
>
> I've checked that this works back to OpenBSD 6.0, and scanning
> their man pages suggests that the feature appeared in 5.5.
> 5.5 isn't that old (2014) so possibly people are still running
> older versions, but we could easily put in version-specific
> default logic similar to what's in src/template/darwin.
>
> Thoughts?

No OpenBSD here, but I was curious enough to peek at their
implementation.  Like others, they create a tiny file under /tmp for
each one, mmap() and close the fd straight away.  Apparently don't
support shared sem_init() yet (EPERM).  So your plan seems good to me.
CC'ing Pierre-Emmanuel (OpenBSD PostgreSQL port maintainer) in case he
is interested.

Wild speculation:  I wouldn't be surprised if POSIX named semas
perform better than SysV semas on a large enough system, since they'll
live on different pages.  At a glance, their sys_semget apparently
allocates arrays of struct sem without padding and I think they
probably get about 4 to a cacheline; see our experience with an 8
socket box leading to commit 2d306759 where we added our own padding.

-- 
Thomas Munro
http://www.enterprisedb.com


Re: OpenBSD versus semaphores

From
Tom Lane
Date:
Thomas Munro <thomas.munro@enterprisedb.com> writes:
> On Tue, Jan 8, 2019 at 7:14 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> So I looked around for an alternative, and found out that modern
>> OpenBSD releases support named POSIX semaphores (though not unnamed
>> ones, at least not shared unnamed ones).  What's more, it appears that
>> in this implementation, named semaphores don't eat open file descriptors
>> as they do on macOS, removing our major objection to using them.

> No OpenBSD here, but I was curious enough to peek at their
> implementation.  Like others, they create a tiny file under /tmp for
> each one, mmap() and close the fd straight away.

Oh, yeah, I can see a bunch of tiny mappings with procmap.  I wonder
whether that scales any better than an open FD per semaphore, when
it comes to forking a bunch of child processes that will inherit
all those mappings or FDs.  I've not tried to benchmark child process
launch as such --- as I said, I'm not running this on hardware that
would support serious benchmarking.

BTW, I just finished finding out that recent NetBSD (8.99.25) has
working code paths for *both* named and unnamed POSIX semaphores.
However, it appears that both code paths involve an open FD per
semaphore, so it's likely not something we want to recommend using.

            regards, tom lane


Re: OpenBSD versus semaphores

From
Mikael Kjellström
Date:
On 2019-01-08 07:14, Tom Lane wrote:
> I've been toying with OpenBSD lately, and soon noticed a seriously
> annoying problem for running Postgres on it: by default, its limits
> for SysV semaphores are only SEMMNS=60, SEMMNI=10.  Not only does that
> greatly constrain the number of connections for a single installation,
> it means that our TAP tests fail because you can't start two postmasters
> concurrently (cf [1]).
> 
> Raising the annoyance factor considerably, AFAICT the only way to
> increase these settings is to build your own custom kernel.

You don't need to build your custom kernel to change those settings.

Just add:

kern.seminfo.semmni=20

to /etc/sysctl.conf and reboot

/Mikael


Re: OpenBSD versus semaphores

From
Abel Abraham Camarillo Ojeda
Date:


On Tue, Jan 8, 2019 at 12:14 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
I've been toying with OpenBSD lately, and soon noticed a seriously
annoying problem for running Postgres on it: by default, its limits
for SysV semaphores are only SEMMNS=60, SEMMNI=10.  Not only does that
greatly constrain the number of connections for a single installation,
it means that our TAP tests fail because you can't start two postmasters
concurrently (cf [1]).

Raising the annoyance factor considerably, AFAICT the only way to
increase these settings is to build your own custom kernel.

This is not accurate, you can change this values via sysctl(1), extracted from OpenBSD postgresql port:

Tuning for busy servers
=======================
The default sizes in the GENERIC kernel for SysV semaphores are only
just large enough for a database with the default configuration
(max_connections 40) if no other running processes use semaphores.
In other cases you will need to increase the limits. Adding the
following in /etc/sysctl.conf will be reasonable for many systems:
kern.seminfo.semmni=60kern.seminfo.semmns=1024

To serve a large number of connections (>250), you may need higher
values for the above.



So I looked around for an alternative, and found out that modern
OpenBSD releases support named POSIX semaphores (though not unnamed
ones, at least not shared unnamed ones).  What's more, it appears that
in this implementation, named semaphores don't eat open file descriptors
as they do on macOS, removing our major objection to using them.

I don't have any OpenBSD installation on hardware that I'd take very
seriously for performance testing, but some light testing with
"pgbench -S" suggests that a build with PREFERRED_SEMAPHORES=NAMED_POSIX
has just about the same performance as a build with SysV semaphores.

This all leads to the thought that maybe we should be selecting
PREFERRED_SEMAPHORES=NAMED_POSIX on OpenBSD.  At the very least,
our docs ought to recommend it as a credible alternative for
people who don't want to get into building custom kernels.

I've checked that this works back to OpenBSD 6.0, and scanning
their man pages suggests that the feature appeared in 5.5.
5.5 isn't that old (2014) so possibly people are still running
older versions, but we could easily put in version-specific
default logic similar to what's in src/template/darwin.

Thoughts?

                        regards, tom lane

[1] https://www.postgresql.org/message-id/e6ecf989-9d5a-9dc5-12de-96985b6e5a5f%40mksoft.nu

Re: OpenBSD versus semaphores

From
Tom Lane
Date:
=?UTF-8?Q?Mikael_Kjellstr=c3=b6m?= <mikael.kjellstrom@mksoft.nu> writes:
> On 2019-01-08 07:14, Tom Lane wrote:
>> Raising the annoyance factor considerably, AFAICT the only way to
>> increase these settings is to build your own custom kernel.

> You don't need to build your custom kernel to change those settings.
> Just add:
> kern.seminfo.semmni=20
> to /etc/sysctl.conf and reboot

Hm, I wonder when that came in?  Our documentation doesn't know about it.

            regards, tom lane


Re: OpenBSD versus semaphores

From
Thomas Munro
Date:
On Fri, Apr 2, 2021 at 9:42 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Thomas Munro <thomas.munro@enterprisedb.com> writes:
> > On Tue, Jan 8, 2019 at 7:14 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> >> So I looked around for an alternative, and found out that modern
> >> OpenBSD releases support named POSIX semaphores (though not unnamed
> >> ones, at least not shared unnamed ones).  What's more, it appears that
> >> in this implementation, named semaphores don't eat open file descriptors
> >> as they do on macOS, removing our major objection to using them.
>
> > No OpenBSD here, but I was curious enough to peek at their
> > implementation.  Like others, they create a tiny file under /tmp for
> > each one, mmap() and close the fd straight away.
>
> Oh, yeah, I can see a bunch of tiny mappings with procmap.  I wonder
> whether that scales any better than an open FD per semaphore, when
> it comes to forking a bunch of child processes that will inherit
> all those mappings or FDs.  I've not tried to benchmark child process
> launch as such --- as I said, I'm not running this on hardware that
> would support serious benchmarking.

I also have no ability to benchmark on a real OpenBSD system, but once
a year or so when I spin up a little OpenBSD VM to test some patch or
other, it annoys me that our tests fail out of the box and then I have
to look up how to change the sysctls, so here's a patch.  I also
checked the release notes to confirm that 5.5 is the right release to
look for[1]; by now that's EOL and probably not even worth bothering
with the test but doesn't cost much to be cautious about that.  4.x is
surely too old to waste electrons on.  I guess the question for
OpenBSD experts is whether having (say) a thousand tiny mappings is
bad.  On the plus side, we know from other Oses that having semas
spread out is good for reducing false sharing on large systems.

[1] https://www.openbsd.org/55.html

Attachment