Thread: OpenBSD versus semaphores
I've been toying with OpenBSD lately, and soon noticed a seriously annoying problem for running Postgres on it: by default, its limits for SysV semaphores are only SEMMNS=60, SEMMNI=10. Not only does that greatly constrain the number of connections for a single installation, it means that our TAP tests fail because you can't start two postmasters concurrently (cf [1]). Raising the annoyance factor considerably, AFAICT the only way to increase these settings is to build your own custom kernel. So I looked around for an alternative, and found out that modern OpenBSD releases support named POSIX semaphores (though not unnamed ones, at least not shared unnamed ones). What's more, it appears that in this implementation, named semaphores don't eat open file descriptors as they do on macOS, removing our major objection to using them. I don't have any OpenBSD installation on hardware that I'd take very seriously for performance testing, but some light testing with "pgbench -S" suggests that a build with PREFERRED_SEMAPHORES=NAMED_POSIX has just about the same performance as a build with SysV semaphores. This all leads to the thought that maybe we should be selecting PREFERRED_SEMAPHORES=NAMED_POSIX on OpenBSD. At the very least, our docs ought to recommend it as a credible alternative for people who don't want to get into building custom kernels. I've checked that this works back to OpenBSD 6.0, and scanning their man pages suggests that the feature appeared in 5.5. 5.5 isn't that old (2014) so possibly people are still running older versions, but we could easily put in version-specific default logic similar to what's in src/template/darwin. Thoughts? regards, tom lane [1] https://www.postgresql.org/message-id/e6ecf989-9d5a-9dc5-12de-96985b6e5a5f%40mksoft.nu
On Tue, Jan 8, 2019 at 7:14 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: > I've been toying with OpenBSD lately, and soon noticed a seriously > annoying problem for running Postgres on it: by default, its limits > for SysV semaphores are only SEMMNS=60, SEMMNI=10. Not only does that > greatly constrain the number of connections for a single installation, > it means that our TAP tests fail because you can't start two postmasters > concurrently (cf [1]). > > Raising the annoyance factor considerably, AFAICT the only way to > increase these settings is to build your own custom kernel. > > So I looked around for an alternative, and found out that modern > OpenBSD releases support named POSIX semaphores (though not unnamed > ones, at least not shared unnamed ones). What's more, it appears that > in this implementation, named semaphores don't eat open file descriptors > as they do on macOS, removing our major objection to using them. > > I don't have any OpenBSD installation on hardware that I'd take very > seriously for performance testing, but some light testing with > "pgbench -S" suggests that a build with PREFERRED_SEMAPHORES=NAMED_POSIX > has just about the same performance as a build with SysV semaphores. > > This all leads to the thought that maybe we should be selecting > PREFERRED_SEMAPHORES=NAMED_POSIX on OpenBSD. At the very least, > our docs ought to recommend it as a credible alternative for > people who don't want to get into building custom kernels. > > I've checked that this works back to OpenBSD 6.0, and scanning > their man pages suggests that the feature appeared in 5.5. > 5.5 isn't that old (2014) so possibly people are still running > older versions, but we could easily put in version-specific > default logic similar to what's in src/template/darwin. > > Thoughts? No OpenBSD here, but I was curious enough to peek at their implementation. Like others, they create a tiny file under /tmp for each one, mmap() and close the fd straight away. Apparently don't support shared sem_init() yet (EPERM). So your plan seems good to me. CC'ing Pierre-Emmanuel (OpenBSD PostgreSQL port maintainer) in case he is interested. Wild speculation: I wouldn't be surprised if POSIX named semas perform better than SysV semas on a large enough system, since they'll live on different pages. At a glance, their sys_semget apparently allocates arrays of struct sem without padding and I think they probably get about 4 to a cacheline; see our experience with an 8 socket box leading to commit 2d306759 where we added our own padding. -- Thomas Munro http://www.enterprisedb.com
Thomas Munro <thomas.munro@enterprisedb.com> writes: > On Tue, Jan 8, 2019 at 7:14 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: >> So I looked around for an alternative, and found out that modern >> OpenBSD releases support named POSIX semaphores (though not unnamed >> ones, at least not shared unnamed ones). What's more, it appears that >> in this implementation, named semaphores don't eat open file descriptors >> as they do on macOS, removing our major objection to using them. > No OpenBSD here, but I was curious enough to peek at their > implementation. Like others, they create a tiny file under /tmp for > each one, mmap() and close the fd straight away. Oh, yeah, I can see a bunch of tiny mappings with procmap. I wonder whether that scales any better than an open FD per semaphore, when it comes to forking a bunch of child processes that will inherit all those mappings or FDs. I've not tried to benchmark child process launch as such --- as I said, I'm not running this on hardware that would support serious benchmarking. BTW, I just finished finding out that recent NetBSD (8.99.25) has working code paths for *both* named and unnamed POSIX semaphores. However, it appears that both code paths involve an open FD per semaphore, so it's likely not something we want to recommend using. regards, tom lane
On 2019-01-08 07:14, Tom Lane wrote: > I've been toying with OpenBSD lately, and soon noticed a seriously > annoying problem for running Postgres on it: by default, its limits > for SysV semaphores are only SEMMNS=60, SEMMNI=10. Not only does that > greatly constrain the number of connections for a single installation, > it means that our TAP tests fail because you can't start two postmasters > concurrently (cf [1]). > > Raising the annoyance factor considerably, AFAICT the only way to > increase these settings is to build your own custom kernel. You don't need to build your custom kernel to change those settings. Just add: kern.seminfo.semmni=20 to /etc/sysctl.conf and reboot /Mikael
On Tue, Jan 8, 2019 at 12:14 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
I've been toying with OpenBSD lately, and soon noticed a seriously
annoying problem for running Postgres on it: by default, its limits
for SysV semaphores are only SEMMNS=60, SEMMNI=10. Not only does that
greatly constrain the number of connections for a single installation,
it means that our TAP tests fail because you can't start two postmasters
concurrently (cf [1]).
Raising the annoyance factor considerably, AFAICT the only way to
increase these settings is to build your own custom kernel.
This is not accurate, you can change this values via sysctl(1), extracted from OpenBSD postgresql port:
Tuning for busy servers
======================= The default sizes in the GENERIC kernel for SysV semaphores are only just large enough for a database with the default configuration (max_connections 40) if no other running processes use semaphores. In other cases you will need to increase the limits. Adding the following in /etc/sysctl.conf will be reasonable for many systems: kern.seminfo.semmni=60kern.seminfo.semmns=1024 To serve a large number of connections (>250), you may need higher values for the above.
So I looked around for an alternative, and found out that modern
OpenBSD releases support named POSIX semaphores (though not unnamed
ones, at least not shared unnamed ones). What's more, it appears that
in this implementation, named semaphores don't eat open file descriptors
as they do on macOS, removing our major objection to using them.
I don't have any OpenBSD installation on hardware that I'd take very
seriously for performance testing, but some light testing with
"pgbench -S" suggests that a build with PREFERRED_SEMAPHORES=NAMED_POSIX
has just about the same performance as a build with SysV semaphores.
This all leads to the thought that maybe we should be selecting
PREFERRED_SEMAPHORES=NAMED_POSIX on OpenBSD. At the very least,
our docs ought to recommend it as a credible alternative for
people who don't want to get into building custom kernels.
I've checked that this works back to OpenBSD 6.0, and scanning
their man pages suggests that the feature appeared in 5.5.
5.5 isn't that old (2014) so possibly people are still running
older versions, but we could easily put in version-specific
default logic similar to what's in src/template/darwin.
Thoughts?
regards, tom lane
[1] https://www.postgresql.org/message-id/e6ecf989-9d5a-9dc5-12de-96985b6e5a5f%40mksoft.nu
=?UTF-8?Q?Mikael_Kjellstr=c3=b6m?= <mikael.kjellstrom@mksoft.nu> writes: > On 2019-01-08 07:14, Tom Lane wrote: >> Raising the annoyance factor considerably, AFAICT the only way to >> increase these settings is to build your own custom kernel. > You don't need to build your custom kernel to change those settings. > Just add: > kern.seminfo.semmni=20 > to /etc/sysctl.conf and reboot Hm, I wonder when that came in? Our documentation doesn't know about it. regards, tom lane
On Fri, Apr 2, 2021 at 9:42 AM Tom Lane <tgl@sss.pgh.pa.us> wrote: > Thomas Munro <thomas.munro@enterprisedb.com> writes: > > On Tue, Jan 8, 2019 at 7:14 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: > >> So I looked around for an alternative, and found out that modern > >> OpenBSD releases support named POSIX semaphores (though not unnamed > >> ones, at least not shared unnamed ones). What's more, it appears that > >> in this implementation, named semaphores don't eat open file descriptors > >> as they do on macOS, removing our major objection to using them. > > > No OpenBSD here, but I was curious enough to peek at their > > implementation. Like others, they create a tiny file under /tmp for > > each one, mmap() and close the fd straight away. > > Oh, yeah, I can see a bunch of tiny mappings with procmap. I wonder > whether that scales any better than an open FD per semaphore, when > it comes to forking a bunch of child processes that will inherit > all those mappings or FDs. I've not tried to benchmark child process > launch as such --- as I said, I'm not running this on hardware that > would support serious benchmarking. I also have no ability to benchmark on a real OpenBSD system, but once a year or so when I spin up a little OpenBSD VM to test some patch or other, it annoys me that our tests fail out of the box and then I have to look up how to change the sysctls, so here's a patch. I also checked the release notes to confirm that 5.5 is the right release to look for[1]; by now that's EOL and probably not even worth bothering with the test but doesn't cost much to be cautious about that. 4.x is surely too old to waste electrons on. I guess the question for OpenBSD experts is whether having (say) a thousand tiny mappings is bad. On the plus side, we know from other Oses that having semas spread out is good for reducing false sharing on large systems. [1] https://www.openbsd.org/55.html