pgsql: Don't treat EINVAL from semget() as a hard failure. - Mailing list pgsql-committers

From Tom Lane
Subject pgsql: Don't treat EINVAL from semget() as a hard failure.
Date
Msg-id E1umDtf-000NgH-1Z@gemulon.postgresql.org
Whole thread Raw
List pgsql-committers
Don't treat EINVAL from semget() as a hard failure.

It turns out that on some platforms (at least current macOS, NetBSD,
OpenBSD) semget(2) will return EINVAL if there is a pre-existing
semaphore set with the same key and too few semaphores.  Our code
expects EEXIST in that case and treats EINVAL as a hard failure,
resulting in failure during initdb or postmaster start.

POSIX does document EINVAL for too-few-semaphores-in-set, and is
silent on its priority relative to EEXIST, so this behavior arguably
conforms to spec.  Nonetheless it's quite problematic because EINVAL
is also documented to mean that nsems is greater than the system's
limit on the number of semaphores per set (SEMMSL).  If that is
where the problem lies, retrying would just become an infinite loop.

To resolve this contradiction, retry after EINVAL, but also install a
loop limit that will make us give up regardless of the specific errno
after trying 1000 different keys.  (1000 is a pretty arbitrary number,
but it seems like it should be sufficient.)  I like this better than
the previous infinite-looping behavior, since it will also keep us out
of trouble if (say) we get EACCES due to a system-level permissions
problem rather than anything to do with a specific semaphore set.

This problem has only been observed in the field in PG 17, which uses
a higher nsems value than other branches (cf. 38da05346, 810a8b1c8).
That makes it possible to get the failure if a new v17 postmaster
has a key collision with an existing postmaster of another branch.
In principle though, we might see such a collision against a semaphore
set created by some other application, in which case all branches are
vulnerable on these platforms.  Hence, backpatch.

Reported-by: Gavin Panella <gavinpanella@gmail.com>
Author: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/CALL7chmzY3eXHA7zHnODUVGZLSvK3wYCSP0RmcDFHJY8f28Q3g@mail.gmail.com
Backpatch-through: 13

Branch
------
REL_15_STABLE

Details
-------
https://git.postgresql.org/pg/commitdiff/f4c0883448d8d2bbae9c104df0c34ccdfaf4db56

Modified Files
--------------
src/backend/port/sysv_sema.c | 46 +++++++++++++++++++++++++++++++-------------
1 file changed, 33 insertions(+), 13 deletions(-)


pgsql-committers by date:

Previous
From: Peter Eisentraut
Date:
Subject: pgsql: Adjust some table column widths in PDF
Next
From: Andres Freund
Date:
Subject: pgsql: Add very basic test for kill_prior_tuples