Re: Changing shared_buffers without restart - Mailing list pgsql-hackers

From Thomas Munro
Subject Re: Changing shared_buffers without restart
Date
Msg-id CA+hUKGJWcUAk-7pAypWZ=hXSk09D7MV07r=po75LE+owjp9rdg@mail.gmail.com
Whole thread Raw
In response to Re: Changing shared_buffers without restart  (Thomas Munro <thomas.munro@gmail.com>)
List pgsql-hackers
On Fri, Apr 18, 2025 at 3:54 AM Thomas Munro <thomas.munro@gmail.com> wrote:
> I contemplated that once before, when I wrote a quick demo patch[1] to
> implement huge_pages=on for FreeBSD (ie explicit rather than
> transparent).  I used a different function, not the Linuxoid one but

Oops, I forgot to supply that link[1].  And by the way all that
technical mumbo jumbo about FreeBSD was just me writing up why I
didn't pull the trigger and add explicit huge_pages support for it.
The short version is: you shouldn't try to use that flag at all on
FreeBSD yet, as it's a separate research project to add that feature.
I care about PostgreSQL/FreeBSD personally and may consider that again
as I learn more about virtual memory topics, but actually its
transparent super pages seem to do a pretty decent job already and
people don't seem to want to turn them off.

For an actionable plan that should be portable everywhere, how about
this: use shm_open(<tempname>, O_CREAT | O_EXCL, S_IRUSR | S_IWUSR)
followed by shm_unlink(<tempname>) to make this work on every Unix
(FreeBSD could use its slightly better SHM_ANON as the name and skip
the unlink), and redirect to memfd inside #ifdef __linux__.  One thing
to consider is that shm_open() descriptors are implicitly set to
FD_CLOEXEC per POSIX, so I think you need to clear that flag with
fcntl() in EXEC_BACKEND builds, and then also set it again in children
so that they don't pass the descriptor to subprograms they run with
system() etc.  memfd_create() needs the same consideration, except its
default is the other way: I think you need to supply the MFD_CLOEXEC
flag explicitly, unless it's an EXEC_BACKEND build, and use the same
fnctl() to clear it in children if it is.  To restate that the other
way around, in non-EXEC_BACKEND builds shm_open() already does the
right thing and memfd_create() needs MFD_CLOEXEC, with no extra steps
after that.

The only systems I'm aware of that *don't* have shm_open() are (1)
Android, but it's Linux so I assume it has memfd_create() (just for
fun: you can run PostgreSQL on a phone with termux[2], and you can see
that their package supplies a fake shm_open() that redirects to plain
open(); I guess didn't realise they could have supplied an ENOSYS
dummy and just set dynamic_shared_memory_type=mmap instead, and we'd
have done that for them!), and (2) the capability-based research OS
projects like Capsicum (and probably the others like it) that rip out
all the global namespace Unix APIs for approximately the same reason
as Android (PostgreSQL can't run under those yet, but just for fun: I
had PostgreSQL mostly working under Capsicum once, and noticed that
the problems to be solved had significant overlap with the
multithreading project: the global namespace stuff like signals/PIDs
and onymous IPC go away, and the only other major thing is absolute
paths, many of which are easily made relative to a pgdata fd and
handled with openat() in fd.c, but I digress...).

[1] https://www.postgresql.org/message-id/CA%2BhUKGLmBWHF6gusP55R7jVS1%3D6T%3DGphbZpUXiOgMMHDUkVCgw%40mail.gmail.com
[2] https://github.com/termux/termux-packages/tree/master/packages/postgresql



pgsql-hackers by date:

Previous
From: Vinod Sridharan
Date:
Subject: Re: Parallel CREATE INDEX for GIN indexes
Next
From: David Rowley
Date:
Subject: Re: n_ins_since_vacuum stats for aborted transactions