Re: Changing shared_buffers without restart - Mailing list pgsql-hackers

From Matthias van de Meent
Subject Re: Changing shared_buffers without restart
Date
Msg-id CAEze2WiMkmXUWg10y+_oGhJzXirZbYHB5bw0=VWte+YHwSBa=A@mail.gmail.com
Whole thread Raw
In response to Re: Changing shared_buffers without restart  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Changing shared_buffers without restart
Re: Changing shared_buffers without restart
List pgsql-hackers
On Thu, 28 Nov 2024 at 19:57, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Matthias van de Meent <boekewurm+postgres@gmail.com> writes:
> > On Thu, 28 Nov 2024 at 18:19, Robert Haas <robertmhaas@gmail.com> wrote:
> >> [...] It's unclear to me why
> >> operating systems don't offer better primitives for this sort of thing
> >> -- in theory there could be a system call that sets aside a pool of
> >> address space and then other system calls that let you allocate
> >> shared/unshared memory within that space or even at specific
> >> addresses, but actually such things don't exist.
>
> > Isn't that more a stdlib/malloc issue? AFAIK, Linux's mmap(2) syscall
> > allows you to request memory from the OS at arbitrary addresses - it's
> > just that stdlib's malloc doens't expose the 'alloc at this address'
> > part of that API.
>
> I think what Robert is concerned about is that there is exactly 0
> guarantee that that will succeed, because you have no control over
> system-driven allocations of address space (for example, loading
> of extensions or JIT code).  In fact, given things like ASLR, there
> is pressure on the kernel crew to make that *less* predictable not
> more so.

I see what you mean, but I think that shouldn't be much of an issue.
I'm not a kernel hacker, but I've never heard about anyone arguing to
remove mmap's mapping-overwriting behavior for user-controlled
mappings - it seems too useful as a way to guarantee relative memory
addresses (agreed, there is now mseal(2), but that is the user asking
for security on their own mapping, this isn't applied to arbitrary
mappings).

I mean, we can do the following to get a nice contiguous empty address
space no other mmap(NULL)s will get put into:

    /* reserve size bytes of memory */
    base = mmap(NULL, size, PROT_NONE, ...flags, ...);
    /* use the first small_size bytes of that reservation */
    allocated_in_reserved = mmap(base, small_size, PROT_READ |
PROT_WRITE, MAP_FIXED, ...);

With the PROT_NONE protection option the OS doesn't actually allocate
any backing memory, but guarantees no other mmap(NULL, ...) will get
placed in that area such that it overlaps with that allocation until
the area is munmap-ed, thus allowing us to reserve a chunk of address
space without actually using (much) memory. Deallocations have to go
through mmap(... PROT_NONE, ...) instead of munmap if we'd want to
keep the full area reserved, but I think that's not that much of an
issue.

I also highly doubt Linux will remove or otherwise limit the PROT_NONE
option to such a degree that we won't be able to "balloon" the memory
address space for (e.g.) dynamic shared buffer resizing.

See also: FreeBSD's MAP_GUARD mmap flag, Window's MEM_RESERVE and
MEM_RESERVE_PLACEHOLDER flags for VirtualAlloc[2][Ex].
See also [0] where PROT_NONE is explicitly called out as a tool for
reserving memory address space.

> So even if we devise a method that seems to work reliably
> today, we could have little faith that it would work with next year's
> kernels.

I really don't think that userspace memory address space reservations
through e.g. PROT_NONE or MEM_RESERVE[_PLACEHOLDER] will be retired
anytime soon, at least not without the relevant kernels also providing
effective alternatives.


Kind regards,

Matthias van de Meent
Neon (https://neon.tech)

[0] https://www.gnu.org/software/libc/manual/html_node/Memory-Protection.html



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: More CppAsString2() in psql's describe.c
Next
From: Sutou Kouhei
Date:
Subject: Re: Make COPY format extendable: Extract COPY TO format implementations