Re: Changing shared_buffers without restart - Mailing list pgsql-hackers
From | Dmitry Dolgov |
---|---|
Subject | Re: Changing shared_buffers without restart |
Date | |
Msg-id | scor5gscd42d4nwszuwvtwss6e22fg3dnvxmqwrcsdkpyyigny@efjlkj6ccv7u Whole thread Raw |
In response to | Re: Changing shared_buffers without restart (Robert Haas <robertmhaas@gmail.com>) |
Responses |
Re: Changing shared_buffers without restart
Re: Changing shared_buffers without restart |
List | pgsql-hackers |
> On Wed, Nov 27, 2024 at 04:05:47PM GMT, Robert Haas wrote: > On Wed, Nov 27, 2024 at 3:48 PM Dmitry Dolgov <9erthalion6@gmail.com> wrote: > > My understanding is that clashing of mappings (either at creation time > > or when resizing) could happen only withing the process address space, > > and the assumption is that by the time we prepare the mapping layout all > > the rest of mappings for the process are already done. > > I don't think that's correct at all. First, the user could type LOAD > 'whatever' at any time. But second, even if they don't or you prohibit > them from doing so, the process could allocate memory for any of a > million different things, and that could require mapping a new region > of memory, and the OS could choose to place that just after an > existing mapping, or at least close enough that we can't expand the > object size as much as desired. > > If we had an upper bound on the size of shared_buffers and could > reserve that amount of address space at startup time but only actually > map a portion of it, then we could later remap and expand into the > reserved space. Without that, I think there's absolutely no guarantee > that the amount of address space that we need is available when we > want to extend a mapping. Just done a couple of experiments, and I think this could be addressed by careful placing of mappings as well, based on two assumptions: for a new mapping the kernel always picks up a lowest address that allows enough space, and the maximum amount of allocable memory for other mappings could be derived from total available memory. With that in mind the shared mapping layout will have to have a large gap at the start, between the lowest address and the shared mappings used for buffers and rest -- the gap where all the other mapping (allocations, libraries, madvise, etc) will land. It's similar to address space reserving you mentioned above, will reduce possibility of clashing significantly, and looks something like this: 01339000-0139e000 [heap] 0139e000-014aa000 [heap] 7f2dd72f6000-7f2dfbc9c000 /memfd:strategy (deleted) 7f2e0209c000-7f2e269b0000 /memfd:checkpoint (deleted) 7f2e2cdb0000-7f2e516b4000 /memfd:iocv (deleted) 7f2e57ab4000-7f2e7c478000 /memfd:descriptors (deleted) 7f2ebc478000-7f2ee8d3c000 /memfd:buffers (deleted) ^ note the distance between two mappings, which is intended for resize 7f3168d3c000-7f318d600000 /memfd:main (deleted) ^ here is where the gap starts 7f4194c00000-7f4194e7d000 ^ this one is an anonymous maping created due to large memory allocation after shared mappings were created 7f4195000000-7f419527d000 7f41952dc000-7f4195416000 7f4195416000-7f4195600000 /dev/shm/PostgreSQL.2529797530 7f4195600000-7f41a311d000 /usr/lib/locale/locale-archive 7f41a317f000-7f41a3200000 7f41a3200000-7f41a3201000 /usr/lib64/libicudata.so.74.2 The assumption about picking up a lowest address is just how it works right now on Linux, this fact is already used in the patch. The idea that we could put upper boundary on the size of other mappings based on total available memory comes from the fact that anonymous mappings, that are much larger than memory, will fail without overcommit. With overcommit it becomes different, but if allocations are hitting that limit I can imagine there are bigger problems than shared buffer resize. This approach follows the same ideas already used in the patch, and have the same trade offs: no address changes, but questions about portability.
pgsql-hackers by date: