Re: Changing shared_buffers without restart - Mailing list pgsql-hackers
From | Tomas Vondra |
---|---|
Subject | Re: Changing shared_buffers without restart |
Date | |
Msg-id | 19820592-cddd-4ffc-8c7c-c182342e7f91@vondra.me Whole thread Raw |
In response to | Re: Changing shared_buffers without restart (Dmitry Dolgov <9erthalion6@gmail.com>) |
Responses |
Re: Changing shared_buffers without restart
|
List | pgsql-hackers |
On 7/5/25 12:35, Dmitry Dolgov wrote: >> On Fri, Jul 04, 2025 at 05:23:29PM +0200, Tomas Vondra wrote: >>>> 2) pending GUC changes >>>> >>>> Perhaps this should be a separate utility command, or maybe even just >>>> a new ALTER SYSTEM variant? Or even just a function, similar to what >>>> the "online checksums" patch did, possibly combined with a bgworder >>>> (but probably not needed, there are no db-specific tasks to do). >>> >>> This is one topic we still actively discuss, but haven't had much >>> feedback otherwise. The pros and cons seem to be clear: >>> >>> * Utilizing the existing GUC mechanism would allow treating >>> shared_buffers as any other configuration, meaning that potential >>> users of this feature don't have to do anything new to use it -- they >>> still can use whatever method they prefer to apply new configuration >>> (pg_reload_conf, pg_ctr reload, maybe even sending SIGHUP directly). >>> >>> I'm also wondering if it's only shared_buffers, or some other options >>> could use similar approach. >>> >> >> I don't know. What are the "potential users" of this feature? I don't >> recall any, but there may be some. How do we know the new pending flag >> will work for them too? > > It could be potentialy useful for any GUC that controls a resource > shared between backend, and requires restart. To make this GUC > changeable online, every backend has to perform some action, and they > have to coordinate to make sure things are consistent -- exactly the use > case we're trying to address, shared_buffers is just happened to be one > of such resources. While I agree that the currently implemented > interface is wrong (e.g. it doesn't prevent pending GUCs from being > stored in PG_AUTOCONF_FILENAME, this has to happen only when the new > value is actually applied), it still makes sense to me to allow more > flexible lifecycle for certain GUC. > > An example I could think of is shared_preload_libraries. If we ever want > to do a hot reload of libraries, this will follow the procedure above: > every backend has to do something like dlclose / dlopen and make sure > that other backends have the same version of the library. Another maybe > less far fetched example is max_worker_processes, which AFAICT is mostly > used to control number of slots in shared memory (altough it's also > stored in the control file, which makes things more complicated). > Not sure. My concern is the config reload / GUC assign hook was not designed with this use case in mind, and we'll run into issues. I also dislike the "async" nature of this, which makes it harder to e.g. abort the change, etc. >>> * Having a separate utility command is a mighty simplification, which >>> helps avoiding problems you've described above. >>> >>> So far we've got two against one in favour of simple utility command, so >>> we can as well go with that. >>> >> >> Not sure voting is a good way to make design decisions ... > > I'm somewhat torn between those two options myself. The more I think > about this topic, the more I convinced that pending GUC makes sense, but > the more work I see needed to implement that. Maybe a good middle ground > is to go with a simple utility command, as Ashutosh was suggesting, and > keep pending GUC infrastructure on top of that as an optional patch. > What about a simple function? Probably not as clean as a proper utility command, and it implies a transaction - not sure if that could be a problem for some part of this. >>>> 3) max_available_memory >>>> >>>> I think the GUC should specify the maximum shared_buffers we want to >>>> allow, and then we'd work out the total to pre-allocate? Considering >>>> we're only allowing to resize shared_buffers, that should be pretty >>>> trivial. Yes, it might happen that the "total limit" happens to exceed >>>> the available memory or something, but we already have the problem >>>> with shared_buffers. Seems fine if we explain this in the docs, and >>>> perhaps print the calculated memory limit on start. >>> >>> Somehow I'm not following what you suggest here. You mean having the >>> maximum shared_buffers specified, but not as a separate GUC? >> >> My suggestion was to have a guc max_shared_buffers. Based on that you >> can easily calculate the size of all other segments dependent on >> NBuffers, and reserve memory for that. > > Got it, ok. > >>>> Moreover, all of the above is for mappings sized based on NBuffers. But >>>> if we allocate 10% for MAIN_SHMEM_SEGMENT, won't that be a problem the >>>> moment someone increases of max_connection, max_locks_per_transaction >>>> and possibly some other stuff? >>> >>> Can you elaborate, what do you mean by that? Increasing max_connection, >>> etc. leading to increased memory consumption in the MAIN_SHMEM_SEGMENT, >>> but the ratio is for memory reservation only. >>> >> >> Stuff like PGPROC, fast-path locks etc. are allocated as part of >> MAIN_SHMEM_SEGMENT, right? Yet the ratio assigns 10% of the maximum >> space for that. If I significantly increase GUCs like max_connections or >> max_locks_per_transaction, how do you know it didn't exceed the 10%? > > Still don't see the problem. The 10% we're talking about is the reserved > space, thus it affects only shared memory resizing operation and nothing > else. The real memory allocated is less than or equal to the reserved > size, but is allocated and managed completely in the same way as without > the patch, including size calculations. If some GUCs are increased and > drive real memory usage high, it will be handled as before. Are we on > the same page about this? > How do you know reserving 10% is sufficient? Imagine I set max_available_memory = '256MB' max_connections = 1000000 max_locks_per_transaction = 10000 How do you know it's not more than 10% of the available memory? FWIW if I add a simple assert to CreateAnonymousSegment Assert(mapping->shmem_reserved >= allocsize); it crashes even with just the max_available_memory=256MB #4 0x0000000000b74fbd in ExceptionalCondition (conditionName=0xe25920 "mapping->shmem_reserved >= allocsize", fileName=0xe251e7 "pg_shmem.c", lineNumber=878) at assert.c:66 because we happen to execute it with this: mapping->shmem_reserved 26845184 allocsize 125042688 I think I mentioned a similar crash earlier, not sure if that's the same issue or a different one. >>>> 11) Actually, what's the difference between the contents of Mappings >>>> and Segments? Isn't that the same thing, indexed in the same way? >>>> Or could it be unified? Or are they conceptually different thing? >>> >>> Unless I'm mixing something badly, the content is the same. The relation >>> is a segment as a structure "contains" a mapping. >>> >> Then, why do we need to track it in two places? Doesn't it just increase >> the likelihood that someone misses updating one of them? > > To clarify, under "contents" I mean the shared memory content (the > actual data) behind both "segment" and the "mapping", maybe you had > something else in mind. > > On the surface of it those are two different data structures that have > mostly different, but related, fields: a shared memory segment contains > stuff needed for working with memory (header, base, end, lock), mapping > has more lower level details, e.g. reserved space, fd, IPC key. The > only common fields are size and address, maybe I can factor them out to > not repeat. > OK, I think I'm just confused by the ambiguous definitions of segment/mapping. It'd be good to document/explain this in a comment somewhere. >>>> 2) In fact, what happens if the user tries to resize to a value that is >>>> too large for one of the segments? How would the system know before >>>> starting the resize (and failing)? >>> >>> This type of situation is handled (doing hard stop) in the latest >>> version, because all the necessary information is present in the mapping >>> structure. >>> >> I don't know, but crashing the instance (I assume that's what you mean >> by hard stop) does not seem like something we want to do. AFAIK the GUC >> hook should be able to determine if the value is too large, and reject >> it at that point. Not proceed and crash everything. > > I see, you're pointing out that it would be good to have more validation > at the GUC level, right? Well, that'd be a starting point. We definitely should not allow setting a value that end up crashing an instance (it does not matter if it's because of FATAL or hitting a segfault/sigbut somewhere). cheers -- Tomas Vondra
pgsql-hackers by date: