Re: Changing shared_buffers without restart - Mailing list pgsql-hackers
| From | Robert Haas | 
|---|---|
| Subject | Re: Changing shared_buffers without restart | 
| Date | |
| Msg-id | CA+TgmoZFfn0E+EkUAjnv_QM_00eUJPkgCJKzm3n1G4itJKMSsA@mail.gmail.com Whole thread Raw | 
| Responses | Re: Changing shared_buffers without restart | 
| List | pgsql-hackers | 
On Fri, Oct 18, 2024 at 3:21 PM Dmitry Dolgov <9erthalion6@gmail.com> wrote: > TL;DR A PoC for changing shared_buffers without PostgreSQL restart, via > changing shared memory mapping layout. Any feedback is appreciated. A lot of people would like to have this feature, so I hope this proposal works out. Thanks for working on it. I think the idea of having multiple shared memory segments is interesting and makes sense, but I would prefer to see them called "segments" rather than "slots" just as do we do for DSMs. The name "slot" is somewhat overused, and invites confusion with replication slots, inter alia. I think it's possible that having multiple fixed shared memory segments will spell trouble on Windows, where we already need to use a retry loop to try to get the main shared memory segment mapped at the correct address. If there are multiple segments and we need whatever ASLR stuff happens on Windows to not place anything else overlapping with any of them, that means there's more chances for stuff to fail than if we just need one address range to be free. Granted, the individual ranges are smaller, so maybe it's fine? But I don't know. The big thing that worries me is synchronization, and while I've only looked at the patch set briefly, it doesn't look to me as though there's enough machinery here to make that work correctly. Suppose that shared_buffers=8GB (a million buffers) and I change it to shared_buffers=16GB (2 million buffers). As soon as any one backend has seen that changed and expanded shared_buffers, there's a possibility that some other backend which has not yet seen the change might see a buffer number greater than a million. If it tries to use that buffer number before it absorbs the change, something bad will happen. The most obvious way for it to see such a buffer number - and possibly the only one - is to do a lookup in the buffer mapping table and find a buffer ID there that was inserted by some other backend that has already seen the change. Fixing this seems tricky. My understanding is that BufferGetBlock() is extremely performance-critical, so having to do a bounds check there to make sure that a given buffer number is in range would probably be bad for performance. Also, even if the overhead weren't prohibitive, I don't think we can safely stick code that unmaps and remaps shared memory segments into a function that currently just does math, because we've probably got places where we assume this operation can't fail -- as well as places where we assume that if we call BufferGetBlock(i) and then BufferGetBlock(j), the second call won't change the answer to the first. It seems to me that it's probably only safe to swap out a backend's notion of where shared_buffers is located when the backend holds on buffer pins, and maybe not even all such places, because it would be a problem if a backend looks up the address of a buffer before actually pinning it, on the assumption that the answer can't change. I don't know if that ever happens, but it would be a legal coding pattern today. Doing it between statements seems safe as long as there are no cursors holding pins. Doing it in the middle of a statement is probably possible if we can verify that we're at a "safe" point in the code, but I'm not sure exactly which points are safe. If we have no code anywhere that assumes the address of an unpinned buffer can't change before we pin it, then I guess the check for pins is the only thing we need, but I don't know that to be the case. I guess I would have imagined that a change like this would have to be done in phases. In phase 1, we'd tell all of the backends that shared_buffers had expanded to some new, larger value; but the new buffers wouldn't be usable for anything yet. Then, once we confirmed that everyone had the memo, we'd tell all the backends that those buffers are now available for use. If shared_buffers were contracted, phase 1 would tell all of the backends that shared_buffers had contracted to some new, smaller value. Once a particular backend learns about that, they will refuse to put any new pages into those high-numbered buffers, but the existing contents would still be valid. Once everyone has been told about this, we can go through and evict all of those buffers, and then let everyone know that's done. Then they shrink their mappings. It looks to me like the patch doesn't expand the buffer mapping table, which seems essential. But maybe I missed that. -- Robert Haas EDB: http://www.enterprisedb.com
pgsql-hackers by date: