Re: DSA_ALLOC_NO_OOM doesn't work - Mailing list pgsql-hackers
From | Heikki Linnakangas |
---|---|
Subject | Re: DSA_ALLOC_NO_OOM doesn't work |
Date | |
Msg-id | cf35fccb-957e-48de-8ea0-a083e9d7843d@iki.fi Whole thread Raw |
In response to | Re: DSA_ALLOC_NO_OOM doesn't work (Robert Haas <robertmhaas@gmail.com>) |
Responses |
Re: DSA_ALLOC_NO_OOM doesn't work
Re: DSA_ALLOC_NO_OOM doesn't work |
List | pgsql-hackers |
On 14/02/2024 09:23, Robert Haas wrote: > On Tue, Feb 13, 2024 at 7:53 PM Heikki Linnakangas <hlinnaka@iki.fi> wrote: >> However, I must say that the dsm_impl_op() interface is absolutely >> insane. It's like someone looked at ioctl() and thought, "hey that's a >> great idea!". > > As the person who wrote that code, this made me laugh. > > I agree it's not the prettiest interface, but I thought that was OK > considering that it should only ever have a very limited number of > callers. I believe I did it this way in the interest of code > compactness. Since there are four DSM implementations, I wanted the > implementation-specific code to be short and all in one place, and > jamming it all into one function served that purpose. Also, there's a > bunch of logic that is shared by multiple operations - detach and > destroy tend to be similar, and so do create and attach, and there are > even things that are shared across all operations, like the snprintf > at the top of dsm_impl_posix() or the slightly larger amount of > boilerplate at the top of dsm_impl_sysv(). > > I'm not particularly opposed to refactoring this to make it nicer, but > my judgement was that splitting it up into one function per operation > per implementation, say, would have involved a lot of duplication of > small bits of code that might then get out of sync with each other > over time. By doing it this way, the logic is a bit tangled -- or > maybe more than a bit -- but there's very little duplication because > each implementation gets jammed into the smallest possible box. I'm OK > with somebody deciding that I got the trade-offs wrong there, but I > will be interested to see the number of lines added vs. removed in any > future refactoring patch. That's fair, I can see those reasons. Nevertheless, I do think it was a bad tradeoff. A little bit of repetition would be better here, or we can extract the common parts to smaller functions. I came up with the attached: 25 files changed, 1710 insertions(+), 1113 deletions(-) So yeah, it's more code, and there's some repetition, but I think this is more readable. Some of that is extra boilerplate because I split the implementations to separate files, and I also added tests. I'm not 100% wedded on all of the decisions here, but I think this is the right direction overall. For example, I decided to separate dsm_handle used by the high-level interface in dsm.c and the dsm_impl_handle used by the low-level interace in dsm_impl_*.c (more on that below). That feels slightly better to me, but could be left out if we don't want that. Notable changes: - Split the single multiplexed dsm_impl_op() function into multiple functions for different operations. This allows more natural function signatures for the different operations. - The create() function is now responsible for generating the handle, instead of having the caller generate it. Those implementations that need to generate a random handle and retry if it's already in use, now do that retry within the implementation. - The destroy() function no longer detaches the segment; you must call detach() first if the segment is still attached. This avoids having to pass "junk" values when destroying a segment that's not attached, and in case of error, makes it more clear what failed. - Separate dsm_handle, used by backend code to interact with the high level interface in dsm.c, from dsm_impl_handle, which is used to interact with the low-level functions in dsm_impl.c. This gets rid of the convention in dsm.c of reserving odd numbers for DSM segments stored in the main shmem area. There is now an explicit flag for that the control slot. For generating dsm_handles, we now use the same scheme we used to use for main-area shm segments for all DSM segments, which includes the slot number in the dsm_handle. The implementations use their own mechanisms for generating the low-level dsm_impl_handles (all but the SysV implementation generate a random handle and retry on collision). - Use IPC_PRIVATE in the SysV implementation to have the OS create a unique identifier for us. Use the shmid directly as the (low-level) handle, so that we don't need to use shmget() to convert a key to shmid, and don't need the "cache" for that. - create() no longer returns the mapped_size. The old Windows implementation had some code to read the actual mapped size after creating the mapping, and returned that in *mapped_size. Others just returned the requested size. In principle reading the actual size might be useful; the caller might be able to make use of the whole mapped size when it's larger than requested. In practice, the callers didn't do that. Also, POSIX shmem on FreeBSD has similar round-up-to-page-size behavior but the implementation did not query the actual mapped size after creating the segment, so you could not rely on it. - Added a test that exercises basic create, detach, attach functionality using all the different implementations supported on the current platform. - Change datatype of the opaque types in dsm_impl.c from "void *" to typedefs over uintptr_t. It's easy to make mistakes with "void *", as you can pass any pointer without getting warnings from the compiler. Dedicated typedefs give a bit more type checking. (This is in the first commit, all the other changes are bundled together in the second commit.) Overall, I don't think this is backpatchable. The handle changes and use of IPC_PRIVATE in particular: they could lead to failure to clean up old segments if you upgraded the binary without a clean shutdown. A slightly different version of this possibly would be, but I'd like to focus on what's best for master for now. -- Heikki Linnakangas Neon (https://neon.tech)
Attachment
pgsql-hackers by date: