Re: Better shared data structure management and resizable shared data structures - Mailing list pgsql-hackers
| From | Ashutosh Bapat |
|---|---|
| Subject | Re: Better shared data structure management and resizable shared data structures |
| Date | |
| Msg-id | CAExHW5sYgt=XekOoAE-Tu_Dv8obWZCjCqAPT9vtN2D4m=M=drQ@mail.gmail.com Whole thread |
| In response to | Re: Better shared data structure management and resizable shared data structures (Heikki Linnakangas <hlinnaka@iki.fi>) |
| Responses |
Re: Better shared data structure management and resizable shared data structures
|
| List | pgsql-hackers |
On Wed, Apr 1, 2026 at 11:47 PM Heikki Linnakangas <hlinnaka@iki.fi> wrote: > > Yet another version attached (also available at: > https://github.com/hlinnaka/postgres/tree/shmem-init-refactor-9). The > main change is the shape of the ShmemRequest*() calls: > > On 27/03/2026 02:51, Heikki Linnakangas wrote: > > Another idea is to use a macro to hide that from pgindent, which would > > make the calls little less verbose anyway: > > > > #define ShmemRequestStruct(desc, ...) ShmemRequestStructWithOpts(desc, > > &(ShmemRequestStructOpts) { __VA_ARGS__ }) > > > > Then the call would be simply: > > > > ShmemRequestStruct(&pgssSharedStateDesc, > > .name = "pg_stat_statements", > > .size = sizeof(pgssSharedState), > > .ptr = (void **) &pgss, > > ); > > I went with that approach. We're already doing something similar with > XL_ROUTINE in xlogreader.h: > > #define XL_ROUTINE(...) &(XLogReaderRoutine){__VA_ARGS__} > > The calls look like this: > > xlogreader = > XLogReaderAllocate(wal_segment_size, NULL, > XL_ROUTINE(.page_read = &XLogPageRead, > .segment_open = NULL, > .segment_close = wal_segment_close), > private); > > If we followed that example, ShmemRequestStruct() calls would look like > this: > > ShmemRequestStruct(&pgssSharedStateDesc, > SHMEM_STRUCT_OPTS(.name = "pg_stat_statements", > .size = sizeof(pgssSharedState), > .ptr = (void **) &pgss, > ); > > However, I don't like the deep indentation, it feels like the important > stuff is buried to the right. And pgindent insists on that. So I went > with the proposal I quoted above, turning ShmemRequestStruct(...) itself > into a macro. If you need more complex options setup, you can set up the > struct without the macro and call ShmemRequestStructWithOpts() directly, > but so far all of the callers can use the macro. > I like this. I have tried it only for the resizable_shmem structure which is not complex. > > Ashutosh, I think I've addressed most of your comments so far. I'm > replying to just a few of them here that might need more discussion: > Thanks. > > > > +} shmem_startup_state; > > > > This isn't just startup state since the backend can toggle between > > DONE and LATE_ATTACH_OR_INIT states after the startup. Probably > > "shmem_state" would be a better name. > > Renamed to "shmem_request_state". And renamed "LATE_ATTACH_OR_INIT" to > "AFTER_STARTUP_ATTACH_OR_INIT" to match the terminology I used elsewhere. > > I'm still not entirely happy with this state machine. It seems useful to > have it for sanity checking, but it still feels a little unclear what > state you're in at different points in the code, and as an aesthetic > thing, the whole enum feels too prominent given that it's just for > sanity checks. I am ok even if it is used just for sanity checks - but with the shared structure requests coming at any time during the life of a server, it would be easy to get lost without those sanity checks. I also see it being used in RegisterShmemCallbacks(), so it's not for just sanity checks, right? > > > + ShmemStructDesc *desc = area->desc; > > + > > + AttachOrInit(desc, false, true); > > + } > > + list_free(requested_shmem_areas); > > + requested_shmem_areas = NIL; > > > > If we pop all the nodes from the list, then the list should be NIL > > right? Why do we need to free it? > > > > + else if (!init_allowed) > > + { > > > > For the sake of documentation and sanity, I would add > > Assert(!index_entry) here, possibly with a comment. Otherwise it feels > > like we might be leaving a half-initialized entry in the hash table. > > > > What if attach_allowed is false and the entry is not found? Should we > > throw an error in that case too? It would be foolish to call > > AttachOrInit with both init_allowed and attach_allowed set to false, > > but the API allows it and we should check for that. > > > > It feels like we should do something about the arguments. The function > > is hard to read. init_allowed is actually the action the caller wants > > to take if the entry is not found, and attach_allowed is the action > > the caller wants to take if the entry is found. > > > > Also explain in the comment what does attach mean here especially in > > case of fixed sized structures. > > I renamed it to AttachOrInitShmemIndexEntry, and the args to 'may_init' > and 'may_attach'. But more importantly I added comments to explain the > different usages. Hope that helps.. The explanation in the prologue looks good. But the function is still confusing. Instead of if ... else fi ... chain, I feel organizing this as below would make it more readable. (this was part of one of my earlier edit patches). if (found) ... else { if (!may_init) error if (!index_entry) error ... rest of the code to initialize and attach } But other than that I don't have any other brilliant ideas. > > On 01/04/2026 14:59, Ashutosh Bapat wrote: > > 0008 > > ------ > > - LWLockRelease(AddinShmemInitLock); > > + /* The hash table must be initialized already */ > > + Assert(pgss_hash != NULL); > > > > Does it make sense to also Assert(pgss)? A broader question is do we > > want to make it a pattern that every user of ShmemRequest*() also > > Assert()s that the pointer is non-NULL in the init callback? It is a > > test that the ShmemRequest*(), which is far from, init_fn is working > > correctly. > > The function does a lot of accesses of 'pgss' so if that's NULL you'll > get a crash pretty quickly. I'm not sure if the Assert(pgss_hash != > NULL) is really needed either, but I'm inclined to keep it, as pgss_hash > might not otherwise be accessed in the function, and there are runtime > checks for it in the other functions, so if it's not initialized for > some reason, things might still appear to work to some extent. I don't > think I want to have that as a broader pattern though. In Assert build, an Assert() at least appears in the server log file, that gives a good direction to start investigation. Without Assert, it gives segmentation faults without any idea where it came from. That's a mild benefit of assert. > > > + /* > > + * Extra space to reserve in the shared memory segment, but it's not part > > + * of the struct itself. This is used for shared memory hash tables that > > + * can grow beyond the initial size when more buckets are allocated. > > + */ > > + size_t extra_size; > > > > When we introduce resizable structures (where even the hash table > > directly itself could be resizable), we will introduce a new field > > max_size which is easy to get confused with extra_size. Maybe we can > > rename extra_size to something like "auxilliary_size" to mean size of > > the auxiliary parts of the structure which are not part of the main > > struct itself. > > > > + /* > > + * max_size is the estimated maximum number of hashtable entries. This is > > + * not a hard limit, but the access efficiency will degrade if it is > > + * exceeded substantially (since it's used to compute directory size and > > + * the hash table buckets will get overfull). > > + */ > > + size_t max_size; > > + > > + /* > > + * init_size is the number of hashtable entries to preallocate. For a > > + * table whose maximum size is certain, this should be equal to max_size; > > + * that ensures that no run-time out-of-shared-memory failures can occur. > > + */ > > + size_t init_size; > > > > Everytime I look at these two fields, I question whether those are the > > number of entries (i.e. size of the hash table) or number of bytes > > (size of the memory). I know it's the former, but it indicates that > > something needs to be changed here, like changing the names to have > > _entries instead of _size, or changing the type to int64 or some such. > > Renaming to _entries would conflict with dynahash APIs since they use > > _size, so maybe the latter? > > I hear you, but I didn't change these yet. If we go with the patches > from the "Shared hash table allocations" thread, max_size and init_size > will be merged into one. I'll try to settle that thread before making > changes here. Will review those patches next. > > > -void > > -InitProcGlobal(void) > > +static void > > +ProcGlobalShmemInit(void *arg) > > { > > I'm not sure what you meant to say here, but I did notice that there > were a bunch of references to InitProcGlobal() left over in comments. > Fixed those. Oh, I just wanted to say that the new version reads much better than the old version, which had ShmemStructInit() sprinkled at seemingly random places. I missed writing that. Nothing serious there. I also rebased my resizable shmem patch on v9. Attached here. I have addressed the following open items from the list at [1] 1. The test is stable now. I found a way to make (roughly) sure that we are not allocating more than required memory for a resizable structure. 2. Disable the feature on platforms that do not have MADV_POPULATE_WRITE and MADV_REMOVE. The feature is also disabled for EXEC_BACKEND case. I have tested the EXEC_BACKEND case, but I have not tested platforms which do not have those constants defined or on Windows. The first two items from [1] need some discussion still. [1] https://www.postgresql.org/message-id/CAExHW5so6VSxBC-1V=35229Z1+dw5vhw8HxHg9ry7UzceKcXzA@mail.gmail.com -- Best Wishes, Ashutosh Bapat
Attachment
pgsql-hackers by date: