Re: Better shared data structure management and resizable shared data structures - Mailing list pgsql-hackers

From Heikki Linnakangas
Subject Re: Better shared data structure management and resizable shared data structures
Date
Msg-id 113724ab-0028-493f-9605-6e8570f0939f@iki.fi
Whole thread Raw
In response to Re: Better shared data structure management and resizable shared data structures  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: Better shared data structure management and resizable shared data structures
Re: Better shared data structure management and resizable shared data structures
List pgsql-hackers
On 19/03/2026 17:17, Robert Haas wrote:
> On Thu, Mar 19, 2026 at 10:34 AM Heikki Linnakangas <hlinnaka@iki.fi> wrote:
>>> I suspect
>>> that issue is also why pg_stat_statements has the
>>> process_shared_preload_libraries_in_progress check at the top, because
>>> it looks to me like everything else that the function does would be
>>> completely fine to do later.
>>
>> I think a bigger problem is loading and saving the statistics file. The
>> file needs to be saved on postmaster exit, where do you do that if the
>> library was not in shared_preload_libraries?
> 
> Well, there's no way to install a hook in the postmaster in that case,
> so you can't. But I'm not sure that justifies skipping everything
> _PG_init() would have done. A problem with the status quo is that
> every module author makes their own decision about how to handle the
> s_p_l problem, and they don't all decide differently, even in our own
> tree. For example, autoprewarm chooses to register the GUC that it can
> while skipping the other one, while pg_stat_statements skips
> everything, including hook installation. Maybe that's properly
> considered flexibility that should be left to each individual author,
> but to me it seems more like happenstance than anything else. I'd
> favor a design that emphasizes severability - i.e. you always try to
> do as much as possible, and defer errors until later. So you always
> create the GUCs but then restrict setting them to values that you
> can't support without a restart, instead of not creating them. You
> install the hooks and then maybe they will have to no-op out. It's
> just weird if you load a library and the GUCs aren't defined and
> there's not even a diagnostic telling you why.
> 
>> If you squint a little, this is pretty much the same as my descriptor
>> design. Let's start from your DefineShmemRegion function, but in order
>> to have some flexibility to add optional optional in the future, without
>> having to create DefineShmemRegionNew(), DefineShmemRegionExt() or
>> similar functions, let's pass the arguments in a struct. So instead of:
>>
>> DefineShmemRegion("pg_stat_statements", sizeof(pgssSharedState), &pgss,
>> &pgss,pgss_shmem_init, 0);
>>
>> you would call it like this:
>>
>> DefineShmemRegion(&(ShmemStructDesc) {
>>       .name = "pg_stat_statements",
>>       .size = sizeof(pgssSharedState),
>>       .ptr = (void **) &pgss,
>>       .init_fn = pgss_shmem_init,
>>       .flags = 0,
>> });
>>
>> This flexibility will come handy as soon as we add the ability to resize.
> 
> I see your point. I'm not really convinced, though. In practice,
> what's now going to happen is that you're probably going to move that
> struct out of _PG_init() and define it elsewhere, so the logic is
> getting split between multiple places, which does not improve
> readability, and it also becomes much more worrying to what degree the
> struct needs to be const, whereas if you just pass a bunch of
> parameters by value you kind of understand what has to be happening.
> Also, what flexibility have you really purchased? Sure, now you can
> add arguments to the function call without breaking existing call
> sites, but (1) there are other ways to do that, like by creating an
> object first and then using methods to assign properties to it
> afterward (i.e. RegisterCallbackOnShmemRegion) and (2) adding
> arguments without breaking existing callers is not an unmixed blessing
> in the first place. I know the world won't end if you go with this
> style, but I guess I'm not much of a fan. I find this sort of thing
> hard to read.

I see your point too, the options will indeed easily start to split 
between constant parts and parts that are set later.

To avoid that, perhaps it's best to adopt the same style we use for 
HASHCTL structs:

static ShmemHashDesc pgssSharedHashDesc = { };

pgssSharedHashDesc.name = "pg_stat_statements hash";
pgssSharedHashDesc.ptr = &pgss_hash;
pgssSharedHashDesc.hash_info.keysize = sizeof(pgssHashKey);
pgssSharedHashDesc.hash_info.entrysize = sizeof(pgssEntry);
pgssSharedHashDesc.hash_flags = HASH_ELEM | HASH_BLOBS;
pgssSharedHashDesc.init_size = pgss_max;
pgssSharedHashDesc.max_size = pgss_max;
ShmemRequestHash(&pgssSharedHashDesc);

I don't like the idea of just using the string name as the handle. I 
think it will come handy to have a backend-private descriptor or handle 
to the shared memory region. But perhaps the "name" and "ptr" should be 
regular arguments, for example. If some parameters are passed as 
arguments and some in the struct, that also splits things in an awkward 
way, though.

>> Ok, we could add GetShmemRegion() in either design. Do we have any place
>> where we'd use that though, instead the backend-private pointer global
>> variable? I can't think of any examples where we currently call
>> ShmemInitStruct() to get a pointer "on demand" like that.
>>
>> In pg_stat_statements, this would replace these tests:
>>
>>       if (!pgss || !pgss_hash)
>>           ereport(ERROR,
>>                   (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
>>                    errmsg("pg_stat_statements must be loaded via
>> \"shared_preload_libraries\"")));
>>
>> But I don't think pg_stat_statements could still allocate the region
>> after postmaster startup. GetShmemRegion() would just be a different way
>> of throwing that error.
> 
> In that case, yes. But something like autoprewarm only needs a small
> amount of shared memory, and can potentially initialize itself on the
> fly. The not-yet-committed pg_collect_advice module has similar needs,
> which it currently satisfies using GetNamedDSMSegment(); see in
> particular pg_collect_advice_attach() if you feel like wandering over
> to the pg_plan_advice thread.

I wonder if we should set aside a small amount of memory, like 10-20 kB, 
in the fixed shmem segment for small structs like that. Currently, such 
allocations will usually succeed because we leave some wiggle room, but 
the  can also be consumed by other things like locks. But we could 
reserve a small amount of memory specifically for small allocations in 
extensions like this.

I didn't implement that, but I improved the documentation on how 
after-startup allocations work with the new facility.

Or if we get the feature to resize allocations, we could set aside a few 
pages of *address space* for them without allocating the actual memory 
until it's needed.



Attachd is new version with lots of changes again. I've experimented 
with different ways that the interface could look like, like with the 
"adjust" callback we discussed earlier. This is the version I like best 
so far:

Shmem callbacks
---------------

I separated the request/init/fn callbacks from the structs. There's now 
a concept of "shmem callbacks", which you register in _PG_init(). For 
example:

static void pgss_shmem_request(void *arg);
static void pgss_shmem_init(void *arg);

static const ShmemCallbacks pgss_shmem_callbacks = {
     .request_fn = pgss_shmem_request,
     .init_fn = pgss_shmem_init,
     .attach_fn = NULL, /* no special attach actions needed */
};

void
_PG_init(void)
{
     ...

     RegisterShmemCallbacks(&pgss_shmem_callbacks);
}

This is a similar to the old shmem_request and shmem_startup hooks. It's 
not a one-to-one replacement though, the callbacks are called at 
different stages than the old hooks (to make things convenient with the 
new facility):

* The request_fn callback is called in postmaster startup, at the same 
stage as the old shmem_request callback was. But in EXEC_BACKEND mode, 
it's *also* called in each backend.

* The init_fn callback is only called in postmaster startup, when it's 
time to initialize the area. (Ignoring the special "after startup" 
allocations for now).

Shmem requests
--------------

To register a shmem area, you call ShmemRequestStruct() or 
ShmmeRequestHash() from the request callback function. For example:

static void
pgss_shmem_request(void *arg)
{
     static ShmemHashDesc pgssSharedHashDesc = {
        .name = "pg_stat_statements hash",
         .ptr = &pgss_hash,
         .hash_info.keysize = sizeof(pgssHashKey),
         .hash_info.entrysize = sizeof(pgssEntry),
         .hash_flags = HASH_ELEM | HASH_BLOBS,
     };
     static ShmemStructDesc pgssSharedStateDesc = {
         .name = "pg_stat_statements",
         .size = sizeof(pgssSharedState),
         .ptr = (void **) &pgss,
     };

     pgssSharedHashDesc.init_size = pgss_max;
     pgssSharedHashDesc.max_size = pgss_max;
     ShmemRequestHash(&pgssSharedHashDesc);
     ShmemRequestStruct(&pgssSharedStateDesc);
}

Initialization happens in the init callback, which is called after all 
the pointers (pgss, pgss_hash) have been set.


Built-in subsystems
-------------------

I refactored all the core subsystems to also use these callbacks. The 
built-in callbacks are listed in one big array, see the new 
"subsystemlist.h" header file.

SLRUs
-----

I also refactored the SLRU initialization to use the same principle: You 
call SimpleLruRequest() in the shmem request callback, and it calculates 
and requests the required shmem size for you.


I split this into more incremental patches. The first few are just 
refactorings that are probably useful on their own.

- Heikki

Attachment

pgsql-hackers by date:

Previous
From: Bharath Rupireddy
Date:
Subject: Re: another autovacuum scheduling thread
Next
From: Bharath Rupireddy
Date:
Subject: Re: log XLogPrefetch stats at end of recovery