Re: Better shared data structure management and resizable shared data structures - Mailing list pgsql-hackers

From Heikki Linnakangas
Subject Re: Better shared data structure management and resizable shared data structures
Date
Msg-id 1c3a07a7-158d-4800-927c-2641c73277d8@iki.fi
Whole thread Raw
In response to Re: Better shared data structure management and resizable shared data structures  (Matthias van de Meent <boekewurm+postgres@gmail.com>)
Responses Re: Better shared data structure management and resizable shared data structures
Re: Better shared data structure management and resizable shared data structures
List pgsql-hackers
On 04/04/2026 15:00, Matthias van de Meent wrote:
> On Sat, 4 Apr 2026 at 02:45, Heikki Linnakangas <hlinnaka@iki.fi> wrote:
>>>> I don't understand the use of ShmemStructDesc. They generally/always
>>>> are private to request_fn(), and their fields are used exclusively
>>>> inside the shmem mechanisms, with no reads of its fields that can't
>>>> already be deduced from context. Why do we need that struct
>>>> everywhere?
>>>
>>> My resizable shared memory structure patches use it as a handle to the
>>> structure to be resized.
>>
>> Right. And hash tables and SLRUs use a desc-like object already, so for
>> symmetry it feels natural to have it for plain structs too.
>> I wonder if we should make it optional though, for the common case that
>> you have no intention of doing anything more with the shmem region that
>> you'd need a desc for. I'm thinking you could just pass NULL for the
>> desc pointer:
>>
>>       ShmemRequestStruct(NULL,
>>           .name = "pg_stat_statements",
>>           .size = sizeof(pgssSharedState),
>>           .ptr = (void **) &pgss,
>>       };
> 
> That would help, though I'd still wonder why we'd have separate Opts
> and Desc structs. IIUC, they generally carry (exactly) the same data.
> 
> Maybe moving it into a `.handle` or `.desc` field in Shmem*Opts could
> make that part of the code a bit cleaner; as it'd further clarify that
> it's very much an optional field.

Yeah. OTOH, I'd like to separate the options from what's effectively a 
return value. But maybe you're right and it's nevertheless better that way.

Some options on this:

a) What's in the patch now

     static ShmemStructDesc pgssSharedStateDesc;

     ShmemRequestStruct(&pgssSharedStateDesc,
                        .name = "pg_stat_statements",
                        .size = sizeof(pgssSharedState),
                        .ptr = (void **) &pgss);

b) Allow passing NULL for the desc

     ShmemRequestStruct(NULL,
                        .name = "pg_stat_statements",
                        .size = sizeof(pgssSharedState),
                        .ptr = (void **) &pgss);

c) Return the Desc as a return value

     static ShmemStructDesc *pgssSharedStateDesc;

     pgssSharedStateDesc =
         ShmemRequestStruct(.name = "pg_stat_statements",
                            .size = sizeof(pgssSharedState),
                            .ptr = (void **) &pgss);

In option c) you can just throw away the result if you don't need it. I 
kind of like this as a notational thing. However it has some downsides:

This changes the return value to be a pointer. I'm thinking that 
ShmemRequestStruct() palloc's the descriptor struct in TopMemoryContext. 
This is a little ugly because the descriptor struct is leaked if the 
caller throws it away. It's not a lot of memory, but still.

I'm also not sure how well this fits in with the SLRU code. On 'master', 
you already have SlruCtlData which is like the "desc" struct. Would we 
turn that into a pointer too, adding one indirection to all the SLRU 
calls. It's probably fine from a performance point of view, but it feels 
like it's going in the wrong direction.

d) Make it part of Opts, as you suggested

     static ShmemStructDesc pgssSharedStateDesc;

     ShmemRequestStruct(.name = "pg_stat_statements",
                        .size = sizeof(pgssSharedState),
                        .ptr = (void **) &pgss,
                        .desc = &pgssSharedStateDesc);

In the attached new version, though, I stepped back and decided to 
remove the whole ShmemStructDesc after all. I still think having a 
handle like that is a good idea, and the follow-up patches for resizing 
need it. However, with option d) it can easily be added later. With 
option d), it seems silly to have it be part of the patch now, when the 
desc struct doesn't really do anything. SLRU's still have a similar 
SlruDesc struct, however. For SLRUs it's essentially the same as the old 
SlruCtlData struct before these patches.

The Desc structs were being used for one thing though: I used the 'size' 
from the Desc struct in ProcGlobalShmemInit() to get the allocated size 
of each shmem area. The size computation there is complicated enough 
that I'd rather not repeat it, and avoiding the repeated size 
calculation was the raison d'être for these patches. I replaced it with 
global variables to hold the sizes from the ShmemRequest() step to 
ShmemInit(). But that would be one case where having the desc would 
already be useful. Then again, I'm not sure we want to expose the 'size' 
in the descriptor like that anyway, because as soon as we make shmem 
regions resizable, we might not be able to keep the size in the 
descriptor up-to-date. The size of these structs won't change, but we 
might not want to expose the information because it would be confusing 
for other structs where it can change to show outdated information.

On a related note, when we add back the ".desc" concept later, is 
".desc" a good name, or ".handle" as you also suggested? More widely, do 
we call the concept and the struct a "handle" or "descriptor" or what? 
Or if we follow the precedence with the existing SlruCtlData struct, it 
could be ".ctl". I'm not a fan of the "Ctl" naming though, because we 
already have a lot of structs with "Ctl" in the name and it's not always 
clear whether a "Ctl" struct refers to the shared memory parts or the 
handle to it. Now that the "desc" structs are not part of these patches 
anymore, however, we can punt on that decision.

On 02/04/2026 09:58, Ashutosh Bapat wrote:
>>
>> I renamed it to AttachOrInitShmemIndexEntry, and the args to 'may_init'
>> and 'may_attach'. But more importantly I added comments to explain the
>> different usages. Hope that helps..
> 
> The explanation in the prologue looks good. But the function is still
> confusing. Instead of if ... else fi ... chain, I feel organizing this
> as below would make it more readable. (this was part of one of my
> earlier edit patches).
> if (found)
> ...
> else
> {
>  if (!may_init)
>  error
> if (!index_entry)
>  error
> 
> ... rest of the code to initialize and attach
> }
> 
> But other than that I don't have any other brilliant ideas.

I did another refactoring in this area: I split 
AttachOrInitShmemIndexEntry() into separate AttachShmemIndexEntry() and 
InitShmemIndexEntry functions again. There's a little bit of repetition 
that way, but IMO it makes it much clearer overall.

Other changes in this patch version:

- I moved some of the stuff from shmem.h to a new shmem_internal.h 
header. The idea is that what remains in shmem.h provides the public API 
for allocating shared memory.

- I refactored the "after-startup request" code. It now detects the case 
that some of the shmem areas, but not all, have already been initialized 
and throws an error.

Still processing the rest of the feedback from the past days. This patch 
version is also available at 
https://github.com/hlinnaka/postgres/tree/shmem-init-refactor-11.

- Heikki
Attachment

pgsql-hackers by date:

Previous
From: Nathan Bossart
Date:
Subject: Re: Add pg_stat_autovacuum_priority
Next
From: Sami Imseih
Date:
Subject: Re: Add pg_stat_autovacuum_priority