Thread: introduce dynamic shared memory registry

introduce dynamic shared memory registry

From
Nathan Bossart
Date:
Every once in a while, I find myself wanting to use shared memory in a
loadable module without requiring it to be loaded at server start via
shared_preload_libraries.  The DSM API offers a nice way to create and
manage dynamic shared memory segments, so creating a segment after server
start is easy enough.  However, AFAICT there's no easy way to teach other
backends about the segment without storing the handles in shared memory,
which puts us right back at square one.

The attached 0001 introduces a "DSM registry" to solve this problem.  The
API provides an easy way to allocate/initialize a segment or to attach to
an existing one.  The registry itself is just a dshash table that stores
the handles keyed by a module-specified string.  0002 adds a test for the
registry that demonstrates basic usage.

I don't presently have any concrete plans to use this for anything, but I
thought it might be useful for extensions for caching, etc. and wanted to
see whether there was any interest in the feature.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

Attachment

Re: introduce dynamic shared memory registry

From
Joe Conway
Date:
On 12/4/23 22:46, Nathan Bossart wrote:
> Every once in a while, I find myself wanting to use shared memory in a
> loadable module without requiring it to be loaded at server start via
> shared_preload_libraries.  The DSM API offers a nice way to create and
> manage dynamic shared memory segments, so creating a segment after server
> start is easy enough.  However, AFAICT there's no easy way to teach other
> backends about the segment without storing the handles in shared memory,
> which puts us right back at square one.
> 
> The attached 0001 introduces a "DSM registry" to solve this problem.  The
> API provides an easy way to allocate/initialize a segment or to attach to
> an existing one.  The registry itself is just a dshash table that stores
> the handles keyed by a module-specified string.  0002 adds a test for the
> registry that demonstrates basic usage.
> 
> I don't presently have any concrete plans to use this for anything, but I
> thought it might be useful for extensions for caching, etc. and wanted to
> see whether there was any interest in the feature.

Notwithstanding any dragons there may be, and not having actually looked 
at the the patches, I love the concept! +<many>

-- 
Joe Conway
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com




Re: introduce dynamic shared memory registry

From
Robert Haas
Date:
On Tue, Dec 5, 2023 at 10:35 AM Joe Conway <mail@joeconway.com> wrote:
> Notwithstanding any dragons there may be, and not having actually looked
> at the the patches, I love the concept! +<many>

Seems fine to me too. I haven't looked at the patches or searched for
dragons either, though.

--
Robert Haas
EDB: http://www.enterprisedb.com



Re: introduce dynamic shared memory registry

From
Michael Paquier
Date:
On Mon, Dec 04, 2023 at 09:46:47PM -0600, Nathan Bossart wrote:
> The attached 0001 introduces a "DSM registry" to solve this problem.  The
> API provides an easy way to allocate/initialize a segment or to attach to
> an existing one.  The registry itself is just a dshash table that stores
> the handles keyed by a module-specified string.  0002 adds a test for the
> registry that demonstrates basic usage.
>
> I don't presently have any concrete plans to use this for anything, but I
> thought it might be useful for extensions for caching, etc. and wanted to
> see whether there was any interest in the feature.

Yes, tracking that in a more central way can have many usages, so your
patch sounds like a good idea.  Note that we have one case in core
that be improved and make use of what you have here: autoprewarm.c.

The module supports the launch of dynamic workers but the library may
not be loaded with shared_preload_libraries, meaning that it can
allocate a chunk of shared memory worth a size of
AutoPrewarmSharedState without having requested it in a
shmem_request_hook.  AutoPrewarmSharedState could be moved to a DSM
and tracked with the shared hash table introduced by the patch instead
of acquiring AddinShmemInitLock while eating the plate of other
facilities that asked for a chunk of shmem, leaving any conflict
handling to dsm_registry_table.

+dsm_registry_init_or_attach(const char *key, void **ptr, size_t size,
+                           void (*init_callback) (void *ptr))

This is shaped around dshash_find_or_insert(), but it looks like you'd
want an equivalent for dshash_find(), as well.
--
Michael

Attachment

Re: introduce dynamic shared memory registry

From
Andrei Lepikhov
Date:
On 5/12/2023 10:46, Nathan Bossart wrote:
> I don't presently have any concrete plans to use this for anything, but I
> thought it might be useful for extensions for caching, etc. and wanted to
> see whether there was any interest in the feature.

I am delighted that you commenced this thread.
Designing extensions, every time I feel pain introducing one shared 
value or some global stat, the extension must be required to be loadable 
on startup only. It reduces the flexibility of even very lightweight 
extensions, which look harmful to use in a cloud.

-- 
regards,
Andrei Lepikhov
Postgres Professional




Re: introduce dynamic shared memory registry

From
Andrei Lepikhov
Date:
On 18/12/2023 13:39, Andrei Lepikhov wrote:
> On 5/12/2023 10:46, Nathan Bossart wrote:
>> I don't presently have any concrete plans to use this for anything, but I
>> thought it might be useful for extensions for caching, etc. and wanted to
>> see whether there was any interest in the feature.
> 
> I am delighted that you commenced this thread.
> Designing extensions, every time I feel pain introducing one shared 
> value or some global stat, the extension must be required to be loadable 
> on startup only. It reduces the flexibility of even very lightweight 
> extensions, which look harmful to use in a cloud.

After looking into the code, I have some comments:
1. The code looks good; I didn't find possible mishaps. Some proposed 
changes are in the attachment.
2. I think a separate file for this feature looks too expensive. 
According to the gist of that code, it is a part of the DSA module.
3. The dsm_registry_init_or_attach routine allocates a DSM segment. Why 
not create dsa_area for a requestor and return it?

-- 
regards,
Andrei Lepikhov
Postgres Professional

Attachment

Re: introduce dynamic shared memory registry

From
Nikita Malakhov
Date:
Hi!

This patch looks like a good solution for a pain in the ass, I'm too for this patch to be committed.
Have looked through the code and agree with Andrei, the code looks good.
Just a suggestion - maybe it is worth adding a function for detaching the segment,
for cases when we unload and/or re-load the extension?

--
Regards,
Nikita Malakhov
Postgres Professional
The Russian Postgres Company

Re: introduce dynamic shared memory registry

From
Robert Haas
Date:
On Mon, Dec 18, 2023 at 3:32 AM Andrei Lepikhov
<a.lepikhov@postgrespro.ru> wrote:
> 2. I think a separate file for this feature looks too expensive.
> According to the gist of that code, it is a part of the DSA module.

-1. I think this is a totally different thing than DSA. More files
aren't nearly as expensive as the confusion that comes from smushing
unrelated things together.

--
Robert Haas
EDB: http://www.enterprisedb.com



Re: introduce dynamic shared memory registry

From
Nathan Bossart
Date:
On Mon, Dec 18, 2023 at 03:32:08PM +0700, Andrei Lepikhov wrote:
> 3. The dsm_registry_init_or_attach routine allocates a DSM segment. Why not
> create dsa_area for a requestor and return it?

My assumption is that most modules just need a fixed-size segment, and if
they really needed a DSA segment, the handle, tranche ID, etc. could just
be stored in the DSM segment.  Maybe that assumption is wrong, though...

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com



Re: introduce dynamic shared memory registry

From
Nathan Bossart
Date:
On Mon, Dec 18, 2023 at 12:05:28PM +0300, Nikita Malakhov wrote:
> Just a suggestion - maybe it is worth adding a function for detaching the
> segment,
> for cases when we unload and/or re-load the extension?

Hm.  We don't presently have a good way to unload a library, but you can
certainly DROP EXTENSION, in which case you might expect the segment to go
away or at least be reset.  But even today, once a preloaded library is
loaded, it stays loaded and its shared memory remains regardless of whether
you CREATE/DROP extension.  Can you think of problems with keeping the
segment attached?

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com



Re: introduce dynamic shared memory registry

From
Nathan Bossart
Date:
On Tue, Dec 19, 2023 at 10:49:23AM -0500, Robert Haas wrote:
> On Mon, Dec 18, 2023 at 3:32 AM Andrei Lepikhov
> <a.lepikhov@postgrespro.ru> wrote:
>> 2. I think a separate file for this feature looks too expensive.
>> According to the gist of that code, it is a part of the DSA module.
> 
> -1. I think this is a totally different thing than DSA. More files
> aren't nearly as expensive as the confusion that comes from smushing
> unrelated things together.

Agreed.  I think there's a decent chance that more functionality will be
added to this registry down the line, in which case it will be even more
important that this stuff stays separate from the tools it is built with.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com



Re: introduce dynamic shared memory registry

From
Nathan Bossart
Date:
On Fri, Dec 08, 2023 at 04:36:52PM +0900, Michael Paquier wrote:
> Yes, tracking that in a more central way can have many usages, so your
> patch sounds like a good idea.  Note that we have one case in core
> that be improved and make use of what you have here: autoprewarm.c.

I'll add a patch for autoprewarm.c.  Even if we don't proceed with that
change, it'll be a good demonstration.

> +dsm_registry_init_or_attach(const char *key, void **ptr, size_t size,
> +                           void (*init_callback) (void *ptr))
> 
> This is shaped around dshash_find_or_insert(), but it looks like you'd
> want an equivalent for dshash_find(), as well.

What is the use-case for only verifying the existence of a segment?

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com



Re: introduce dynamic shared memory registry

From
Michael Paquier
Date:
On Tue, Dec 19, 2023 at 10:19:11AM -0600, Nathan Bossart wrote:
> On Fri, Dec 08, 2023 at 04:36:52PM +0900, Michael Paquier wrote:
>> Yes, tracking that in a more central way can have many usages, so your
>> patch sounds like a good idea.  Note that we have one case in core
>> that be improved and make use of what you have here: autoprewarm.c.
>
> I'll add a patch for autoprewarm.c.  Even if we don't proceed with that
> change, it'll be a good demonstration.

Cool, thanks.  It could just be a separate change on top of the main
one.

> > +dsm_registry_init_or_attach(const char *key, void **ptr, size_t size,
> > +                           void (*init_callback) (void *ptr))
> >
> > This is shaped around dshash_find_or_insert(), but it looks like you'd
> > want an equivalent for dshash_find(), as well.
>
> What is the use-case for only verifying the existence of a segment?

One case I was thinking about is parallel aggregates that can define
combining and serial/deserial functions, where some of the operations
could happen in shared memory, requiring a DSM, and where each process
doing some aggregate combining would expect a DSM to exist before
making use of it.  So a registry wrapper for dshash_find() could be
used as a way to perform sanity checks with what's stored in the
registry.
--
Michael

Attachment

Re: introduce dynamic shared memory registry

From
Michael Paquier
Date:
On Tue, Dec 19, 2023 at 10:14:44AM -0600, Nathan Bossart wrote:
> On Tue, Dec 19, 2023 at 10:49:23AM -0500, Robert Haas wrote:
>> On Mon, Dec 18, 2023 at 3:32 AM Andrei Lepikhov
>> <a.lepikhov@postgrespro.ru> wrote:
>>> 2. I think a separate file for this feature looks too expensive.
>>> According to the gist of that code, it is a part of the DSA module.
>>
>> -1. I think this is a totally different thing than DSA. More files
>> aren't nearly as expensive as the confusion that comes from smushing
>> unrelated things together.
>
> Agreed.  I think there's a decent chance that more functionality will be
> added to this registry down the line, in which case it will be even more
> important that this stuff stays separate from the tools it is built with.

+1 for keeping a clean separation between both.
--
Michael

Attachment

Re: introduce dynamic shared memory registry

From
Andrei Lepikhov
Date:
On 20/12/2023 07:04, Michael Paquier wrote:
> On Tue, Dec 19, 2023 at 10:14:44AM -0600, Nathan Bossart wrote:
>> On Tue, Dec 19, 2023 at 10:49:23AM -0500, Robert Haas wrote:
>>> On Mon, Dec 18, 2023 at 3:32 AM Andrei Lepikhov
>>> <a.lepikhov@postgrespro.ru> wrote:
>>>> 2. I think a separate file for this feature looks too expensive.
>>>> According to the gist of that code, it is a part of the DSA module.
>>>
>>> -1. I think this is a totally different thing than DSA. More files
>>> aren't nearly as expensive as the confusion that comes from smushing
>>> unrelated things together.
>>
>> Agreed.  I think there's a decent chance that more functionality will be
>> added to this registry down the line, in which case it will be even more
>> important that this stuff stays separate from the tools it is built with.
> 
> +1 for keeping a clean separation between both.

Thanks, I got the reason.
In that case, maybe change the test case to make it closer to real-life 
usage - with locks and concurrent access (See attachment)?

-- 
regards,
Andrei Lepikhov
Postgres Professional

Attachment

Re: introduce dynamic shared memory registry

From
Bharath Rupireddy
Date:
On Tue, Dec 5, 2023 at 9:17 AM Nathan Bossart <nathandbossart@gmail.com> wrote:
>
> Every once in a while, I find myself wanting to use shared memory in a
> loadable module without requiring it to be loaded at server start via
> shared_preload_libraries.  The DSM API offers a nice way to create and
> manage dynamic shared memory segments, so creating a segment after server
> start is easy enough.  However, AFAICT there's no easy way to teach other
> backends about the segment without storing the handles in shared memory,
> which puts us right back at square one.
>
> The attached 0001 introduces a "DSM registry" to solve this problem.  The
> API provides an easy way to allocate/initialize a segment or to attach to
> an existing one.  The registry itself is just a dshash table that stores
> the handles keyed by a module-specified string.  0002 adds a test for the
> registry that demonstrates basic usage.

+1 for something like this.

> I don't presently have any concrete plans to use this for anything, but I
> thought it might be useful for extensions for caching, etc. and wanted to
> see whether there was any interest in the feature.

Isn't the worker_spi best place to show the use of the DSM registry
instead of a separate test extension? Note the custom wait event
feature that added its usage code to worker_spi. Since worker_spi
demonstrates typical coding patterns, having just set_val_in_shmem()
and get_val_in_shmem() in there makes this patch simple and shaves
some code off.

Comments on the v1 patch set:

1. IIUC, this feature lets external modules create as many DSM
segments as possible with different keys right? If yes, is capping the
max number of DSMs a good idea?

2. Why does this feature have to deal with DSMs? Why not DSAs? With
DSA and an API that gives the DSA handle to the external modules, the
modules can dsa_allocate and dsa_free right? Do you see any problem
with it?

3.
+typedef struct DSMRegistryEntry
+{
+    char        key[256];

Key length 256 feels too much, can it be capped at NAMEDATALEN 64
bytes (similar to some of the key lengths for hash_create()) to start
with?

4. Do we need on_dsm_detach for each DSM created?
dsm_backend_shutdown

5.
+ *
+ * *ptr should initially be set to NULL.  If it is not NULL, this routine will
+ * assume that the segment has already been attached to the current session.
+ * Otherwise, this routine will set *ptr appropriately.

+    /* Quick exit if the value is already set. */
+    if (*ptr)
+        return;

Instead of the above assumption and quick exit condition, can it be
something like if (dsm_find_mapping(dsm_segment_handle(*ptr)) != NULL)
return;?

6.
+static pg_atomic_uint32 *val;

Any specific reason for it to be an atomic variable?

7.
+static pg_atomic_uint32 *val;

Instead of a run-of-the-mill example with just an integer val that
gets stored in shared memory, can it be something more realistic, a
struct with 2 or more variables or a struct to store linked list
(slist_head or dlist_head) in shared memory or such?

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com



Re: introduce dynamic shared memory registry

From
Nathan Bossart
Date:
On Wed, Dec 20, 2023 at 11:02:58AM +0200, Andrei Lepikhov wrote:
> In that case, maybe change the test case to make it closer to real-life
> usage - with locks and concurrent access (See attachment)?

I'm not following why we should make this test case more complicated.  It
is only intended to test the DSM registry machinery, and setting/retrieving
an atomic variable seems like a realistic use-case to me.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com



Re: introduce dynamic shared memory registry

From
Zhang Mingli
Date:
Hi, all

I see most xxxShmemInit functions have the logic to handle IsUnderPostmaster env.
Do we need to consider it in DSMRegistryShmemInit() too? For example, add some assertions.
Others LGTM.


Zhang Mingli
www.hashdata.xyz
On Dec 5, 2023 at 11:47 +0800, Nathan Bossart <nathandbossart@gmail.com>, wrote:
Every once in a while, I find myself wanting to use shared memory in a
loadable module without requiring it to be loaded at server start via
shared_preload_libraries. The DSM API offers a nice way to create and
manage dynamic shared memory segments, so creating a segment after server
start is easy enough. However, AFAICT there's no easy way to teach other
backends about the segment without storing the handles in shared memory,
which puts us right back at square one.

The attached 0001 introduces a "DSM registry" to solve this problem. The
API provides an easy way to allocate/initialize a segment or to attach to
an existing one. The registry itself is just a dshash table that stores
the handles keyed by a module-specified string. 0002 adds a test for the
registry that demonstrates basic usage.

I don't presently have any concrete plans to use this for anything, but I
thought it might be useful for extensions for caching, etc. and wanted to
see whether there was any interest in the feature.

--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

Re: introduce dynamic shared memory registry

From
Nathan Bossart
Date:
On Wed, Dec 20, 2023 at 03:28:38PM +0530, Bharath Rupireddy wrote:
> Isn't the worker_spi best place to show the use of the DSM registry
> instead of a separate test extension? Note the custom wait event
> feature that added its usage code to worker_spi. Since worker_spi
> demonstrates typical coding patterns, having just set_val_in_shmem()
> and get_val_in_shmem() in there makes this patch simple and shaves
> some code off.

I don't agree.  The test case really isn't that complicated, and I'd rather
have a dedicated test suite for this feature that we can build on instead
of trying to squeeze it into something unrelated.

> 1. IIUC, this feature lets external modules create as many DSM
> segments as possible with different keys right? If yes, is capping the
> max number of DSMs a good idea?

Why?  Even if it is a good idea, what limit could we choose that wouldn't
be arbitrary and eventually cause problems down the road?

> 2. Why does this feature have to deal with DSMs? Why not DSAs? With
> DSA and an API that gives the DSA handle to the external modules, the
> modules can dsa_allocate and dsa_free right? Do you see any problem
> with it?

Please see upthread discussion [0].

> +typedef struct DSMRegistryEntry
> +{
> +    char        key[256];
> 
> Key length 256 feels too much, can it be capped at NAMEDATALEN 64
> bytes (similar to some of the key lengths for hash_create()) to start
> with?

Why is it too much?

> 4. Do we need on_dsm_detach for each DSM created?

Presently, I've designed this such that the DSM remains attached for the
lifetime of a session (and stays present even if all attached sessions go
away) to mimic what you get when you allocate shared memory during startup.
Perhaps there's a use-case for having backends do some cleanup before
exiting, in which case a detach_cb might be useful.  IMHO we should wait
for a concrete use-case before adding too many bells and whistles, though.

> + * *ptr should initially be set to NULL.  If it is not NULL, this routine will
> + * assume that the segment has already been attached to the current session.
> + * Otherwise, this routine will set *ptr appropriately.
> 
> +    /* Quick exit if the value is already set. */
> +    if (*ptr)
> +        return;
> 
> Instead of the above assumption and quick exit condition, can it be
> something like if (dsm_find_mapping(dsm_segment_handle(*ptr)) != NULL)
> return;?

Yeah, I think something like that could be better.  One of the things I
dislike about the v1 API is that it depends a little too much on the caller
doing exactly the right things, and I think it's possible to make it a
little more robust.

> +static pg_atomic_uint32 *val;
> 
> Any specific reason for it to be an atomic variable?

A regular integer would probably be fine for testing, but I figured we
might as well ensure correctness for when this code is inevitably
copy/pasted somewhere.

> +static pg_atomic_uint32 *val;
> 
> Instead of a run-of-the-mill example with just an integer val that
> gets stored in shared memory, can it be something more realistic, a
> struct with 2 or more variables or a struct to store linked list
> (slist_head or dlist_head) in shared memory or such?

This is the second time this has come up [1].  The intent of this test is
to verify the DSM registry behavior, not how folks are going to use the
shared memory it manages, so I'm really not inclined to make this more
complicated without a good reason.  I don't mind changing this if I'm
outvoted on this one, though.

[0] https://postgr.es/m/20231219160117.GB831499%40nathanxps13
[1] https://postgr.es/m/20231220153342.GA833819%40nathanxps13

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com



Re: introduce dynamic shared memory registry

From
Nathan Bossart
Date:
On Thu, Dec 21, 2023 at 12:03:18AM +0800, Zhang Mingli wrote:
> I see most xxxShmemInit functions have the logic to handle IsUnderPostmaster env.
> Do we need to consider it in DSMRegistryShmemInit() too? For example, add some assertions.
> Others LGTM.

Good point.  I _think_ the registry is safe to set up and use in
single-user mode but not in a regular postmaster process.  It'd probably be
wise to add some assertions along those lines, but even if we didn't, I
think the DSM code has existing assertions that will catch it.  In any
case, I'd like to avoid requiring folks to add special
single-user-mode-only logic if we can avoid it.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com



Re: introduce dynamic shared memory registry

From
Andrei Lepikhov
Date:
On 20/12/2023 17:33, Nathan Bossart wrote:
> On Wed, Dec 20, 2023 at 11:02:58AM +0200, Andrei Lepikhov wrote:
>> In that case, maybe change the test case to make it closer to real-life
>> usage - with locks and concurrent access (See attachment)?
> 
> I'm not following why we should make this test case more complicated.  It
> is only intended to test the DSM registry machinery, and setting/retrieving
> an atomic variable seems like a realistic use-case to me.

I could provide you at least two reasons here:
1. A More complicated example would be a tutorial on using the feature 
correctly. It will reduce the number of questions in mailing lists.
2. Looking into existing extensions, I see that the most common case of 
using a shared memory segment is maintaining some hash table or state 
structure that needs at least one lock.

Try to rewrite the pg_prewarm according to this new feature, and you 
will realize how difficult it is.

-- 
regards,
Andrei Lepikhov
Postgres Professional




Re: introduce dynamic shared memory registry

From
Nathan Bossart
Date:
Here is a new version of the patch.  In addition to various small changes,
I've rewritten the test suite to use an integer and a lock, added a
dsm_registry_find() function, and adjusted autoprewarm to use the registry.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

Attachment

Re: introduce dynamic shared memory registry

From
Nathan Bossart
Date:
On Wed, Dec 27, 2023 at 01:53:27PM -0600, Nathan Bossart wrote:
> Here is a new version of the patch.  In addition to various small changes,
> I've rewritten the test suite to use an integer and a lock, added a
> dsm_registry_find() function, and adjusted autoprewarm to use the registry.

Here's a v3 that fixes a silly mistake in the test.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

Attachment

Re: introduce dynamic shared memory registry

From
Bharath Rupireddy
Date:
On Wed, Dec 20, 2023 at 9:33 PM Nathan Bossart <nathandbossart@gmail.com> wrote:
>
> On Wed, Dec 20, 2023 at 03:28:38PM +0530, Bharath Rupireddy wrote:
> > Isn't the worker_spi best place to show the use of the DSM registry
> > instead of a separate test extension? Note the custom wait event
> > feature that added its usage code to worker_spi. Since worker_spi
> > demonstrates typical coding patterns, having just set_val_in_shmem()
> > and get_val_in_shmem() in there makes this patch simple and shaves
> > some code off.
>
> I don't agree.  The test case really isn't that complicated, and I'd rather
> have a dedicated test suite for this feature that we can build on instead
> of trying to squeeze it into something unrelated.

With the use of dsm registry for pg_prewarm, do we need this
test_dsm_registry module at all? Because 0002 patch pretty much
demonstrates how to use the DSM registry. With this comment and my
earlier comment on incorporating the use of dsm registry in
worker_spi, I'm trying to make a point to reduce the code for this
feature. However, if others have different opinions, I'm okay with
having a separate test module.

> > 1. IIUC, this feature lets external modules create as many DSM
> > segments as possible with different keys right? If yes, is capping the
> > max number of DSMs a good idea?
>
> Why?  Even if it is a good idea, what limit could we choose that wouldn't
> be arbitrary and eventually cause problems down the road?

I've tried with a shared memory structure size of 10GB on my
development machine and I have seen an intermittent crash (I haven't
got a chance to dive deep into it). I think the shared memory that a
named-DSM segment can get via this DSM registry feature depends on the
total shared memory area that a typical DSM client can allocate today.
I'm not sure it's worth it to limit the shared memory for this feature
given that we don't limit the shared memory via startup hook.

> > 2. Why does this feature have to deal with DSMs? Why not DSAs? With
> > DSA and an API that gives the DSA handle to the external modules, the
> > modules can dsa_allocate and dsa_free right? Do you see any problem
> > with it?
>
> Please see upthread discussion [0].

Per my understanding, this feature allows one to define and manage
named-DSM segments.

> > +typedef struct DSMRegistryEntry
> > +{
> > +    char        key[256];
> >
> > Key length 256 feels too much, can it be capped at NAMEDATALEN 64
> > bytes (similar to some of the key lengths for hash_create()) to start
> > with?
>
> Why is it too much?

Are we expecting, for instance, a 128-bit UUID being used as a key and
hence limiting it to a higher value 256 instead of just NAMEDATALEN?
My thoughts were around saving a few bytes of shared memory space that
can get higher when multiple modules using a DSM registry with
multiple DSM segments.

> > 4. Do we need on_dsm_detach for each DSM created?
>
> Presently, I've designed this such that the DSM remains attached for the
> lifetime of a session (and stays present even if all attached sessions go
> away) to mimic what you get when you allocate shared memory during startup.
> Perhaps there's a use-case for having backends do some cleanup before
> exiting, in which case a detach_cb might be useful.  IMHO we should wait
> for a concrete use-case before adding too many bells and whistles, though.

On Thu, Dec 28, 2023 at 9:05 PM Nathan Bossart <nathandbossart@gmail.com> wrote:
>
> On Wed, Dec 27, 2023 at 01:53:27PM -0600, Nathan Bossart wrote:
> > Here is a new version of the patch.  In addition to various small changes,
> > I've rewritten the test suite to use an integer and a lock, added a
> > dsm_registry_find() function, and adjusted autoprewarm to use the registry.
>
> Here's a v3 that fixes a silly mistake in the test.

Some comments on the v3 patch set:

1. Typo: missing "an" before "already-attached".
+        /* Return address of already-attached DSM registry entry. */

2. Do you see any immediate uses of dsm_registry_find()? Especially
given that dsm_registry_init_or_attach() does the necessary work
(returns the DSM address if DSM already exists for a given key) for a
postgres process. If there is no immediate use (the argument to remove
this function goes similar to detach_cb above), I'm sure we can
introduce it when there's one.

3. I think we don't need miscadmin.h inclusion in autoprewarm.c, do we?

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com



Re: introduce dynamic shared memory registry

From
Nathan Bossart
Date:
On Fri, Dec 29, 2023 at 08:53:54PM +0530, Bharath Rupireddy wrote:
> With the use of dsm registry for pg_prewarm, do we need this
> test_dsm_registry module at all? Because 0002 patch pretty much
> demonstrates how to use the DSM registry. With this comment and my
> earlier comment on incorporating the use of dsm registry in
> worker_spi, I'm trying to make a point to reduce the code for this
> feature. However, if others have different opinions, I'm okay with
> having a separate test module.

I don't have a strong opinion here, but I lean towards still having a
dedicated test suite, if for no other reason that to guarantee some
coverage even if the other in-tree uses disappear.

> I've tried with a shared memory structure size of 10GB on my
> development machine and I have seen an intermittent crash (I haven't
> got a chance to dive deep into it). I think the shared memory that a
> named-DSM segment can get via this DSM registry feature depends on the
> total shared memory area that a typical DSM client can allocate today.
> I'm not sure it's worth it to limit the shared memory for this feature
> given that we don't limit the shared memory via startup hook.

I would be interested to see more details about the crashes you are seeing.
Is this unique to the registry or an existing problem with DSM segments?

> Are we expecting, for instance, a 128-bit UUID being used as a key and
> hence limiting it to a higher value 256 instead of just NAMEDATALEN?
> My thoughts were around saving a few bytes of shared memory space that
> can get higher when multiple modules using a DSM registry with
> multiple DSM segments.

I'm not really expecting folks to use more than, say, 16 characters for the
key, but I intentionally set it much higher in case someone did have a
reason to use longer keys.  I'll lower it to 64 in the next revision unless
anyone else objects.

> 2. Do you see any immediate uses of dsm_registry_find()? Especially
> given that dsm_registry_init_or_attach() does the necessary work
> (returns the DSM address if DSM already exists for a given key) for a
> postgres process. If there is no immediate use (the argument to remove
> this function goes similar to detach_cb above), I'm sure we can
> introduce it when there's one.

See [0].  FWIW I tend to agree that we could leave this one out for now.

> 3. I think we don't need miscadmin.h inclusion in autoprewarm.c, do we?

I'll take a look at this one.

[0] https://postgr.es/m/ZYIu_JL-usgd3E1q%40paquier.xyz

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com



Re: introduce dynamic shared memory registry

From
Robert Haas
Date:
On Tue, Jan 2, 2024 at 11:21 AM Nathan Bossart <nathandbossart@gmail.com> wrote:
> > Are we expecting, for instance, a 128-bit UUID being used as a key and
> > hence limiting it to a higher value 256 instead of just NAMEDATALEN?
> > My thoughts were around saving a few bytes of shared memory space that
> > can get higher when multiple modules using a DSM registry with
> > multiple DSM segments.
>
> I'm not really expecting folks to use more than, say, 16 characters for the
> key, but I intentionally set it much higher in case someone did have a
> reason to use longer keys.  I'll lower it to 64 in the next revision unless
> anyone else objects.

This surely doesn't matter either way. We're not expecting this hash
table to have more than a handful of entries; the difference between
256, 64, and NAMEDATALEN won't even add up to kilobytes in any
realistic scenario, let along MB or GB.

--
Robert Haas
EDB: http://www.enterprisedb.com



Re: introduce dynamic shared memory registry

From
Nathan Bossart
Date:
Here's a new version of the patch set with Bharath's feedback addressed.

On Tue, Jan 02, 2024 at 11:31:14AM -0500, Robert Haas wrote:
> On Tue, Jan 2, 2024 at 11:21 AM Nathan Bossart <nathandbossart@gmail.com> wrote:
>> > Are we expecting, for instance, a 128-bit UUID being used as a key and
>> > hence limiting it to a higher value 256 instead of just NAMEDATALEN?
>> > My thoughts were around saving a few bytes of shared memory space that
>> > can get higher when multiple modules using a DSM registry with
>> > multiple DSM segments.
>>
>> I'm not really expecting folks to use more than, say, 16 characters for the
>> key, but I intentionally set it much higher in case someone did have a
>> reason to use longer keys.  I'll lower it to 64 in the next revision unless
>> anyone else objects.
> 
> This surely doesn't matter either way. We're not expecting this hash
> table to have more than a handful of entries; the difference between
> 256, 64, and NAMEDATALEN won't even add up to kilobytes in any
> realistic scenario, let along MB or GB.

Right.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

Attachment

Re: introduce dynamic shared memory registry

From
Bharath Rupireddy
Date:
On Wed, Jan 3, 2024 at 4:19 AM Nathan Bossart <nathandbossart@gmail.com> wrote:
>
> Here's a new version of the patch set with Bharath's feedback addressed.

Thanks. The v4 patches look good to me except for a few minor
comments. I've marked it as RfC in CF.

1. Update all the copyright to the new year. A run of
src/tools/copyright.pl on the source tree will take care of it at some
point, but still it's good if we can update while we are here.
+ * Portions Copyright (c) 1996-2023, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1996-2023, PostgreSQL Global Development Group
+# Copyright (c) 2023, PostgreSQL Global Development Group
+ * Copyright (c) 2023, PostgreSQL Global Development Group

2. Typo: missing "an" before "already-attached".
+        /* Return address of already-attached DSM registry entry. */

3. Use NAMEDATALEN instead of 64?
+    char        key[64];

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com



Re: introduce dynamic shared memory registry

From
Nathan Bossart
Date:
On Sat, Jan 06, 2024 at 07:34:15PM +0530, Bharath Rupireddy wrote:
> 1. Update all the copyright to the new year. A run of
> src/tools/copyright.pl on the source tree will take care of it at some
> point, but still it's good if we can update while we are here.

Done.

> 2. Typo: missing "an" before "already-attached".
> +        /* Return address of already-attached DSM registry entry. */

Done.

> 3. Use NAMEDATALEN instead of 64?
> +    char        key[64];

I kept this the same, as I didn't see any need to tie the key size to
NAMEDATALEN.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

Attachment

Re: introduce dynamic shared memory registry

From
Bharath Rupireddy
Date:
On Sat, Jan 6, 2024 at 10:05 PM Nathan Bossart <nathandbossart@gmail.com> wrote:
>
> I kept this the same, as I didn't see any need to tie the key size to
> NAMEDATALEN.

Thanks. A fresh look at the v5 patches left me with the following thoughts:

1. I think we need to add some notes about this new way of getting
shared memory for external modules in the <title>Shared Memory and
LWLocks</title> section in xfunc.sgml? This will at least tell there's
another way for external modules to get shared memory, not just with
the shmem_request_hook and shmem_startup_hook. What do you think?

2. FWIW, I'd like to call this whole feature "Support for named DSM
segments in Postgres". Do you see anything wrong with this?

3. IIUC, this feature eventually makes both shmem_request_hook and
shmem_startup_hook pointless, no? Or put another way, what's the
significance of shmem request and startup hooks in lieu of this new
feature? I think it's quite possible to get rid of the shmem request
and startup hooks (of course, not now but at some point in future to
not break the external modules), because all the external modules can
allocate and initialize the same shared memory via
dsm_registry_init_or_attach and its init_callback. All the external
modules will then need to call dsm_registry_init_or_attach in their
_PG_init callbacks and/or in their bg worker's main functions in case
the modules intend to start up bg workers. Am I right?

4. With the understanding in comment #3, can pg_stat_statements and
test_slru.c get rid of its shmem_request_hook and shmem_startup_hook
and use dsm_registry_init_or_attach? It's not that this patch need to
remove them now, but just asking if there's something in there that
makes this new feature unusable.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com



Re: introduce dynamic shared memory registry

From
Amul Sul
Date:


On Mon, Jan 8, 2024 at 10:53 AM Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote:
On Sat, Jan 6, 2024 at 10:05 PM Nathan Bossart <nathandbossart@gmail.com> wrote:
>
> I kept this the same, as I didn't see any need to tie the key size to
> NAMEDATALEN.

Thanks. A fresh look at the v5 patches left me with the following thoughts:

1. I think we need to add some notes about this new way of getting
shared memory for external modules in the <title>Shared Memory and
LWLocks</title> section in xfunc.sgml? This will at least tell there's
another way for external modules to get shared memory, not just with
the shmem_request_hook and shmem_startup_hook. What do you think?

2. FWIW, I'd like to call this whole feature "Support for named DSM
segments in Postgres". Do you see anything wrong with this?

3. IIUC, this feature eventually makes both shmem_request_hook and
shmem_startup_hook pointless, no? Or put another way, what's the
significance of shmem request and startup hooks in lieu of this new
feature? I think it's quite possible to get rid of the shmem request
and startup hooks (of course, not now but at some point in future to
not break the external modules), because all the external modules can
allocate and initialize the same shared memory via
dsm_registry_init_or_attach and its init_callback. All the external
modules will then need to call dsm_registry_init_or_attach in their
_PG_init callbacks and/or in their bg worker's main functions in case
the modules intend to start up bg workers. Am I right?

4. With the understanding in comment #3, can pg_stat_statements and
test_slru.c get rid of its shmem_request_hook and shmem_startup_hook
and use dsm_registry_init_or_attach? It's not that this patch need to
remove them now, but just asking if there's something in there that
makes this new feature unusable.
 
+1, since doing for pg_prewarm, better to do for these modules as well.

A minor comment for v5:

+void *
+dsm_registry_init_or_attach(const char *key, size_t size,

I think the name could be simple as dsm_registry_init() like we use
elsewhere e.g. ShmemInitHash() which doesn't say attach explicitly.

Similarly, I think dshash_find_or_insert() can be as simple as dshash_search() and
accept HASHACTION like hash_search().

Regards,
Amul


Re: introduce dynamic shared memory registry

From
Nathan Bossart
Date:
On Mon, Jan 08, 2024 at 10:53:17AM +0530, Bharath Rupireddy wrote:
> 1. I think we need to add some notes about this new way of getting
> shared memory for external modules in the <title>Shared Memory and
> LWLocks</title> section in xfunc.sgml? This will at least tell there's
> another way for external modules to get shared memory, not just with
> the shmem_request_hook and shmem_startup_hook. What do you think?

Good call.  I definitely think this stuff ought to be documented.  After a
quick read, I also wonder if it'd be worth spending some time refining that
section.

> 2. FWIW, I'd like to call this whole feature "Support for named DSM
> segments in Postgres". Do you see anything wrong with this?

Why do you feel it should be renamed?  I don't see anything wrong with it,
but I also don't see any particular advantage with that name compared to
"dynamic shared memory registry."

> 3. IIUC, this feature eventually makes both shmem_request_hook and
> shmem_startup_hook pointless, no? Or put another way, what's the
> significance of shmem request and startup hooks in lieu of this new
> feature? I think it's quite possible to get rid of the shmem request
> and startup hooks (of course, not now but at some point in future to
> not break the external modules), because all the external modules can
> allocate and initialize the same shared memory via
> dsm_registry_init_or_attach and its init_callback. All the external
> modules will then need to call dsm_registry_init_or_attach in their
> _PG_init callbacks and/or in their bg worker's main functions in case
> the modules intend to start up bg workers. Am I right?

Well, modules might need to do a number of other things (e.g., adding
hooks) that can presently only be done when preloaded, in which case I
doubt there's much benefit from switching to the DSM registry.  I don't
really intend for it to replace the existing request/startup hooks, but
you're probably right that most, if not all, could use the registry
instead.  IMHO this is well beyond the scope of this thread, though.

> 4. With the understanding in comment #3, can pg_stat_statements and
> test_slru.c get rid of its shmem_request_hook and shmem_startup_hook
> and use dsm_registry_init_or_attach? It's not that this patch need to
> remove them now, but just asking if there's something in there that
> makes this new feature unusable.

It might be possible, but IIUC you'd still need a way to know whether the
library was preloaded, i.e., all the other necessary hooks were in place.
It's convenient to just be able to check whether the shared memory was set
up for this purpose.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com



Re: introduce dynamic shared memory registry

From
Nathan Bossart
Date:
On Mon, Jan 08, 2024 at 11:13:42AM +0530, Amul Sul wrote:
> +void *
> +dsm_registry_init_or_attach(const char *key, size_t size,
> 
> I think the name could be simple as dsm_registry_init() like we use
> elsewhere e.g. ShmemInitHash() which doesn't say attach explicitly.

That seems reasonable to me.

> Similarly, I think dshash_find_or_insert() can be as simple as
> dshash_search() and
> accept HASHACTION like hash_search().

I'm not totally sure what you mean here.  If you mean changing the dshash
API, I'd argue that's a topic for another thread.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com



Re: introduce dynamic shared memory registry

From
Amul Sul
Date:

On Mon, Jan 8, 2024 at 10:48 PM Nathan Bossart <nathandbossart@gmail.com> wrote:
On Mon, Jan 08, 2024 at 11:13:42AM +0530, Amul Sul wrote:
> +void *
> +dsm_registry_init_or_attach(const char *key, size_t size,
>
> I think the name could be simple as dsm_registry_init() like we use
> elsewhere e.g. ShmemInitHash() which doesn't say attach explicitly.

That seems reasonable to me.

> Similarly, I think dshash_find_or_insert() can be as simple as
> dshash_search() and
> accept HASHACTION like hash_search().

I'm not totally sure what you mean here.  If you mean changing the dshash
API, I'd argue that's a topic for another thread.
 
Yes, you are correct. I didn't realize that existing code -- now sure, why
wouldn't we implemented as the dynahash. Sorry for the noise.

Regards,
Amul

Re: introduce dynamic shared memory registry

From
Andrei Lepikhov
Date:
On 9/1/2024 00:16, Nathan Bossart wrote:
> On Mon, Jan 08, 2024 at 10:53:17AM +0530, Bharath Rupireddy wrote:
>> 1. I think we need to add some notes about this new way of getting
>> shared memory for external modules in the <title>Shared Memory and
>> LWLocks</title> section in xfunc.sgml? This will at least tell there's
>> another way for external modules to get shared memory, not just with
>> the shmem_request_hook and shmem_startup_hook. What do you think?
+1. Maybe even more - in the section related to extensions, this 
approach to using shared data can be mentioned, too.

>> 2. FWIW, I'd like to call this whole feature "Support for named DSM
>> segments in Postgres". Do you see anything wrong with this?
> 
> Why do you feel it should be renamed?  I don't see anything wrong with it,
> but I also don't see any particular advantage with that name compared to
> "dynamic shared memory registry."
It is not a big issue, I suppose. But for me personally (as not a native 
English speaker), the label "Named DSM segments" seems more 
straightforward to understand.
> 
>> 3. IIUC, this feature eventually makes both shmem_request_hook and
>> shmem_startup_hook pointless, no? Or put another way, what's the
>> significance of shmem request and startup hooks in lieu of this new
>> feature? I think it's quite possible to get rid of the shmem request
>> and startup hooks (of course, not now but at some point in future to
>> not break the external modules), because all the external modules can
>> allocate and initialize the same shared memory via
>> dsm_registry_init_or_attach and its init_callback. All the external
>> modules will then need to call dsm_registry_init_or_attach in their
>> _PG_init callbacks and/or in their bg worker's main functions in case
>> the modules intend to start up bg workers. Am I right?
> 
> Well, modules might need to do a number of other things (e.g., adding
> hooks) that can presently only be done when preloaded, in which case I
> doubt there's much benefit from switching to the DSM registry.  I don't
> really intend for it to replace the existing request/startup hooks, but
> you're probably right that most, if not all, could use the registry
> instead.  IMHO this is well beyond the scope of this thread, though.
+1, it may be a many reasons to use these hooks.

 >> 3. Use NAMEDATALEN instead of 64?
 >> +    char        key[64];
 > I kept this the same, as I didn't see any need to tie the key size to
 > NAMEDATALEN.
IMO, we should avoid magic numbers whenever possible. Current logic 
according to which first users of this feature will be extensions 
naturally bonds this size to the size of the 'name' type.

And one more point. I think the commit already deserves a more detailed 
commit message.

-- 
regards,
Andrei Lepikhov
Postgres Professional




Re: introduce dynamic shared memory registry

From
Nathan Bossart
Date:
On Thu, Jan 11, 2024 at 09:50:19AM +0700, Andrei Lepikhov wrote:
> On 9/1/2024 00:16, Nathan Bossart wrote:
>> On Mon, Jan 08, 2024 at 10:53:17AM +0530, Bharath Rupireddy wrote:
>> > 2. FWIW, I'd like to call this whole feature "Support for named DSM
>> > segments in Postgres". Do you see anything wrong with this?
>> 
>> Why do you feel it should be renamed?  I don't see anything wrong with it,
>> but I also don't see any particular advantage with that name compared to
>> "dynamic shared memory registry."
> It is not a big issue, I suppose. But for me personally (as not a native
> English speaker), the label "Named DSM segments" seems more straightforward
> to understand.

That is good to know, thanks.  I see that it would also align better with
RequestNamedLWLockTranche/GetNamedLWLockTranche.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com



Re: introduce dynamic shared memory registry

From
Michael Paquier
Date:
On Mon, Jan 08, 2024 at 11:16:27AM -0600, Nathan Bossart wrote:
> On Mon, Jan 08, 2024 at 10:53:17AM +0530, Bharath Rupireddy wrote:
>> 1. I think we need to add some notes about this new way of getting
>> shared memory for external modules in the <title>Shared Memory and
>> LWLocks</title> section in xfunc.sgml? This will at least tell there's
>> another way for external modules to get shared memory, not just with
>> the shmem_request_hook and shmem_startup_hook. What do you think?
>
> Good call.  I definitely think this stuff ought to be documented.  After a
> quick read, I also wonder if it'd be worth spending some time refining that
> section.

+1.  It would be a second thing to point at autoprewarm.c in the docs
as an extra pointer that can be fed to users reading the docs.

>> 3. IIUC, this feature eventually makes both shmem_request_hook and
>> shmem_startup_hook pointless, no? Or put another way, what's the
>> significance of shmem request and startup hooks in lieu of this new
>> feature? I think it's quite possible to get rid of the shmem request
>> and startup hooks (of course, not now but at some point in future to
>> not break the external modules), because all the external modules can
>> allocate and initialize the same shared memory via
>> dsm_registry_init_or_attach and its init_callback. All the external
>> modules will then need to call dsm_registry_init_or_attach in their
>> _PG_init callbacks and/or in their bg worker's main functions in case
>> the modules intend to start up bg workers. Am I right?
>
> Well, modules might need to do a number of other things (e.g., adding
> hooks) that can presently only be done when preloaded, in which case I
> doubt there's much benefit from switching to the DSM registry.  I don't
> really intend for it to replace the existing request/startup hooks, but
> you're probably right that most, if not all, could use the registry
> instead.  IMHO this is well beyond the scope of this thread, though.

Even if that's not in the scope of this thread, just removing these
hooks would break a lot of out-of-core things, and they still have a
lot of value when extensions expect to always be loaded with shared.
They don't cost in maintenance at this stage.
--
Michael

Attachment

Re: introduce dynamic shared memory registry

From
Bharath Rupireddy
Date:
On Thu, Jan 11, 2024 at 10:42 AM Michael Paquier <michael@paquier.xyz> wrote:
>
> >> 3. IIUC, this feature eventually makes both shmem_request_hook and
> >> shmem_startup_hook pointless, no? Or put another way, what's the
> >> significance of shmem request and startup hooks in lieu of this new
> >> feature? I think it's quite possible to get rid of the shmem request
> >> and startup hooks (of course, not now but at some point in future to
> >> not break the external modules), because all the external modules can
> >> allocate and initialize the same shared memory via
> >> dsm_registry_init_or_attach and its init_callback. All the external
> >> modules will then need to call dsm_registry_init_or_attach in their
> >> _PG_init callbacks and/or in their bg worker's main functions in case
> >> the modules intend to start up bg workers. Am I right?
> >
> > Well, modules might need to do a number of other things (e.g., adding
> > hooks) that can presently only be done when preloaded, in which case I
> > doubt there's much benefit from switching to the DSM registry.  I don't
> > really intend for it to replace the existing request/startup hooks, but
> > you're probably right that most, if not all, could use the registry
> > instead.  IMHO this is well beyond the scope of this thread, though.
>
> Even if that's not in the scope of this thread, just removing these
> hooks would break a lot of out-of-core things, and they still have a
> lot of value when extensions expect to always be loaded with shared.
> They don't cost in maintenance at this stage.

Adding some notes in the docs on when exactly one needs to use shmem
hooks and the named DSM segments can help greatly.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com



Re: introduce dynamic shared memory registry

From
Nathan Bossart
Date:
Here's a new version of the patch set in which I've attempted to address
the feedback in this thread.  Note that 0001 is being tracked in a separate
thread [0], but it is basically a prerequisite for adding the documentation
for this feature, so that's why I've also added it here.

[0] https://postgr.es/m/20240112041430.GA3557928%40nathanxps13

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

Attachment

Re: introduce dynamic shared memory registry

From
Abhijit Menon-Sen
Date:
At 2024-01-12 11:21:52 -0600, nathandbossart@gmail.com wrote:
>
> From: Nathan Bossart <nathan@postgresql.org>
> Date: Thu, 11 Jan 2024 21:55:25 -0600
> Subject: [PATCH v6 1/3] reorganize shared memory and lwlocks documentation
> 
> ---
>  doc/src/sgml/xfunc.sgml | 182 +++++++++++++++++++++++++---------------
>  1 file changed, 114 insertions(+), 68 deletions(-)
> 
> diff --git a/doc/src/sgml/xfunc.sgml b/doc/src/sgml/xfunc.sgml
> index 89116ae74c..0ba52b41d4 100644
> --- a/doc/src/sgml/xfunc.sgml
> +++ b/doc/src/sgml/xfunc.sgml
> @@ -3397,90 +3397,136 @@ CREATE FUNCTION make_array(anyelement) RETURNS anyarray
>     </sect2>
>  
> […]
> -     from your <literal>shmem_request_hook</literal>.
> -    </para>
> -    <para>
> -     LWLocks are reserved by calling:
> +      Each backend sould obtain a pointer to the reserved shared memory by

sould → should

> +      Add-ins can reserve LWLocks on server startup.  Like with shared memory,

(Would "As with shared memory" read better? Maybe, but then again maybe
it should be left alone because you also write "Unlike with" elsewhere.)

-- Abhijit



Re: introduce dynamic shared memory registry

From
Nathan Bossart
Date:
On Fri, Jan 12, 2024 at 11:13:46PM +0530, Abhijit Menon-Sen wrote:
> At 2024-01-12 11:21:52 -0600, nathandbossart@gmail.com wrote:
>> +      Each backend sould obtain a pointer to the reserved shared memory by
> 
> sould → should

D'oh.  Thanks.

>> +      Add-ins can reserve LWLocks on server startup.  Like with shared memory,
> 
> (Would "As with shared memory" read better? Maybe, but then again maybe
> it should be left alone because you also write "Unlike with" elsewhere.)

I think "As with shared memory..." sounds better here.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com



Re: introduce dynamic shared memory registry

From
Nathan Bossart
Date:
On Fri, Jan 12, 2024 at 01:45:55PM -0600, Nathan Bossart wrote:
> On Fri, Jan 12, 2024 at 11:13:46PM +0530, Abhijit Menon-Sen wrote:
>> At 2024-01-12 11:21:52 -0600, nathandbossart@gmail.com wrote:
>>> +      Each backend sould obtain a pointer to the reserved shared memory by
>> 
>> sould → should
> 
> D'oh.  Thanks.
> 
>>> +      Add-ins can reserve LWLocks on server startup.  Like with shared memory,
>> 
>> (Would "As with shared memory" read better? Maybe, but then again maybe
>> it should be left alone because you also write "Unlike with" elsewhere.)
> 
> I think "As with shared memory..." sounds better here.

Here is a new version of the patch set with these changes.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

Attachment

Re: introduce dynamic shared memory registry

From
Bharath Rupireddy
Date:
On Sun, Jan 14, 2024 at 3:11 AM Nathan Bossart <nathandbossart@gmail.com> wrote:
>
> Here is a new version of the patch set with these changes.

Thanks. Here are some comments on v7-0002.

1.
+GetNamedDSMSegment(const char *name, size_t size,
+                   void (*init_callback) (void *ptr), bool *found)
+{
+
+    Assert(name);
+    Assert(size);
+    Assert(found);

I've done some input validation to GetNamedDSMSegment():

With an empty key name (""), it works, but do we need that in
practice? Can't we error out saying the name can't be empty?

With a size 0, an assertion is fine, but in production (without the
assertion), I'm seeing the following errors.

2024-01-16 04:49:28.961 UTC client backend[864369]
pg_regress/test_dsm_registry ERROR:  could not resize shared memory
segment "/PostgreSQL.3701090278" to 0 bytes: Invalid argument
2024-01-16 04:49:29.264 UTC postmaster[864357] LOG:  server process
(PID 864370) was terminated by signal 11: Segmentation fault

I think it's better for GetNamedDSMSegment() to error out on empty
'name' and size 0. This makes the user-facing function
GetNamedDSMSegment more concrete.

2.
+void *
+GetNamedDSMSegment(const char *name, size_t size,
+                   void (*init_callback) (void *ptr), bool *found)

+    Assert(found);

Why is input parameter 'found' necessary to be passed by the caller?
Neither the test module added, nor the pg_prewarm is using the found
variable. The function will anyway create the DSM segment if one with
the given name isn't found. IMO, found is an optional parameter for
the caller. So, the assert(found) isn't necessary.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com



Re: introduce dynamic shared memory registry

From
Nathan Bossart
Date:
On Tue, Jan 16, 2024 at 10:28:29AM +0530, Bharath Rupireddy wrote:
> I think it's better for GetNamedDSMSegment() to error out on empty
> 'name' and size 0. This makes the user-facing function
> GetNamedDSMSegment more concrete.

Agreed, thanks for the suggestion.

> +void *
> +GetNamedDSMSegment(const char *name, size_t size,
> +                   void (*init_callback) (void *ptr), bool *found)
> 
> +    Assert(found);
> 
> Why is input parameter 'found' necessary to be passed by the caller?
> Neither the test module added, nor the pg_prewarm is using the found
> variable. The function will anyway create the DSM segment if one with
> the given name isn't found. IMO, found is an optional parameter for
> the caller. So, the assert(found) isn't necessary.

The autoprewarm change (0003) does use this variable.  I considered making
it optional (i.e., you could pass in NULL if you didn't want it), but I
didn't feel like the extra code in GetNamedDSMSegment() to allow this was
worth it so that callers could avoid creating a single bool.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

Attachment

Re: introduce dynamic shared memory registry

From
Bharath Rupireddy
Date:
On Tue, Jan 16, 2024 at 9:37 PM Nathan Bossart <nathandbossart@gmail.com> wrote:
>
> The autoprewarm change (0003) does use this variable.  I considered making
> it optional (i.e., you could pass in NULL if you didn't want it), but I
> didn't feel like the extra code in GetNamedDSMSegment() to allow this was
> worth it so that callers could avoid creating a single bool.

I'm okay with it.

The v8 patches look good to me.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com



Re: introduce dynamic shared memory registry

From
Nathan Bossart
Date:
On Wed, Jan 17, 2024 at 08:00:00AM +0530, Bharath Rupireddy wrote:
> The v8 patches look good to me.

Committed.  Thanks everyone for reviewing!

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com



Re: introduce dynamic shared memory registry

From
Tom Lane
Date:
Nathan Bossart <nathandbossart@gmail.com> writes:
> Committed.  Thanks everyone for reviewing!

Coverity complained about this:

*** CID 1586660:  Null pointer dereferences  (NULL_RETURNS)
/srv/coverity/git/pgsql-git/postgresql/src/backend/storage/ipc/dsm_registry.c: 185 in GetNamedDSMSegment()
179         }
180         else if (!dsm_find_mapping(entry->handle))
181         {
182             /* Attach to existing segment. */
183             dsm_segment *seg = dsm_attach(entry->handle);
184
>>>     CID 1586660:  Null pointer dereferences  (NULL_RETURNS)
>>>     Dereferencing a pointer that might be "NULL" "seg" when calling "dsm_pin_mapping".
185             dsm_pin_mapping(seg);
186             ret = dsm_segment_address(seg);
187         }
188         else
189         {
190             /* Return address of an already-attached segment. */

I think it's right --- the comments for dsm_attach explicitly
point out that a NULL return is possible.  You need to handle
that scenario in some way other than SIGSEGV.

            regards, tom lane



Re: introduce dynamic shared memory registry

From
Nathan Bossart
Date:
On Sun, Jan 21, 2024 at 11:21:46AM -0500, Tom Lane wrote:
> Coverity complained about this:
> 
> *** CID 1586660:  Null pointer dereferences  (NULL_RETURNS)
> /srv/coverity/git/pgsql-git/postgresql/src/backend/storage/ipc/dsm_registry.c: 185 in GetNamedDSMSegment()
> 179         }
> 180         else if (!dsm_find_mapping(entry->handle))
> 181         {
> 182             /* Attach to existing segment. */
> 183             dsm_segment *seg = dsm_attach(entry->handle);
> 184     
>>>>     CID 1586660:  Null pointer dereferences  (NULL_RETURNS)
>>>>     Dereferencing a pointer that might be "NULL" "seg" when calling "dsm_pin_mapping".
> 185             dsm_pin_mapping(seg);
> 186             ret = dsm_segment_address(seg);
> 187         }
> 188         else
> 189         {
> 190             /* Return address of an already-attached segment. */
> 
> I think it's right --- the comments for dsm_attach explicitly
> point out that a NULL return is possible.  You need to handle
> that scenario in some way other than SIGSEGV.

Oops.  I've attached an attempt at fixing this.  I took the opportunity to
clean up the surrounding code a bit.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

Attachment

Re: introduce dynamic shared memory registry

From
Michael Paquier
Date:
On Sun, Jan 21, 2024 at 04:13:20PM -0600, Nathan Bossart wrote:
> Oops.  I've attached an attempt at fixing this.  I took the opportunity to
> clean up the surrounding code a bit.

Thanks for the patch.  Your proposed attempt looks correct to me with
an ERROR when no segments are found..
--
Michael

Attachment

Re: introduce dynamic shared memory registry

From
Bharath Rupireddy
Date:
On Mon, Jan 22, 2024 at 3:43 AM Nathan Bossart <nathandbossart@gmail.com> wrote:
>
> Oops.  I've attached an attempt at fixing this.  I took the opportunity to
> clean up the surrounding code a bit.

The code looks cleaner and readable with the patch. All the call sites
are taking care of dsm_attach returning NULL value. So, the attached
patch looks good to me.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com



Re: introduce dynamic shared memory registry

From
Nathan Bossart
Date:
On Mon, Jan 22, 2024 at 05:00:48PM +0530, Bharath Rupireddy wrote:
> On Mon, Jan 22, 2024 at 3:43 AM Nathan Bossart <nathandbossart@gmail.com> wrote:
>> Oops.  I've attached an attempt at fixing this.  I took the opportunity to
>> clean up the surrounding code a bit.
> 
> The code looks cleaner and readable with the patch. All the call sites
> are taking care of dsm_attach returning NULL value. So, the attached
> patch looks good to me.

Committed.  Thanks for the report and the reviews.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com