Re: Support to define custom wait events for extensions - Mailing list pgsql-hackers

From Michael Paquier
Subject Re: Support to define custom wait events for extensions
Date
Msg-id ZNQlJ8pQIIWjv/he@paquier.xyz
Whole thread Raw
In response to Re: Support to define custom wait events for extensions  (Masahiro Ikeda <ikedamsh@oss.nttdata.com>)
Responses Re: Support to define custom wait events for extensions
List pgsql-hackers
On Wed, Aug 09, 2023 at 08:10:42PM +0900, Masahiro Ikeda wrote:
> * create second hash table to find a event id from a name to
>   identify uniquness. It enable extensions which don't use share
>   memory for their use to define custom wait events because
>   WaitEventExtensionNew() will not allocate duplicate wait events.

Okay, a second hash table to check if events are registered works for
me.

> * create PoC patch to show that extensions, which don't use shared
>   memory for their use, can define custom wait events.
>  (v2-0002-poc-custom-wait-event-for-dblink.patch)
>
> I'm worrying about
> * Is 512(wee_hash_max_size) the maximum number of the custom wait
>   events sufficient?

Thanks for sending a patch!

I'm OK to start with that.  This could always be revisited later, but
even for a server loaded with a bunch of extensions that looks more
than enough to me.

> * Is there any way to not force extensions that don't use shared
>   memory for their use like dblink to acquire AddinShmemInitLock?;

Yes, they don't need it at all as the dynahashes are protected with
their own LWLocks.

+++ b/src/backend/storage/lmgr/lwlocknames.txt
@@ -53,3 +53,4 @@ XactTruncationLock                    44
 # 45 was XactTruncationLock until removal of BackendRandomLock
 WrapLimitsVacuumLock               46
 NotifyQueueTailLock                    47
+WaitEventExtensionLock              48

This new LWLock needs to be added to wait_event_names.txt, or it won't
be reported to pg_stat_activity and it would not be documented when
the sgml docs are generated from the txt data.

-extern uint32 WaitEventExtensionNew(void);
-extern void WaitEventExtensionRegisterName(uint32 wait_event_info,
-                                          const char *wait_event_name);
+extern uint32 WaitEventExtensionNew(const char *wait_event_name);
Looks about right, and the docs are refreshed.

+static const int wee_hash_init_size = 128;
+static const int wee_hash_max_size = 512;
I would use a few #defines with upper-case characters here instead as
these are constants for us.

Now that it is possible to rely on LWLocks for the hash tables, more
cleanup is possible in worker_spi, with the removal of
worker_spi_state, the shmem hooks and their routines.  The only thing
that should be needed is something like that at the start of
worker_spi_main() (same position as worker_spi_shmem_init now):
+static uint32 wait_event = 0;
[...]
+   if (wait_event == 0)
+       wait_event = WaitEventExtensionNew("worker_spi_main");

The updates in 001_worker_spi.pl look OK.

+    * The entry must be stored because it's registered in
+    * WaitEventExtensionNew().
     */
-   eventId -= NUM_BUILTIN_WAIT_EVENT_EXTENSION;
+   if (!entry)
+       ereport(ERROR,
+               errmsg("could not find the name for custom wait event ID %u", eventId));
Yeah, I think that's better than just falling back to "extension".  An
ID reported in pg_stat_activity should always have an entry, or we
have race conditions.  This should be an elog(ERROR), as in
this-error-shall-never-happen.  No need to translate the error string,
as well (the docs have been updated with this change. thanks!).

Additionally, LWLockHeldByMeInMode(AddinShmemInitLock) in
WaitEventExtensionNew() should not be needed, thanks to
WaitEventExtensionLock.

+ * WaitEventExtensionNameHash is used to find the name from a event id.
+ * It enables all backends look up them without additional processing
+ * per backend like LWLockRegisterTranche().

It does not seem necessary to mention LWLockRegisterTranche().

+ * WaitEventExtensionIdHash is used to find the event id from a name.
+ * Since it can identify uniquness by the names, extensions that do not
+ * use shared memory also be able to define custom wait events without
+ * defining duplicate wait events.

Perhaps this could just say that this table is necessary to ensure
that we don't have duplicated entries when registering new strings
with their IDs?  s/uniquness/uniqueness/.  The second part of the
sentence about shared memory does not seem necessary now.

+   sz = add_size(sz, hash_estimate_size(wee_hash_init_size,
+                                       sizeof(WaitEventExtensionNameEntry)));
+   sz = add_size(sz, hash_estimate_size(wee_hash_init_size,
+                                       sizeof(WaitEventExtensionIdEntry)));

Err, this should use the max size, and not the init size for the size
estimation, no?

+   if (strlen(wait_event_name) >= NAMEDATALEN)
+       ereport(ERROR,
+               errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED),
+               errmsg("wait event name is too long"));
This could just be an elog(ERROR), I assume, as that could only be
reached by developers.  The string needs to be rewritten, like "cannot
use custom wait event string longer than %u characters", or something
like that.

+       if (wait_event_info == NULL)
+       {
+           wait_event_info = (uint32 *) MemoryContextAlloc(TopMemoryContext, sizeof(uint32));
+           LWLockAcquire(AddinShmemInitLock, LW_EXCLUSIVE);
+           *wait_event_info = WaitEventExtensionNew("dblink_get_con");
+           LWLockRelease(AddinShmemInitLock);
+       }
+       conn = libpqsrv_connect(connstr, *wait_event_info)

In 0002.  Caching the value statically in the backend is what you
should do, but a pointer, an allocation to the TopMemoryContext and a
dependency to AddinShmemInitLock should not be necessary when dealing
with a uint32.  You could use an initial value of 0, for example, or
just PG_WAIT_EXTENSION but the latter is not really necessary and
would bypass the sanity checks.

+   /* Register the new custom wait event in the shared hash table */
+   LWLockAcquire(WaitEventExtensionLock, LW_EXCLUSIVE);
+
+   name_entry = (WaitEventExtensionNameEntry *)
+               hash_search(WaitEventExtensionNameHash, &eventId, HASH_ENTER, &found);
+   Assert(!found);
+   strlcpy(name_entry->wait_event_name, wait_event_name, sizeof(name_entry->wait_event_name));
+
+   id_entry = (WaitEventExtensionIdEntry *)
+               hash_search(WaitEventExtensionIdHash, &wait_event_name, HASH_ENTER, &found);
+   Assert(!found);
+   id_entry->event_id = eventId;
+
+   LWLockRelease(WaitEventExtensionLock);

The logic added to WaitEventExtensionNew() is a bit racy, where it
would be possible with the same entry to be added multiple times.
Imagine for example the following:
- Process 1 does WaitEventExtensionNew("foo1"), does not find the
entry by name in hash_search, gets an eventId of 1, releases the
spinlock.
- Process 2 calls as well WaitEventExtensionNew("foo1"), does not find
the entry by name because it has not been added by process 1 yet,
allocates an eventId of 2
- Process 2 takes first WaitEventExtensionLock in LW_EXCLUSIVE to add
entry "foo1", there is no entry by name, so one is added for the ID.
WaitEventExtensionLock is released
- Process 1, that was waiting on WaitEventExtensionLock, can now take
it in exclusive mode.  It finds an entry by name for "foo1", fails the
assertion because an entry is found.

I think that the ordering of WaitEventExtensionNew() should be
reworked a bit.  This order should be safer.
- Take WaitEventExtensionLock in shared mode, look if there's an entry
by name, release the lock.  The patch does that.
- If an entry is found, return, we're OK.  The patch does that.
- Take again WaitEventExtensionLock in exclusive mode.
- Look again at the hash table with the name given, in case somebody
has inserted an equivalent entry in the short window where the lock
was not held.
-- If an entry is found, release the lock and leave, we're OK.
-- If an entry is not found, keep the lock.
- Acquire the spinlock, and get a new event ID.  Release spinlock.
- Add the new entries to both tables, both assertions on found are OK
to have.
- Release LWLock and leave.
--
Michael

Attachment

pgsql-hackers by date:

Previous
From: Jacob Champion
Date:
Subject: Re: pg_dump needs SELECT privileges on irrelevant extension table
Next
From: Jimmy Yih
Date:
Subject: Should the archiver process always make sure that the timeline history files exist in the archive?