Re: Adding facility for injection points (or probe points?) for more advanced tests - Mailing list pgsql-hackers

From Ashutosh Bapat
Subject Re: Adding facility for injection points (or probe points?) for more advanced tests
Date
Msg-id CAExHW5us3HWEcP0Vx0gLumfGNyk5XfxqwG6nY7A3FurfrkNOAg@mail.gmail.com
Whole thread Raw
In response to Re: Adding facility for injection points (or probe points?) for more advanced tests  (Michael Paquier <michael@paquier.xyz>)
Responses Re: Adding facility for injection points (or probe points?) for more advanced tests
List pgsql-hackers
On Tue, Dec 12, 2023 at 4:15 PM Michael Paquier <michael@paquier.xyz> wrote:
>
> On Tue, Dec 12, 2023 at 10:27:09AM +0530, Dilip Kumar wrote:
> > Oops, I only included the code changes where I am adding injection
> > points and some comments to verify that, but missed the actual test
> > file. Attaching it here.
>
> I see.  Interesting that this requires persistent connections to work.
> That's something I've found clunky to rely on when the scenarios a
> test needs to deal with are rather complex.  That's an area that could
> be made easier to use outside of this patch..  Something got proposed
> by Andrew Dunstan to make the libpq routines usable through a perl
> module, for example.
>
> > Note:  I think the latest patches are conflicting with the head, can you rebase?
>
> Indeed, as per the recent manipulations in ipci.c for the shmem
> initialization areas.  Here goes a v6.

Sorry for replying late here. Another minor conflict has risen again.
It's minor enough to be ignored for a review.

On Tue, Nov 21, 2023 at 6:56 AM Michael Paquier <michael@paquier.xyz> wrote:
>
> On Mon, Nov 20, 2023 at 04:53:45PM +0530, Ashutosh Bapat wrote:
> > On Wed, Oct 25, 2023 at 9:43 AM Michael Paquier <michael@paquier.xyz> wrote:
> >> I have added some documentation to explain that, as well.  I am not
> >> wedded to the name proposed in the patch, so if you feel there is
> >> better, feel free to propose ideas.
> >
> > Actually with Attach and Detach terminology, INJECTION_POINT becomes
> > the place where we "declare" the injection point. So the documentation
> > needs to first explain INJECTION_POINT and then explain the other
> > operations.
>
> Sure.

This discussion has not been addressed in v6. I think the interface
needs to be documented in the order below
INJECTION_POINT - this declares an injection point - i.e. a place in
code where an external code can be injected (and run).
InjectionPointAttach() - this is used to associate ("attach") external
code to an injection point.
InjectionPointDetach() - this is used to disassociate ("detach")
external code from an injection point.

Specifying that InjectionPointAttach() "defines" an injection point
gives an impression that the injection point will be "somehow" added
to the code by calling InjectionPointAttach() which is not true. For
InjectionPointAttach() to be useful, the first argument to it should
be something already "declared" in the code using INJECTION_POINT().
Hence INJECTION_POINT needs to be mentioned in the documentation
first, followed by Attach and detach. The documentation needs to be
rephrased to use terms "declare", "attach" and "detach" instead of
"define", "run". The first set is aligned with the functionality
whereas the second set is aligned with the implementation.

Even if an INJECTION_POINT is not "declared" attach would succeed but
doesn't do anything. I think this needs to be made clear in the
documentation. Better if we could somehow make Attach() fail if the
specified injection point is not "declared" using INJECTION_POINT. Of
course we don't want to bloat the hash table with all "declared"
injection points even if they aren't being attached to and hence not
used. I think, exposing the current injection point strings as
#defines and encouraging users to use these macros instead of string
literals will be a good start.

With the current implementation it's possible to "declare" injection
point with same name at multiple places. It's useful but is it
intended?

/* Field sizes */
#define INJ_NAME_MAXLEN 64
#define INJ_LIB_MAXLEN 128
#define INJ_FUNC_MAXLEN 128
I think these limits should be either documented or specified in the
error messages for users to fix their code in case of
errors/unexpected behaviour.

Here are some code level comments on 0001

+typedef struct InjectionPointArrayEntry

This is not an array entry anymore. I think we should rename
InjectionPointEntry as SharedInjectionPointEntry and InjectionPointArrayEntry
as LocalInjectionPointEntry.

+/* utilities to handle the local array cache */
+static void
+injection_point_cache_add(const char *name,
+ InjectionPointCallback callback)
+{
... snip ...
+
+ entry = (InjectionPointCacheEntry *)
+ hash_search(InjectionPointCache, name, HASH_ENTER, &found);
+
+ if (!found)

The function is called only when the injection point is not found in the local
cache. So this condition will always be true. An Assert will help to make it
clear and also prevent an unintended callback replacement.

+#ifdef USE_INJECTION_POINTS
+static bool
+file_exists(const char *name)

There's similar function in jit.c and dfmgr.c. Can we not reuse that code?

+ /* Save the entry */
+ memcpy(entry_by_name->name, name, sizeof(entry_by_name->name));
+ entry_by_name->name[INJ_NAME_MAXLEN - 1] = '\0';
+ memcpy(entry_by_name->library, library, sizeof(entry_by_name->library));
+ entry_by_name->library[INJ_LIB_MAXLEN - 1] = '\0';
+ memcpy(entry_by_name->function, function, sizeof(entry_by_name->function));
+ entry_by_name->function[INJ_FUNC_MAXLEN - 1] = '\0';

Most of the code is using strncpy instead of memcpy. Why is this code different?

+ injection_callback = injection_point_cache_get(name);
+ if (injection_callback == NULL)
+ {
+ char path[MAXPGPATH];
+
+ /* Found, so just run the callback registered */

The condition indicates that the callback was not found. Comment looks wrong.

+ snprintf(path, MAXPGPATH, "%s/%s%s", pkglib_path,
+ entry_by_name->library, DLSUFFIX);
+
+ if (!file_exists(path))
+ elog(ERROR, "could not find injection library \"%s\"", path);
+
+ injection_callback = (InjectionPointCallback)
+ load_external_function(path, entry_by_name->function, true, NULL);
+
+ /* add it to the local cache when found */
+ injection_point_cache_add(name, injection_callback);
+ }
+

Consider case

Backend 2
InjectionPointAttach("xyz", "abc", "pqr");

Backend 1
INJECTION_POINT("xyz");

Backend 2
InjectionPointDetach("xyz");
InjectionPointAttach("xyz", "uvw", "lmn");

Backend 1
INJECTION_POINT("xyz");

IIUC, the last INJECTION_POINT would run abc.pqr instead of uvw.lmn.
Am I correct?

To fix this, we have to a. either save qualified name of the function in local
cache too OR resolve the function name every time INJECTION_POINT is invoked
and is found in the shared hash table. The first one option is cheaper I think.
But it will be good if we can invalidate the local entry when the global entry
changes. To keep code simple, we may choose to ignore close race conditions
where INJECTION_POINT is run while InjectionPointAttach or InjectionPointDetach
is happening. But this way we don't have to look up shared hash table every
time INJECTION_POINT is invoked thus improving performance.

I will look at 0002 next.

--
Best Wishes,
Ashutosh Bapat



pgsql-hackers by date:

Previous
From: Bharath Rupireddy
Date:
Subject: Re: [17] CREATE SUBSCRIPTION ... SERVER
Next
From: Magnus Hagander
Date:
Subject: Re: Commitfest manager January 2024