Thread: Injection point locking

Injection point locking

From
Heikki Linnakangas
Date:
InjectionPointRun() acquires InjectionPointLock, looks up the hash 
entry, and releases the lock:

>     LWLockAcquire(InjectionPointLock, LW_SHARED);
>     entry_by_name = (InjectionPointEntry *)
>         hash_search(InjectionPointHash, name,
>                     HASH_FIND, &found);
>     LWLockRelease(InjectionPointLock);

Later, it reads fields from the entry it looked up:

>         /* not found in local cache, so load and register */
>         snprintf(path, MAXPGPATH, "%s/%s%s", pkglib_path,
>                  entry_by_name->library, DLSUFFIX);

Isn't that a straightforward race condition, if the injection point is 
detached in between?


Another thing:

I wanted use injection points to inject an error early at backend 
startup, to write a test case for the scenarios that Jacob point out at 
https://www.postgresql.org/message-id/CAOYmi%2Bnwvu21mJ4DYKUa98HdfM_KZJi7B1MhyXtnsyOO-PB6Ww%40mail.gmail.com. 
But I can't do that, because InjectionPointRun() requires a PGPROC 
entry, because it uses an LWLock. That also makes it impossible to use 
injection points in the postmaster. Any chance we could allow injection 
points to be triggered without a PGPROC entry? Could we use a simple 
spinlock instead? With a fast path for the case that no injection points 
are attached or something?

-- 
Heikki Linnakangas
Neon (https://neon.tech)



Re: Injection point locking

From
Tom Lane
Date:
Heikki Linnakangas <hlinnaka@iki.fi> writes:
> ... I can't do that, because InjectionPointRun() requires a PGPROC 
> entry, because it uses an LWLock. That also makes it impossible to use 
> injection points in the postmaster. Any chance we could allow injection 
> points to be triggered without a PGPROC entry? Could we use a simple 
> spinlock instead? With a fast path for the case that no injection points 
> are attached or something?

Even taking a spinlock in the postmaster is contrary to project
policy.  Maybe we could look the other way for debug-only code,
but it seems like a pretty horrible precedent.

Given your point that the existing locking is just a fig leaf
anyway, maybe we could simply not have any?  Then it's on the
test author to be sure that the injection point won't be
getting invoked when it's about to be removed.  Or just rip
out the removal capability, and say that once installed an
injection point is there until postmaster shutdown (or till
shared memory reinitialization, anyway).

            regards, tom lane



Re: Injection point locking

From
Michael Paquier
Date:
On Mon, Jun 24, 2024 at 01:29:38PM +0300, Heikki Linnakangas wrote:
> InjectionPointRun() acquires InjectionPointLock, looks up the hash entry,
> and releases the lock:
>
> >     LWLockAcquire(InjectionPointLock, LW_SHARED);
> >     entry_by_name = (InjectionPointEntry *)
> >         hash_search(InjectionPointHash, name,
> >                     HASH_FIND, &found);
> >     LWLockRelease(InjectionPointLock);
>
> Later, it reads fields from the entry it looked up:
>
> >         /* not found in local cache, so load and register */
> >         snprintf(path, MAXPGPATH, "%s/%s%s", pkglib_path,
> >                  entry_by_name->library, DLSUFFIX);
>
> Isn't that a straightforward race condition, if the injection point is
> detached in between?

This is a feature, not a bug :)

Jokes apart, this is a behavior that Noah was looking for so as it is
possible to detach a point to emulate what a debugger would do with a
breakpoint for some of his tests with concurrent DDL bugs, so not
taking a lock while running a point is important.  It's true, though,
that we could always delay the LWLock release once the local cache is
loaded, but would it really matter?
--
Michael

Attachment

Re: Injection point locking

From
Noah Misch
Date:
On Mon, Jun 24, 2024 at 11:03:09AM -0400, Tom Lane wrote:
> Heikki Linnakangas <hlinnaka@iki.fi> writes:
> > ... I can't do that, because InjectionPointRun() requires a PGPROC 
> > entry, because it uses an LWLock. That also makes it impossible to use 
> > injection points in the postmaster. Any chance we could allow injection 
> > points to be triggered without a PGPROC entry? Could we use a simple 
> > spinlock instead?

That sounds fine to me.  If calling hash_search() with a spinlock feels too
awful, a list to linear-search could work.

> > With a fast path for the case that no injection points 
> > are attached or something?
> 
> Even taking a spinlock in the postmaster is contrary to project
> policy.  Maybe we could look the other way for debug-only code,
> but it seems like a pretty horrible precedent.

If you're actually using an injection point in the postmaster, that would be
the least of concerns.  It is something of a concern for running an injection
point build while not attaching any injection point.  One solution could be a
GUC to control whether the postmaster participates in injection points.
Another could be to make the data structure readable with atomics only.

> Given your point that the existing locking is just a fig leaf
> anyway, maybe we could simply not have any?  Then it's on the
> test author to be sure that the injection point won't be
> getting invoked when it's about to be removed.

That's tricky with injection_points_set_local() plus an injection point at a
frequently-reached location.  It's one thing to control when the
injection_points_set_local() process reaches the injection point.  It's too
hard to control all the other processes that just reach the injection point
and conclude it's not for their PID.

> Or just rip
> out the removal capability, and say that once installed an
> injection point is there until postmaster shutdown (or till
> shared memory reinitialization, anyway).

That could work.  Tests do need a way to soft-disable, but it's okay with me
if nothing can reclaim the resources.



Re: Injection point locking

From
Michael Paquier
Date:
On Mon, Jun 24, 2024 at 11:03:09AM -0400, Tom Lane wrote:
> Given your point that the existing locking is just a fig leaf
> anyway, maybe we could simply not have any?  Then it's on the
> test author to be sure that the injection point won't be
> getting invoked when it's about to be removed.

That would work for me to put the responsibility to the test author,
ripping out the LWLock.  I was wondering when somebody would come up
with a case where they'd want to point to the postmaster to do
something, without really coming down to a case, so there was that
from my side originally.

Looking at all the points currently in the tree, nothing cares about
the concurrent locking when attaching or detaching a point, so perhaps
it is a good thing based on these experiences to just let this LWLock
go.  This should not impact the availability of the tests, either.

> Or just rip
> out the removal capability, and say that once installed an
> injection point is there until postmaster shutdown (or till
> shared memory reinitialization, anyway).

But not that.  Being able to remove points on the fly can be important
in some cases, for example where you'd still want to issue an ERROR
(partial write path is one case) in a SQL test, then remove it in a
follow-up SQL query to not trigger the same ERROR.
--
Michael

Attachment

Re: Injection point locking

From
Noah Misch
Date:
On Tue, Jun 25, 2024 at 11:14:57AM +0900, Michael Paquier wrote:
> On Mon, Jun 24, 2024 at 01:29:38PM +0300, Heikki Linnakangas wrote:
> > InjectionPointRun() acquires InjectionPointLock, looks up the hash entry,
> > and releases the lock:
> > 
> > >     LWLockAcquire(InjectionPointLock, LW_SHARED);
> > >     entry_by_name = (InjectionPointEntry *)
> > >         hash_search(InjectionPointHash, name,
> > >                     HASH_FIND, &found);
> > >     LWLockRelease(InjectionPointLock);
> > 
> > Later, it reads fields from the entry it looked up:
> > 
> > >         /* not found in local cache, so load and register */
> > >         snprintf(path, MAXPGPATH, "%s/%s%s", pkglib_path,
> > >                  entry_by_name->library, DLSUFFIX);
> > 
> > Isn't that a straightforward race condition, if the injection point is
> > detached in between?
> 
> This is a feature, not a bug :)
> 
> Jokes apart, this is a behavior that Noah was looking for so as it is
> possible to detach a point to emulate what a debugger would do with a
> breakpoint for some of his tests with concurrent DDL bugs, so not
> taking a lock while running a point is important.  It's true, though,
> that we could always delay the LWLock release once the local cache is
> loaded, but would it really matter?

I think your last sentence is what Heikki is saying should happen, and I
agree.  Yes, it matters.  As written, InjectionPointRun() could cache an
entry_by_name->function belonging to a different injection point.



Re: Injection point locking

From
Michael Paquier
Date:
On Tue, Jun 25, 2024 at 09:10:06AM -0700, Noah Misch wrote:
> I think your last sentence is what Heikki is saying should happen, and I
> agree.  Yes, it matters.  As written, InjectionPointRun() could cache an
> entry_by_name->function belonging to a different injection point.

That's true, we could delay the release of the lock to happen just
before a callback is run.

Now, how much do people wish to see for the postmaster bits mentioned
upthread?  Taking a spinlock for so long is not going to work, so we
could just remove it and let developers deal with that and feed on the
flexibility with the lock removal to allow this stuff in more areas.
All the existing tests are OK with that, and I think that also the
case of what you have proposed for the concurrency issues with
in-place updates of catalogs.  Or we could live with a no-lock path
when going through that with the postmaster, but that's a bit weird.

Note that with the current callbacks in the module, assuming that a
point is added within BackendStartup() in the postmaster like the
attached, an ERROR is promoted to a FATAL, taking down the cluster.  A
NOTICE of course works find.  Waits with conditional variables are not
really OK.  How much are you looking for here?

The shmem state being initialized in the DSM registry is not something
that's going to work in the context of the postmaster, but we could
tweak the module so as it can be loaded, initializing the shared state
with the shmem hooks and falling back to a DSM registry when the
library is not loaded with shared_preload_libraries.  For example, see
the POC attached, where I've played with injection points in
BackendStartup(), which is the area I'm guessing Heikki was looking
at.
--
Michael

Attachment

Re: Injection point locking

From
Michael Paquier
Date:
On Wed, Jun 26, 2024 at 10:56:12AM +0900, Michael Paquier wrote:
> That's true, we could delay the release of the lock to happen just
> before a callback is run.

I am not sure what else we can do for the postmaster case for now, so
I've moved ahead with the concern regarding the existing locking
release delay when running a point, and pushed a patch for it.
--
Michael

Attachment

Re: Injection point locking

From
Heikki Linnakangas
Date:
On 25/06/2024 05:25, Noah Misch wrote:
> On Mon, Jun 24, 2024 at 11:03:09AM -0400, Tom Lane wrote:
>> Heikki Linnakangas <hlinnaka@iki.fi> writes:
>>> ... I can't do that, because InjectionPointRun() requires a PGPROC
>>> entry, because it uses an LWLock. That also makes it impossible to use
>>> injection points in the postmaster. Any chance we could allow injection
>>> points to be triggered without a PGPROC entry? Could we use a simple
>>> spinlock instead?
> 
> That sounds fine to me.  If calling hash_search() with a spinlock feels too
> awful, a list to linear-search could work.

>>> With a fast path for the case that no injection points
>>> are attached or something?
>>
>> Even taking a spinlock in the postmaster is contrary to project
>> policy.  Maybe we could look the other way for debug-only code,
>> but it seems like a pretty horrible precedent.
> 
> If you're actually using an injection point in the postmaster, that would be
> the least of concerns.  It is something of a concern for running an injection
> point build while not attaching any injection point.  One solution could be a
> GUC to control whether the postmaster participates in injection points.
> Another could be to make the data structure readable with atomics only.

I came up with the attached. It replaces the shmem hash table with an 
array that's scanned linearly. On each slot in the array, there's a 
generation number that indicates whether the slot is in use, and allows 
detecting concurrent modifications without locks. The attach/detach 
operations still hold the LWLock, but InjectionPointRun() is now 
lock-free, so it can be used without a PGPROC entry.

It's now usable from postmaster too. However, it's theoretically 
possible that if shared memory is overwritten with garbage, the garbage 
looks like a valid injection point with a name that matches one of the 
injection points that postmaster looks at. That seems unlikely enough 
that I think we can accept the risk. To close that gap 100% I think a 
GUC is the only solution.

Note that until we actually add an injection point to a function that 
runs in the postmaster, there's no risk. If we're uneasy about that, we 
could add an assertion to InjectionPointRun() to prevent it from running 
in the postmaster, so that we don't cross that line inadvertently.

Thoughts?

-- 
Heikki Linnakangas
Neon (https://neon.tech)

Attachment

Re: Injection point locking

From
Tom Lane
Date:
Heikki Linnakangas <hlinnaka@iki.fi> writes:
> Note that until we actually add an injection point to a function that 
> runs in the postmaster, there's no risk. If we're uneasy about that, we 
> could add an assertion to InjectionPointRun() to prevent it from running 
> in the postmaster, so that we don't cross that line inadvertently.

As long as we consider injection points to be a debug/test feature
only, I think it's a net positive that one can be set in the
postmaster.  I'd be considerably more uncomfortable if somebody
wanted to do that in production, but maybe it'd be fine even then.

            regards, tom lane



Re: Injection point locking

From
Michael Paquier
Date:
On Mon, Jul 08, 2024 at 10:17:49AM -0400, Tom Lane wrote:
> Heikki Linnakangas <hlinnaka@iki.fi> writes:
>> Note that until we actually add an injection point to a function that
>> runs in the postmaster, there's no risk. If we're uneasy about that, we
>> could add an assertion to InjectionPointRun() to prevent it from running
>> in the postmaster, so that we don't cross that line inadvertently.

AFAIU, you want to be able to do that to enforce some protocol checks.
That's a fine goal.

> As long as we consider injection points to be a debug/test feature
> only, I think it's a net positive that one can be set in the
> postmaster. I'd be considerably more uncomfortable if somebody
> wanted to do that in production, but maybe it'd be fine even then.

This is documented as a developer feature for tests, the docs are
clear about that.
--
Michael

Attachment

Re: Injection point locking

From
Michael Paquier
Date:
On Mon, Jul 08, 2024 at 04:21:37PM +0300, Heikki Linnakangas wrote:
> I came up with the attached. It replaces the shmem hash table with an array
> that's scanned linearly. On each slot in the array, there's a generation
> number that indicates whether the slot is in use, and allows detecting
> concurrent modifications without locks. The attach/detach operations still
> hold the LWLock, but InjectionPointRun() is now lock-free, so it can be used
> without a PGPROC entry.

Okay, noted.

> It's now usable from postmaster too. However, it's theoretically possible
> that if shared memory is overwritten with garbage, the garbage looks like a
> valid injection point with a name that matches one of the injection points
> that postmaster looks at. That seems unlikely enough that I think we can
> accept the risk. To close that gap 100% I think a GUC is the only solution.

This does not worry me much, FWIW.

+ * optimization to.avoid scanning through the whole entry, in the common case

s/to.avoid/to avoid/

+     * generation counter on each entry to to allow safe, lock-free reading.
s/to to/to/

+     * we're looking for is concurrently added or remoed, we might or might
s/remoed/removed/

+    if (max_inuse == 0)
+    {
+        if (InjectionPointCache)
+        {
+            hash_destroy(InjectionPointCache);
+            InjectionPointCache = NULL;
+        }
+        return false;

In InjectionPointCacheRefresh(), this points to nothing in the cache,
so it should return NULL not false, even both are 0.

 typedef struct InjectionPointCacheEntry
 {
     char        name[INJ_NAME_MAXLEN];
+    int            slot_idx;
+    uint64        generation;
     char        private_data[INJ_PRIVATE_MAXLEN];
     InjectionPointCallback callback;
 } InjectionPointCacheEntry;

May be worth mentioning that generation is a copy of
InjectionPointEntry's generation cross-checked at runtime with the
shmem entry to see if we have a cache consistent with shmem under the
same point name.

+        generation = pg_atomic_read_u64(&entry->generation);
+        if (generation % 2 == 0)
+            continue;
In the loops of InjectionPointCacheRefresh() and
InjectionPointDetach(), perhaps this should say that the slot is not
used hence skipped when generation is even.

InjectionPointDetach() has this code block at its end:
    if (!found)
        return false;
    return true;

Not the fault of this patch, but this can just return "found".

The tricks with max_inuse to make the shmem lookups cheaper are
interesting.

+        pg_read_barrier();
+        if (memcmp(entry->name, name, namelen + 1) != 0)
+            continue;
Why this barrier when checking the name of a shmem entry before
reloading it in the local cache?  Perhaps the reason should be
commented?

+        pg_read_barrier();
+        if (pg_atomic_read_u64(&entry->generation) != generation)
+            continue;            /* was detached concurrently */
+
+        return injection_point_cache_load(&local_copy, idx, generation);

So, in InjectionPointCacheRefresh(), when a point is loaded into the
local cache for the first time, the read of "generation" is the
tipping point: it is possible to take a breakpoint at the beginning of
injection_point_cache_load(), detach then attach the point.  What
matters is that we are going to use the data in local_copy, even if
shmem may have something entirely different.  Hmm.  Okay.  It is a bit
annoying that the entry is just discarded and ignored if the local
copy and shmem generations don't match?  Could it be more
user-friendly to go back to the beginning of ActiveInjectionPoints and
re-check the whole rather than return a NULL callback?

-   if (private_data != NULL)
-       memcpy(entry->private_data, private_data, INJ_PRIVATE_MAXLEN);
+   memcpy(entry->private_data, private_data, INJ_PRIVATE_MAXLEN);

private_data could be NULL, hence why the memcpy()?
--
Michael

Attachment

Re: Injection point locking

From
Heikki Linnakangas
Date:
On 09/07/2024 08:16, Michael Paquier wrote:
>   typedef struct InjectionPointCacheEntry
>   {
>       char        name[INJ_NAME_MAXLEN];
> +    int            slot_idx;
> +    uint64        generation;
>       char        private_data[INJ_PRIVATE_MAXLEN];
>       InjectionPointCallback callback;
>   } InjectionPointCacheEntry;
> 
> May be worth mentioning that generation is a copy of
> InjectionPointEntry's generation cross-checked at runtime with the
> shmem entry to see if we have a cache consistent with shmem under the
> same point name.

Added a comment.

> +        generation = pg_atomic_read_u64(&entry->generation);
> +        if (generation % 2 == 0)
> +            continue;
> In the loops of InjectionPointCacheRefresh() and
> InjectionPointDetach(), perhaps this should say that the slot is not
> used hence skipped when generation is even.

Added a brief "/* empty slot */" comment

> InjectionPointDetach() has this code block at its end:
>      if (!found)
>          return false;
>      return true;
> 
> Not the fault of this patch, but this can just return "found".

Done.

> The tricks with max_inuse to make the shmem lookups cheaper are
> interesting.
> 
> +        pg_read_barrier();
> +        if (memcmp(entry->name, name, namelen + 1) != 0)
> +            continue;
> Why this barrier when checking the name of a shmem entry before
> reloading it in the local cache?  Perhaps the reason should be
> commented?

Added a comment.

> +        pg_read_barrier();
> +        if (pg_atomic_read_u64(&entry->generation) != generation)
> +            continue;            /* was detached concurrently */
> +
> +        return injection_point_cache_load(&local_copy, idx, generation);
> 
> So, in InjectionPointCacheRefresh(), when a point is loaded into the
> local cache for the first time, the read of "generation" is the
> tipping point: it is possible to take a breakpoint at the beginning of
> injection_point_cache_load(), detach then attach the point.  What
> matters is that we are going to use the data in local_copy, even if
> shmem may have something entirely different.  Hmm.  Okay.  It is a bit
> annoying that the entry is just discarded and ignored if the local
> copy and shmem generations don't match?  Could it be more
> user-friendly to go back to the beginning of ActiveInjectionPoints and
> re-check the whole rather than return a NULL callback?

I thought about it, but no. If the generation number doesn't match, 
there are a few possibilities:

1. The entry was what we were looking for, but it was concurrently 
detached. Return NULL is correct in that case.

2. The entry was what we were looking for, but it was concurrently 
detached, and was then immediately reattached. NULL is a fine return 
value in that case too. When Run runs concurrently with Detach+Attach, 
you don't get any guarantee whether the actual apparent order is 
"Detach, Attach, Run", "Detach, Run, Attach", or "Run, Detach, Attach". 
NULL result corresponds to the "Detach, Run, Attach" ordering.

3. The entry was not actually what we were looking for. The name 
comparison falsely matched just because the slot was concurrently 
detached and recycled for a different injection point. We must continue 
the search in that case.

I added a comment to the top of the loop to explain scenario 2. And a 
comment to the "continue" to explain scnario 3, because that's a bit subtle.

> -   if (private_data != NULL)
> -       memcpy(entry->private_data, private_data, INJ_PRIVATE_MAXLEN);
> +   memcpy(entry->private_data, private_data, INJ_PRIVATE_MAXLEN);
> 
> private_data could be NULL, hence why the memcpy()?

It can not be NULL. You can pass NULL or a shorter length, to 
InjectionPointAttach(), but we don't store the length in shared memory.

Attached is a new version. No other changes except for fixes for the 
things you pointed out and comments.

-- 
Heikki Linnakangas
Neon (https://neon.tech)

Attachment

Re: Injection point locking

From
Michael Paquier
Date:
On Tue, Jul 09, 2024 at 12:12:04PM +0300, Heikki Linnakangas wrote:
> I thought about it, but no. If the generation number doesn't match, there
> are a few possibilities:
>
> 1. The entry was what we were looking for, but it was concurrently detached.
> Return NULL is correct in that case.
>
> 2. The entry was what we were looking for, but it was concurrently detached,
> and was then immediately reattached. NULL is a fine return value in that
> case too. When Run runs concurrently with Detach+Attach, you don't get any
> guarantee whether the actual apparent order is "Detach, Attach, Run",
> "Detach, Run, Attach", or "Run, Detach, Attach". NULL result corresponds to
> the "Detach, Run, Attach" ordering.
>
> 3. The entry was not actually what we were looking for. The name comparison
> falsely matched just because the slot was concurrently detached and recycled
> for a different injection point. We must continue the search in that case.
>
> I added a comment to the top of the loop to explain scenario 2. And a
> comment to the "continue" to explain scnario 3, because that's a bit subtle.

Okay.  I am fine with your arguments here.  There is still an argument
imo about looping back at the beginning of ActiveInjectionPoints
entries if we find an entry with a matching name but the generation
does not match with the local copy for the detach-attach concurrent
case, but just moving on with the follow-up entries is also OK by me,
as well.

The new comments in InjectionPointCacheRefresh() are nice
improvements.  Thanks for that.
--
Michael

Attachment

Re: Injection point locking

From
Heikki Linnakangas
Date:
On 10/07/2024 06:44, Michael Paquier wrote:
> On Tue, Jul 09, 2024 at 12:12:04PM +0300, Heikki Linnakangas wrote:
>> I thought about it, but no. If the generation number doesn't match, there
>> are a few possibilities:
>>
>> 1. The entry was what we were looking for, but it was concurrently detached.
>> Return NULL is correct in that case.
>>
>> 2. The entry was what we were looking for, but it was concurrently detached,
>> and was then immediately reattached. NULL is a fine return value in that
>> case too. When Run runs concurrently with Detach+Attach, you don't get any
>> guarantee whether the actual apparent order is "Detach, Attach, Run",
>> "Detach, Run, Attach", or "Run, Detach, Attach". NULL result corresponds to
>> the "Detach, Run, Attach" ordering.
>>
>> 3. The entry was not actually what we were looking for. The name comparison
>> falsely matched just because the slot was concurrently detached and recycled
>> for a different injection point. We must continue the search in that case.
>>
>> I added a comment to the top of the loop to explain scenario 2. And a
>> comment to the "continue" to explain scnario 3, because that's a bit subtle.
> 
> Okay.  I am fine with your arguments here.  There is still an argument
> imo about looping back at the beginning of ActiveInjectionPoints
> entries if we find an entry with a matching name but the generation
> does not match with the local copy for the detach-attach concurrent
> case, but just moving on with the follow-up entries is also OK by me,
> as well.
> 
> The new comments in InjectionPointCacheRefresh() are nice
> improvements.  Thanks for that.

Ok, committed this.

-- 
Heikki Linnakangas
Neon (https://neon.tech)




Re: Injection point locking

From
Michael Paquier
Date:
On Mon, Jul 15, 2024 at 10:55:26AM +0300, Heikki Linnakangas wrote:
> Ok, committed this.

Okidoki.  Thanks!
--
Michael

Attachment