Thread: Injection point locking
InjectionPointRun() acquires InjectionPointLock, looks up the hash entry, and releases the lock: > LWLockAcquire(InjectionPointLock, LW_SHARED); > entry_by_name = (InjectionPointEntry *) > hash_search(InjectionPointHash, name, > HASH_FIND, &found); > LWLockRelease(InjectionPointLock); Later, it reads fields from the entry it looked up: > /* not found in local cache, so load and register */ > snprintf(path, MAXPGPATH, "%s/%s%s", pkglib_path, > entry_by_name->library, DLSUFFIX); Isn't that a straightforward race condition, if the injection point is detached in between? Another thing: I wanted use injection points to inject an error early at backend startup, to write a test case for the scenarios that Jacob point out at https://www.postgresql.org/message-id/CAOYmi%2Bnwvu21mJ4DYKUa98HdfM_KZJi7B1MhyXtnsyOO-PB6Ww%40mail.gmail.com. But I can't do that, because InjectionPointRun() requires a PGPROC entry, because it uses an LWLock. That also makes it impossible to use injection points in the postmaster. Any chance we could allow injection points to be triggered without a PGPROC entry? Could we use a simple spinlock instead? With a fast path for the case that no injection points are attached or something? -- Heikki Linnakangas Neon (https://neon.tech)
Heikki Linnakangas <hlinnaka@iki.fi> writes: > ... I can't do that, because InjectionPointRun() requires a PGPROC > entry, because it uses an LWLock. That also makes it impossible to use > injection points in the postmaster. Any chance we could allow injection > points to be triggered without a PGPROC entry? Could we use a simple > spinlock instead? With a fast path for the case that no injection points > are attached or something? Even taking a spinlock in the postmaster is contrary to project policy. Maybe we could look the other way for debug-only code, but it seems like a pretty horrible precedent. Given your point that the existing locking is just a fig leaf anyway, maybe we could simply not have any? Then it's on the test author to be sure that the injection point won't be getting invoked when it's about to be removed. Or just rip out the removal capability, and say that once installed an injection point is there until postmaster shutdown (or till shared memory reinitialization, anyway). regards, tom lane
On Mon, Jun 24, 2024 at 01:29:38PM +0300, Heikki Linnakangas wrote: > InjectionPointRun() acquires InjectionPointLock, looks up the hash entry, > and releases the lock: > > > LWLockAcquire(InjectionPointLock, LW_SHARED); > > entry_by_name = (InjectionPointEntry *) > > hash_search(InjectionPointHash, name, > > HASH_FIND, &found); > > LWLockRelease(InjectionPointLock); > > Later, it reads fields from the entry it looked up: > > > /* not found in local cache, so load and register */ > > snprintf(path, MAXPGPATH, "%s/%s%s", pkglib_path, > > entry_by_name->library, DLSUFFIX); > > Isn't that a straightforward race condition, if the injection point is > detached in between? This is a feature, not a bug :) Jokes apart, this is a behavior that Noah was looking for so as it is possible to detach a point to emulate what a debugger would do with a breakpoint for some of his tests with concurrent DDL bugs, so not taking a lock while running a point is important. It's true, though, that we could always delay the LWLock release once the local cache is loaded, but would it really matter? -- Michael
Attachment
On Mon, Jun 24, 2024 at 11:03:09AM -0400, Tom Lane wrote: > Heikki Linnakangas <hlinnaka@iki.fi> writes: > > ... I can't do that, because InjectionPointRun() requires a PGPROC > > entry, because it uses an LWLock. That also makes it impossible to use > > injection points in the postmaster. Any chance we could allow injection > > points to be triggered without a PGPROC entry? Could we use a simple > > spinlock instead? That sounds fine to me. If calling hash_search() with a spinlock feels too awful, a list to linear-search could work. > > With a fast path for the case that no injection points > > are attached or something? > > Even taking a spinlock in the postmaster is contrary to project > policy. Maybe we could look the other way for debug-only code, > but it seems like a pretty horrible precedent. If you're actually using an injection point in the postmaster, that would be the least of concerns. It is something of a concern for running an injection point build while not attaching any injection point. One solution could be a GUC to control whether the postmaster participates in injection points. Another could be to make the data structure readable with atomics only. > Given your point that the existing locking is just a fig leaf > anyway, maybe we could simply not have any? Then it's on the > test author to be sure that the injection point won't be > getting invoked when it's about to be removed. That's tricky with injection_points_set_local() plus an injection point at a frequently-reached location. It's one thing to control when the injection_points_set_local() process reaches the injection point. It's too hard to control all the other processes that just reach the injection point and conclude it's not for their PID. > Or just rip > out the removal capability, and say that once installed an > injection point is there until postmaster shutdown (or till > shared memory reinitialization, anyway). That could work. Tests do need a way to soft-disable, but it's okay with me if nothing can reclaim the resources.
On Mon, Jun 24, 2024 at 11:03:09AM -0400, Tom Lane wrote: > Given your point that the existing locking is just a fig leaf > anyway, maybe we could simply not have any? Then it's on the > test author to be sure that the injection point won't be > getting invoked when it's about to be removed. That would work for me to put the responsibility to the test author, ripping out the LWLock. I was wondering when somebody would come up with a case where they'd want to point to the postmaster to do something, without really coming down to a case, so there was that from my side originally. Looking at all the points currently in the tree, nothing cares about the concurrent locking when attaching or detaching a point, so perhaps it is a good thing based on these experiences to just let this LWLock go. This should not impact the availability of the tests, either. > Or just rip > out the removal capability, and say that once installed an > injection point is there until postmaster shutdown (or till > shared memory reinitialization, anyway). But not that. Being able to remove points on the fly can be important in some cases, for example where you'd still want to issue an ERROR (partial write path is one case) in a SQL test, then remove it in a follow-up SQL query to not trigger the same ERROR. -- Michael
Attachment
On Tue, Jun 25, 2024 at 11:14:57AM +0900, Michael Paquier wrote: > On Mon, Jun 24, 2024 at 01:29:38PM +0300, Heikki Linnakangas wrote: > > InjectionPointRun() acquires InjectionPointLock, looks up the hash entry, > > and releases the lock: > > > > > LWLockAcquire(InjectionPointLock, LW_SHARED); > > > entry_by_name = (InjectionPointEntry *) > > > hash_search(InjectionPointHash, name, > > > HASH_FIND, &found); > > > LWLockRelease(InjectionPointLock); > > > > Later, it reads fields from the entry it looked up: > > > > > /* not found in local cache, so load and register */ > > > snprintf(path, MAXPGPATH, "%s/%s%s", pkglib_path, > > > entry_by_name->library, DLSUFFIX); > > > > Isn't that a straightforward race condition, if the injection point is > > detached in between? > > This is a feature, not a bug :) > > Jokes apart, this is a behavior that Noah was looking for so as it is > possible to detach a point to emulate what a debugger would do with a > breakpoint for some of his tests with concurrent DDL bugs, so not > taking a lock while running a point is important. It's true, though, > that we could always delay the LWLock release once the local cache is > loaded, but would it really matter? I think your last sentence is what Heikki is saying should happen, and I agree. Yes, it matters. As written, InjectionPointRun() could cache an entry_by_name->function belonging to a different injection point.
On Tue, Jun 25, 2024 at 09:10:06AM -0700, Noah Misch wrote: > I think your last sentence is what Heikki is saying should happen, and I > agree. Yes, it matters. As written, InjectionPointRun() could cache an > entry_by_name->function belonging to a different injection point. That's true, we could delay the release of the lock to happen just before a callback is run. Now, how much do people wish to see for the postmaster bits mentioned upthread? Taking a spinlock for so long is not going to work, so we could just remove it and let developers deal with that and feed on the flexibility with the lock removal to allow this stuff in more areas. All the existing tests are OK with that, and I think that also the case of what you have proposed for the concurrency issues with in-place updates of catalogs. Or we could live with a no-lock path when going through that with the postmaster, but that's a bit weird. Note that with the current callbacks in the module, assuming that a point is added within BackendStartup() in the postmaster like the attached, an ERROR is promoted to a FATAL, taking down the cluster. A NOTICE of course works find. Waits with conditional variables are not really OK. How much are you looking for here? The shmem state being initialized in the DSM registry is not something that's going to work in the context of the postmaster, but we could tweak the module so as it can be loaded, initializing the shared state with the shmem hooks and falling back to a DSM registry when the library is not loaded with shared_preload_libraries. For example, see the POC attached, where I've played with injection points in BackendStartup(), which is the area I'm guessing Heikki was looking at. -- Michael
Attachment
On Wed, Jun 26, 2024 at 10:56:12AM +0900, Michael Paquier wrote: > That's true, we could delay the release of the lock to happen just > before a callback is run. I am not sure what else we can do for the postmaster case for now, so I've moved ahead with the concern regarding the existing locking release delay when running a point, and pushed a patch for it. -- Michael
Attachment
On 25/06/2024 05:25, Noah Misch wrote: > On Mon, Jun 24, 2024 at 11:03:09AM -0400, Tom Lane wrote: >> Heikki Linnakangas <hlinnaka@iki.fi> writes: >>> ... I can't do that, because InjectionPointRun() requires a PGPROC >>> entry, because it uses an LWLock. That also makes it impossible to use >>> injection points in the postmaster. Any chance we could allow injection >>> points to be triggered without a PGPROC entry? Could we use a simple >>> spinlock instead? > > That sounds fine to me. If calling hash_search() with a spinlock feels too > awful, a list to linear-search could work. >>> With a fast path for the case that no injection points >>> are attached or something? >> >> Even taking a spinlock in the postmaster is contrary to project >> policy. Maybe we could look the other way for debug-only code, >> but it seems like a pretty horrible precedent. > > If you're actually using an injection point in the postmaster, that would be > the least of concerns. It is something of a concern for running an injection > point build while not attaching any injection point. One solution could be a > GUC to control whether the postmaster participates in injection points. > Another could be to make the data structure readable with atomics only. I came up with the attached. It replaces the shmem hash table with an array that's scanned linearly. On each slot in the array, there's a generation number that indicates whether the slot is in use, and allows detecting concurrent modifications without locks. The attach/detach operations still hold the LWLock, but InjectionPointRun() is now lock-free, so it can be used without a PGPROC entry. It's now usable from postmaster too. However, it's theoretically possible that if shared memory is overwritten with garbage, the garbage looks like a valid injection point with a name that matches one of the injection points that postmaster looks at. That seems unlikely enough that I think we can accept the risk. To close that gap 100% I think a GUC is the only solution. Note that until we actually add an injection point to a function that runs in the postmaster, there's no risk. If we're uneasy about that, we could add an assertion to InjectionPointRun() to prevent it from running in the postmaster, so that we don't cross that line inadvertently. Thoughts? -- Heikki Linnakangas Neon (https://neon.tech)
Attachment
Heikki Linnakangas <hlinnaka@iki.fi> writes: > Note that until we actually add an injection point to a function that > runs in the postmaster, there's no risk. If we're uneasy about that, we > could add an assertion to InjectionPointRun() to prevent it from running > in the postmaster, so that we don't cross that line inadvertently. As long as we consider injection points to be a debug/test feature only, I think it's a net positive that one can be set in the postmaster. I'd be considerably more uncomfortable if somebody wanted to do that in production, but maybe it'd be fine even then. regards, tom lane
On Mon, Jul 08, 2024 at 10:17:49AM -0400, Tom Lane wrote: > Heikki Linnakangas <hlinnaka@iki.fi> writes: >> Note that until we actually add an injection point to a function that >> runs in the postmaster, there's no risk. If we're uneasy about that, we >> could add an assertion to InjectionPointRun() to prevent it from running >> in the postmaster, so that we don't cross that line inadvertently. AFAIU, you want to be able to do that to enforce some protocol checks. That's a fine goal. > As long as we consider injection points to be a debug/test feature > only, I think it's a net positive that one can be set in the > postmaster. I'd be considerably more uncomfortable if somebody > wanted to do that in production, but maybe it'd be fine even then. This is documented as a developer feature for tests, the docs are clear about that. -- Michael
Attachment
On Mon, Jul 08, 2024 at 04:21:37PM +0300, Heikki Linnakangas wrote: > I came up with the attached. It replaces the shmem hash table with an array > that's scanned linearly. On each slot in the array, there's a generation > number that indicates whether the slot is in use, and allows detecting > concurrent modifications without locks. The attach/detach operations still > hold the LWLock, but InjectionPointRun() is now lock-free, so it can be used > without a PGPROC entry. Okay, noted. > It's now usable from postmaster too. However, it's theoretically possible > that if shared memory is overwritten with garbage, the garbage looks like a > valid injection point with a name that matches one of the injection points > that postmaster looks at. That seems unlikely enough that I think we can > accept the risk. To close that gap 100% I think a GUC is the only solution. This does not worry me much, FWIW. + * optimization to.avoid scanning through the whole entry, in the common case s/to.avoid/to avoid/ + * generation counter on each entry to to allow safe, lock-free reading. s/to to/to/ + * we're looking for is concurrently added or remoed, we might or might s/remoed/removed/ + if (max_inuse == 0) + { + if (InjectionPointCache) + { + hash_destroy(InjectionPointCache); + InjectionPointCache = NULL; + } + return false; In InjectionPointCacheRefresh(), this points to nothing in the cache, so it should return NULL not false, even both are 0. typedef struct InjectionPointCacheEntry { char name[INJ_NAME_MAXLEN]; + int slot_idx; + uint64 generation; char private_data[INJ_PRIVATE_MAXLEN]; InjectionPointCallback callback; } InjectionPointCacheEntry; May be worth mentioning that generation is a copy of InjectionPointEntry's generation cross-checked at runtime with the shmem entry to see if we have a cache consistent with shmem under the same point name. + generation = pg_atomic_read_u64(&entry->generation); + if (generation % 2 == 0) + continue; In the loops of InjectionPointCacheRefresh() and InjectionPointDetach(), perhaps this should say that the slot is not used hence skipped when generation is even. InjectionPointDetach() has this code block at its end: if (!found) return false; return true; Not the fault of this patch, but this can just return "found". The tricks with max_inuse to make the shmem lookups cheaper are interesting. + pg_read_barrier(); + if (memcmp(entry->name, name, namelen + 1) != 0) + continue; Why this barrier when checking the name of a shmem entry before reloading it in the local cache? Perhaps the reason should be commented? + pg_read_barrier(); + if (pg_atomic_read_u64(&entry->generation) != generation) + continue; /* was detached concurrently */ + + return injection_point_cache_load(&local_copy, idx, generation); So, in InjectionPointCacheRefresh(), when a point is loaded into the local cache for the first time, the read of "generation" is the tipping point: it is possible to take a breakpoint at the beginning of injection_point_cache_load(), detach then attach the point. What matters is that we are going to use the data in local_copy, even if shmem may have something entirely different. Hmm. Okay. It is a bit annoying that the entry is just discarded and ignored if the local copy and shmem generations don't match? Could it be more user-friendly to go back to the beginning of ActiveInjectionPoints and re-check the whole rather than return a NULL callback? - if (private_data != NULL) - memcpy(entry->private_data, private_data, INJ_PRIVATE_MAXLEN); + memcpy(entry->private_data, private_data, INJ_PRIVATE_MAXLEN); private_data could be NULL, hence why the memcpy()? -- Michael
Attachment
On 09/07/2024 08:16, Michael Paquier wrote: > typedef struct InjectionPointCacheEntry > { > char name[INJ_NAME_MAXLEN]; > + int slot_idx; > + uint64 generation; > char private_data[INJ_PRIVATE_MAXLEN]; > InjectionPointCallback callback; > } InjectionPointCacheEntry; > > May be worth mentioning that generation is a copy of > InjectionPointEntry's generation cross-checked at runtime with the > shmem entry to see if we have a cache consistent with shmem under the > same point name. Added a comment. > + generation = pg_atomic_read_u64(&entry->generation); > + if (generation % 2 == 0) > + continue; > In the loops of InjectionPointCacheRefresh() and > InjectionPointDetach(), perhaps this should say that the slot is not > used hence skipped when generation is even. Added a brief "/* empty slot */" comment > InjectionPointDetach() has this code block at its end: > if (!found) > return false; > return true; > > Not the fault of this patch, but this can just return "found". Done. > The tricks with max_inuse to make the shmem lookups cheaper are > interesting. > > + pg_read_barrier(); > + if (memcmp(entry->name, name, namelen + 1) != 0) > + continue; > Why this barrier when checking the name of a shmem entry before > reloading it in the local cache? Perhaps the reason should be > commented? Added a comment. > + pg_read_barrier(); > + if (pg_atomic_read_u64(&entry->generation) != generation) > + continue; /* was detached concurrently */ > + > + return injection_point_cache_load(&local_copy, idx, generation); > > So, in InjectionPointCacheRefresh(), when a point is loaded into the > local cache for the first time, the read of "generation" is the > tipping point: it is possible to take a breakpoint at the beginning of > injection_point_cache_load(), detach then attach the point. What > matters is that we are going to use the data in local_copy, even if > shmem may have something entirely different. Hmm. Okay. It is a bit > annoying that the entry is just discarded and ignored if the local > copy and shmem generations don't match? Could it be more > user-friendly to go back to the beginning of ActiveInjectionPoints and > re-check the whole rather than return a NULL callback? I thought about it, but no. If the generation number doesn't match, there are a few possibilities: 1. The entry was what we were looking for, but it was concurrently detached. Return NULL is correct in that case. 2. The entry was what we were looking for, but it was concurrently detached, and was then immediately reattached. NULL is a fine return value in that case too. When Run runs concurrently with Detach+Attach, you don't get any guarantee whether the actual apparent order is "Detach, Attach, Run", "Detach, Run, Attach", or "Run, Detach, Attach". NULL result corresponds to the "Detach, Run, Attach" ordering. 3. The entry was not actually what we were looking for. The name comparison falsely matched just because the slot was concurrently detached and recycled for a different injection point. We must continue the search in that case. I added a comment to the top of the loop to explain scenario 2. And a comment to the "continue" to explain scnario 3, because that's a bit subtle. > - if (private_data != NULL) > - memcpy(entry->private_data, private_data, INJ_PRIVATE_MAXLEN); > + memcpy(entry->private_data, private_data, INJ_PRIVATE_MAXLEN); > > private_data could be NULL, hence why the memcpy()? It can not be NULL. You can pass NULL or a shorter length, to InjectionPointAttach(), but we don't store the length in shared memory. Attached is a new version. No other changes except for fixes for the things you pointed out and comments. -- Heikki Linnakangas Neon (https://neon.tech)
Attachment
On Tue, Jul 09, 2024 at 12:12:04PM +0300, Heikki Linnakangas wrote: > I thought about it, but no. If the generation number doesn't match, there > are a few possibilities: > > 1. The entry was what we were looking for, but it was concurrently detached. > Return NULL is correct in that case. > > 2. The entry was what we were looking for, but it was concurrently detached, > and was then immediately reattached. NULL is a fine return value in that > case too. When Run runs concurrently with Detach+Attach, you don't get any > guarantee whether the actual apparent order is "Detach, Attach, Run", > "Detach, Run, Attach", or "Run, Detach, Attach". NULL result corresponds to > the "Detach, Run, Attach" ordering. > > 3. The entry was not actually what we were looking for. The name comparison > falsely matched just because the slot was concurrently detached and recycled > for a different injection point. We must continue the search in that case. > > I added a comment to the top of the loop to explain scenario 2. And a > comment to the "continue" to explain scnario 3, because that's a bit subtle. Okay. I am fine with your arguments here. There is still an argument imo about looping back at the beginning of ActiveInjectionPoints entries if we find an entry with a matching name but the generation does not match with the local copy for the detach-attach concurrent case, but just moving on with the follow-up entries is also OK by me, as well. The new comments in InjectionPointCacheRefresh() are nice improvements. Thanks for that. -- Michael
Attachment
On 10/07/2024 06:44, Michael Paquier wrote: > On Tue, Jul 09, 2024 at 12:12:04PM +0300, Heikki Linnakangas wrote: >> I thought about it, but no. If the generation number doesn't match, there >> are a few possibilities: >> >> 1. The entry was what we were looking for, but it was concurrently detached. >> Return NULL is correct in that case. >> >> 2. The entry was what we were looking for, but it was concurrently detached, >> and was then immediately reattached. NULL is a fine return value in that >> case too. When Run runs concurrently with Detach+Attach, you don't get any >> guarantee whether the actual apparent order is "Detach, Attach, Run", >> "Detach, Run, Attach", or "Run, Detach, Attach". NULL result corresponds to >> the "Detach, Run, Attach" ordering. >> >> 3. The entry was not actually what we were looking for. The name comparison >> falsely matched just because the slot was concurrently detached and recycled >> for a different injection point. We must continue the search in that case. >> >> I added a comment to the top of the loop to explain scenario 2. And a >> comment to the "continue" to explain scnario 3, because that's a bit subtle. > > Okay. I am fine with your arguments here. There is still an argument > imo about looping back at the beginning of ActiveInjectionPoints > entries if we find an entry with a matching name but the generation > does not match with the local copy for the detach-attach concurrent > case, but just moving on with the follow-up entries is also OK by me, > as well. > > The new comments in InjectionPointCacheRefresh() are nice > improvements. Thanks for that. Ok, committed this. -- Heikki Linnakangas Neon (https://neon.tech)
On Mon, Jul 15, 2024 at 10:55:26AM +0300, Heikki Linnakangas wrote: > Ok, committed this. Okidoki. Thanks! -- Michael