Re: Injection point locking - Mailing list pgsql-hackers
From | Michael Paquier |
---|---|
Subject | Re: Injection point locking |
Date | |
Msg-id | ZozHnQ5QZSjI4CPH@paquier.xyz Whole thread Raw |
In response to | Re: Injection point locking (Heikki Linnakangas <hlinnaka@iki.fi>) |
Responses |
Re: Injection point locking
|
List | pgsql-hackers |
On Mon, Jul 08, 2024 at 04:21:37PM +0300, Heikki Linnakangas wrote: > I came up with the attached. It replaces the shmem hash table with an array > that's scanned linearly. On each slot in the array, there's a generation > number that indicates whether the slot is in use, and allows detecting > concurrent modifications without locks. The attach/detach operations still > hold the LWLock, but InjectionPointRun() is now lock-free, so it can be used > without a PGPROC entry. Okay, noted. > It's now usable from postmaster too. However, it's theoretically possible > that if shared memory is overwritten with garbage, the garbage looks like a > valid injection point with a name that matches one of the injection points > that postmaster looks at. That seems unlikely enough that I think we can > accept the risk. To close that gap 100% I think a GUC is the only solution. This does not worry me much, FWIW. + * optimization to.avoid scanning through the whole entry, in the common case s/to.avoid/to avoid/ + * generation counter on each entry to to allow safe, lock-free reading. s/to to/to/ + * we're looking for is concurrently added or remoed, we might or might s/remoed/removed/ + if (max_inuse == 0) + { + if (InjectionPointCache) + { + hash_destroy(InjectionPointCache); + InjectionPointCache = NULL; + } + return false; In InjectionPointCacheRefresh(), this points to nothing in the cache, so it should return NULL not false, even both are 0. typedef struct InjectionPointCacheEntry { char name[INJ_NAME_MAXLEN]; + int slot_idx; + uint64 generation; char private_data[INJ_PRIVATE_MAXLEN]; InjectionPointCallback callback; } InjectionPointCacheEntry; May be worth mentioning that generation is a copy of InjectionPointEntry's generation cross-checked at runtime with the shmem entry to see if we have a cache consistent with shmem under the same point name. + generation = pg_atomic_read_u64(&entry->generation); + if (generation % 2 == 0) + continue; In the loops of InjectionPointCacheRefresh() and InjectionPointDetach(), perhaps this should say that the slot is not used hence skipped when generation is even. InjectionPointDetach() has this code block at its end: if (!found) return false; return true; Not the fault of this patch, but this can just return "found". The tricks with max_inuse to make the shmem lookups cheaper are interesting. + pg_read_barrier(); + if (memcmp(entry->name, name, namelen + 1) != 0) + continue; Why this barrier when checking the name of a shmem entry before reloading it in the local cache? Perhaps the reason should be commented? + pg_read_barrier(); + if (pg_atomic_read_u64(&entry->generation) != generation) + continue; /* was detached concurrently */ + + return injection_point_cache_load(&local_copy, idx, generation); So, in InjectionPointCacheRefresh(), when a point is loaded into the local cache for the first time, the read of "generation" is the tipping point: it is possible to take a breakpoint at the beginning of injection_point_cache_load(), detach then attach the point. What matters is that we are going to use the data in local_copy, even if shmem may have something entirely different. Hmm. Okay. It is a bit annoying that the entry is just discarded and ignored if the local copy and shmem generations don't match? Could it be more user-friendly to go back to the beginning of ActiveInjectionPoints and re-check the whole rather than return a NULL callback? - if (private_data != NULL) - memcpy(entry->private_data, private_data, INJ_PRIVATE_MAXLEN); + memcpy(entry->private_data, private_data, INJ_PRIVATE_MAXLEN); private_data could be NULL, hence why the memcpy()? -- Michael
Attachment
pgsql-hackers by date: