Re: Weird test mixup - Mailing list pgsql-hackers
From | Michael Paquier |
---|---|
Subject | Re: Weird test mixup |
Date | |
Msg-id | Zjwa4lX6wADRAili@paquier.xyz Whole thread Raw |
In response to | Re: Weird test mixup (Noah Misch <noah@leadboat.com>) |
Responses |
Re: Weird test mixup
|
List | pgsql-hackers |
On Tue, May 07, 2024 at 11:53:10AM -0700, Noah Misch wrote: > On Tue, May 07, 2024 at 10:17:49AM +0900, Michael Paquier wrote: >> Always resetting >> condition->name when detaching a point is a simpler flow and saner >> IMO. >> >> Overall, this switches from one detach behavior to a different one, > > Can you say more about that? The only behavior change known to me is that a > given injection point workload uses more of INJ_MAX_CONDITION. If there's > another behavior change, it was likely unintended. As far as I read the previous patch, the conditions stored in InjectionPointSharedState would never be really gone, even if the points are removed from InjectionPointHash. > The problem I'm trying to tackle in this thread is to make > src/test/modules/gin installcheck-safe. $SUBJECT's commit 5105c90 started > that work, having seen the intarray test suite break when run concurrently > with the injection_points test suite. That combination still does break at > the exit-time race condition. To reproduce, apply this attachment to add > sleeps, and run: > > make -C src/test/modules/gin installcheck USE_MODULE_DB=1 & sleep 2; make -C contrib/intarray installcheck USE_MODULE_DB=1 Thanks for that. I am not really sure how to protect that without a small cost in flexibility for the cases of detach vs run paths. This comes down to the fact that a custom routine could be run while it could be detached concurrently, removing any stuff a callback could depend on in the module. It was mentioned upthread to add to InjectionPointCacheEntry a fixed area of memory that modules could use to store some "status" data, but it would not close the run/detach race because a backend could still hold a pointer to a callback, with concurrent backends playing with the contents of InjectionPointCacheEntry (concurrent detaches and attaches that would cause the same entries to be reused). One way to close entirely the window would be to hold InjectionPointLock longer in InjectionPointRun() until the callback finishes or until it triggers an ERROR. This would mean that the case you've mentioned in [1] would change, by blocking the detach() of s3 until the callback of s2 finishes. We don't have tests in the tree that do any of that, so holding InjectionPointLock longer would not break anything on HEAD. A detach being possible while the callback is run is something I've considered as valid in d86d20f0ba79, but perhaps that was too cute of me, even more with the use case of local points. > Separately, I see injection_points_attach() populates InjectionPointCondition > after InjectionPointAttach(). Shouldn't InjectionPointAttach() come last, to > avoid the same sort of race? I've not tried to reproduce that one. Good point. You could run into the case of a concurrent backend running an injection point that should be local if waiting between InjectionPointAttach() and the condition getting registered in injection_points_attach(). That should be reversed. [1] https://www.postgresql.org/message-id/20240501231214.40@rfd.leadboat.com -- Michael
Attachment
pgsql-hackers by date: