Thread: patch: fix race in SSI's CheckTargetForConflictsIn
While running some benchmarks to test SSI performance, I found a race condition that's capable of causing a segfault. A patch is attached. The bug is in CheckTargetForConflictsIn, which scans the list of SIREAD locks on a lock target when it's modified. There's an optimization in there where the writing transaction will remove a SIREAD lock that it holds itself, because it's being replaced with a (stronger) write lock. To do that, it needs to drop its shared lwlocks and reacquire them in exclusive mode. The existing code deals with concurrent modifications in that interval by redoing checks. However, it misses the case where some other transaction removes all remaining locks on the target, and proceeds to remove the lock target itself. The attached patch fixes this by deferring the SIREAD lock removal until the end of the function. At that point, there isn't any need to worry about concurrent updates to the target's lock list. The resulting code is also simpler. Dan -- Dan R. K. Ports MIT CSAIL http://drkp.net/
Attachment
Dan Ports <drkp@csail.mit.edu> wrote: > While running some benchmarks to test SSI performance, I found a > race condition that's capable of causing a segfault. A patch is > attached. > > The bug is in CheckTargetForConflictsIn, which scans the list of > SIREAD locks on a lock target when it's modified. There's an > optimization in there where the writing transaction will remove a > SIREAD lock that it holds itself, because it's being replaced with > a (stronger) write lock. To do that, it needs to drop its shared > lwlocks and reacquire them in exclusive mode. The existing code > deals with concurrent modifications in that interval by redoing > checks. However, it misses the case where some other transaction > removes all remaining locks on the target, and proceeds to remove > the lock target itself. > > The attached patch fixes this by deferring the SIREAD lock removal > until the end of the function. At that point, there isn't any need > to worry about concurrent updates to the target's lock list. The > resulting code is also simpler. I just want to confirm that this addresses a real (although very narrow) race condition. It results from code used to implement a valuable optimization described in Cahill's thesis: don't get or keep an SIREAD lock on a row which has a write lock. The write lock is stronger and will cause a write/write conflict with any overlapping transactions which would care about a read/write conflict. The pattern of reading a row and then updating or deleting it is so common that this optimization does a lot to avoid promotion of predicate locks to coarser granularity, and thereby helps avoid false positives. While the optimization is valuable, the code used to implement it was pretty horrid. (And that was me that wrote it.) It has already been the cause of several other fixes since the main patch went in. What Dan has done here is move the optimization out of the middle of the loop which is doing the conflict detection, and in doing so has reduced the number of lines of code needed, reduced the amount of fiddling with LW locks, and all around made the code more robust and more understandable. I've reviewed the patch and it looks good to me. Dan has beat up on it with the same DBT-2 run which exposed the race condition without seeing a problem. Although a much smaller patch could address the immediate problem, I strongly feel that Dan has taken the right approach by refactoring this bit to something fundamentally cleaner and less fragile. -Kevin
On Thu, May 5, 2011 at 1:43 PM, Kevin Grittner <Kevin.Grittner@wicourts.gov> wrote: > Dan Ports <drkp@csail.mit.edu> wrote: >> While running some benchmarks to test SSI performance, I found a >> race condition that's capable of causing a segfault. A patch is >> attached. >> >> The bug is in CheckTargetForConflictsIn, which scans the list of >> SIREAD locks on a lock target when it's modified. There's an >> optimization in there where the writing transaction will remove a >> SIREAD lock that it holds itself, because it's being replaced with >> a (stronger) write lock. To do that, it needs to drop its shared >> lwlocks and reacquire them in exclusive mode. The existing code >> deals with concurrent modifications in that interval by redoing >> checks. However, it misses the case where some other transaction >> removes all remaining locks on the target, and proceeds to remove >> the lock target itself. >> >> The attached patch fixes this by deferring the SIREAD lock removal >> until the end of the function. At that point, there isn't any need >> to worry about concurrent updates to the target's lock list. The >> resulting code is also simpler. > > I just want to confirm that this addresses a real (although very > narrow) race condition. It results from code used to implement a > valuable optimization described in Cahill's thesis: don't get or > keep an SIREAD lock on a row which has a write lock. The write lock > is stronger and will cause a write/write conflict with any > overlapping transactions which would care about a read/write > conflict. The pattern of reading a row and then updating or > deleting it is so common that this optimization does a lot to avoid > promotion of predicate locks to coarser granularity, and thereby > helps avoid false positives. > > While the optimization is valuable, the code used to implement it > was pretty horrid. (And that was me that wrote it.) It has already > been the cause of several other fixes since the main patch went in. > What Dan has done here is move the optimization out of the middle of > the loop which is doing the conflict detection, and in doing so has > reduced the number of lines of code needed, reduced the amount of > fiddling with LW locks, and all around made the code more robust and > more understandable. > > I've reviewed the patch and it looks good to me. Dan has beat up on > it with the same DBT-2 run which exposed the race condition without > seeing a problem. Although a much smaller patch could address the > immediate problem, I strongly feel that Dan has taken the right > approach by refactoring this bit to something fundamentally cleaner > and less fragile. Why does this HASH_FIND the applicable hash table entries and then HASH_REMOVE it as a separate step, instead of just HASH_REMOVE-ing it in one go? Doesn't this fail to release the locks if rmpredlock == NULL? I believe it's project style to test (rmpredlock != NULL) rather than just (rmpredlock). -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Fri, May 06, 2011 at 09:35:39PM -0400, Robert Haas wrote: > Why does this HASH_FIND the applicable hash table entries and then > HASH_REMOVE it as a separate step, instead of just HASH_REMOVE-ing it > in one go? For PredicateLockHash, we need to find the lock entry first so that we can call SHMQueueDelete on its targetLink and xactLink fields. For LocalPredicateLockHash, we check the resulting entry to decide whether to remove it. Having looked at the code some more, however, we do always remove it because we only apply this optimization to heap tuple locks, which are not parents of other locks. So we can simplify this down to a HASH_REMOVE. > Doesn't this fail to release the locks if rmpredlock == NULL? Yikes. Indeed it does. > I believe it's project style to test (rmpredlock != NULL) rather than > just (rmpredlock). That works for me (I prefer the != NULL myself). I believe I've seen both elsewhere, though... Will update the patch. Dan -- Dan R. K. Ports MIT CSAIL http://drkp.net/
On Fri, May 06, 2011 at 10:49:22PM -0400, Dan Ports wrote: > Will update the patch. Updated patch (in response to Robert's comments) attached. Dan -- Dan R. K. Ports MIT CSAIL http://drkp.net/
Attachment
On Sun, May 8, 2011 at 8:31 PM, Dan Ports <drkp@csail.mit.edu> wrote: > On Fri, May 06, 2011 at 10:49:22PM -0400, Dan Ports wrote: >> Will update the patch. > > Updated patch (in response to Robert's comments) attached. Committed. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company