Thread: lwlocks and starvation
LWLockRelease() currently does something like (simplifying a lot): acquire lwlock spinlock decrement lock count if lock is free if first waiter in queue is waiting for exclusivelock, awaken him; else, walk through the queue and awaken all the shared waiters until we reach an exclusivewaiter end if release lwlock spinlock This has the nice property that locks are granted in FIFO order. Is it essential that we maintain that property? If not, we could instead walk through the wait queue and awaken *all* the shared waiters, and get a small improvement in throughput. I can see that this might starve exclusive waiters; however, we allow the following: Proc A => LWLockAcquire(lock, LW_SHARED); -- succeeds Proc B => LWLockAcquire(lock, LW_EXCLUSIVE); -- blocks ProcC => LWLockAcquire(lock, LW_SHARED); -- succeeds i.e. we don't *really* follow strict FIFO order anyway. -Neil
Neil Conway wrote: > LWLockRelease() currently does something like (simplifying a lot): > > acquire lwlock spinlock > decrement lock count > if lock is free > if first waiter in queue is waiting for exclusive lock, > awaken him; else, walk through the queue and awaken > all the shared waiters until we reach an exclusive waiter > end if > release lwlock spinlock > > This has the nice property that locks are granted in FIFO order. Is it > essential that we maintain that property? If not, we could instead walk > through the wait queue and awaken *all* the shared waiters, and get a > small improvement in throughput. > > I can see that this might starve exclusive waiters; however, we allow > the following: > > Proc A => LWLockAcquire(lock, LW_SHARED); -- succeeds > Proc B => LWLockAcquire(lock, LW_EXCLUSIVE); -- blocks > Proc C => LWLockAcquire(lock, LW_SHARED); -- succeeds > > i.e. we don't *really* follow strict FIFO order anyway. My guess is the existing behavior was designed to allow waking of multiple waiters _sometimes_ without starving of exclusive waiters. There should be a comment in the code explaining this usage and I bet it was intentional. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
Bruce Momjian wrote: > My guess is the existing behavior was designed to allow waking of > multiple waiters _sometimes_ without starving of exclusive waiters. Well, I think the current algorithm *does* allow starvation, at least in some situations. Consider a workload in which a new shared reader arrives every 50 ms, and holds the lock for, say, 500 ms. If an exclusive waiter arrives, they will starve with the current algorithm. > There should be a comment in the code explaining this usage and I bet it > was intentional. Oh, I bet it was intentional as well :) I'm mostly curious to see exactly what the reasoning was, and whether it is necessary that we preserve the FIFO behavior while considering optimizations. -Neil
Neil Conway wrote: > Bruce Momjian wrote: > > My guess is the existing behavior was designed to allow waking of > > multiple waiters _sometimes_ without starving of exclusive waiters. > > Well, I think the current algorithm *does* allow starvation, at least in > some situations. Consider a workload in which a new shared reader > arrives every 50 ms, and holds the lock for, say, 500 ms. If an > exclusive waiter arrives, they will starve with the current algorithm. I thought the new readers will sit after the writer in the FIFO queue so the writer will not starve. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
Bruce Momjian wrote: > I thought the new readers will sit after the writer in the FIFO queue so > the writer will not starve. AFAICS, that is not the case. See lwlock.c, circa line 264: in LW_SHARED mode, we check if "exclusive" is zero; if so, we acquire the lock (increment the shared lock count and do not block). And "exclusive" is set non-zero only when we _acquire_ a lock in exclusive mode, not when we add an exclusive waiter to the wait queue. (Speaking of which, the "exclusive" field is declared as a "char"; I wonder if it wouldn't be more clear to declare it as "bool", and treat it as a boolean field. The storage/alignment requirements should be the same (bool is a typedef for char, at least a C compiler), but IMHO it would be more logical.) -Neil
Neil Conway <neilc@samurai.com> writes: > LWLockRelease() currently does something like (simplifying a lot): > ... > This has the nice property that locks are granted in FIFO order. Is it > essential that we maintain that property? Not really, although avoiding indefinite starvation is important. > If not, we could instead walk > through the wait queue and awaken *all* the shared waiters, and get a > small improvement in throughput. I think this would increase the odds of starvation for exclusive waiters significantly. It would also complicate the list manipulation in LWLockRelease, and the net loss of cycles to that might outweigh any possible speedup anyway. > I can see that this might starve exclusive waiters; however, we allow > the following: > Proc A => LWLockAcquire(lock, LW_SHARED); -- succeeds > Proc B => LWLockAcquire(lock, LW_EXCLUSIVE); -- blocks > Proc C => LWLockAcquire(lock, LW_SHARED); -- succeeds > i.e. we don't *really* follow strict FIFO order anyway. Uh, no; proc C will queue up behind proc B. See LWLockAcquire. Were this not so, again we'd be risking starving proc B, since there could easily be an indefinitely long series of people acquiring and releasing the lock in shared mode. regards, tom lane
Neil Conway <neilc@samurai.com> writes: > AFAICS, that is not the case. See lwlock.c, circa line 264: in LW_SHARED > mode, we check if "exclusive" is zero; if so, we acquire the lock > (increment the shared lock count and do not block). And "exclusive" is > set non-zero only when we _acquire_ a lock in exclusive mode, not when > we add an exclusive waiter to the wait queue. Ooh, you are right. I think that may qualify as a bug ... regards, tom lane
Neil Conway <neilc@samurai.com> writes: > (Speaking of which, the "exclusive" field is declared as a "char"; I > wonder if it wouldn't be more clear to declare it as "bool", and treat > it as a boolean field. I coded it that way because I was thinking of it as a count (0 or 1), for symmetry with the count of shared holders. You could argue it either way I suppose. regards, tom lane
I wrote: > Neil Conway <neilc@samurai.com> writes: >> AFAICS, that is not the case. See lwlock.c, circa line 264: in LW_SHARED >> mode, we check if "exclusive" is zero; if so, we acquire the lock >> (increment the shared lock count and do not block). And "exclusive" is >> set non-zero only when we _acquire_ a lock in exclusive mode, not when >> we add an exclusive waiter to the wait queue. > Ooh, you are right. I think that may qualify as a bug ... I am thinking of addressing this issue by applying the attached patch. The patch prevents a would-be shared locker from cutting in front of a waiting exclusive locker. It is not a 100% solution because it does not cover the case where a waiting exclusive locker is released, then a new shared locker arrives at the lock before the exclusive locker is given any cycles to acquire the lock. However I don't see any cure for the latter problem that's not worse than the disease (granting the lock to the released waiter immediately is definitely not better). On the other hand we might consider that this isn't a big problem and just leave things as they are. We haven't seen any indication that starvation is a real problem in practice, and so it might be better to avoid extra trips through the kernel scheduler. In particular, I think the existing behavior may fit in better with the idea that a shared locker should be able to acquire and release the lock multiple times during his timeslice, regardless of whether there is contention. And on the third hand, I think the heavily-contended LWLocks are only taken in exclusive mode anyway, so this may be mostly a moot point. Comments? regards, tom lane *** src/backend/storage/lmgr/lwlock.c.orig Sun Aug 29 01:06:48 2004 --- src/backend/storage/lmgr/lwlock.c Wed Nov 24 23:13:19 2004 *************** *** 261,267 **** } else { ! if (lock->exclusive == 0) { lock->shared++; mustwait = false; --- 261,273 ---- } else { ! /* ! * If there is anyone waiting for the lock, we don't take it. ! * This could only happen if an exclusive locker is blocked ! * by existing shared lockers; we want to queue up behind him ! * rather than risk starving him indefinitely. ! */ ! if (lock->exclusive == 0 && lock->head == NULL) { lock->shared++; mustwait = false; *************** *** 376,382 **** } else { ! if (lock->exclusive == 0) { lock->shared++; mustwait = false; --- 382,394 ---- } else { ! /* ! * If there is anyone waiting for the lock, we don't take it. ! * This could only happen if an exclusive locker is blocked ! * by existing shared lockers; we want to queue up behind him ! * rather than risk starving him indefinitely. ! */ ! if (lock->exclusive == 0 && lock->head == NULL) { lock->shared++; mustwait = false;
On Wed, 2004-11-24 at 23:30 -0500, Tom Lane wrote: > It is not a 100% solution because it does not > cover the case where a waiting exclusive locker is released, then a new > shared locker arrives at the lock before the exclusive locker is given > any cycles to acquire the lock. However I don't see any cure for the > latter problem that's not worse than the disease Yeah, I don't think this is a problem -- eventually the exclusive waiter will win the coin flip anyway. > On the other hand we might consider that this isn't a big problem and > just leave things as they are. We haven't seen any indication that > starvation is a real problem in practice, and so it might be better to > avoid extra trips through the kernel scheduler. Yes, I'm a little concerned about applying a patch to address what is, so far, an entirely academic concern -- especially if it might hurt performance. -Neil
On Wed, 2004-11-24 at 12:52, Neil Conway wrote: > Bruce Momjian wrote: > > I thought the new readers will sit after the writer in the FIFO queue so > > the writer will not starve. > > AFAICS, that is not the case. See lwlock.c, circa line 264: in LW_SHARED > mode, we check if "exclusive" is zero; if so, we acquire the lock > (increment the shared lock count and do not block). And "exclusive" is > set non-zero only when we _acquire_ a lock in exclusive mode, not when > we add an exclusive waiter to the wait queue. Wow...well spotted. That could explain many recent performance results. On Wed, 2004-11-24 at 08:23, Neil Conway wrote: > LWLockRelease() currently does something like (simplifying a lot): > > acquire lwlock spinlock > decrement lock count > if lock is free > if first waiter in queue is waiting for exclusive lock, > awaken him; else, walk through the queue and awaken > all the shared waiters until we reach an exclusive waiter > end if > release lwlock spinlock > > This has the nice property that locks are granted in FIFO order. Is it > essential that we maintain that property? If not, we could instead walk > through the wait queue and awaken *all* the shared waiters, and get a > small improvement in throughput. I'd been thinking about lock release order also, thinking that this could be related to the CS storms observed earlier and the apparent lock-step behaviour commented upon previously. FIFO is the most easily theoretically predictable, but others are possible. ISTM that waking shared waiters in xid order would bring the most benefit and minimise any data issues. Readers waiting behind an exclusive waiter, where the reader has a lower xid might reasonably be woken without a problem since they will never see the changes made by the exclusive waiter anyway. That probably needs to be within a limited window of inspection beyond the exclusive waiter to limit the complexity, say 4-8 places beyond the exclusive waiter. Exactly what we do from here is going to dramatically effect performance in various situations, so I think trying a few different algorithms should help the understanding. IMHO a concern remains that oprofile is not good enough instrumentation to spot this kind of issue. Instrumentation at the lwlock level *would* have spotted this and other issues too, and will also help us determine what the differences are between the various ways forward for (possibly) changing the current behaviour. -- Best Regards, Simon Riggs
Simon Riggs wrote: > I'd been thinking about lock release order also, thinking that this > could be related to the CS storms observed earlier and the apparent > lock-step behaviour commented upon previously. As much as I would love to believe this (because it would mean we could probably solve the problem pretty easily), I don't think lock release order is the (primary) culprit behind the CS storm issue. As I understand it, the heavily contended lock in those situations is the BufMgrLock, and that is _always_ acquired in exclusive mode. > ISTM that waking > shared waiters in xid order would bring the most benefit and minimise > any data issues. Readers waiting behind an exclusive waiter, where the > reader has a lower xid might reasonably be woken without a problem since > they will never see the changes made by the exclusive waiter anyway. I'm not sure I understand. What "data issues" are you referring to? LWLocks are used to protect non-transactional resources (such as shared memory data structures), so the relative xids of two waiter processes doesn't affect whether they can see each other's modifications (i.e. because they always can). In any case, the idea of considering information such as the xid when deciding which waiters to release is interesting. It's not immediately apparent to me quite *how* to make use of that info, but it's definitely something to consider... -Neil
On Thu, 2004-11-25 at 11:28, Neil Conway wrote: > Simon Riggs wrote: > > I'd been thinking about lock release order also, thinking that this > > could be related to the CS storms observed earlier and the apparent > > lock-step behaviour commented upon previously. > > As much as I would love to believe this (because it would mean we could > probably solve the problem pretty easily), I don't think lock release > order is the (primary) culprit behind the CS storm issue. Wish we had a way to tell... though I think you are right. > As I > understand it, the heavily contended lock in those situations is the > BufMgrLock, and that is _always_ acquired in exclusive mode. So you're saying this doesn't effect BufMgrLock contention? Oh. > > ISTM that waking > > shared waiters in xid order would bring the most benefit and minimise > > any data issues. Readers waiting behind an exclusive waiter, where the > > reader has a lower xid might reasonably be woken without a problem since > > they will never see the changes made by the exclusive waiter anyway. > > I'm not sure I understand. What "data issues" are you referring to? > LWLocks are used to protect non-transactional resources (such as shared > memory data structures), so the relative xids of two waiter processes > doesn't affect whether they can see each other's modifications (i.e. > because they always can). Yes, understand the difference. But when we check for tuple visibility, the changes would be ignored, hence it is OK to resort the list from FIFO to xid order, but... > In any case, the idea of considering information such as the xid when > deciding which waiters to release is interesting. It's not immediately > apparent to me quite *how* to make use of that info, but it's definitely > something to consider... Agreed. It's a wacky idea and I'm not embarrassed having them. The day I stop having them will be the day I stop having good ideas too. But please don't let this distract from your own good-idea: Well done to you, Neil. -- Best Regards, Simon Riggs
Neil, where are we on this? Should we add comments? Add a TODO? A patch? --------------------------------------------------------------------------- Neil Conway wrote: > On Wed, 2004-11-24 at 23:30 -0500, Tom Lane wrote: > > It is not a 100% solution because it does not > > cover the case where a waiting exclusive locker is released, then a new > > shared locker arrives at the lock before the exclusive locker is given > > any cycles to acquire the lock. However I don't see any cure for the > > latter problem that's not worse than the disease > > Yeah, I don't think this is a problem -- eventually the exclusive waiter > will win the coin flip anyway. > > > On the other hand we might consider that this isn't a big problem and > > just leave things as they are. We haven't seen any indication that > > starvation is a real problem in practice, and so it might be better to > > avoid extra trips through the kernel scheduler. > > Yes, I'm a little concerned about applying a patch to address what is, > so far, an entirely academic concern -- especially if it might hurt > performance. > > -Neil > > -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
On Wed, 2004-12-01 at 21:51 -0500, Bruce Momjian wrote: > Neil, where are we on this? Should we add comments? Add a TODO? A patch? I'm not sure what the right resolution is. As I said, I don't think it's wise to apply a patch that could have a significant impact on performance without (a) testing its performance effect and/or (b) having any evidence that the problem it addresses actually effects anyone in the real world. I'll try to run some benchmarks when I get a chance. I wrote up most of a patch to implement the "wake up all shared wakers on LWLockRelease()" behavior to see how that would change performance, but the patch has a subtle bug in it that I can't seem to find (I've attached it -- comments welcome). Certainly if we decide to leave things as they are I think we ought to document why the behavior is intentional, but I don't think we have enough data to make that decision yet. -Neil
Attachment
OK, either you have to own the issue or I have to add something to the TODO list. --------------------------------------------------------------------------- Neil Conway wrote: > On Wed, 2004-12-01 at 21:51 -0500, Bruce Momjian wrote: > > Neil, where are we on this? Should we add comments? Add a TODO? A patch? > > I'm not sure what the right resolution is. As I said, I don't think it's > wise to apply a patch that could have a significant impact on > performance without (a) testing its performance effect and/or (b) having > any evidence that the problem it addresses actually effects anyone in > the real world. I'll try to run some benchmarks when I get a chance. > > I wrote up most of a patch to implement the "wake up all shared wakers > on LWLockRelease()" behavior to see how that would change performance, > but the patch has a subtle bug in it that I can't seem to find (I've > attached it -- comments welcome). > > Certainly if we decide to leave things as they are I think we ought to > document why the behavior is intentional, but I don't think we have > enough data to make that decision yet. > > -Neil > [ Attachment, skipping... ] > > ---------------------------(end of broadcast)--------------------------- > TIP 4: Don't 'kill -9' the postmaster -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
On Thu, 2004-12-02 at 09:59 -0500, Bruce Momjian wrote: > OK, either you have to own the issue or I have to add something to the > TODO list. Can you add it to the TODO list and assign it to me? -Neil
Neil Conway wrote: > On Thu, 2004-12-02 at 09:59 -0500, Bruce Momjian wrote: > > OK, either you have to own the issue or I have to add something to the > > TODO list. > > Can you add it to the TODO list and assign it to me? > Done: * Fix priority ordering of read and write light-weight locks (Neil) -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073