Thread: TODO: fix priority of ordering of read and write light-weight locks
The wiki todo has the topic "Fix priority ordering of read and write light-weight locks" and references http://archives.postgresql.org/pgsql-hackers/2004-11/msg00893.php (lwlocks and starvation) Having read the referenced discussion, I'm trying to figure out what remains to be done. Tom proposed a patch back in 2004, which still seems rather applicable today, which would correct the "new Shared request trumps queued Exclusive waiter" problem except for not fixing it during a race condition. It seems that this patch was not pursued because no one thought it was evident that it would actually make things better (is Exclusive Waiter starvation actually a documentable problem?), while the ways it could make things worse are obvious (adding a context switches, often to little end). So what is waiting to be done? Do we want a fix that doesn't suffer the race condition (a waiter has been removed from the queue and signaled, but has not yet been dispatched to the CPU)? It is hard to imagine one of those that is not worse than the disease. Are we looking for empirical evidence that the proposed patch is actually an improvement for at least one plausible workload? In which case, does anyone have suggestions about what such a work-load might look like? Since lwlock covers a rather heterogeneous bunch of lock purposes, it seems unlikely to me any one strategy is going to be applicable to all of those purposes if extreme optimization is what we are after . How much are we willing to sacrifice modularity and abstraction in order to get a little extra performance out of things protected by lwlock? Cheers, Jeff
Jeff Janes <jeff.janes@gmail.com> writes: > The wiki todo has the topic "Fix priority ordering of read and write > light-weight locks" and > references http://archives.postgresql.org/pgsql-hackers/2004-11/msg00893.php > (lwlocks and starvation) > Having read the referenced discussion, I'm trying to figure out what > remains to be done. AFAIR that patch hasn't been applied because nobody's demonstrated an overall win from changing it. It's not only a matter of showing an improvement for some one workload, but providing some confidence that no other case gets a lot worse. If you go back to circa 2000 or 2001 to see the previous iteration of tweaking the lock algorithms, you'll see that we found that it's important to avoid unnecessary trips through the kernel scheduler. If we force a incoming shared requestor to block, there's a very real chance that the overhead from descheduling and later rescheduling him will mean a net performance degradation regardless of any other benefits. As against this there's the risk of long-term starvation of an exclusive requestor --- but there is little if any evidence that that's a serious problem in practice. Just to make things more interesting, the context has changed a lot since 2000-2001 --- far more people have multi-CPU machines now. So it's possible the tradeoffs have changed. > Since lwlock covers a rather heterogeneous bunch of lock purposes, it > seems unlikely to me any one strategy is going to be applicable to all > of those purposes if extreme optimization is what we are after . How > much are we willing to sacrifice modularity and abstraction in order > to get a little extra performance out of things protected by lwlock? My answer is "not a lot, unless it's a *lot* of extra performance". We've talked sometimes about having more than one type of LWLock to address different scheduling needs, but nobody's come up with evidence that that'd really be helpful. Also, did you see this thread: http://archives.postgresql.org/pgsql-performance/2009-03/msg00104.php That proposal was bounced because it seemed likely to hurt in a much wider set of cases than it helped (see extra-scheduling-overhead argument). But I'd still be interested to see an unbiased analysis of what was going on in the case where it seemed to help. regards, tom lane
Sorry, I screwed up. The below was supposed to go to the list, not Tom personally. ---------- Forwarded message ---------- From: Jeff Janes <jeff.janes@gmail.com> Date: Thu, Aug 13, 2009 at 9:32 PM Subject: Re: [HACKERS] TODO: fix priority of ordering of read and write light-weight locks To: Tom Lane <tgl@sss.pgh.pa.us> On Tue, Aug 11, 2009 at 9:58 PM, Tom Lane<tgl@sss.pgh.pa.us> wrote: > Jeff Janes <jeff.janes@gmail.com> writes: >> The wiki todo has the topic "Fix priority ordering of read and write >> light-weight locks" and >> references http://archives.postgresql.org/pgsql-hackers/2004-11/msg00893.php >> (lwlocks and starvation) > >> Having read the referenced discussion, I'm trying to figure out what >> remains to be done. > > AFAIR that patch hasn't been applied because nobody's demonstrated > an overall win from changing it. It's not only a matter of showing > an improvement for some one workload, but providing some confidence > that no other case gets a lot worse. If you go back to circa 2000 > or 2001 to see the previous iteration of tweaking the lock algorithms, > you'll see that we found that it's important to avoid unnecessary > trips through the kernel scheduler. If we force a incoming shared > requestor to block, there's a very real chance that the overhead from > descheduling and later rescheduling him will mean a net performance > degradation regardless of any other benefits. As against this there's > the risk of long-term starvation of an exclusive requestor --- but > there is little if any evidence that that's a serious problem in > practice. Yes, I agree with all that. That is why I was surprised to find it on the todo list--I thought it was thoroughly thought out and decided not worth pursuing, so didn't know what more to be done. ... > > Also, did you see this thread: > http://archives.postgresql.org/pgsql-performance/2009-03/msg00104.php > That proposal was bounced because it seemed likely to hurt in a > much wider set of cases than it helped (see extra-scheduling-overhead > argument). But I'd still be interested to see an unbiased analysis > of what was going on in the case where it seemed to help. I'm afraid I can't help much here. I can't reproduce his results. But I have neither his hardware (the biggest thing I could test is 8-way x86_64) nor his benchmark/simulation code. On the 8-way system, with fsync=off to roughly simulate SDD, using pgbench -s 50, I saturate cpu at about -c 8, which is what you would expect, and going above that doesn't improve performance. Using the wake-all variant didn't improve things. (But also didn't make them worse). I wonder if the original poster of that message would be willing to try using pgbench on his system to see if he can get the same results under it? pgbench might not be the best load-testing tool ever created, but it is one that everyone has ready access to, which certainly counts for something. Cheers, Jeff