Thread: Re: Re: Shared row locking
Tom Lane <tgl@sss.pgh.pa.us> wrote on 20.12.2004, 19:34:21: > Alvaro Herrera writes: > > To solve the problem I want to solve, we have three orthogonal > > possibilities: > > > 1. implement shared row locking using the ideas outlined in the mail > > starting this thread (pg_clog-like seems to be the winner, details TBD). > > > 2. implement shared lock table spill-to-disk mechanism. > > > 3. implement lock escalation. > > Check. > > > - 2 could have a performance impact, and we don't even know how to > > start. For example, what would be an algorithm to decide what locks > > to send to disk? > > LRU, perhaps? That's all open for investigation still. > > #1 could have a pretty serious performance impact, too. For small > numbers of FOR UPDATE locks (too few to force spill to disk) I would > expect #2 to substantially beat #1. #1 essentially imposes the worst > case performance at all times, whereas #2 degrades (at a currently > unknown rate) when there are lots and lots of FOR UPDATE locks. Agreed. [My gut feeling would be against another permanent on-disk structure, since this is one more thing for a user to delete "to save space" etc...] > Most of the applications I've seen don't take out that many FOR UPDATE > locks at once, so I'm unclear on the rationale for choosing a fixed-but- > poor performance curve over one that is fast for few locks and degrades > for many locks. Especially when the value of "many" is > user-configurable. > > Furthermore, we have also seen issues with too many locks on ordinary > objects, which #2 would solve simultaneously. > > So I feel that #2 is clearly the approach to try first. If we find that > we can't do spill-to-disk without serious performance degradation, then I agree. We need to understand what type of application logic we are coding for. In general, I agree with Tom: I haven't seen many programs that use extended SELECT FOR UPDATE logic. However, the ones I have seen have been batch style programs written using a whole-table cursor - these latter ones have been designed for the cursor stability approach. It would be much better to analyze one or more representative application suites to understand which option to pick. Without a set of programs, or some driving force, the wrong one could be picked. Spill-to-disk would not be that bad, since WARNINGs could appear in the log. That's much better than doing a lock escalation, definitely. Best Regards, Simon Riggs
On Mon, 20 Dec 2004 21:44:01 +0100, <simon@2ndquadrant.com> wrote: >Tom Lane <tgl@sss.pgh.pa.us> wrote on 20.12.2004, 19:34:21: >> #1 could have a pretty serious performance impact, too. For small >> numbers of FOR UPDATE locks (too few to force spill to disk) I would >> expect #2 to substantially beat #1. #1 essentially imposes the worst >> case performance at all times, whereas #2 degrades (at a currently >> unknown rate) when there are lots and lots of FOR UPDATE locks. I don't see too much of a difference between #1 (an on-disk structure buffered in shared memory) and #2 (a shared memory structure spilling to disk). As long as we can avoid unnecessary writes, #1 has the nice property that it automatically adapts to different usage patterns because it uses the normal shared buffer pool and cache replacement strategy. >[My gut feeling would be against another permanent on-disk structure, >since this is one more thing for a user to delete "to save space" >etc...] Irrelevant, IMHO. Whoever manually deletes any file under PGDATA deserves whatever this may cause. >I haven't seen many programs that use >extended SELECT FOR UPDATE logic. AFAICS, SELECT FOR UPDATE is not the primary issue here, because it locks rows exclusively. This thread is about shared locks, which should be used for foreign key checks. ServusManfred
Manfred Koizar <mkoi-pg@aon.at> writes: > I don't see too much of a difference between #1 (an on-disk structure > buffered in shared memory) and #2 (a shared memory structure spilling > to disk). If you stand back that far, maybe you can't see a difference ;-) ... but the proposal on the table was for an extremely special-purpose on-disk structure. I'd prefer to see first if we can solve a more general problem, namely the fixed size of the existing lock table. It's certainly possible that that idea won't work out, but if it can be done for a similar time investment as the special-purpose log would take, it'd be a win. regards, tom lane
On Wed, 29 Dec 2004 19:57:15 -0500, Tom Lane <tgl@sss.pgh.pa.us> wrote: >Manfred Koizar <mkoi-pg@aon.at> writes: >> I don't see too much of a difference between #1 (an on-disk structure >> buffered in shared memory) and #2 (a shared memory structure spilling >> to disk). > >If you stand back that far, maybe you can't see a difference ;-) ... Well, I tried to step back a bit to see the whole picture. You think I went too far this time? :-) >but the proposal on the table was for an extremely special-purpose >on-disk structure. I'd prefer to see first if we can solve a more >general problem, namely the fixed size of the existing lock table. I haven't been digging into the code for some time (:-() -- but isn't this basically a key/value mapping with random access? My concern is that as soon as you start to push entries out of the memory structure you have to implement some kind of on-disk structure to support fast lookups, which sooner or later might lead to something that looks suspiciously like the existing hash or btree code. So the question is whether starting by making nbtree more flexible isn't the lower hanging fruit... ServusManfred
Manfred Koizar <mkoi-pg@aon.at> writes: > So the question is whether starting by making nbtree more flexible isn't > the lower hanging fruit... Certainly not; indexes depend on locks, not vice versa. You'd not be able to do that without introducing an infinite recursion into the system design. In any case nbtree is much more heavyweight than we need for this --- the lock table does not want WAL support for example, nor REINDEX/VACUUM, nor support for arbitrary index lookup conditions, nor even multiple datatypes or multiple index columns. regards, tom lane
On Thu, 30 Dec 2004 13:36:53 -0500, Tom Lane <tgl@sss.pgh.pa.us> wrote: >Certainly not; indexes depend on locks, not vice versa. You'd not be >able to do that without introducing an infinite recursion into the >system design. Wouldn't you have to face the same sort of problems if you spill part of the lock table to disk? While you do I/O you have to hold some lock. In either case there has to be a special class of locks that are pinned in memory. > In any case nbtree is much more heavyweight than we need >for this Having funcionality we don't need is not a showstopper ... unless heavyweight implies slow, which I have to admit may well be the case. ServusManfred
Manfred Koizar <mkoi-pg@aon.at> writes: > On Thu, 30 Dec 2004 13:36:53 -0500, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> Certainly not; indexes depend on locks, not vice versa. You'd not be >> able to do that without introducing an infinite recursion into the >> system design. > Wouldn't you have to face the same sort of problems if you spill part of > the lock table to disk? While you do I/O you have to hold some lock. See LWLocks ... or spinlocks underneath those. But (some) operations on tables and indexes make use of heavyweight locks. regards, tom lane