Re: Improve heavyweight locks instead of building new lock managers? - Mailing list pgsql-hackers
From | Robert Haas |
---|---|
Subject | Re: Improve heavyweight locks instead of building new lock managers? |
Date | |
Msg-id | CA+Tgmob_=KRPoCyEHziLzq0Q8fSX3JgLX8WzKWP3Sm1m78yd7g@mail.gmail.com Whole thread Raw |
In response to | Improve heavyweight locks instead of building new lock managers? (Andres Freund <andres@anarazel.de>) |
List | pgsql-hackers |
On Mon, Feb 10, 2020 at 11:22 PM Andres Freund <andres@anarazel.de> wrote: > To me this seems to go in the direction of having multiple bespoke lock > managers with slightly different feature sets, different debugging / > logging / monitoring support, with separate locking code each. That's > bad for maintainability. I think it would help a lot if the lock manager were not such a monolithic thing. For instance, suppose that instead of having the deadlock detector be part of the lock manager, it's a separate thing that integrates with the lock manager. Instead of only knowing about waits for heavyweight locks, it knows about whatever you want to tell it about. Then, for instance, you could possibly tell it that process A is waiting for a cleanup lock while process B holds a pin, possibly detecting a deadlock we can't notice today. There are likely other cases as well. > E.g. we could: > > - Have an extended LockAcquire API where the caller provides a stack > variable for caching information until the LockRelease call, avoiding > separate LockMethodLocalHash lookup for release. Maybe. Seems like it might make things a little clunky for the caller, though. If you just kept a stack of locks acquired and checked the lock being released against the top of the stack, you'd probably catch a large percentage of cases. Or, the caller could pass a flag indicating whether they intend to release the lock prior to the end of the transaction (which could be cross-checked by assertions). Any that you intend to release early go on a stack. > - Instead of always implementing HASH_REMOVE as a completely fresh > lookup in dynahash, we should provide an API for removing an object we > *know* is in the hashtable at a specific pointer. We're obviously > already relying on that address to stay constant, so we'd not loose > flexibility. With this separate API we can avoid the bucket lookup, > walking through each element in that bucket, and having to compare the > hash key every time (which is quite expensive for a full LOCKTAG). That makes a lot of sense, although I wonder if we should go further and replace dynahash entirely. > - We should try to figure out whether we can avoid needing the separate > lock table for PROCLOCKs. As far as I can tell there's not a single > reason for it to be a hashtable, rather than something like > NUM_LOCK_PARTITIONS lists of free PROCLOCKs. I agree that the PROCLOCK thing seems ugly and inefficient. I'm not sure that a bunch of lists is the best answer, though it might be. One case to consider is when a lock is initially acquired via the fast-path and then, because of a conflicting lock acquisition, transferred by *some other process* to the main lock table. If this occurs, the original locker must be able to find its PROCLOCK. It doesn't have to be crazy efficient because it shouldn't happen very often, but it shouldn't suck too much. > - Whenever we do a relation extension, we better already hold a normal > relation lock. We don't actually need to have an entirely distinct > type of lock for extensions, that theoretically could support N > extension locks for each relation. The fact that these are associated > could be utilized in different ways: > > - We could define relation extension as a separate lock level only > conflicting with itself (and perhaps treat AELs as including > it). That'd avoid needing separate shared memory states in many > cases. > > This might be too problematic, because extension locks don't have > the same desired behaviour around group locking (and a small army of > other issues). Yeah, I don't think that's likely to work out very nicely. > - We could keep a separate extension lock cached inside the relation > lock. The first time a transaction needs to extend, it does the > normal work, and after that stores the PROCLOCK in the LOCALLOCK (or > something like that). Once extension is done, don't release the lock > entirely, but instead just reduce the lock level to a new > non-conflicting lock level > > Alternatively we could implement something very similar outside of > lock.c, e.g. by caching the LOCALLOCK* (possibly identified by an > integer or such) in RelationData. That'd require enough support > from lock.c to be able to make that really cheap. Not sure I quite see what you mean here. > We could e.g. split the maintenance of the hashtable(s) from protecting > the state of individual locks. The locks protecting the hashtable would > just be held long enough to change a "pin count" of each lock, and then > a per LOCK lwlock would protect each heavyweight lock's state. We'd not > need to lock the partition locks for quite a few cases, because there's > many locks in a loaded system that always have lockers. There'd be cost > to that, needing more atomic ops in some cases, but I think it'd reduce > contention tremendously, and it'd open a lot more optimization > potential. It seems plausible that we could even work, as a followup, > to not needing the partition locks for some lock releases (e.g. when > there are other holders), and we might even be able to avoid it for > acquisitions, by caching the LOCK inside LOCALLOCK, and re-identifying > the identity. I agree. This seems worth exploring. The idea of caching the probably location of the lock and re-pinning it to check whether it's the one you expected seems like it would avoid false sharing in a lot of practical cases. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
pgsql-hackers by date: