Re: [HACKERS] Moving relation extension locks out of heavyweight lock manager - Mailing list pgsql-hackers
From | Amit Kapila |
---|---|
Subject | Re: [HACKERS] Moving relation extension locks out of heavyweight lock manager |
Date | |
Msg-id | CAA4eK1LEwuEP=JqCNjwjqna7hZ=1HSyhjFmJ++Q66wY6B7rstA@mail.gmail.com Whole thread Raw |
In response to | Re: [HACKERS] Moving relation extension locks out of heavyweight lock manager (Mahendra Singh Thalor <mahi6run@gmail.com>) |
Responses |
Re: [HACKERS] Moving relation extension locks out of heavyweight lock manager
|
List | pgsql-hackers |
On Wed, Mar 4, 2020 at 11:45 AM Mahendra Singh Thalor <mahi6run@gmail.com> wrote: > > On Mon, 24 Feb 2020 at 15:39, Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > On Thu, Feb 20, 2020 at 8:06 AM Andres Freund <andres@anarazel.de> wrote: > > > > > > What I'm advocating is that extension locks should continue to go > > > through lock.c. And yes, that requires some changes to group locking, > > > but I still don't see why they'd be complicated. > > > > > > > Fair position, as per initial analysis, I think if we do below three > > things, it should work out without changing to a new way of locking > > for relation extension or page type locks. > > a. As per the discussion above, ensure in code we will never try to > > acquire another heavy-weight lock after acquiring relation extension > > or page type locks (probably by having Asserts in code or maybe some > > other way). > > b. Change lock.c so that group locking is not considered for these two > > lock types. For ex. in LockCheckConflicts, along with the check (if > > (proclock->groupLeader == MyProc && MyProc->lockGroupLeader == NULL)), > > we also check lock->tag and call it a conflict for these two locks. > > c. The deadlock detector can ignore checking these two types of locks > > because point (a) ensures that those won't lead to deadlock. One idea > > could be that FindLockCycleRecurseMember just ignores these two types > > of locks by checking the lock tag. > > Thanks Amit for summary. > > Based on above 3 points, here attaching 2 patches for review. > > 1. v01_0001-Conflict-EXTENTION-lock-in-group-member.patch (Patch by Dilip Kumar) > Basically this patch is for point b and c. > > 2. v01_0002-Added-assert-to-verify-that-we-never-try-to-take-any.patch > (Patch by me) > This patch is for point a. > > After applying both the patches, make check-world is passing. > > We are testing both the patches and will post results. > I think we need to do detailed code review in the places where we are taking Relation Extension Lock and see whether we are acquiring another heavy-weight lock after that. It seems to me that in brin_getinsertbuffer, after acquiring Relation Extension Lock, we might again try to acquire the same lock. See brin_initialize_empty_new_buffer which is called after acquiring Relation Extension Lock, in that function, we call RecordPageWithFreeSpace and that can again try to acquire the same lock if it needs to perform fsm_extend. I think there will be similar instances in the code. I think it is fine if we again try to acquire it, but the current assertion in your patch needs to be adjusted for that. Few other minor comments on v01_0002-Added-assert-to-verify-that-we-never-try-to-take-any: 1. Ideally, this should be the first patch as we first need to ensure that we don't take any heavy-weight locks after acquiring a relation extension lock. 2. I think it is better to add an Assert after initial error checks (after RecoveryInProgress().. check) 3. + Assert (locallock->tag.lock.locktag_type != LOCKTAG_RELATION_EXTEND || + locallock->nLocks == 0); I think it is not possible that we have an entry in LockMethodLocalHash and its value is zero. Do you see any such possibility, if not, then we might want to remove it? 4. We already have a macro for LOCALLOCK_LOCKMETHOD, can we write another one tag type? This will make the check look a bit cleaner and probably if we need to extend it in future for Page type locks, then also it will be good. 5. I have also tried to think of another way to check if we already hold lock type LOCKTAG_RELATION_EXTEND, but couldn't come up with a cheaper way than this. Basically, I think if we traverse the MyProc->myProcLocks queue, we will get this information, but that doesn't seem much cheaper than this. 6. Another thing that could be possible is to make this a test and elog so that it can hit in production scenarios, but I think the cost of that will be high unless we have a very simple way to write this test condition. -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
pgsql-hackers by date: