Thread: Resource Owner reassign Locks
As discussed in several different email threads here and on performance , when using pg_dump a on large number of objects, the server has a quadratic behavior in LockReassignCurrentOwner where it has to dig through the entire local lock table to push one or two locks up from the portal being dropped to its parent. The attached patch fixes that by remembering up to 10 local locks, and pushing them up specifically rather than digging through the entire lock table. If the list overflows, then it reverts to the old behavior of digging through the entire local lock table. The same change was made to the case where the locks are being released, rather than reassigned (i.e. subtransaction abort rather than commit). I have no evidence that digging through the local lock table during bulk release was ever a bottleneck, but it seemed inconsistent not to make that change as well. When it needs to forget a lock, it searches backwards in the list of released lock and then just moves the last lock into the place of the one to be forgotten. Other ResourceOwner Forget functions slide the entire list down to close the gap, rather than using the selection-sort-like method. I don't understand why they do that. If Forgets are strictly LIFO, then it would not make a difference. If they are mostly LIFO but occasionally not, maybe the insertion method would win over the selection method. From what I can tell, Locks are mostly released in bulk anyway at transaction end, and rarely released explicitly. This patch reduces the time needed to dump 20,000 simple empty tables from 3m50.903s to 20.695s, and of course larger gains at larger numbers. dropdb test; createdb test for f in `seq 1 10000 4000000` ; do perl -le "print qq{create table foo\$_ (k integer , v integer);} foreach ($f..$f+9999)" | psql test>& /dev/null time pg_dump test|wc -c done For degrading performance in other cases, I think the best test case is "pgbench -P" (implemented in another patch in this commitfest) which has a loop which pushes one or two locks up from a portal to the parent (which already owns them, due to previous rounds of the same loop) very frequently. There might be a performance degradation of 0.5% or so, but it is less than the run to run variation. I plan to run some longer test to get a better estimate. If there is a degradation in that range, how important is that? I wanted to test a real worst case, which would be a malicious ordering of lock releases (take 10 locks, release the first lock taken, then release the other 9 in reverse order), but with the absence of UNLOCK TABLE command, I can't figure out how to engineer that. Cheers, Jeff
Attachment
I have few doubts regarding logic of ResourceOwnerRememberLock() and ResourceOwnerForgetLock(): 1. In function ResourceOwnerRememberLock(), when lock count is MAX_RESOWNER_LOCKS, it will not add the lock to lock array but increment the count to make it 11. Now in ResourceOwnerForgetLock(), it cannot enter the if loop until the count is <= MAX_RESOWNER_LOCKS. So how it will forget the lock. 2. ResourceOwnerForgetLock(), it decrements the lock count before removing, so incase it doesn't find the lock in lockarray, the count will be decremented inspite the array still contains the same number of locks. -----Original Message----- From: pgsql-hackers-owner@postgresql.org [mailto:pgsql-hackers-owner@postgresql.org] On Behalf Of Jeff Janes Sent: Monday, June 11, 2012 2:10 AM To: pgsql-hackers Subject: [HACKERS] Resource Owner reassign Locks As discussed in several different email threads here and on performance , when using pg_dump a on large number of objects, the server has a quadratic behavior in LockReassignCurrentOwner where it has to dig through the entire local lock table to push one or two locks up from the portal being dropped to its parent. The attached patch fixes that by remembering up to 10 local locks, and pushing them up specifically rather than digging through the entire lock table. If the list overflows, then it reverts to the old behavior of digging through the entire local lock table. The same change was made to the case where the locks are being released, rather than reassigned (i.e. subtransaction abort rather than commit). I have no evidence that digging through the local lock table during bulk release was ever a bottleneck, but it seemed inconsistent not to make that change as well. When it needs to forget a lock, it searches backwards in the list of released lock and then just moves the last lock into the place of the one to be forgotten. Other ResourceOwner Forget functions slide the entire list down to close the gap, rather than using the selection-sort-like method. I don't understand why they do that. If Forgets are strictly LIFO, then it would not make a difference. If they are mostly LIFO but occasionally not, maybe the insertion method would win over the selection method. From what I can tell, Locks are mostly released in bulk anyway at transaction end, and rarely released explicitly. This patch reduces the time needed to dump 20,000 simple empty tables from 3m50.903s to 20.695s, and of course larger gains at larger numbers. dropdb test; createdb test for f in `seq 1 10000 4000000` ; do perl -le "print qq{create table foo\$_ (k integer , v integer);} foreach ($f..$f+9999)" | psql test>& /dev/null time pg_dump test|wc -c done For degrading performance in other cases, I think the best test case is "pgbench -P" (implemented in another patch in this commitfest) which has a loop which pushes one or two locks up from a portal to the parent (which already owns them, due to previous rounds of the same loop) very frequently. There might be a performance degradation of 0.5% or so, but it is less than the run to run variation. I plan to run some longer test to get a better estimate. If there is a degradation in that range, how important is that? I wanted to test a real worst case, which would be a malicious ordering of lock releases (take 10 locks, release the first lock taken, then release the other 9 in reverse order), but with the absence of UNLOCK TABLE command, I can't figure out how to engineer that. Cheers, Jeff
On Sun, Jun 10, 2012 at 11:28 PM, Amit Kapila <amit.kapila@huawei.com> wrote: > I have few doubts regarding logic of ResourceOwnerRememberLock() and > ResourceOwnerForgetLock(): > 1. In function ResourceOwnerRememberLock(), when lock count is > MAX_RESOWNER_LOCKS, it will not add the lock to lock array but increment the > count to make it 11. Yes, that means the list has over-flowed. Once it is over-flowed, it is now invalid for the reminder of the life of the resource owner. At one time I used simpler logic that stored the last lock even though it could never be accessed, but I didn't like that and changed it to three-valued logic: already over-flowed, just about to over-flow (do not store, but still increment), and normal (store and increment). I guess I could add a macro to test for overflow, rather than explicitly comparing nlocks to MAX_RESOWNER_LOCKS, but I would either need another macro for the "Not yet overflowed, but soon to be", or would still need to do a manual test in that one spot. > Now in ResourceOwnerForgetLock(), it cannot enter the if loop until the > count is <= MAX_RESOWNER_LOCKS. So how it will forget the lock. When it comes time to release or reassign, it will fall back to the original behavior of looking through the local lock table. > > 2. ResourceOwnerForgetLock(), it decrements the lock count before removing, > so incase it doesn't find the lock in lockarray, the count will be > decremented inspite the array still contains the same number of locks. Should it emit a FATAL rather than an ERROR? I thought ERROR was sufficient to make the backend quit, as it is not clear how it could meaningfully recover. Cheers, Jeff
> Yes, that means the list has over-flowed. Once it is over-flowed, it > is now invalid for the reminder of the life of the resource owner. Don't we need any logic to clear the reference of locallock in owner->locks array. MAX_RESOWNER_LOCKS - How did you arrive at number 10 for it. Is there any specific reason for 10. > Should it emit a FATAL rather than an ERROR? I thought ERROR was > sufficient to make the backend quit, as it is not clear how it could > meaningfully recover. I am not able to visualize any valid scenario in which it can happen unless some corruption happens. If this happens, user can close all statements and abort its transactions. According to me ERROR is okay. However in the message "Can't find lock to remove", it could be better, if there is information about resource owner and lock. -----Original Message----- From: Jeff Janes [mailto:jeff.janes@gmail.com] Sent: Monday, June 11, 2012 8:52 PM To: Amit Kapila Cc: pgsql-hackers Subject: Re: [HACKERS] Resource Owner reassign Locks On Sun, Jun 10, 2012 at 11:28 PM, Amit Kapila <amit.kapila@huawei.com> wrote: > I have few doubts regarding logic of ResourceOwnerRememberLock() and > ResourceOwnerForgetLock(): > 1. In function ResourceOwnerRememberLock(), when lock count is > MAX_RESOWNER_LOCKS, it will not add the lock to lock array but increment the > count to make it 11. Yes, that means the list has over-flowed. Once it is over-flowed, it is now invalid for the reminder of the life of the resource owner. At one time I used simpler logic that stored the last lock even though it could never be accessed, but I didn't like that and changed it to three-valued logic: already over-flowed, just about to over-flow (do not store, but still increment), and normal (store and increment). I guess I could add a macro to test for overflow, rather than explicitly comparing nlocks to MAX_RESOWNER_LOCKS, but I would either need another macro for the "Not yet overflowed, but soon to be", or would still need to do a manual test in that one spot. > Now in ResourceOwnerForgetLock(), it cannot enter the if loop until the > count is <= MAX_RESOWNER_LOCKS. So how it will forget the lock. When it comes time to release or reassign, it will fall back to the original behavior of looking through the local lock table. > > 2. ResourceOwnerForgetLock(), it decrements the lock count before removing, > so incase it doesn't find the lock in lockarray, the count will be > decremented inspite the array still contains the same number of locks. Should it emit a FATAL rather than an ERROR? I thought ERROR was sufficient to make the backend quit, as it is not clear how it could meaningfully recover. Cheers, Jeff
On Mon, Jun 11, 2012 at 9:30 PM, Amit Kapila <amit.kapila@huawei.com> wrote: >> Yes, that means the list has over-flowed. Once it is over-flowed, it >> is now invalid for the reminder of the life of the resource owner. > Don't we need any logic to clear the reference of locallock in owner->locks > array. I don't think so. C doesn't ref count its pointers. > MAX_RESOWNER_LOCKS - How did you arrive at number 10 for it. Is there any > specific reason for 10. I instrumented the code to record the maximum number of locks held by a resource owner, and report the max when it was destroyed. (That code is not in this patch). During a large pg_dump, the vast majority of the resource owners had maximum locks of 2, with some more at 4 and 6. Then there was one resource owner, for the top-level transaction, at tens or hundreds of thousands (basically one for every lockable object). There was little between 6 and this top-level number, so I thought 10 was a good compromise, safely above 6 but not so large that searching through the list itself was likely to bog down. Also, Tom independently suggested the same number. >> Should it emit a FATAL rather than an ERROR? I thought ERROR was >> sufficient to make the backend quit, as it is not clear how it could >> meaningfully recover. > > I am not able to visualize any valid scenario in which it can happen unless > some corruption happens. > If this happens, user can close all statements and abort its transactions. > According to me ERROR is okay. However in the message "Can't find lock to > remove", it could be better, > if there is information about resource owner and lock. I think we might end up changing that entirely once someone more familiar with the error handling mechanisms takes a look at it. I don't think that lock tags have good human readable formats, and just a pointer dump probably wouldn't be much use when something that can never happen has happened. But I'll at least add a reference to the resource owner if this stays in. Thanks, Jeff
Jeff Janes <jeff.janes@gmail.com> writes: > On Mon, Jun 11, 2012 at 9:30 PM, Amit Kapila <amit.kapila@huawei.com> wrote: >> MAX_RESOWNER_LOCKS - How did you arrive at number 10 for it. Is there any >> specific reason for 10. > I instrumented the code to record the maximum number of locks held by > a resource owner, and report the max when it was destroyed. (That > code is not in this patch). During a large pg_dump, the vast majority > of the resource owners had maximum locks of 2, with some more at 4 > and 6. Then there was one resource owner, for the top-level > transaction, at tens or hundreds of thousands (basically one for every > lockable object). There was little between 6 and this top-level > number, so I thought 10 was a good compromise, safely above 6 but not > so large that searching through the list itself was likely to bog > down. > Also, Tom independently suggested the same number. FYI, I had likewise suggested 10 on the basis of examining pg_dump's behavior. It might be a good idea to examine a few other use-cases before settling on a value. regards, tom lane
On Fri, Jun 15, 2012 at 3:29 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Jeff Janes <jeff.janes@gmail.com> writes: >> On Mon, Jun 11, 2012 at 9:30 PM, Amit Kapila <amit.kapila@huawei.com> wrote: >>> MAX_RESOWNER_LOCKS - How did you arrive at number 10 for it. Is there any >>> specific reason for 10. > >> I instrumented the code to record the maximum number of locks held by >> a resource owner, and report the max when it was destroyed. (That >> code is not in this patch). During a large pg_dump, the vast majority >> of the resource owners had maximum locks of 2, with some more at 4 >> and 6. Then there was one resource owner, for the top-level >> transaction, at tens or hundreds of thousands (basically one for every >> lockable object). There was little between 6 and this top-level >> number, so I thought 10 was a good compromise, safely above 6 but not >> so large that searching through the list itself was likely to bog >> down. > >> Also, Tom independently suggested the same number. > > FYI, I had likewise suggested 10 on the basis of examining pg_dump's > behavior. It might be a good idea to examine a few other use-cases > before settling on a value. Looking at the logging output of a "make check" run, there are many cases where the list would have overflown (max locks was >10), but in all of them the number of locks held at the time of destruction was equal to, or only slightly less than, the size of the local lock hash table. So iterating over a large memorized list would not save much computational complexity over iterating over the entire hash table (although the constant factor in iterating over pointers in an array might be smaller the constant factor for using a hash-iterator). Looking at pg_dump with more complex structures (table with multiple toasted columns and multiple unique indexes, and inherited tables) does use more max locks, but the number doesn't seem to depend on how many toast and indexes exist. There are very frequently a max of 9 locks occurring when the lock table is large, so that is uncomfortably close to overflowing. Adding sequences (or at least, using a type of serial) doesn't seem to increase the max used. I don't know if there a more principle-based way of approaching this. There are probably cases where maintaining the list of locks is loss rather than a gain, but since I don't how to create them I can't evaluate what the trade off might be to increasing the max. I'm inclined to increase the max from 10 to 15 to reclaim a margin of safety, and leave it at that, unless someone can recommend a better test case. Cheers, Jeff
> I don't think so. C doesn't ref count its pointers. You are right I have misunderstood. > I don't think that lock tags have good human readable formats, and just > a pointer dump probably wouldn't be much use when something that can > never happen has happened. But I'll at least add a reference to the > resource owner if this stays in. I have checked in lock.c file for the message where lock tags have been used. elog(ERROR, "lock %s on object %u/%u/%u is already held", lockMethodTable->lockModeNames[lockmode], lock->tag.locktag_field1,lock->tag.locktag_field2, lock->tag.locktag_field3); This can give more information about erroneous lock. ________________________________________ From: Jeff Janes [jeff.janes@gmail.com] Sent: Saturday, June 16, 2012 3:21 AM To: Amit kapila Cc: pgsql-hackers Subject: Re: [HACKERS] Resource Owner reassign Locks On Mon, Jun 11, 2012 at 9:30 PM, Amit Kapila <amit.kapila@huawei.com> wrote: >> Yes, that means the list has over-flowed. Once it is over-flowed, it >> is now invalid for the reminder of the life of the resource owner. > Don't we need any logic to clear the reference of locallock in owner->locks > array. I don't think so. C doesn't ref count its pointers. > MAX_RESOWNER_LOCKS - How did you arrive at number 10 for it. Is there any > specific reason for 10. I instrumented the code to record the maximum number of locks held by a resource owner, and report the max when it was destroyed. (That code is not in this patch). During a large pg_dump, the vast majority of the resource owners had maximum locks of 2, with some more at 4 and 6. Then there was one resource owner, for the top-level transaction, at tens or hundreds of thousands (basically one for every lockable object). There was little between 6 and this top-level number, so I thought 10 was a good compromise, safely above 6 but not so large that searching through the list itself was likely to bog down. Also, Tom independently suggested the same number. >> Should it emit a FATAL rather than an ERROR? I thought ERROR was >> sufficient to make the backend quit, as it is not clear how it could >> meaningfully recover. > > I am not able to visualize any valid scenario in which it can happen unless > some corruption happens. > If this happens, user can close all statements and abort its transactions. > According to me ERROR is okay. However in the message "Can't find lock to > remove", it could be better, > if there is information about resource owner and lock. I think we might end up changing that entirely once someone more familiar with the error handling mechanisms takes a look at it. I don't think that lock tags have good human readable formats, and just a pointer dump probably wouldn't be much use when something that can never happen has happened. But I'll at least add a reference to the resource owner if this stays in. Thanks, Jeff
On 16.06.2012 09:04, Amit kapila wrote: >> I don't think so. C doesn't ref count its pointers. > You are right I have misunderstood. > >> I don't think that lock tags have good human readable formats, and just >> a pointer dump probably wouldn't be much use when something that can >> never happen has happened. But I'll at least add a reference to the >> resource owner if this stays in. > > I have checked in lock.c file for the message where lock tags have been used. > elog(ERROR, "lock %s on object %u/%u/%u is already held", > lockMethodTable->lockModeNames[lockmode], > lock->tag.locktag_field1, lock->tag.locktag_field2, > lock->tag.locktag_field3); > > This can give more information about erroneous lock. A better error message would be nice, but I don't think it's worth that much. resowner.c doesn't currently know about the internals of LOCALLOCk struct or lock tags, and I'd prefer to keep it that way. Let's just print the pointer's address, that's what we do in many other corresponding error messages in other Forget* functions. On 11.06.2012 18:21, Jeff Janes wrote: > On Sun, Jun 10, 2012 at 11:28 PM, Amit Kapila<amit.kapila@huawei.com> wrote: >> 2. ResourceOwnerForgetLock(), it decrements the lock count before removing, >> so incase it doesn't find the lock in lockarray, the count will be >> decremented inspite the array still contains the same number of locks.> > Should it emit a FATAL rather than an ERROR? I thought ERROR was > sufficient to make the backend quit, as it is not clear how it could > meaningfully recover. ERROR seems right to me, it'll get promoted to FATAL or PANIC if we're in a context where we can't recover. But I agree with Amit that we should not decrement the lock count on error. I'm not sure if it makes any difference, as we'll abort out of the current transaction on error, but it certainly looks wrong. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
On 10.06.2012 23:39, Jeff Janes wrote: > The attached patch fixes that by remembering up to 10 local locks, and > pushing them up specifically rather than digging through the entire > lock table. If the list overflows, then it reverts to the old > behavior of digging through the entire local lock table. I'm a bit uncomfortable with having a fixed limit like that, after which you fall off the cliff. But I guess we can live with that, since we've had the quadratic behavior for years, and haven't heard complaints about it until recently. We can devise some more complex scheme later, if people still run into this. One idea for a more complex scheme, should we need one in the future, is to make the cache in the ResourceOwner expandable, like all the other arrays in ResourceOwner. It would not overflow when new entries are added it. However, if you try to forget a lock, and it's not found in the bottom 10 entries or so of the cache, mark the cache as overflowed at that point. That way reassigning the locks would be fast, even if the current resource owner holds a lot of locks, as longs as you haven't tried to release any of them out of LIFO order. > When it needs to forget a lock, it searches backwards in the list of > released lock and then just moves the last lock into the place of the > one to be forgotten. Other ResourceOwner Forget functions slide the > entire list down to close the gap, rather than using the > selection-sort-like method. I don't understand why they do that. If > Forgets are strictly LIFO, then it would not make a difference. If > they are mostly LIFO but occasionally not, maybe the insertion method > would win over the selection method. Yeah, I think that's the reason. We try to keep the arrays in LIFO order. If you move the last entry into the removed slot, then even a single out-of-order removal will spoil the heuristic that removals are done in LIFO order. Imagine that you insert 5 entries in order: 1 2 3 4 5 Now imagine that you first remove 1 (out-of-order), and then 5, 4, 3, and 2 (in order). The removal of 1 moves 5 to the beginning of the array. Then you remove 5, and that has to traverse the whole list because 5 is now at the beginning. Then you swap 4 into the beginning of the list, and so forth. Every subsequent removal scans the list from bottom to top, because of the single out-of-order removal. > From what I can tell, Locks are > mostly released in bulk anyway at transaction end, and rarely released > explicitly. Hmm, if they are released in bulk, then it doesn't matter either way. So the decision on whether to do the slide or swap has to be made on the assumption that some locks are released out-of-order. With an array of 10-15 entries, it probably doesn't make any difference either way, though. > For degrading performance in other cases, I think the best test case > is "pgbench -P" (implemented in another patch in this commitfest) > which has a loop which pushes one or two locks up from a portal to the > parent (which already owns them, due to previous rounds of the same > loop) very frequently. There might be a performance degradation of > 0.5% or so, but it is less than the run to run variation. I plan to > run some longer test to get a better estimate. If there is a > degradation in that range, how important is that? I think that would be acceptable. I found the interface between resowner.c and lock.c a bit confusing. resowner.c would sometimes call LockReassignCurrentOwner() to reassign all the locks, and sometimes it would call LockReassignCurrentOwner() on each individual lock, with the net effect that's the same as calling LockReleaseCurrentOwner(). And it doesn't seem right for ResourceOwnerRemember/ForgetLock to have to accept a NULL owner. I rearranged that so that there's just a single LockReassignCurrentOwner() function, like before this patch. But it takes as an optional argument a list of locks held by the current resource owner. If the caller knows it, it can pass them to make the call faster, but if it doesn't it can just pass NULL and the function will traverse the hash table the old-fashioned way. I think that's a better API. Please take a look to see if I broke something. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
Attachment
> A better error message would be nice, but I don't think it's worth that > much. resowner.c doesn't currently know about the internals of LOCALLOCk > struct or lock tags, and I'd prefer to keep it that way. Let's just > print the pointer's address, that's what we do in many other > corresponding error messages in other Forget* functions. I have checked there also doesn't exist any functions which expose lock internal parameters like tag values. So we can modify as you suggested. -----Original Message----- From: Heikki Linnakangas [mailto:heikki.linnakangas@enterprisedb.com] Sent: Monday, June 18, 2012 4:25 PM To: Amit kapila; Jeff Janes Cc: pgsql-hackers Subject: Re: Resource Owner reassign Locks On 16.06.2012 09:04, Amit kapila wrote: >> I don't think so. C doesn't ref count its pointers. > You are right I have misunderstood. > >> I don't think that lock tags have good human readable formats, and just >> a pointer dump probably wouldn't be much use when something that can >> never happen has happened. But I'll at least add a reference to the >> resource owner if this stays in. > > I have checked in lock.c file for the message where lock tags have been used. > elog(ERROR, "lock %s on object %u/%u/%u is already held", > lockMethodTable->lockModeNames[lockmode], > lock->tag.locktag_field1, lock->tag.locktag_field2, > lock->tag.locktag_field3); > > This can give more information about erroneous lock. A better error message would be nice, but I don't think it's worth that much. resowner.c doesn't currently know about the internals of LOCALLOCk struct or lock tags, and I'd prefer to keep it that way. Let's just print the pointer's address, that's what we do in many other corresponding error messages in other Forget* functions. On 11.06.2012 18:21, Jeff Janes wrote: > On Sun, Jun 10, 2012 at 11:28 PM, Amit Kapila<amit.kapila@huawei.com> wrote: >> 2. ResourceOwnerForgetLock(), it decrements the lock count before removing, >> so incase it doesn't find the lock in lockarray, the count will be >> decremented inspite the array still contains the same number of locks.> > Should it emit a FATAL rather than an ERROR? I thought ERROR was > sufficient to make the backend quit, as it is not clear how it could > meaningfully recover. ERROR seems right to me, it'll get promoted to FATAL or PANIC if we're in a context where we can't recover. But I agree with Amit that we should not decrement the lock count on error. I'm not sure if it makes any difference, as we'll abort out of the current transaction on error, but it certainly looks wrong. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
> And it doesn't seem right for ResourceOwnerRemember/ForgetLock to have to accept a NULL owner. I am not sure, if it can ever enter into this flow without resowner as mentioned in jeff comments for session level locks. If it cannot enter then it is okay. > Please take a look to see if I broke something. In you previous mail you agreed with level as ERROR for elog message in function ResourceOwnerForgetLock(..) function, but in your modification you have used PANIC, is there any specific reason for it. With Regards, Amit Kapila.
On 19.06.2012 09:02, Amit Kapila wrote: >> Please take a look to see if I broke something. > In you previous mail you agreed with level as ERROR for elog message in > function ResourceOwnerForgetLock(..) function, > but in your modification you have used PANIC, is there any specific reason > for it. Ah, sorry, that was just a debugging aid. Before posting the patch, I had a bug (now fixed) where I got that error, and I changed it temporarily to PANIC to get a core dump, but forgot to change it back before posting the patch. It should indeed be ERROR, not PANIC. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
On 18.06.2012 13:59, Heikki Linnakangas wrote: > On 10.06.2012 23:39, Jeff Janes wrote: > I found the interface between resowner.c and lock.c a bit confusing. > resowner.c would sometimes call LockReassignCurrentOwner() to reassign > all the locks, and sometimes it would call LockReassignCurrentOwner() on > each individual lock, with the net effect that's the same as calling > LockReleaseCurrentOwner(). And it doesn't seem right for > ResourceOwnerRemember/ForgetLock to have to accept a NULL owner. > > I rearranged that so that there's just a single > LockReassignCurrentOwner() function, like before this patch. But it > takes as an optional argument a list of locks held by the current > resource owner. If the caller knows it, it can pass them to make the > call faster, but if it doesn't it can just pass NULL and the function > will traverse the hash table the old-fashioned way. I think that's a > better API. > > Please take a look to see if I broke something. I hear no complaints, so committed. Thanks! -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
On Thu, Jun 21, 2012 at 5:32 AM, Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> wrote: > On 18.06.2012 13:59, Heikki Linnakangas wrote: >> >> On 10.06.2012 23:39, Jeff Janes wrote: >> I found the interface between resowner.c and lock.c a bit confusing. >> resowner.c would sometimes call LockReassignCurrentOwner() to reassign >> all the locks, and sometimes it would call LockReassignCurrentOwner() on >> each individual lock, with the net effect that's the same as calling >> LockReleaseCurrentOwner(). And it doesn't seem right for >> ResourceOwnerRemember/ForgetLock to have to accept a NULL owner. >> >> I rearranged that so that there's just a single >> LockReassignCurrentOwner() function, like before this patch. But it >> takes as an optional argument a list of locks held by the current >> resource owner. If the caller knows it, it can pass them to make the >> call faster, but if it doesn't it can just pass NULL and the function >> will traverse the hash table the old-fashioned way. I think that's a >> better API. Thank you, that does look much cleaner. >> >> Please take a look to see if I broke something. > > > I hear no complaints, so committed. Thanks! > Thanks. Just for the record, I'd previously promised some long running performance tests with my proposed -P option for pgbench, which are now done and showed a 0.2% degradation with my patch. With enough data collected, that difference is statistically significant, but probably not practically significant. It was with my original version, but I can't imagine your version being different in performance. Also, this test is very pessimistic. Since the primary key look-up in the pl/pgSQL look up runs in a portal each time, it pushes the locks to the parent each time. If the lookup was instead running as the inner side of a nested loop, it would not do the reassign on each loop. Cheers, Jeff
On Thu, Jun 21, 2012 at 5:32 AM, Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> wrote:
On 18.06.2012 13:59, Heikki Linnakangas wrote:On 10.06.2012 23:39, Jeff Janes wrote:
I found the interface between resowner.c and lock.c a bit confusing.
resowner.c would sometimes call LockReassignCurrentOwner() to reassign
all the locks, and sometimes it would call LockReassignCurrentOwner() on
each individual lock, with the net effect that's the same as calling
LockReleaseCurrentOwner(). And it doesn't seem right for
ResourceOwnerRemember/ForgetLock to have to accept a NULL owner.
I rearranged that so that there's just a single
LockReassignCurrentOwner() function, like before this patch. But it
takes as an optional argument a list of locks held by the current
resource owner. If the caller knows it, it can pass them to make the
call faster, but if it doesn't it can just pass NULL and the function
will traverse the hash table the old-fashioned way. I think that's a
better API.
Please take a look to see if I broke something.
I hear no complaints, so committed. Thanks!
I'd like to advocate for back-patching this to 9.0, 9.1, and 9.2. It has run without problems for a while now, and it can be considered a bug that systems with a very large number of objects cannot be upgraded in a reasonable time.
There are possible ways to make the upgrade smoother by changing the new systems pg_upgrade rather the old systems server, but those methods will not be simpler, and not be as well tested.
I'll add this proposal to the commit fest.
Thanks,
Jeff
On 2015-06-07 13:44:08 -0700, Jeff Janes wrote: > I'd like to advocate for back-patching this to 9.0, 9.1, and 9.2. It has > run without problems for a while now, and it can be considered a bug that > systems with a very large number of objects cannot be upgraded in a > reasonable time. +1 Greetings, Andres Freund
On 2015-06-07 13:44:08 -0700, Jeff Janes wrote: > I'd like to advocate for back-patching this to 9.0, 9.1, and 9.2. It has > run without problems for a while now, and it can be considered a bug that > systems with a very large number of objects cannot be upgraded in a > reasonable time. In that case, how about working on a version for <= 9.2 (single one should suffice)? This will likely include a bunch of wrapper functions to avoid changing the API in the back branches. Greetings, Andres Freund
<div dir="ltr"><div class="gmail_extra"><div class="gmail_quote">On Fri, Jul 3, 2015 at 12:59 AM, Andres Freund <span dir="ltr"><<ahref="mailto:andres@anarazel.de" target="_blank">andres@anarazel.de</a>></span> wrote:<br /><blockquoteclass="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><span class="">On2015-06-07 13:44:08 -0700, Jeff Janes wrote:<br /></span><span class="">> I'd like to advocate for back-patchingthis to 9.0, 9.1, and 9.2. It has<br /> > run without problems for a while now, and it can be considereda bug that<br /> > systems with a very large number of objects cannot be upgraded in a<br /> > reasonabletime.<br /><br /></span>In that case, how about working on a version for <= 9.2 (single one<br /> should suffice)?This will likely include a bunch of wrapper functions<br /> to avoid changing the API in the back branches.<br /><br/> Greetings,<br /><br /> Andres Freund<br /></blockquote></div><br /></div><div class="gmail_extra">Unfortunately Idon't know what that means about the API. Does it mean that none of the functions declared in any .h file can have theirsignatures changed? But new functions can be added?</div><div class="gmail_extra"><br /></div><div class="gmail_extra">Thanks,</div><divclass="gmail_extra"><br /></div><div class="gmail_extra">Jeff</div><div class="gmail_extra"><br/></div></div>
On July 9, 2015 9:13:20 PM GMT+02:00, Jeff Janes <jeff.janes@gmail.com> wrote: >Unfortunately I don't know what that means about the API. Does it mean >that none of the functions declared in any .h file can have their >signatures changed? But new functions can be added? That's the safest way. Sometimes you can decide that a function can not sanely be called by external code and thus changethe signature. But I'd rather not risk or here, IRS quite possible that one pod these is used by a extension. Andres --- Please excuse brevity and formatting - I am writing this on my mobile phone.
On Fri, Jul 10, 2015 at 4:22 AM, Andres Freund <andres@anarazel.de> wrote: > On July 9, 2015 9:13:20 PM GMT+02:00, Jeff Janes <jeff.janes@gmail.com> wrote: > >>Unfortunately I don't know what that means about the API. Does it mean >>that none of the functions declared in any .h file can have their >>signatures changed? But new functions can be added? > > That's the safest way. Sometimes you can decide that a function can not sanely be called by external code and thus changethe signature. But I'd rather not risk or here, IRS quite possible that one pod these is used by a extension. Where are we on this? Could there be a version for <= 9.2? -- Michael
On Tue, Aug 25, 2015 at 5:48 AM, Michael Paquier <michael.paquier@gmail.com> wrote:
On Fri, Jul 10, 2015 at 4:22 AM, Andres Freund <andres@anarazel.de> wrote:
> On July 9, 2015 9:13:20 PM GMT+02:00, Jeff Janes <jeff.janes@gmail.com> wrote:
>
>>Unfortunately I don't know what that means about the API. Does it mean
>>that none of the functions declared in any .h file can have their
>>signatures changed? But new functions can be added?
>
> That's the safest way. Sometimes you can decide that a function can not sanely be called by external code and thus change the signature. But I'd rather not risk or here, IRS quite possible that one pod these is used by a extension.
Where are we on this? Could there be a version for <= 9.2?
Once the code has to be rewritten, my argument that it has been working "in the field" for a while doesn't really apply anymore. It is beyond what I feel comfortable trying to do, especially as I have no "test case" of 3rd party code to verify I haven't broken it.
I still think is a good idea, but for someone who knows more about linkers and .so files than I do.
If I were faced with upgrading a 9.2 instance with many tens of thousands of objects, I would just backpatch the existing code and compile it to make a binary used only for the purposes of the upgrade.
Cheers,
Jeff
Jeff Janes <jeff.janes@gmail.com> writes: > On Tue, Aug 25, 2015 at 5:48 AM, Michael Paquier <michael.paquier@gmail.com> >> On Fri, Jul 10, 2015 at 4:22 AM, Andres Freund <andres@anarazel.de> wrote: >>> That's the safest way. Sometimes you can decide that a function can not >>> sanely be called by external code and thus change the signature. But I'd >>> rather not risk or here, IRS quite possible that one pod these is used by a >>> extension. >> Where are we on this? Could there be a version for <= 9.2? > Once the code has to be rewritten, my argument that it has been working "in > the field" for a while doesn't really apply anymore. Yeah. However, I'm not entirely following Andres' concern here. AFAICS, the only externally visible API change in commit eeb6f37d8 was that LockReleaseCurrentOwner and LockReassignCurrentOwner gained some arguments. That would certainly be an issue if there were any plausible reason for extension code to be calling either one --- but it seems to me that those functions are only meant to be called from resowner.c. What other use-case would there be for them? Were any follow-on commits needed to fix problems induced by eeb6f37d8? I couldn't find any in a quick trawl of the commit logs, but I could have missed something. regards, tom lane
On 2015-08-25 13:54:20 -0400, Tom Lane wrote: > Jeff Janes <jeff.janes@gmail.com> writes: > > Once the code has to be rewritten, my argument that it has been working "in > > the field" for a while doesn't really apply anymore. If rewriting involves adding two one line wrapper functions, I don't see the problem. > However, I'm not entirely following Andres' concern here. AFAICS, > the only externally visible API change in commit eeb6f37d8 was that > LockReleaseCurrentOwner and LockReassignCurrentOwner gained some > arguments. That would certainly be an issue if there were any plausible > reason for extension code to be calling either one --- but it seems to me > that those functions are only meant to be called from resowner.c. What > other use-case would there be for them? I don't think it's super likely, but I don't think it's impossible that somebody created their own resource owner. Say because they want to perform some operation and then release the locks without finishing the transaction. Adding a zero argument LockReleaseCurrentOwner()/LockReassignCurrentOwner() wrapper seems like a small enough effort to simply not bother looking for existing callers. > Were any follow-on commits needed to fix problems induced by eeb6f37d8? > I couldn't find any in a quick trawl of the commit logs, but I could have > missed something. I don't remember any at least. Andres
Andres Freund <andres@anarazel.de> writes: > On 2015-08-25 13:54:20 -0400, Tom Lane wrote: >> However, I'm not entirely following Andres' concern here. AFAICS, >> the only externally visible API change in commit eeb6f37d8 was that >> LockReleaseCurrentOwner and LockReassignCurrentOwner gained some >> arguments. That would certainly be an issue if there were any plausible >> reason for extension code to be calling either one --- but it seems to me >> that those functions are only meant to be called from resowner.c. What >> other use-case would there be for them? > I don't think it's super likely, but I don't think it's impossible that > somebody created their own resource owner. How would they have done that without major code surgery? We don't have any hooks or function pointers involved in the users of resowner.h. Certainly locks would not be getting passed to a nonstandard resowner. > Say because they want to > perform some operation and then release the locks without finishing the > transaction. Adding a zero argument > LockReleaseCurrentOwner()/LockReassignCurrentOwner() wrapper seems like > a small enough effort to simply not bother looking for existing callers. I agree that a wrapper is possible, but it's not without cost; both as to the time required to modify the patch, and as to possibly complicating future back-patching because the code becomes gratuitously different in the back branches. I really don't see that a wrapper is appropriate here. regards, tom lane
On 2015-08-25 14:12:37 -0400, Tom Lane wrote: > How would they have done that without major code surgery? We don't have > any hooks or function pointers involved in the users of resowner.h. > Certainly locks would not be getting passed to a nonstandard resowner. CurrentResourceOwner = myresowner; /* do some op */ ... ? > > Say because they want to > > perform some operation and then release the locks without finishing the > > transaction. Adding a zero argument > > LockReleaseCurrentOwner()/LockReassignCurrentOwner() wrapper seems like > > a small enough effort to simply not bother looking for existing callers. > > I agree that a wrapper is possible, but it's not without cost; both as to > the time required to modify the patch, and as to possibly complicating > future back-patching because the code becomes gratuitously different in > the back branches. I really don't see that a wrapper is appropriate here. Works for me.
Andres Freund <andres@anarazel.de> writes: > On 2015-08-25 14:12:37 -0400, Tom Lane wrote: >> How would they have done that without major code surgery? We don't have >> any hooks or function pointers involved in the users of resowner.h. >> Certainly locks would not be getting passed to a nonstandard resowner. > CurrentResourceOwner = myresowner; > /* do some op */ Yeah, but so what? GrantLockLocal does not contain any way that external code could change the way that a new lock is recorded. (IOW, yeah, certainly third-party code could create a new *instance* of the ResourceOwner data structure, but they would not have any knowledge of what's inside unless they had hacked the core code.) regards, tom lane
On 2015-08-25 14:33:25 -0400, Tom Lane wrote: > (IOW, yeah, certainly third-party code could create a new *instance* of > the ResourceOwner data structure, but they would not have any knowledge of > what's inside unless they had hacked the core code.) What I was thinking is that somebody created a new resowner, did something, and then called LockReleaseCurrentOwner() (because no locks are needed anymore), or LockReassignCurrentOwner() (say you want to abort a subtransaction, but do *not* want the locks to be released). Anyway, I slightly lean towards having wrappers, you strongly against, so that makes it an easy call.
Andres Freund <andres@anarazel.de> writes: > On 2015-08-25 14:33:25 -0400, Tom Lane wrote: >> (IOW, yeah, certainly third-party code could create a new *instance* of >> the ResourceOwner data structure, but they would not have any knowledge of >> what's inside unless they had hacked the core code.) > What I was thinking is that somebody created a new resowner, did > something, and then called LockReleaseCurrentOwner() (because no locks > are needed anymore), or LockReassignCurrentOwner() (say you want to > abort a subtransaction, but do *not* want the locks to be released). > Anyway, I slightly lean towards having wrappers, you strongly against, > so that makes it an easy call. Well, I'm not "strongly" against them, just trying to understand whether there's a plausible argument that someone is calling these functions from extensions. I'm not hearing one ... for one thing, I don't believe there are any extensions playing games with transaction/lock semantics. (My Salesforce colleagues have done some of that, and no you can't get far without changing the core code.) regards, tom lane
I went ahead and pushed this into 9.2 and 9.1. It did not apply at all to 9.0, though, as there evidently was some refactoring affecting LockReassignCurrentOwner() between 9.0 and 9.1. We could possibly have applied some additional patches to 9.0, but I think that would be stretching the argument that this is well-tested code. regards, tom lane