Thread: Resource Owner reassign Locks

Resource Owner reassign Locks

From
Jeff Janes
Date:
As discussed in several different email threads here and on
performance , when using pg_dump a on large number of objects, the
server has a quadratic behavior in LockReassignCurrentOwner where it
has to dig through the entire local lock table to push one or two
locks up from the portal being dropped to its parent.

The attached patch fixes that by remembering up to 10 local locks, and
pushing them up specifically rather than digging through the entire
lock table.  If the list overflows, then it reverts to the old
behavior of digging through the entire local lock table.

The same change was made to the case where the locks are being
released, rather than reassigned (i.e. subtransaction abort rather
than commit).  I have no evidence that digging through the local lock
table during bulk release was ever a bottleneck, but it seemed
inconsistent not to make that change as well.

When it needs to  forget a lock, it searches backwards in the list of
released lock and then just moves the last lock into the place of the
one to be forgotten.  Other ResourceOwner Forget functions slide the
entire list down to close the gap, rather than using the
selection-sort-like method.  I don't understand why they do that.  If
Forgets are strictly LIFO, then it would not make a difference.  If
they are mostly LIFO but occasionally not, maybe the insertion method
would win over the selection method.  From what I can tell, Locks are
mostly released in bulk anyway at transaction end, and rarely released
explicitly.


This patch reduces the time needed to dump 20,000 simple empty tables
from 3m50.903s to 20.695s, and of course larger gains at larger
numbers.

dropdb test; createdb test
for f in `seq 1 10000 4000000` ; do
  perl -le "print qq{create table foo\$_ (k integer , v integer);}
foreach ($f..$f+9999)" | psql test>& /dev/null
  time pg_dump test|wc -c
done

For degrading performance in other cases, I think the best test case
is "pgbench -P" (implemented in another patch in this commitfest)
which has a loop which pushes one or two locks up from a portal to the
parent (which already owns them, due to previous rounds of the same
loop) very frequently.  There might be a performance degradation of
0.5% or so, but it is less than the run to run variation.  I plan to
run some longer test to get a better estimate.  If there is a
degradation in that range, how important is that?

I wanted to test a real worst case, which would be a malicious
ordering of lock releases (take 10 locks, release the first lock
taken, then release the other 9 in reverse order), but with the
absence of UNLOCK TABLE command, I can't figure out how to engineer
that.

Cheers,

Jeff

Attachment

Re: Resource Owner reassign Locks

From
Amit Kapila
Date:
I have few doubts regarding logic of ResourceOwnerRememberLock() and
ResourceOwnerForgetLock():
1. In function ResourceOwnerRememberLock(), when lock count is
MAX_RESOWNER_LOCKS, it will not add the lock to lock array but increment the
count to make it 11.
Now in ResourceOwnerForgetLock(), it cannot enter the if loop until the
count is <= MAX_RESOWNER_LOCKS. So how it will forget the lock.

2. ResourceOwnerForgetLock(), it decrements the lock count before removing,
so incase it doesn't find the lock in lockarray, the count will be
decremented inspite the array still contains the same number of locks.

-----Original Message-----
From: pgsql-hackers-owner@postgresql.org
[mailto:pgsql-hackers-owner@postgresql.org] On Behalf Of Jeff Janes
Sent: Monday, June 11, 2012 2:10 AM
To: pgsql-hackers
Subject: [HACKERS] Resource Owner reassign Locks

As discussed in several different email threads here and on performance ,
when using pg_dump a on large number of objects, the server has a quadratic
behavior in LockReassignCurrentOwner where it has to dig through the entire
local lock table to push one or two locks up from the portal being dropped
to its parent.

The attached patch fixes that by remembering up to 10 local locks, and
pushing them up specifically rather than digging through the entire lock
table.  If the list overflows, then it reverts to the old behavior of
digging through the entire local lock table.

The same change was made to the case where the locks are being released,
rather than reassigned (i.e. subtransaction abort rather than commit).  I
have no evidence that digging through the local lock table during bulk
release was ever a bottleneck, but it seemed inconsistent not to make that
change as well.

When it needs to  forget a lock, it searches backwards in the list of
released lock and then just moves the last lock into the place of the one to
be forgotten.  Other ResourceOwner Forget functions slide the entire list
down to close the gap, rather than using the selection-sort-like method.  I
don't understand why they do that.  If Forgets are strictly LIFO, then it
would not make a difference.  If they are mostly LIFO but occasionally not,
maybe the insertion method would win over the selection method.  From what I
can tell, Locks are mostly released in bulk anyway at transaction end, and
rarely released explicitly.


This patch reduces the time needed to dump 20,000 simple empty tables from
3m50.903s to 20.695s, and of course larger gains at larger numbers.

dropdb test; createdb test
for f in `seq 1 10000 4000000` ; do perl -le "print qq{create table foo\$_ (k integer , v integer);} foreach
($f..$f+9999)" | psql test>& /dev/null time pg_dump test|wc -c
done

For degrading performance in other cases, I think the best test case is
"pgbench -P" (implemented in another patch in this commitfest) which has a
loop which pushes one or two locks up from a portal to the parent (which
already owns them, due to previous rounds of the same
loop) very frequently.  There might be a performance degradation of 0.5% or
so, but it is less than the run to run variation.  I plan to run some longer
test to get a better estimate.  If there is a degradation in that range, how
important is that?

I wanted to test a real worst case, which would be a malicious ordering of
lock releases (take 10 locks, release the first lock taken, then release the
other 9 in reverse order), but with the absence of UNLOCK TABLE command, I
can't figure out how to engineer that.

Cheers,

Jeff



Re: Resource Owner reassign Locks

From
Jeff Janes
Date:
On Sun, Jun 10, 2012 at 11:28 PM, Amit Kapila <amit.kapila@huawei.com> wrote:
> I have few doubts regarding logic of ResourceOwnerRememberLock() and
> ResourceOwnerForgetLock():
> 1. In function ResourceOwnerRememberLock(), when lock count is
> MAX_RESOWNER_LOCKS, it will not add the lock to lock array but increment the
> count to make it 11.

Yes, that means the list has over-flowed.  Once it is over-flowed, it
is now invalid for the reminder of the life of the resource owner.  At
one time I used simpler logic that stored the last lock even though it
could never be accessed, but I didn't like that and changed it to
three-valued logic: already over-flowed, just about to over-flow (do
not store, but still increment), and normal (store and increment).

I guess I could add a macro to test for overflow, rather than
explicitly comparing nlocks to MAX_RESOWNER_LOCKS, but I would either
need another macro for the "Not yet overflowed, but soon to be", or
would still need to do a manual test in that one spot.


> Now in ResourceOwnerForgetLock(), it cannot enter the if loop until the
> count is <= MAX_RESOWNER_LOCKS. So how it will forget the lock.

When it comes time to release or reassign, it will fall back to the
original behavior of looking through the local lock table.

>
> 2. ResourceOwnerForgetLock(), it decrements the lock count before removing,
> so incase it doesn't find the lock in lockarray, the count will be
> decremented inspite the array still contains the same number of locks.

Should it emit a FATAL rather than an ERROR?  I thought ERROR was
sufficient to make the backend quit, as it is not clear how it could
meaningfully recover.

Cheers,

Jeff


Re: Resource Owner reassign Locks

From
Amit Kapila
Date:
> Yes, that means the list has over-flowed.  Once it is over-flowed, it
> is now invalid for the reminder of the life of the resource owner.
Don't we need any logic to clear the reference of locallock in owner->locks
array.

MAX_RESOWNER_LOCKS - How did you arrive at number 10 for it. Is there any
specific reason for 10.


> Should it emit a FATAL rather than an ERROR?  I thought ERROR was
> sufficient to make the backend quit, as it is not clear how it could
> meaningfully recover.

I am not able to visualize any valid scenario in which it can happen unless
some corruption happens.
If this happens, user can close all statements and abort its transactions. 
According to me ERROR is okay. However in the message "Can't find lock to
remove",  it could be better,
if there is information about resource owner and lock.


-----Original Message-----
From: Jeff Janes [mailto:jeff.janes@gmail.com] 
Sent: Monday, June 11, 2012 8:52 PM
To: Amit Kapila
Cc: pgsql-hackers
Subject: Re: [HACKERS] Resource Owner reassign Locks

On Sun, Jun 10, 2012 at 11:28 PM, Amit Kapila <amit.kapila@huawei.com>
wrote:
> I have few doubts regarding logic of ResourceOwnerRememberLock() and
> ResourceOwnerForgetLock():
> 1. In function ResourceOwnerRememberLock(), when lock count is
> MAX_RESOWNER_LOCKS, it will not add the lock to lock array but increment
the
> count to make it 11.

Yes, that means the list has over-flowed.  Once it is over-flowed, it
is now invalid for the reminder of the life of the resource owner.  At
one time I used simpler logic that stored the last lock even though it
could never be accessed, but I didn't like that and changed it to
three-valued logic: already over-flowed, just about to over-flow (do
not store, but still increment), and normal (store and increment).

I guess I could add a macro to test for overflow, rather than
explicitly comparing nlocks to MAX_RESOWNER_LOCKS, but I would either
need another macro for the "Not yet overflowed, but soon to be", or
would still need to do a manual test in that one spot.


> Now in ResourceOwnerForgetLock(), it cannot enter the if loop until the
> count is <= MAX_RESOWNER_LOCKS. So how it will forget the lock.

When it comes time to release or reassign, it will fall back to the
original behavior of looking through the local lock table.

>
> 2. ResourceOwnerForgetLock(), it decrements the lock count before
removing,
> so incase it doesn't find the lock in lockarray, the count will be
> decremented inspite the array still contains the same number of locks.

Should it emit a FATAL rather than an ERROR?  I thought ERROR was
sufficient to make the backend quit, as it is not clear how it could
meaningfully recover.

Cheers,

Jeff



Re: Resource Owner reassign Locks

From
Jeff Janes
Date:
On Mon, Jun 11, 2012 at 9:30 PM, Amit Kapila <amit.kapila@huawei.com> wrote:
>> Yes, that means the list has over-flowed.  Once it is over-flowed, it
>> is now invalid for the reminder of the life of the resource owner.

> Don't we need any logic to clear the reference of locallock in owner->locks
> array.

I don't think so.  C doesn't ref count its pointers.

> MAX_RESOWNER_LOCKS - How did you arrive at number 10 for it. Is there any
> specific reason for 10.

I instrumented the code to record the maximum number of locks held by
a resource owner, and report the max when it was destroyed.  (That
code is not in this patch).  During a large pg_dump, the vast majority
of the resource  owners had maximum locks of 2, with some more at 4
and 6.    Then there was one resource owner, for the top-level
transaction, at tens or hundreds of thousands (basically one for every
lockable object).  There was little between 6 and this top-level
number, so I thought 10 was a good compromise, safely above 6 but not
so large that searching through the list itself was likely to bog
down.

Also, Tom independently suggested the same number.


>> Should it emit a FATAL rather than an ERROR?  I thought ERROR was
>> sufficient to make the backend quit, as it is not clear how it could
>> meaningfully recover.
>
> I am not able to visualize any valid scenario in which it can happen unless
> some corruption happens.
> If this happens, user can close all statements and abort its transactions.
> According to me ERROR is okay. However in the message "Can't find lock to
> remove",  it could be better,
> if there is information about resource owner and lock.

I think we might end up changing that entirely once someone more
familiar with the error handling mechanisms takes a look at it.  I
don't think that lock tags have good human readable formats, and just
a pointer dump probably wouldn't be much use when something that can
never happen has happened.  But I'll at least add a reference to the
resource owner if this stays in.

Thanks,

Jeff


Re: Resource Owner reassign Locks

From
Tom Lane
Date:
Jeff Janes <jeff.janes@gmail.com> writes:
> On Mon, Jun 11, 2012 at 9:30 PM, Amit Kapila <amit.kapila@huawei.com> wrote:
>> MAX_RESOWNER_LOCKS - How did you arrive at number 10 for it. Is there any
>> specific reason for 10.

> I instrumented the code to record the maximum number of locks held by
> a resource owner, and report the max when it was destroyed.  (That
> code is not in this patch).  During a large pg_dump, the vast majority
> of the resource  owners had maximum locks of 2, with some more at 4
> and 6.    Then there was one resource owner, for the top-level
> transaction, at tens or hundreds of thousands (basically one for every
> lockable object).  There was little between 6 and this top-level
> number, so I thought 10 was a good compromise, safely above 6 but not
> so large that searching through the list itself was likely to bog
> down.

> Also, Tom independently suggested the same number.

FYI, I had likewise suggested 10 on the basis of examining pg_dump's
behavior.  It might be a good idea to examine a few other use-cases
before settling on a value.
        regards, tom lane


Re: Resource Owner reassign Locks

From
Jeff Janes
Date:
On Fri, Jun 15, 2012 at 3:29 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Jeff Janes <jeff.janes@gmail.com> writes:
>> On Mon, Jun 11, 2012 at 9:30 PM, Amit Kapila <amit.kapila@huawei.com> wrote:
>>> MAX_RESOWNER_LOCKS - How did you arrive at number 10 for it. Is there any
>>> specific reason for 10.
>
>> I instrumented the code to record the maximum number of locks held by
>> a resource owner, and report the max when it was destroyed.  (That
>> code is not in this patch).  During a large pg_dump, the vast majority
>> of the resource  owners had maximum locks of 2, with some more at 4
>> and 6.    Then there was one resource owner, for the top-level
>> transaction, at tens or hundreds of thousands (basically one for every
>> lockable object).  There was little between 6 and this top-level
>> number, so I thought 10 was a good compromise, safely above 6 but not
>> so large that searching through the list itself was likely to bog
>> down.
>
>> Also, Tom independently suggested the same number.
>
> FYI, I had likewise suggested 10 on the basis of examining pg_dump's
> behavior.  It might be a good idea to examine a few other use-cases
> before settling on a value.

Looking at the logging output of a "make check" run, there are many
cases where the list would have overflown (max locks was >10), but in
all of them the number of locks held at the time of destruction was
equal to, or only slightly less than, the size of the local lock hash
table.  So iterating over a large memorized list would not save much
computational complexity over iterating over the entire hash table
(although the constant factor in iterating over pointers in an array
might be smaller the constant factor for using a hash-iterator).

Looking at pg_dump with more complex structures (table with multiple
toasted columns and multiple unique indexes, and inherited tables)
does use more max locks, but the number doesn't seem to depend on how
many toast and indexes exist.  There are very frequently a max of 9
locks occurring when the lock table is large, so that is uncomfortably
close to overflowing.  Adding sequences (or at least, using a type of
serial) doesn't seem to increase the max used.

I don't know if there a more principle-based way of approaching this.

There are probably cases where maintaining the list of locks is loss
rather than a gain, but since I don't how to create them I can't
evaluate what the trade off might be to increasing the max.

I'm inclined to increase the max from 10 to 15 to reclaim a margin of
safety, and leave it at that, unless someone can recommend a better
test case.

Cheers,

Jeff


Re: Resource Owner reassign Locks

From
Amit kapila
Date:
> I don't think so.  C doesn't ref count its pointers.
You are right I have misunderstood.

> I don't think that lock tags have good human readable formats, and just
> a pointer dump probably wouldn't be much use when something that can
> never happen has happened.  But I'll at least add a reference to the
> resource owner if this stays in.

I have checked in lock.c file for the message where lock tags have been used.
elog(ERROR, "lock %s on object %u/%u/%u is already held",   lockMethodTable->lockModeNames[lockmode],
lock->tag.locktag_field1,lock->tag.locktag_field2,   lock->tag.locktag_field3); 

This can give more information about erroneous lock.

________________________________________
From: Jeff Janes [jeff.janes@gmail.com]
Sent: Saturday, June 16, 2012 3:21 AM
To: Amit kapila
Cc: pgsql-hackers
Subject: Re: [HACKERS] Resource Owner reassign Locks

On Mon, Jun 11, 2012 at 9:30 PM, Amit Kapila <amit.kapila@huawei.com> wrote:
>> Yes, that means the list has over-flowed.  Once it is over-flowed, it
>> is now invalid for the reminder of the life of the resource owner.

> Don't we need any logic to clear the reference of locallock in owner->locks
> array.

I don't think so.  C doesn't ref count its pointers.

> MAX_RESOWNER_LOCKS - How did you arrive at number 10 for it. Is there any
> specific reason for 10.

I instrumented the code to record the maximum number of locks held by
a resource owner, and report the max when it was destroyed.  (That
code is not in this patch).  During a large pg_dump, the vast majority
of the resource  owners had maximum locks of 2, with some more at 4
and 6.    Then there was one resource owner, for the top-level
transaction, at tens or hundreds of thousands (basically one for every
lockable object).  There was little between 6 and this top-level
number, so I thought 10 was a good compromise, safely above 6 but not
so large that searching through the list itself was likely to bog
down.

Also, Tom independently suggested the same number.


>> Should it emit a FATAL rather than an ERROR?  I thought ERROR was
>> sufficient to make the backend quit, as it is not clear how it could
>> meaningfully recover.
>
> I am not able to visualize any valid scenario in which it can happen unless
> some corruption happens.
> If this happens, user can close all statements and abort its transactions.
> According to me ERROR is okay. However in the message "Can't find lock to
> remove",  it could be better,
> if there is information about resource owner and lock.

I think we might end up changing that entirely once someone more
familiar with the error handling mechanisms takes a look at it.  I
don't think that lock tags have good human readable formats, and just
a pointer dump probably wouldn't be much use when something that can
never happen has happened.  But I'll at least add a reference to the
resource owner if this stays in.

Thanks,

Jeff

Re: Resource Owner reassign Locks

From
Heikki Linnakangas
Date:
On 16.06.2012 09:04, Amit kapila wrote:
>> I don't think so.  C doesn't ref count its pointers.
> You are right I have misunderstood.
>
>> I don't think that lock tags have good human readable formats, and just
>> a pointer dump probably wouldn't be much use when something that can
>> never happen has happened.  But I'll at least add a reference to the
>> resource owner if this stays in.
>
> I have checked in lock.c file for the message where lock tags have been used.
> elog(ERROR, "lock %s on object %u/%u/%u is already held",
>      lockMethodTable->lockModeNames[lockmode],
>      lock->tag.locktag_field1, lock->tag.locktag_field2,
>      lock->tag.locktag_field3);
>
> This can give more information about erroneous lock.

A better error message would be nice, but I don't think it's worth that 
much. resowner.c doesn't currently know about the internals of LOCALLOCk 
struct or lock tags, and I'd prefer to keep it that way. Let's just 
print the pointer's address, that's what we do in many other 
corresponding error messages in other Forget* functions.

On 11.06.2012 18:21, Jeff Janes wrote:
> On Sun, Jun 10, 2012 at 11:28 PM, Amit Kapila<amit.kapila@huawei.com>  wrote:
>> 2. ResourceOwnerForgetLock(), it decrements the lock count before removing,
>> so incase it doesn't find the lock in lockarray, the count will be
>> decremented inspite the array still contains the same number of locks.>
> Should it emit a FATAL rather than an ERROR?  I thought ERROR was
> sufficient to make the backend quit, as it is not clear how it could
> meaningfully recover.

ERROR seems right to me, it'll get promoted to FATAL or PANIC if we're 
in a context where we can't recover. But I agree with Amit that we 
should not decrement the lock count on error. I'm not sure if it makes 
any difference, as we'll abort out of the current transaction on error, 
but it certainly looks wrong.

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com


Re: Resource Owner reassign Locks

From
Heikki Linnakangas
Date:
On 10.06.2012 23:39, Jeff Janes wrote:
> The attached patch fixes that by remembering up to 10 local locks, and
> pushing them up specifically rather than digging through the entire
> lock table.  If the list overflows, then it reverts to the old
> behavior of digging through the entire local lock table.

I'm a bit uncomfortable with having a fixed limit like that, after which
you fall off the cliff. But I guess we can live with that, since we've
had the quadratic behavior for years, and haven't heard complaints about
it until recently. We can devise some more complex scheme later, if
people still run into this.

One idea for a more complex scheme, should we need one in the future, is
to make the cache in the ResourceOwner expandable, like all the other
arrays in ResourceOwner. It would not overflow when new entries are
added it. However, if you try to forget a lock, and it's not found in
the bottom 10 entries or so of the cache, mark the cache as overflowed
at that point. That way reassigning the locks would be fast, even if the
current resource owner holds a lot of locks, as longs as you haven't
tried to release any of them out of LIFO order.

> When it needs to  forget a lock, it searches backwards in the list of
> released lock and then just moves the last lock into the place of the
> one to be forgotten.  Other ResourceOwner Forget functions slide the
> entire list down to close the gap, rather than using the
> selection-sort-like method.  I don't understand why they do that.  If
> Forgets are strictly LIFO, then it would not make a difference.  If
> they are mostly LIFO but occasionally not, maybe the insertion method
> would win over the selection method.

Yeah, I think that's the reason. We try to keep the arrays in LIFO
order. If you move the last entry into the removed slot, then even a
single out-of-order removal will spoil the heuristic that removals are
done in LIFO order. Imagine that you insert 5 entries in order:

1
2
3
4
5

Now imagine that you first remove 1 (out-of-order), and then 5, 4, 3,
and 2 (in order). The removal of 1 moves 5 to the beginning of the
array. Then you remove 5, and that has to traverse the whole list
because 5 is now at the beginning. Then you swap 4 into the beginning of
the list, and so forth. Every subsequent removal scans the list from
bottom to top, because of the single out-of-order removal.

>  From what I can tell, Locks are
> mostly released in bulk anyway at transaction end, and rarely released
> explicitly.

Hmm, if they are released in bulk, then it doesn't matter either way. So
the decision on whether to do the slide or swap has to be made on the
assumption that some locks are released out-of-order. With an array of
10-15 entries, it probably doesn't make any difference either way, though.

> For degrading performance in other cases, I think the best test case
> is "pgbench -P" (implemented in another patch in this commitfest)
> which has a loop which pushes one or two locks up from a portal to the
> parent (which already owns them, due to previous rounds of the same
> loop) very frequently.  There might be a performance degradation of
> 0.5% or so, but it is less than the run to run variation.  I plan to
> run some longer test to get a better estimate.  If there is a
> degradation in that range, how important is that?

I think that would be acceptable.

I found the interface between resowner.c and lock.c a bit confusing.
resowner.c would sometimes call LockReassignCurrentOwner() to reassign
all the locks, and sometimes it would call LockReassignCurrentOwner() on
each individual lock, with the net effect that's the same as calling
LockReleaseCurrentOwner(). And it doesn't seem right for
ResourceOwnerRemember/ForgetLock to have to accept a NULL owner.

I rearranged that so that there's just a single
LockReassignCurrentOwner() function, like before this patch. But it
takes as an optional argument a list of locks held by the current
resource owner. If the caller knows it, it can pass them to make the
call faster, but if it doesn't it can just pass NULL and the function
will traverse the hash table the old-fashioned way. I think that's a
better API.

Please take a look to see if I broke something.

--
   Heikki Linnakangas
   EnterpriseDB   http://www.enterprisedb.com

Attachment

Re: Resource Owner reassign Locks

From
Amit Kapila
Date:
> A better error message would be nice, but I don't think it's worth that 
> much. resowner.c doesn't currently know about the internals of LOCALLOCk 
> struct or lock tags, and I'd prefer to keep it that way. Let's just 
> print the pointer's address, that's what we do in many other 
> corresponding error messages in other Forget* functions.

I have checked there also doesn't exist any functions which expose lock
internal parameters like tag values.
So we can modify as you suggested.

-----Original Message-----
From: Heikki Linnakangas [mailto:heikki.linnakangas@enterprisedb.com] 
Sent: Monday, June 18, 2012 4:25 PM
To: Amit kapila; Jeff Janes
Cc: pgsql-hackers
Subject: Re: Resource Owner reassign Locks

On 16.06.2012 09:04, Amit kapila wrote:
>> I don't think so.  C doesn't ref count its pointers.
> You are right I have misunderstood.
>
>> I don't think that lock tags have good human readable formats, and just
>> a pointer dump probably wouldn't be much use when something that can
>> never happen has happened.  But I'll at least add a reference to the
>> resource owner if this stays in.
>
> I have checked in lock.c file for the message where lock tags have been
used.
> elog(ERROR, "lock %s on object %u/%u/%u is already held",
>      lockMethodTable->lockModeNames[lockmode],
>      lock->tag.locktag_field1, lock->tag.locktag_field2,
>      lock->tag.locktag_field3);
>
> This can give more information about erroneous lock.

A better error message would be nice, but I don't think it's worth that 
much. resowner.c doesn't currently know about the internals of LOCALLOCk 
struct or lock tags, and I'd prefer to keep it that way. Let's just 
print the pointer's address, that's what we do in many other 
corresponding error messages in other Forget* functions.

On 11.06.2012 18:21, Jeff Janes wrote:
> On Sun, Jun 10, 2012 at 11:28 PM, Amit Kapila<amit.kapila@huawei.com>
wrote:
>> 2. ResourceOwnerForgetLock(), it decrements the lock count before
removing,
>> so incase it doesn't find the lock in lockarray, the count will be
>> decremented inspite the array still contains the same number of locks.>
> Should it emit a FATAL rather than an ERROR?  I thought ERROR was
> sufficient to make the backend quit, as it is not clear how it could
> meaningfully recover.

ERROR seems right to me, it'll get promoted to FATAL or PANIC if we're 
in a context where we can't recover. But I agree with Amit that we 
should not decrement the lock count on error. I'm not sure if it makes 
any difference, as we'll abort out of the current transaction on error, 
but it certainly looks wrong.

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com



Re: Resource Owner reassign Locks

From
Amit Kapila
Date:
> And it doesn't seem right for ResourceOwnerRemember/ForgetLock to have to
accept a NULL owner. I am not sure, if it can ever enter into this flow without resowner as
mentioned in jeff comments  for session level locks. If it cannot enter then it is okay.

> Please take a look to see if I broke something. In you previous mail you agreed with level as ERROR for elog message
in
function ResourceOwnerForgetLock(..) function,    but in your modification you have used PANIC, is there any specific
reason
for it.

With Regards,
Amit Kapila.



Re: Resource Owner reassign Locks

From
Heikki Linnakangas
Date:
On 19.06.2012 09:02, Amit Kapila wrote:
>> Please take a look to see if I broke something.
>    In you previous mail you agreed with level as ERROR for elog message in
> function ResourceOwnerForgetLock(..) function,
>    but in your modification you have used PANIC, is there any specific reason
> for it.

Ah, sorry, that was just a debugging aid. Before posting the patch, I 
had a bug (now fixed) where I got that error, and I changed it 
temporarily to PANIC to get a core dump, but forgot to change it back 
before posting the patch. It should indeed be ERROR, not PANIC.

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com


Re: Resource Owner reassign Locks

From
Heikki Linnakangas
Date:
On 18.06.2012 13:59, Heikki Linnakangas wrote:
> On 10.06.2012 23:39, Jeff Janes wrote:
> I found the interface between resowner.c and lock.c a bit confusing.
> resowner.c would sometimes call LockReassignCurrentOwner() to reassign
> all the locks, and sometimes it would call LockReassignCurrentOwner() on
> each individual lock, with the net effect that's the same as calling
> LockReleaseCurrentOwner(). And it doesn't seem right for
> ResourceOwnerRemember/ForgetLock to have to accept a NULL owner.
>
> I rearranged that so that there's just a single
> LockReassignCurrentOwner() function, like before this patch. But it
> takes as an optional argument a list of locks held by the current
> resource owner. If the caller knows it, it can pass them to make the
> call faster, but if it doesn't it can just pass NULL and the function
> will traverse the hash table the old-fashioned way. I think that's a
> better API.
>
> Please take a look to see if I broke something.

I hear no complaints, so committed. Thanks!

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com


Re: Resource Owner reassign Locks

From
Jeff Janes
Date:
On Thu, Jun 21, 2012 at 5:32 AM, Heikki Linnakangas
<heikki.linnakangas@enterprisedb.com> wrote:
> On 18.06.2012 13:59, Heikki Linnakangas wrote:
>>
>> On 10.06.2012 23:39, Jeff Janes wrote:
>> I found the interface between resowner.c and lock.c a bit confusing.
>> resowner.c would sometimes call LockReassignCurrentOwner() to reassign
>> all the locks, and sometimes it would call LockReassignCurrentOwner() on
>> each individual lock, with the net effect that's the same as calling
>> LockReleaseCurrentOwner(). And it doesn't seem right for
>> ResourceOwnerRemember/ForgetLock to have to accept a NULL owner.
>>
>> I rearranged that so that there's just a single
>> LockReassignCurrentOwner() function, like before this patch. But it
>> takes as an optional argument a list of locks held by the current
>> resource owner. If the caller knows it, it can pass them to make the
>> call faster, but if it doesn't it can just pass NULL and the function
>> will traverse the hash table the old-fashioned way. I think that's a
>> better API.

Thank you, that does look much cleaner.

>>
>> Please take a look to see if I broke something.
>
>
> I hear no complaints, so committed. Thanks!
>

Thanks.

Just for the record, I'd previously promised some long running
performance tests with my proposed -P option for pgbench, which are
now done and showed a 0.2% degradation with my patch.  With enough
data collected, that difference is statistically significant, but
probably not practically significant.  It was with my original
version, but I can't imagine your version being different in
performance.  Also, this test is very pessimistic.  Since the primary
key look-up in the pl/pgSQL look up runs in a portal each time, it
pushes the locks to the parent each time.  If the lookup was instead
running as the inner side of a nested loop, it would not do the
reassign on each loop.

Cheers,

Jeff


Re: Resource Owner reassign Locks

From
Jeff Janes
Date:
On Thu, Jun 21, 2012 at 5:32 AM, Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> wrote:
On 18.06.2012 13:59, Heikki Linnakangas wrote:
On 10.06.2012 23:39, Jeff Janes wrote:
I found the interface between resowner.c and lock.c a bit confusing.
resowner.c would sometimes call LockReassignCurrentOwner() to reassign
all the locks, and sometimes it would call LockReassignCurrentOwner() on
each individual lock, with the net effect that's the same as calling
LockReleaseCurrentOwner(). And it doesn't seem right for
ResourceOwnerRemember/ForgetLock to have to accept a NULL owner.

I rearranged that so that there's just a single
LockReassignCurrentOwner() function, like before this patch. But it
takes as an optional argument a list of locks held by the current
resource owner. If the caller knows it, it can pass them to make the
call faster, but if it doesn't it can just pass NULL and the function
will traverse the hash table the old-fashioned way. I think that's a
better API.

Please take a look to see if I broke something.

I hear no complaints, so committed. Thanks!

I'd like to advocate for back-patching this to 9.0, 9.1, and 9.2.  It has run without problems for a while now, and it can be considered a bug that systems with a very large number of objects cannot be upgraded in a reasonable time.

There are possible ways to make the upgrade smoother by changing the new systems pg_upgrade rather the old systems server, but those methods will not be simpler, and not be as well tested.

I'll add this proposal to the commit fest.

Thanks,

Jeff

Re: Resource Owner reassign Locks

From
Andres Freund
Date:
On 2015-06-07 13:44:08 -0700, Jeff Janes wrote:
> I'd like to advocate for back-patching this to 9.0, 9.1, and 9.2.  It has
> run without problems for a while now, and it can be considered a bug that
> systems with a very large number of objects cannot be upgraded in a
> reasonable time.

+1

Greetings,

Andres Freund



Re: Resource Owner reassign Locks

From
Andres Freund
Date:
On 2015-06-07 13:44:08 -0700, Jeff Janes wrote:
> I'd like to advocate for back-patching this to 9.0, 9.1, and 9.2.  It has
> run without problems for a while now, and it can be considered a bug that
> systems with a very large number of objects cannot be upgraded in a
> reasonable time.

In that case, how about working on a version for <= 9.2 (single one
should suffice)? This will likely include a bunch of wrapper functions
to avoid changing the API in the back branches.

Greetings,

Andres Freund



Re: Resource Owner reassign Locks

From
Jeff Janes
Date:
<div dir="ltr"><div class="gmail_extra"><div class="gmail_quote">On Fri, Jul 3, 2015 at 12:59 AM, Andres Freund <span
dir="ltr"><<ahref="mailto:andres@anarazel.de" target="_blank">andres@anarazel.de</a>></span> wrote:<br
/><blockquoteclass="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><span
class="">On2015-06-07 13:44:08 -0700, Jeff Janes wrote:<br /></span><span class="">> I'd like to advocate for
back-patchingthis to 9.0, 9.1, and 9.2.  It has<br /> > run without problems for a while now, and it can be
considereda bug that<br /> > systems with a very large number of objects cannot be upgraded in a<br /> >
reasonabletime.<br /><br /></span>In that case, how about working on a version for <= 9.2 (single one<br /> should
suffice)?This will likely include a bunch of wrapper functions<br /> to avoid changing the API in the back branches.<br
/><br/> Greetings,<br /><br /> Andres Freund<br /></blockquote></div><br /></div><div class="gmail_extra">Unfortunately
Idon't know what that means about the API.  Does it mean that none of the functions declared in any .h file can have
theirsignatures changed?  But new functions can be added?</div><div class="gmail_extra"><br /></div><div
class="gmail_extra">Thanks,</div><divclass="gmail_extra"><br /></div><div class="gmail_extra">Jeff</div><div
class="gmail_extra"><br/></div></div> 

Re: Resource Owner reassign Locks

From
Andres Freund
Date:
On July 9, 2015 9:13:20 PM GMT+02:00, Jeff Janes <jeff.janes@gmail.com> wrote:

>Unfortunately I don't know what that means about the API.  Does it mean
>that none of the functions declared in any .h file can have their
>signatures changed?  But new functions can be added?

That's the safest way. Sometimes you can decide that a function can not sanely be called by external code and thus
changethe signature. But I'd rather not risk or here, IRS quite possible that one pod these is used by a extension.
 

Andres

--- 
Please excuse brevity and formatting - I am writing this on my mobile phone.



Re: Resource Owner reassign Locks

From
Michael Paquier
Date:
On Fri, Jul 10, 2015 at 4:22 AM, Andres Freund <andres@anarazel.de> wrote:
> On July 9, 2015 9:13:20 PM GMT+02:00, Jeff Janes <jeff.janes@gmail.com> wrote:
>
>>Unfortunately I don't know what that means about the API.  Does it mean
>>that none of the functions declared in any .h file can have their
>>signatures changed?  But new functions can be added?
>
> That's the safest way. Sometimes you can decide that a function can not sanely be called by external code and thus
changethe signature. But I'd rather not risk or here, IRS quite possible that one pod these is used by a extension.
 

Where are we on this? Could there be a version for <= 9.2?
-- 
Michael



Re: Resource Owner reassign Locks

From
Jeff Janes
Date:

On Tue, Aug 25, 2015 at 5:48 AM, Michael Paquier <michael.paquier@gmail.com> wrote:
On Fri, Jul 10, 2015 at 4:22 AM, Andres Freund <andres@anarazel.de> wrote:
> On July 9, 2015 9:13:20 PM GMT+02:00, Jeff Janes <jeff.janes@gmail.com> wrote:
>
>>Unfortunately I don't know what that means about the API.  Does it mean
>>that none of the functions declared in any .h file can have their
>>signatures changed?  But new functions can be added?
>
> That's the safest way. Sometimes you can decide that a function can not sanely be called by external code and thus change the signature. But I'd rather not risk or here, IRS quite possible that one pod these is used by a extension.

Where are we on this? Could there be a version for <= 9.2?

Once the code has to be rewritten, my argument that it has been working "in the field" for a while doesn't really apply anymore.  It is beyond what I feel comfortable trying to do, especially as I have no "test case" of 3rd party code to verify I haven't broken it.

I still think is a good idea, but for someone who knows more about linkers and .so files than I do.

If I were faced with upgrading a 9.2 instance with many tens of thousands of objects, I would just backpatch the existing code and compile it to make a binary used only for the purposes of the upgrade.

Cheers,

Jeff

Re: Resource Owner reassign Locks

From
Tom Lane
Date:
Jeff Janes <jeff.janes@gmail.com> writes:
> On Tue, Aug 25, 2015 at 5:48 AM, Michael Paquier <michael.paquier@gmail.com>
>> On Fri, Jul 10, 2015 at 4:22 AM, Andres Freund <andres@anarazel.de> wrote:
>>> That's the safest way. Sometimes you can decide that a function can not
>>> sanely be called by external code and thus change the signature. But I'd
>>> rather not risk or here, IRS quite possible that one pod these is used by a
>>> extension.

>> Where are we on this? Could there be a version for <= 9.2?

> Once the code has to be rewritten, my argument that it has been working "in
> the field" for a while doesn't really apply anymore.

Yeah.

However, I'm not entirely following Andres' concern here.  AFAICS,
the only externally visible API change in commit eeb6f37d8 was that
LockReleaseCurrentOwner and LockReassignCurrentOwner gained some
arguments.  That would certainly be an issue if there were any plausible
reason for extension code to be calling either one --- but it seems to me
that those functions are only meant to be called from resowner.c.  What
other use-case would there be for them?

Were any follow-on commits needed to fix problems induced by eeb6f37d8?
I couldn't find any in a quick trawl of the commit logs, but I could have
missed something.
        regards, tom lane



Re: Resource Owner reassign Locks

From
Andres Freund
Date:
On 2015-08-25 13:54:20 -0400, Tom Lane wrote:
> Jeff Janes <jeff.janes@gmail.com> writes:
> > Once the code has to be rewritten, my argument that it has been working "in
> > the field" for a while doesn't really apply anymore.

If rewriting involves adding two one line wrapper functions, I don't see
the problem.

> However, I'm not entirely following Andres' concern here.  AFAICS,
> the only externally visible API change in commit eeb6f37d8 was that
> LockReleaseCurrentOwner and LockReassignCurrentOwner gained some
> arguments.  That would certainly be an issue if there were any plausible
> reason for extension code to be calling either one --- but it seems to me
> that those functions are only meant to be called from resowner.c.  What
> other use-case would there be for them?

I don't think it's super likely, but I don't think it's impossible that
somebody created their own resource owner. Say because they want to
perform some operation and then release the locks without finishing the
transaction.  Adding a zero argument
LockReleaseCurrentOwner()/LockReassignCurrentOwner() wrapper seems like
a small enough effort to simply not bother looking for existing callers.


> Were any follow-on commits needed to fix problems induced by eeb6f37d8?
> I couldn't find any in a quick trawl of the commit logs, but I could have
> missed something.

I don't remember any at least.

Andres



Re: Resource Owner reassign Locks

From
Tom Lane
Date:
Andres Freund <andres@anarazel.de> writes:
> On 2015-08-25 13:54:20 -0400, Tom Lane wrote:
>> However, I'm not entirely following Andres' concern here.  AFAICS,
>> the only externally visible API change in commit eeb6f37d8 was that
>> LockReleaseCurrentOwner and LockReassignCurrentOwner gained some
>> arguments.  That would certainly be an issue if there were any plausible
>> reason for extension code to be calling either one --- but it seems to me
>> that those functions are only meant to be called from resowner.c.  What
>> other use-case would there be for them?

> I don't think it's super likely, but I don't think it's impossible that
> somebody created their own resource owner.

How would they have done that without major code surgery?  We don't have
any hooks or function pointers involved in the users of resowner.h.
Certainly locks would not be getting passed to a nonstandard resowner.

> Say because they want to
> perform some operation and then release the locks without finishing the
> transaction.  Adding a zero argument
> LockReleaseCurrentOwner()/LockReassignCurrentOwner() wrapper seems like
> a small enough effort to simply not bother looking for existing callers.

I agree that a wrapper is possible, but it's not without cost; both as to
the time required to modify the patch, and as to possibly complicating
future back-patching because the code becomes gratuitously different in
the back branches.  I really don't see that a wrapper is appropriate here.
        regards, tom lane



Re: Resource Owner reassign Locks

From
Andres Freund
Date:
On 2015-08-25 14:12:37 -0400, Tom Lane wrote:
> How would they have done that without major code surgery?  We don't have
> any hooks or function pointers involved in the users of resowner.h.
> Certainly locks would not be getting passed to a nonstandard resowner.

CurrentResourceOwner = myresowner;
/* do some op */
...
?

> > Say because they want to
> > perform some operation and then release the locks without finishing the
> > transaction.  Adding a zero argument
> > LockReleaseCurrentOwner()/LockReassignCurrentOwner() wrapper seems like
> > a small enough effort to simply not bother looking for existing callers.
> 
> I agree that a wrapper is possible, but it's not without cost; both as to
> the time required to modify the patch, and as to possibly complicating
> future back-patching because the code becomes gratuitously different in
> the back branches.  I really don't see that a wrapper is appropriate here.

Works for me.



Re: Resource Owner reassign Locks

From
Tom Lane
Date:
Andres Freund <andres@anarazel.de> writes:
> On 2015-08-25 14:12:37 -0400, Tom Lane wrote:
>> How would they have done that without major code surgery?  We don't have
>> any hooks or function pointers involved in the users of resowner.h.
>> Certainly locks would not be getting passed to a nonstandard resowner.

> CurrentResourceOwner = myresowner;
> /* do some op */

Yeah, but so what?  GrantLockLocal does not contain any way that external
code could change the way that a new lock is recorded.

(IOW, yeah, certainly third-party code could create a new *instance* of
the ResourceOwner data structure, but they would not have any knowledge of
what's inside unless they had hacked the core code.)
        regards, tom lane



Re: Resource Owner reassign Locks

From
Andres Freund
Date:
On 2015-08-25 14:33:25 -0400, Tom Lane wrote:
> (IOW, yeah, certainly third-party code could create a new *instance* of
> the ResourceOwner data structure, but they would not have any knowledge of
> what's inside unless they had hacked the core code.)

What I was thinking is that somebody created a new resowner, did
something, and then called LockReleaseCurrentOwner() (because no locks
are needed anymore), or LockReassignCurrentOwner() (say you want to
abort a subtransaction, but do *not* want the locks to be released).

Anyway, I slightly lean towards having wrappers, you strongly against,
so that makes it an easy call.



Re: Resource Owner reassign Locks

From
Tom Lane
Date:
Andres Freund <andres@anarazel.de> writes:
> On 2015-08-25 14:33:25 -0400, Tom Lane wrote:
>> (IOW, yeah, certainly third-party code could create a new *instance* of
>> the ResourceOwner data structure, but they would not have any knowledge of
>> what's inside unless they had hacked the core code.)

> What I was thinking is that somebody created a new resowner, did
> something, and then called LockReleaseCurrentOwner() (because no locks
> are needed anymore), or LockReassignCurrentOwner() (say you want to
> abort a subtransaction, but do *not* want the locks to be released).

> Anyway, I slightly lean towards having wrappers, you strongly against,
> so that makes it an easy call.

Well, I'm not "strongly" against them, just trying to understand whether
there's a plausible argument that someone is calling these functions from
extensions.  I'm not hearing one ... for one thing, I don't believe there
are any extensions playing games with transaction/lock semantics.  (My
Salesforce colleagues have done some of that, and no you can't get far
without changing the core code.)
        regards, tom lane



Re: Resource Owner reassign Locks

From
Tom Lane
Date:
I went ahead and pushed this into 9.2 and 9.1.  It did not apply at all
to 9.0, though, as there evidently was some refactoring affecting
LockReassignCurrentOwner() between 9.0 and 9.1.  We could possibly have
applied some additional patches to 9.0, but I think that would be
stretching the argument that this is well-tested code.
        regards, tom lane