Re: bug in fast-path locking - Mailing list pgsql-hackers

From Robert Haas
Subject Re: bug in fast-path locking
Date
Msg-id CA+TgmoboxRNM=BAPPKcBDL-Jpm1301ijQr0aA_pkNTNZeuB6aQ@mail.gmail.com
Whole thread Raw
In response to Re: bug in fast-path locking  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: bug in fast-path locking  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
On Mon, Apr 9, 2012 at 1:49 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>> I looked at this more.  The above analysis is basically correct, but
>> the problem goes a bit beyond an error in LockWaitCancel().  We could
>> also crap out with an error before getting as far as LockWaitCancel()
>> and have the same problem.  I think that a correct statement of the
>> problem is this: from the time we bump the strong lock count, up until
>> the time we're done acquiring the lock (or give up on acquiring it),
>> we need to have an error-cleanup hook in place that will unbump the
>> strong lock count if we error out.   Once we're done updating the
>> shared and local lock tables, the special handling ceases to be
>> needed, because any subsequent lock release will go through
>> LockRelease() or LockReleaseAll(), which will do the appropriate
>> clenaup.
>
> Haven't looked at the code, but maybe it'd be better to not bump the
> strong lock count in the first place until the final step of updating
> the lock tables?

Well, unfortunately, that would break the entire mechanism.  The idea
is that we bump the strong lock count first.  That prevents anyone
from taking any more fast-path locks on the target relation.  Then, we
go through and find any existing fast-path locks that have already
been taken, and turn them into regular locks.  Finally, we resolve the
actual lock request and either grant the lock or block, depending on
whether conflicts exist.  So there's some necessary separation between
the action of bumping the strong lock count and updating the lock
tables; the entire mechanism relies on being able to do non-trivial
processing in between.  I thought that I had nailed down the error
exit cases in the original patch, but this test case, and some code
reading with fresh eyes, shows that I didn't do half so good a job as
I had thought.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


pgsql-hackers by date:

Previous
From: Alvaro Herrera
Date:
Subject: Re: Revisiting extract(epoch from timestamp)
Next
From: Andrew Dunstan
Date:
Subject: Re: why was the VAR 'optind' never changed in initdb?