Thread: FOR SHARE vs FOR UPDATE locks
I just realized that we have a bit of a problem with upgrading row locks. Consider the following sequence: regression=# begin; BEGIN regression=# select * from int4_tbl where f1 = 0 for share;f1 ---- 0 (1 row) regression=# savepoint x; SAVEPOINT regression=# select * from int4_tbl where f1 = 0 for update;f1 ---- 0 (1 row) regression=# rollback to x; ROLLBACK The FOR UPDATE replaces the former shared row lock with an exclusive lock in the name of the subtransaction. After the ROLLBACK, the row appears not to be locked at all (it is ex-locked with XMAX = a failed transaction), so another backend could come along and modify it. That shouldn't happen --- we should act as though the outer transaction's FOR SHARE lock is still held. Unfortunately, I don't think there is any good way to implement that, since we surely don't have room in the tuple header to track multiple locks. One possibility is to try to assign the ex-lock in the name of the highest subtransaction holding a row lock, but that seems messy, and it wouldn't really have the correct semantics anyway --- in the above example, the outer transaction would be left holding ex-lock which would be surprising. I'm tempted to just error out in this scenario rather than allow the lock upgrade. Thoughts? regards, tom lane
Tom Lane wrote:> I'm tempted to just error out in this scenario rather than allow the> lock upgrade. Thoughts? Although this seems to be a technically hard problem, the above sentence does not sound like the PostgreSQL way to solve problems (rather like MySQL). ;-) Now seriously, isn't this a perfectly feasible scenario? E.g. the outer transaction acquires a shared lock because of foreign key constraints, and the sub transaction later wants to update that row? Best Regards Michael Paesold
Michael Paesold <mpaesold@gmx.at> writes: > Tom Lane wrote: >>> I'm tempted to just error out in this scenario rather than allow the >>> lock upgrade. Thoughts? > Although this seems to be a technically hard problem, the above sentence > does not sound like the PostgreSQL way to solve problems (rather like > MySQL). ;-) No, the MySQL way is to let it do something that's easy, fast, and sort of works most of the time ;-) > Now seriously, isn't this a perfectly feasible scenario? E.g. the outer > transaction acquires a shared lock because of foreign key constraints, and > the sub transaction later wants to update that row? Yeah, it's not implausible. But the only way I can see to implement that is to upgrade the outer xact's shared lock to exclusive, and that doesn't seem real cool either. More to the point, we are up against a release deadline, and so the only options we have today are for things that are simple, bulletproof, and don't lock us out of doing something smarter later. Hence my suggestion to throw an error. regards, tom lane
On Thu, 2006-11-30 at 17:06 -0500, Tom Lane wrote: > I just realized that we have a bit of a problem with upgrading row > locks. Consider the following sequence: > > regression=# begin; > BEGIN > regression=# select * from int4_tbl where f1 = 0 for share; > f1 > ---- > 0 > (1 row) > > regression=# savepoint x; > SAVEPOINT > regression=# select * from int4_tbl where f1 = 0 for update; > f1 > ---- > 0 > (1 row) > > regression=# rollback to x; > ROLLBACK > > The FOR UPDATE replaces the former shared row lock with an exclusive > lock in the name of the subtransaction. After the ROLLBACK, the row > appears not to be locked at all (it is ex-locked with XMAX = a failed > transaction), so another backend could come along and modify it. > That shouldn't happen --- we should act as though the outer > transaction's FOR SHARE lock is still held. > > Unfortunately, I don't think there is any good way to implement that, > since we surely don't have room in the tuple header to track multiple > locks. One possibility is to try to assign the ex-lock in the name > of the highest subtransaction holding a row lock, but that seems messy, > and it wouldn't really have the correct semantics anyway --- in the > above example, the outer transaction would be left holding ex-lock > which would be surprising. ISTM that multitrans could be used here. Two xids, one xmax. Maybe the semantics of that use are slightly different from the normal queueing mechanism, but it seems straightforward enough. > I'm tempted to just error out in this scenario rather than allow the > lock upgrade. Thoughts? This close to release, I'll support you in choosing to just throw an error. This should be fairly rare. Lock upgrades are deadlock prone anyhow, so not a recommended coding practice and we would have a valid practical reason for not allowing them (at this time). It is something to fix later though: If I did need to do a lock upgrade, I would code it with a savepoint so that deadlocks can be trapped and retried. IMHO the savepoint-related locking semantics aren't documented at all, which is probably why such things have gone so long undetected. -- Simon Riggs EnterpriseDB http://www.enterprisedb.com
Simon Riggs wrote: > ISTM that multitrans could be used here. Two xids, one xmax. Hmm, yeah, this seems a reasonable suggestion. The problem is that we don't have a mechanism today for saying "this Xid holds a shared lock, this one holds an exclusive lock". So code-wise it wouldn't be simple to do. It's a single bit per Xid, but I don't see where to store such a thing. I'm not sure we can use the simple "raise an ERROR" answer though, because for users that would be a regression. -- Alvaro Herrera http://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support
Alvaro Herrera wrote: > Simon Riggs wrote: > >> ISTM that multitrans could be used here. Two xids, one xmax. > > Hmm, yeah, this seems a reasonable suggestion. The problem is that we > don't have a mechanism today for saying "this Xid holds a shared lock, > this one holds an exclusive lock". So code-wise it wouldn't be simple > to do. It's a single bit per Xid, but I don't see where to store such a > thing. We could store an extra byte in front of each xid in the multixact member file. That would screw up alignment, though, unless we pad it up to int32, but that would double the space taken by the members file. Could we make the combination HEAP_XMAX_EXCL_LOCK | HEAP_XMAX_IS_MULTI legal, with the meaning that the last member in the multixact is an exclusive locker, if it's still in-progress? > I'm not sure we can use the simple "raise an ERROR" answer though, > because for users that would be a regression. Yeah, that's ugly. Though it doesn't behave correctly as it is either... -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
Alvaro Herrera <alvherre@commandprompt.com> writes: > I'm not sure we can use the simple "raise an ERROR" answer though, > because for users that would be a regression. I've reconsidered the idea of upgrading the outer xact's shared lock to exclusive: at first I thought that would be hard to implement correctly, but now I realize it's easy. Just re-use the XID that's in the multixact as the one to store as the exclusive locker, instead of storing our current subxact XID. In some cases this will be a subcommitted XID of the current subxact or a parent, but the locking semantics are the same, and even though we think such an XID is finished everyone else will see it as still live so the appearance of its XID in an XMAX field shouldn't be an issue. So that's what I propose doing. regards, tom lane
> > I'm not sure we can use the simple "raise an ERROR" answer though, > > because for users that would be a regression. > > I've reconsidered the idea of upgrading the outer xact's shared lock to > exclusive: at first I thought that would be hard to implement correctly, > but now I realize it's easy. Just re-use the XID that's in the multixact > as the one to store as the exclusive locker, instead of storing our > current subxact XID. In some cases this will be a subcommitted XID of > the current subxact or a parent, but the locking semantics are the same, > and even though we think such an XID is finished everyone else will see > it as still live so the appearance of its XID in an XMAX field shouldn't > be an issue. fwiw, I think that is good. Andreas
Tom Lane wrote: > Alvaro Herrera <alvherre@commandprompt.com> writes: >> I'm not sure we can use the simple "raise an ERROR" answer though, >> because for users that would be a regression. > > I've reconsidered the idea of upgrading the outer xact's shared lock to > exclusive: at first I thought that would be hard to implement correctly, > but now I realize it's easy. Just re-use the XID that's in the multixact > as the one to store as the exclusive locker, instead of storing our > current subxact XID. In some cases this will be a subcommitted XID of > the current subxact or a parent, but the locking semantics are the same, > and even though we think such an XID is finished everyone else will see > it as still live so the appearance of its XID in an XMAX field shouldn't > be an issue. That way, the lock won't be downgraded back to a shared lock on "rollback to savepoint", right? Though it's still better than throwing an error, I think. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
"Heikki Linnakangas" <heikki@enterprisedb.com> writes: > That way, the lock won't be downgraded back to a shared lock on > "rollback to savepoint", right? Though it's still better than throwing > an error, I think. Correct, a rollback would leave the tuple still exclusive-locked. So not perfect, but it's hard to see how to do better without a whole lot more infrastructure, which the case is probably not worth. I've just finished coding up the patch --- untested as yet, but anyone see any problems? regards, tom lane *** src/backend/access/heap/heapam.c.orig Fri Nov 17 13:00:14 2006 --- src/backend/access/heap/heapam.c Fri Dec 1 12:18:04 2006 *************** *** 2360,2365 **** --- 2360,2366 ---- PageHeader dp; TransactionId xid; TransactionId xmax; + TransactionId existing_subxact = InvalidTransactionId; uint16 old_infomask; uint16 new_infomask; LOCKMODE tuple_lock_type; *************** *** 2398,2419 **** LockBuffer(*buffer, BUFFER_LOCK_UNLOCK); /* ! * If we wish to acquire share lock, and the tuple is already ! * share-locked by a multixact that includes any subtransaction of the ! * current top transaction, then we effectively hold the desired lock ! * already. We *must* succeed without trying to take the tuple lock, ! * else we will deadlock against anyone waiting to acquire exclusive ! * lock. We don't need to make any state changes in this case. */ ! if (mode == LockTupleShared && ! (infomask & HEAP_XMAX_IS_MULTI) && ! MultiXactIdIsCurrent((MultiXactId) xwait)) { Assert(infomask & HEAP_XMAX_SHARED_LOCK); ! /* Probably can't hold tuple lock here, but may as well check */ ! if (have_tuple_lock) ! UnlockTuple(relation, tid, tuple_lock_type); ! return HeapTupleMayBeUpdated; } /* --- 2399,2430 ---- LockBuffer(*buffer, BUFFER_LOCK_UNLOCK); /* ! * If the tuple is currently share-locked by a multixact, we have to ! * check whether the multixact includes any live subtransaction of the ! * current top transaction. If so, then we effectively already hold ! * share-lock, even if that XID isn't the current subxact. That's ! * because no such subtransaction could be aborted without also ! * aborting the current subtransaction, and so its locks are as good ! * as ours. */ ! if (infomask & HEAP_XMAX_IS_MULTI) { Assert(infomask & HEAP_XMAX_SHARED_LOCK); ! existing_subxact = MultiXactIdGetCurrent((MultiXactId) xwait); ! /* ! * Done if we have share lock and that's what the caller wants. ! * We *must* check this before trying to take the tuple lock, else ! * we will deadlock against anyone waiting to acquire exclusive ! * lock. We don't need to make any state changes in this case. ! */ ! if (mode == LockTupleShared && ! TransactionIdIsValid(existing_subxact)) ! { ! /* Probably can't hold tuple lock here, but check anyway */ ! if (have_tuple_lock) ! UnlockTuple(relation, tid, tuple_lock_type); ! return HeapTupleMayBeUpdated; ! } } /* *************** *** 2570,2593 **** if (!(old_infomask & (HEAP_XMAX_INVALID | HEAP_XMAX_COMMITTED | HEAP_XMAX_IS_MULTI)) && - (mode == LockTupleShared ? - (old_infomask & HEAP_IS_LOCKED) : - (old_infomask & HEAP_XMAX_EXCL_LOCK)) && TransactionIdIsCurrentTransactionId(xmax)) { ! LockBuffer(*buffer, BUFFER_LOCK_UNLOCK); ! /* Probably can't hold tuple lock here, but may as well check */ ! if (have_tuple_lock) ! UnlockTuple(relation, tid, tuple_lock_type); ! return HeapTupleMayBeUpdated; } /* * Compute the new xmax and infomask to store into the tuple. Note we do * not modify the tuple just yet, because that would leave it in the wrong * state if multixact.c elogs. */ ! xid = GetCurrentTransactionId(); new_infomask = old_infomask & ~(HEAP_XMAX_COMMITTED | HEAP_XMAX_INVALID | --- 2581,2621 ---- if (!(old_infomask & (HEAP_XMAX_INVALID | HEAP_XMAX_COMMITTED | HEAP_XMAX_IS_MULTI)) && TransactionIdIsCurrentTransactionId(xmax)) { ! /* The tuple is locked by some existing subxact ... */ ! Assert(old_infomask & HEAP_IS_LOCKED); ! existing_subxact = xmax; ! /* ... but is it the desired lock type or stronger? */ ! if (mode == LockTupleShared || ! (old_infomask & HEAP_XMAX_EXCL_LOCK)) ! { ! LockBuffer(*buffer, BUFFER_LOCK_UNLOCK); ! /* Probably can't hold tuple lock here, but check anyway */ ! if (have_tuple_lock) ! UnlockTuple(relation, tid, tuple_lock_type); ! return HeapTupleMayBeUpdated; ! } } /* * Compute the new xmax and infomask to store into the tuple. Note we do * not modifythe tuple just yet, because that would leave it in the wrong * state if multixact.c elogs. + * + * If we are upgrading a shared lock held by another subxact to exclusive, + * we need to mark the tuple as exclusively locked by the other subxact + * not this one. Otherwise, a rollback of this subxact would leave the + * tuple apparently not locked at all. We don't have enough + * infrastructure to keep track of both types of tuple lock, so we + * compromise by making the tuple appear to be exclusive-locked by the + * other, possibly longer-lived subxact. (Again, there are no cases where + * a live subxact could be shorter-lived than the current one.) */ ! if (TransactionIdIsValid(existing_subxact)) ! xid = existing_subxact; ! else ! xid = GetCurrentTransactionId(); new_infomask = old_infomask & ~(HEAP_XMAX_COMMITTED | HEAP_XMAX_INVALID | *** src/backend/access/transam/multixact.c.orig Fri Nov 17 13:00:15 2006 --- src/backend/access/transam/multixact.c Fri Dec 1 12:17:57 2006 *************** *** 381,388 **** /* * Checking for myself is cheap compared to looking in shared memory, ! * so first do the equivalent of MultiXactIdIsCurrent(). This is not ! * needed for correctness, it's just a fast path. */ for (i = 0; i < nmembers; i++) { --- 381,388 ---- /* * Checking for myself is cheap compared to looking in shared memory, ! * so try that first. This is not needed for correctness, it's just a ! * fast path. */ for (i = 0; i < nmembers; i++) { *************** *** 418,437 **** } /* ! * MultiXactIdIsCurrent ! * Returns true if the current transaction is a member of the MultiXactId. * ! * We return true if any live subtransaction of the current top-level ! * transaction is a member. This is appropriate for the same reason that a ! * lock held by any such subtransaction is globally equivalent to a lock ! * held by the current subtransaction: no such lock could be released without ! * aborting this subtransaction, and hence releasing its locks. So it's not ! * necessary to add the current subxact to the MultiXact separately. */ ! bool ! MultiXactIdIsCurrent(MultiXactId multi) { ! bool result = false; TransactionId *members; int nmembers; int i; --- 418,437 ---- } /* ! * MultiXactIdGetCurrent ! * If any live subtransaction of the current backend is a member of ! * the MultiXactId, return its XID; else return InvalidTransactionId. * ! * If the MXACT contains multiple such subtransactions, it is unspecified ! * which one is returned. This doesn't matter in current usage because ! * heap_lock_tuple takes care not to insert multiple subtransactions of the ! * same backend into any MXACT. If need be, we could modify this code to ! * return the oldest such subxact, or some such rule. */ ! TransactionId ! MultiXactIdGetCurrent(MultiXactId multi) { ! TransactionId result = InvalidTransactionId; TransactionId *members; int nmembers; int i; *************** *** 439,451 **** nmembers = GetMultiXactIdMembers(multi, &members); if (nmembers < 0) ! return false; for (i = 0; i < nmembers; i++) { if (TransactionIdIsCurrentTransactionId(members[i])) { ! result = true; break; } } --- 439,451 ---- nmembers = GetMultiXactIdMembers(multi, &members); if (nmembers < 0) ! return result; for (i = 0; i < nmembers; i++) { if (TransactionIdIsCurrentTransactionId(members[i])) { ! result = members[i]; break; } } *** src/include/access/multixact.h.orig Fri Nov 17 13:00:15 2006 --- src/include/access/multixact.h Fri Dec 1 12:17:49 2006 *************** *** 45,51 **** extern MultiXactId MultiXactIdCreate(TransactionId xid1, TransactionId xid2); extern MultiXactId MultiXactIdExpand(MultiXactIdmulti, TransactionId xid); extern bool MultiXactIdIsRunning(MultiXactId multi); ! extern bool MultiXactIdIsCurrent(MultiXactId multi); extern void MultiXactIdWait(MultiXactId multi); extern bool ConditionalMultiXactIdWait(MultiXactIdmulti); extern void MultiXactIdSetOldestMember(void); --- 45,51 ---- extern MultiXactId MultiXactIdCreate(TransactionId xid1, TransactionId xid2); extern MultiXactId MultiXactIdExpand(MultiXactIdmulti, TransactionId xid); extern bool MultiXactIdIsRunning(MultiXactId multi); ! extern TransactionId MultiXactIdGetCurrent(MultiXactId multi); extern void MultiXactIdWait(MultiXactId multi); extern boolConditionalMultiXactIdWait(MultiXactId multi); extern void MultiXactIdSetOldestMember(void);
On Fri, 2006-12-01 at 12:18 -0500, Tom Lane wrote: > "Heikki Linnakangas" <heikki@enterprisedb.com> writes: > > That way, the lock won't be downgraded back to a shared lock on > > "rollback to savepoint", right? Though it's still better than throwing > > an error, I think. So for us non-c programmers out there may I clarify? If I have a long running transaction that uses multiple savepoints. If I rollback to a save point the tuple that was being modified before the rollback will have an exclusive lock? At what point is the exclusive lock released? When I create a new savepoint? On COMMIT of the entire transaction? Joshua D. Drake > > Correct, a rollback would leave the tuple still exclusive-locked. > So not perfect, but it's hard to see how to do better without a whole > lot more infrastructure, which the case is probably not worth. > > I've just finished coding up the patch --- untested as yet, but anyone > see any problems? > > regards, tom lane > > *** src/backend/access/heap/heapam.c.orig Fri Nov 17 13:00:14 2006 > --- src/backend/access/heap/heapam.c Fri Dec 1 12:18:04 2006 > *************** > *** 2360,2365 **** > --- 2360,2366 ---- > PageHeader dp; > TransactionId xid; > TransactionId xmax; > + TransactionId existing_subxact = InvalidTransactionId; > uint16 old_infomask; > uint16 new_infomask; > LOCKMODE tuple_lock_type; > *************** > *** 2398,2419 **** > LockBuffer(*buffer, BUFFER_LOCK_UNLOCK); > > /* > ! * If we wish to acquire share lock, and the tuple is already > ! * share-locked by a multixact that includes any subtransaction of the > ! * current top transaction, then we effectively hold the desired lock > ! * already. We *must* succeed without trying to take the tuple lock, > ! * else we will deadlock against anyone waiting to acquire exclusive > ! * lock. We don't need to make any state changes in this case. > */ > ! if (mode == LockTupleShared && > ! (infomask & HEAP_XMAX_IS_MULTI) && > ! MultiXactIdIsCurrent((MultiXactId) xwait)) > { > Assert(infomask & HEAP_XMAX_SHARED_LOCK); > ! /* Probably can't hold tuple lock here, but may as well check */ > ! if (have_tuple_lock) > ! UnlockTuple(relation, tid, tuple_lock_type); > ! return HeapTupleMayBeUpdated; > } > > /* > --- 2399,2430 ---- > LockBuffer(*buffer, BUFFER_LOCK_UNLOCK); > > /* > ! * If the tuple is currently share-locked by a multixact, we have to > ! * check whether the multixact includes any live subtransaction of the > ! * current top transaction. If so, then we effectively already hold > ! * share-lock, even if that XID isn't the current subxact. That's > ! * because no such subtransaction could be aborted without also > ! * aborting the current subtransaction, and so its locks are as good > ! * as ours. > */ > ! if (infomask & HEAP_XMAX_IS_MULTI) > { > Assert(infomask & HEAP_XMAX_SHARED_LOCK); > ! existing_subxact = MultiXactIdGetCurrent((MultiXactId) xwait); > ! /* > ! * Done if we have share lock and that's what the caller wants. > ! * We *must* check this before trying to take the tuple lock, else > ! * we will deadlock against anyone waiting to acquire exclusive > ! * lock. We don't need to make any state changes in this case. > ! */ > ! if (mode == LockTupleShared && > ! TransactionIdIsValid(existing_subxact)) > ! { > ! /* Probably can't hold tuple lock here, but check anyway */ > ! if (have_tuple_lock) > ! UnlockTuple(relation, tid, tuple_lock_type); > ! return HeapTupleMayBeUpdated; > ! } > } > > /* > *************** > *** 2570,2593 **** > if (!(old_infomask & (HEAP_XMAX_INVALID | > HEAP_XMAX_COMMITTED | > HEAP_XMAX_IS_MULTI)) && > - (mode == LockTupleShared ? > - (old_infomask & HEAP_IS_LOCKED) : > - (old_infomask & HEAP_XMAX_EXCL_LOCK)) && > TransactionIdIsCurrentTransactionId(xmax)) > { > ! LockBuffer(*buffer, BUFFER_LOCK_UNLOCK); > ! /* Probably can't hold tuple lock here, but may as well check */ > ! if (have_tuple_lock) > ! UnlockTuple(relation, tid, tuple_lock_type); > ! return HeapTupleMayBeUpdated; > } > > /* > * Compute the new xmax and infomask to store into the tuple. Note we do > * not modify the tuple just yet, because that would leave it in the wrong > * state if multixact.c elogs. > */ > ! xid = GetCurrentTransactionId(); > > new_infomask = old_infomask & ~(HEAP_XMAX_COMMITTED | > HEAP_XMAX_INVALID | > --- 2581,2621 ---- > if (!(old_infomask & (HEAP_XMAX_INVALID | > HEAP_XMAX_COMMITTED | > HEAP_XMAX_IS_MULTI)) && > TransactionIdIsCurrentTransactionId(xmax)) > { > ! /* The tuple is locked by some existing subxact ... */ > ! Assert(old_infomask & HEAP_IS_LOCKED); > ! existing_subxact = xmax; > ! /* ... but is it the desired lock type or stronger? */ > ! if (mode == LockTupleShared || > ! (old_infomask & HEAP_XMAX_EXCL_LOCK)) > ! { > ! LockBuffer(*buffer, BUFFER_LOCK_UNLOCK); > ! /* Probably can't hold tuple lock here, but check anyway */ > ! if (have_tuple_lock) > ! UnlockTuple(relation, tid, tuple_lock_type); > ! return HeapTupleMayBeUpdated; > ! } > } > > /* > * Compute the new xmax and infomask to store into the tuple. Note we do > * not modify the tuple just yet, because that would leave it in the wrong > * state if multixact.c elogs. > + * > + * If we are upgrading a shared lock held by another subxact to exclusive, > + * we need to mark the tuple as exclusively locked by the other subxact > + * not this one. Otherwise, a rollback of this subxact would leave the > + * tuple apparently not locked at all. We don't have enough > + * infrastructure to keep track of both types of tuple lock, so we > + * compromise by making the tuple appear to be exclusive-locked by the > + * other, possibly longer-lived subxact. (Again, there are no cases where > + * a live subxact could be shorter-lived than the current one.) > */ > ! if (TransactionIdIsValid(existing_subxact)) > ! xid = existing_subxact; > ! else > ! xid = GetCurrentTransactionId(); > > new_infomask = old_infomask & ~(HEAP_XMAX_COMMITTED | > HEAP_XMAX_INVALID | > *** src/backend/access/transam/multixact.c.orig Fri Nov 17 13:00:15 2006 > --- src/backend/access/transam/multixact.c Fri Dec 1 12:17:57 2006 > *************** > *** 381,388 **** > > /* > * Checking for myself is cheap compared to looking in shared memory, > ! * so first do the equivalent of MultiXactIdIsCurrent(). This is not > ! * needed for correctness, it's just a fast path. > */ > for (i = 0; i < nmembers; i++) > { > --- 381,388 ---- > > /* > * Checking for myself is cheap compared to looking in shared memory, > ! * so try that first. This is not needed for correctness, it's just a > ! * fast path. > */ > for (i = 0; i < nmembers; i++) > { > *************** > *** 418,437 **** > } > > /* > ! * MultiXactIdIsCurrent > ! * Returns true if the current transaction is a member of the MultiXactId. > * > ! * We return true if any live subtransaction of the current top-level > ! * transaction is a member. This is appropriate for the same reason that a > ! * lock held by any such subtransaction is globally equivalent to a lock > ! * held by the current subtransaction: no such lock could be released without > ! * aborting this subtransaction, and hence releasing its locks. So it's not > ! * necessary to add the current subxact to the MultiXact separately. > */ > ! bool > ! MultiXactIdIsCurrent(MultiXactId multi) > { > ! bool result = false; > TransactionId *members; > int nmembers; > int i; > --- 418,437 ---- > } > > /* > ! * MultiXactIdGetCurrent > ! * If any live subtransaction of the current backend is a member of > ! * the MultiXactId, return its XID; else return InvalidTransactionId. > * > ! * If the MXACT contains multiple such subtransactions, it is unspecified > ! * which one is returned. This doesn't matter in current usage because > ! * heap_lock_tuple takes care not to insert multiple subtransactions of the > ! * same backend into any MXACT. If need be, we could modify this code to > ! * return the oldest such subxact, or some such rule. > */ > ! TransactionId > ! MultiXactIdGetCurrent(MultiXactId multi) > { > ! TransactionId result = InvalidTransactionId; > TransactionId *members; > int nmembers; > int i; > *************** > *** 439,451 **** > nmembers = GetMultiXactIdMembers(multi, &members); > > if (nmembers < 0) > ! return false; > > for (i = 0; i < nmembers; i++) > { > if (TransactionIdIsCurrentTransactionId(members[i])) > { > ! result = true; > break; > } > } > --- 439,451 ---- > nmembers = GetMultiXactIdMembers(multi, &members); > > if (nmembers < 0) > ! return result; > > for (i = 0; i < nmembers; i++) > { > if (TransactionIdIsCurrentTransactionId(members[i])) > { > ! result = members[i]; > break; > } > } > *** src/include/access/multixact.h.orig Fri Nov 17 13:00:15 2006 > --- src/include/access/multixact.h Fri Dec 1 12:17:49 2006 > *************** > *** 45,51 **** > extern MultiXactId MultiXactIdCreate(TransactionId xid1, TransactionId xid2); > extern MultiXactId MultiXactIdExpand(MultiXactId multi, TransactionId xid); > extern bool MultiXactIdIsRunning(MultiXactId multi); > ! extern bool MultiXactIdIsCurrent(MultiXactId multi); > extern void MultiXactIdWait(MultiXactId multi); > extern bool ConditionalMultiXactIdWait(MultiXactId multi); > extern void MultiXactIdSetOldestMember(void); > --- 45,51 ---- > extern MultiXactId MultiXactIdCreate(TransactionId xid1, TransactionId xid2); > extern MultiXactId MultiXactIdExpand(MultiXactId multi, TransactionId xid); > extern bool MultiXactIdIsRunning(MultiXactId multi); > ! extern TransactionId MultiXactIdGetCurrent(MultiXactId multi); > extern void MultiXactIdWait(MultiXactId multi); > extern bool ConditionalMultiXactIdWait(MultiXactId multi); > extern void MultiXactIdSetOldestMember(void); > > ---------------------------(end of broadcast)--------------------------- > TIP 6: explain analyze is your friend > -- === The PostgreSQL Company: Command Prompt, Inc. === Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240 Providing the most comprehensive PostgreSQL solutions since 1997 http://www.commandprompt.com/ Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
"Joshua D. Drake" <jd@commandprompt.com> writes: > At what point is the exclusive lock released? When I create a new > savepoint? On COMMIT of the entire transaction? COMMIT, or rollback to a savepoint before the original shared lock was taken. regards, tom lane
Actually ... wait a minute. The proposed hack covers the case of SELECT FOR SHARE followed by SELECT FOR UPDATE within a subtransaction. But what about SELECT FOR SHARE followed by an actual UPDATE (or DELETE)? We certainly don't want to mark the UPDATE/DELETE as having been carried out by the upper transaction, but there's no way we can record the UPDATE while still remembering the previous share-lock. So I think I'm back to the position that we should throw an error here. regards, tom lane
Tom Lane wrote: > Actually ... wait a minute. The proposed hack covers the case of > SELECT FOR SHARE followed by SELECT FOR UPDATE within a subtransaction. > But what about SELECT FOR SHARE followed by an actual UPDATE (or DELETE)? > > We certainly don't want to mark the UPDATE/DELETE as having been carried > out by the upper transaction, but there's no way we can record the > UPDATE while still remembering the previous share-lock. > > So I think I'm back to the position that we should throw an error here. Yeah. Even without a real update, I just figured you can't upgrade a shared lock held by an arbitrarily chosen parent to an exclusive lock. If that subxid aborts, and if any of the parents of that subtransaction held the shared lock, that lock would be released incorrectly. Which is essentially the same problem we began with. Let's throw an error for now. We have to come back to this in 8.3, I think. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
"Heikki Linnakangas" <heikki@enterprisedb.com> writes: > Yeah. Even without a real update, I just figured you can't upgrade a > shared lock held by an arbitrarily chosen parent to an exclusive lock. > If that subxid aborts, and if any of the parents of that subtransaction > held the shared lock, that lock would be released incorrectly. Which is > essentially the same problem we began with. Well, there's nothing "arbitrarily chosen" about it, because the lock shown in the tuple would always be the most senior live subtransaction. So I don't think your concern above is a real problem. Nonetheless, the proposed hack is definitely playing some games with the semantics, and while we might be able to get away with that for the question of "is this lock shared or exclusive", it certainly won't do for an actual update. > Let's throw an error for now. We have to come back to this in 8.3, I think. I think it's OK to fix it so that we allow the pre-existing lock to be held by a subtransaction of the current xact, but not a parent. regards, tom lane
On Fri, 2006-12-01 at 02:46 -0500, Tom Lane wrote: > Michael Paesold <mpaesold@gmx.at> writes: > > Now seriously, isn't this a perfectly feasible scenario? E.g. the outer > > transaction acquires a shared lock because of foreign key constraints, and > > the sub transaction later wants to update that row? > > Yeah, it's not implausible. But the only way I can see to implement > that is to upgrade the outer xact's shared lock to exclusive, and that > doesn't seem real cool either. > If it's a plausible enough sequence of events, is it worth adding a note to the "migration" section of the release notes? Regards, Jeff Davis
"Heikki Linnakangas" <heikki@enterprisedb.com> writes: > Let's throw an error for now. We have to come back to this in 8.3, I think. After further thought I think we should also seriously consider plan C: do nothing for now. We now realize that there have been related bugs since 8.0, namely that begin;select some rows for update;savepoint x;update the same rows;rollback to x; leaves the tuple(s) not locked. The lack of complaints about this from the field suggests that this isn't a huge problem in practice. If we do make it throw an error I'm afraid that we will break applications that aren't having a problem at the moment. I'm also realizing that a fix along the throw-an-error line is nontrivial, eg, HeapTupleSatisfiesUpdate would need another return code. So at this point we are facing three options:- throw in a large and poorly tested "fix" at the last moment;- postpone 8.2until we can think of a real fix, which might be a major undertaking;- ship 8.2 with the same behavior 8.0 and 8.1 had. None of these are very attractive, but I'm starting to think the last is the least bad. regards, tom lane
On Fri, 2006-12-01 at 13:46 -0500, Tom Lane wrote: > "Heikki Linnakangas" <heikki@enterprisedb.com> writes: > > Let's throw an error for now. We have to come back to this in 8.3, I think. > > After further thought I think we should also seriously consider plan C: > do nothing for now. We now realize that there have been related bugs > since 8.0, namely that > > begin; > select some rows for update; > savepoint x; > update the same rows; > rollback to x; > > leaves the tuple(s) not locked. The lack of complaints about this from > the field suggests that this isn't a huge problem in practice. If we > do make it throw an error I'm afraid that we will break applications > that aren't having a problem at the moment. > > I'm also realizing that a fix along the throw-an-error line is > nontrivial, eg, HeapTupleSatisfiesUpdate would need another return code. > > So at this point we are facing three options: > - throw in a large and poorly tested "fix" at the last moment; > - postpone 8.2 until we can think of a real fix, which might > be a major undertaking; > - ship 8.2 with the same behavior 8.0 and 8.1 had. > None of these are very attractive, but I'm starting to think the last > is the least bad. /me struggles... IMHO: option 2 is the correct option, but the least favorable. option 1 is probably bad option 3 is the lesser of the evils if we document it loudly. Joshua D. Drake > > regards, tom lane > > ---------------------------(end of broadcast)--------------------------- > TIP 5: don't forget to increase your free space map settings > -- === The PostgreSQL Company: Command Prompt, Inc. === Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240 Providing the most comprehensive PostgreSQL solutions since 1997 http://www.commandprompt.com/ Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
Tom, > So at this point we are facing three options: > - throw in a large and poorly tested "fix" at the last moment; > - postpone 8.2 until we can think of a real fix, which might > be a major undertaking; > - ship 8.2 with the same behavior 8.0 and 8.1 had. > None of these are very attractive, but I'm starting to think the last > is the least bad. Yes. If it was earlier in the beta cycle I'd say no, but frankly this behavior has existed for two years without examples of real-life data loss. Further, the TPC tests, which are supposed to give ACID properties a workout, would not break this, so the industry doesn't consider it very important either. So, I think it needs to go on the list for 8.2.1 or 8.3 (depending on what changes the fix requires) but I don't think we should hold up the release. As PR maven, though, you know I'm biased about the release date. I would suggest a last-minute doc patch documenting the behavior and suggesting that locks should always be declared in the outer transaction prior to any savepoints. -- --Josh Josh Berkus PostgreSQL @ Sun San Francisco
On Fri, 1 Dec 2006, Tom Lane wrote: > > So at this point we are facing three options: > - throw in a large and poorly tested "fix" at the last moment; > - postpone 8.2 until we can think of a real fix, which might > be a major undertaking; > - ship 8.2 with the same behavior 8.0 and 8.1 had. > None of these are very attractive, but I'm starting to think the last > is the least bad. > > regards, tom lane I'd go with that last option; it's important to get this release out now, I think, as it has a lot of value add, and people get it that things aren't always perfect. I do, however, feel that the "real fix" is vital, whenever it can occur. It's attention to detail like this that elevates this group from good to great. Richard -- Richard Troy, Chief Scientist Science Tools Corporation 510-924-1363 or 202-747-1263 rtroy@ScienceTools.com, http://ScienceTools.com/
Josh Berkus <josh@agliodbs.com> writes: > So, I think it needs to go on the list for 8.2.1 or 8.3 (depending on what > changes the fix requires) but I don't think we should hold up the release. That's pretty much my feeling as well. The thing is that postponing 8.2 any further will deprive users of a lot of good stuff, in order to fix a problem that apparently isn't biting anyone in the field. And it's not clear that we can fix this on a shorter-than-8.3-ish timescale anyway. The only obvious solution involves adding another header field, which I'm sure is not very acceptable. regards, tom lane
Josh Berkus wrote: > Tom, > > > So at this point we are facing three options: > > ????????- throw in a large and poorly tested "fix" at the last moment; > > ????????- postpone 8.2 until we can think of a real fix, which might > > ???????? ?be a major undertaking; > > ????????- ship 8.2 with the same behavior 8.0 and 8.1 had. > > None of these are very attractive, but I'm starting to think the last > > is the least bad. > > Yes. If it was earlier in the beta cycle I'd say no, but frankly this > behavior has existed for two years without examples of real-life data > loss. Further, the TPC tests, which are supposed to give ACID properties > a workout, would not break this, so the industry doesn't consider it very > important either. > > So, I think it needs to go on the list for 8.2.1 or 8.3 (depending on what > changes the fix requires) but I don't think we should hold up the release. We cannot add something this major in a minor release --- it would have to be 8.3. -- Bruce Momjian bruce@momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. +
Bruce Momjian <bruce@momjian.us> writes: > Josh Berkus wrote: >> So, I think it needs to go on the list for 8.2.1 or 8.3 (depending on what >> changes the fix requires) but I don't think we should hold up the release. > We cannot add something this major in a minor release --- it would have > to be 8.3. If someone thinks of a brilliant solution that doesn't change on-disk layout, maybe we could implement it in 8.2.x, but right now I'm not feeling hopeful about that. The best idea I have at the moment is that we might be able to do something as part of the proposed plan to fold cmin/cmax into a single field. The thought there was that there could be some in-memory state for tuples that had been modified multiple times by a single xact --- perhaps that could be extended to cover this problem. This is just handwaving at the moment though. In particular, is-the-tuple-locked-or-not seems like it has to be externally visible state, so maybe it can't work. regards, tom lane
On 12/1/06, Tom Lane <tgl@sss.pgh.pa.us> wrote: > "Heikki Linnakangas" <heikki@enterprisedb.com> writes: > > Let's throw an error for now. We have to come back to this in 8.3, I think. > > After further thought I think we should also seriously consider plan C: > do nothing for now. We now realize that there have been related bugs > since 8.0, namely that > > begin; > select some rows for update; > savepoint x; > update the same rows; > rollback to x; > > leaves the tuple(s) not locked. The lack of complaints about this from > the field suggests that this isn't a huge problem in practice. If we > do make it throw an error I'm afraid that we will break applications > that aren't having a problem at the moment. imo, the most likely scenario would be a begin/exception/end block in pg/sql. i would venture to guess that very little true savepointing happens in practice. maybe add a little note of caution pg/sql error handling documentation? merlin
On Fri, 2006-12-01 at 13:46 -0500, Tom Lane wrote: > I'm also realizing that a fix along the throw-an-error line is > nontrivial, eg, HeapTupleSatisfiesUpdate would need another return code. Yes, thats starting to get hairy. The fix could easily break something else in another corner of MVCC. > So at this point we are facing three options: > - throw in a large and poorly tested "fix" at the last moment; > - postpone 8.2 until we can think of a real fix, which might > be a major undertaking; > - ship 8.2 with the same behavior 8.0 and 8.1 had. > None of these are very attractive, but I'm starting to think the last > is the least bad. The functionality in this area isn't yet complete anyway; we still have locking in the partitioned table case to consider. It's not that bad just to leave it as is. So last option gets my vote. -- Simon Riggs EnterpriseDB http://www.enterprisedb.com
"Simon Riggs" <simon@2ndquadrant.com> writes: > The functionality in this area isn't yet complete anyway; we still have > locking in the partitioned table case to consider. Hm? What does partitioning have to do with it? regards, tom lane
"Merlin Moncure" <mmoncure@gmail.com> writes: > imo, the most likely scenario would be a begin/exception/end block in > pg/sql. i would venture to guess that very little true savepointing > happens in practice. maybe add a little note of caution pg/sql error > handling documentation? I mentioned exception blocks as a risk factor, but I think the right place to document it is under FOR UPDATE/SHARE: http://developer.postgresql.org/cvsweb.cgi/pgsql/doc/src/sgml/ref/select.sgml.diff?r1=1.93;r2=1.94 regards, tom lane
On Fri, 2006-12-01 at 15:52 -0500, Tom Lane wrote: > "Simon Riggs" <simon@2ndquadrant.com> writes: > > The functionality in this area isn't yet complete anyway; we still have > > locking in the partitioned table case to consider. > > Hm? What does partitioning have to do with it? SELECT FOR UPDATE/SHARE is not supported for inheritance queries. My point was that the implementation of row locking is not yet complete, so the slight wrinkle around lock upgrading is not a solitary eyesore. -- Simon Riggs EnterpriseDB http://www.enterprisedb.com
"Simon Riggs" <simon@2ndquadrant.com> writes: > On Fri, 2006-12-01 at 15:52 -0500, Tom Lane wrote: >> Hm? What does partitioning have to do with it? > SELECT FOR UPDATE/SHARE is not supported for inheritance queries. True, but that's a planner/executor issue not a question of the fundamental representation of the state on-disk. (I have some ideas about how to fix that one.) regards, tom lane
On Dec 1, 2006, at 10:46 AM, Tom Lane wrote: > If we > do make it throw an error I'm afraid that we will break applications > that aren't having a problem at the moment. What about throwing a warning? Shouldn't break anything, but at least then anyone who's experiencing this and has just gotten lucky so far will have a better idea that it's happening. As for possibly using the in-memory store of multiple CIDs affecting a tuple, could that not work if that store contained enough information to 'rollback' the lock to it's original state when restoring to a savepoint? AFAIK other backends would only need to know what the current lock being held was, they wouldn't need to know the history of it themselves... -- Jim Nasby jim@nasby.net EnterpriseDB http://enterprisedb.com 512.569.9461 (cell)
Jim Nasby <decibel@decibel.org> writes: > As for possibly using the in-memory store of multiple CIDs affecting > a tuple, could that not work if that store contained enough > information to 'rollback' the lock to it's original state when > restoring to a savepoint? AFAIK other backends would only need to > know what the current lock being held was, they wouldn't need to know > the history of it themselves... One of the interesting problems is that if you upgrade shared lock to exclusive and then want to roll that back, you might need to un-block other processes that tried to acquire share lock after you acquired exclusive. We have no way to do that in the current implementation. (Any such processes will be blocked on your transaction ID lock, which you can't release without possibly unblocking the wrong processes.) regards, tom lane
Added to TODO: * Fix problem when multiple subtransactions of the same outer transaction hold different types of locks, and one subtransactionaborts http://archives.postgresql.org/pgsql-hackers/2006-11/msg01011.php http://archives.postgresql.org/pgsql-hackers/2006-12/msg00001.php --------------------------------------------------------------------------- Tom Lane wrote: > Jim Nasby <decibel@decibel.org> writes: > > As for possibly using the in-memory store of multiple CIDs affecting > > a tuple, could that not work if that store contained enough > > information to 'rollback' the lock to it's original state when > > restoring to a savepoint? AFAIK other backends would only need to > > know what the current lock being held was, they wouldn't need to know > > the history of it themselves... > > One of the interesting problems is that if you upgrade shared lock to > exclusive and then want to roll that back, you might need to un-block > other processes that tried to acquire share lock after you acquired > exclusive. We have no way to do that in the current implementation. > (Any such processes will be blocked on your transaction ID lock, which > you can't release without possibly unblocking the wrong processes.) > > regards, tom lane > > ---------------------------(end of broadcast)--------------------------- > TIP 7: You can help support the PostgreSQL project by donating at > > http://www.postgresql.org/about/donate -- Bruce Momjian bruce@momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. +