Thread: Referential Integrity and SHARE locks
I'm reading the SQL Standard and I can't find anywhere that says that we need to place SHARE locks on rows in the referenced table. RI_FKey_check() clearly does that. What I do see is this: "4. For each row of the referenced table, its matching rows, unique matching rows, and non-unique matching rows are determined immediately prior to the execution of any <SQL procedure statement>. No new matching rows are added during the execution of that <SQL procedure statement>. The association between a referenced row and a non-unique matching row is dropped during the execution of that SQL-statement if the referenced row is either marked for deletion or updated to a distinct value on any referenced column that corresponds to a non-null referencing column. This occurs immediately after such a mark for deletion or update of the referenced row. Unique matching rows and non-unique matching rows for a referenced row are evaluated immediately after dropping the association between that referenced row and a non-unique matching row." under General Rules for <referential constraint definition> That explicitly says "are determined immediately prior to the execution". To me, that implies that a Read Committed snapshot is sufficient to read referenced rows and that no lock is required. Why do we need a SHARE lock at all, on the **referenc(ed)** table? It sounds like if we don't put a SHARE lock on the referenced table then we can end the transaction in an inconsistent state if the referenced table has concurrent UPDATEs or DELETEs. BUT those operations do impose locking rules back onto the referencing tables that would not be granted until after any changes to the referencing table complete, whereupon they would restrict or cascade. So an inconsistent state doesn't seem possible to me. What am I missing? -- Simon Riggs EnterpriseDB http://www.enterprisedb.com
On Fri, 2007-02-02 at 10:51, Simon Riggs wrote: [snip] > Why do we need a SHARE lock at all, on the **referenc(ed)** table? > > It sounds like if we don't put a SHARE lock on the referenced table then > we can end the transaction in an inconsistent state if the referenced > table has concurrent UPDATEs or DELETEs. BUT those operations do impose > locking rules back onto the referencing tables that would not be granted > until after any changes to the referencing table complete, whereupon > they would restrict or cascade. So an inconsistent state doesn't seem > possible to me. > > What am I missing? Well, here we do have a patch (deployed on production servers) which does not put the shared lock on the referenced table, and it lets in occasionally rows in the referencing tables which do not have parent rows in the referenced table. I'm not sure what is the mechanism, but it does happen, I can assure you. It happens rare enough that is not disturbing for us, compared to the deadlocks which happen without the patch - that's another matter... In our application we never update any key ids, so only deletes/inserts come in play, and I guess it happens when a referenced row is deleted just between a newly inserted child row checks that the parent row exists and the row is really inserted. Or something like that... Cheers, Csaba.
Csaba Nagy wrote: > On Fri, 2007-02-02 at 10:51, Simon Riggs wrote: > [snip] >> Why do we need a SHARE lock at all, on the **referenc(ed)** table? > Well, here we do have a patch (deployed on production servers) which > does not put the shared lock on the referenced table, and it lets in > occasionally rows in the referencing tables which do not have parent > rows in the referenced table. I'm not sure what is the mechanism, but it > does happen, I can assure you. It happens rare enough that is not > disturbing for us, compared to the deadlocks which happen without the > patch - that's another matter... You say below the cut that you're not updating keys, so presumably it's other columns. Which leads me to something I've wondered for a while - why do we lock the whole row? Is it just a matter of "not optimised that yet" or is there a good reason why locking just some columns isn't practical. -- Richard Huxton Archonet Ltd
> You say below the cut that you're not updating keys, so presumably it's > other columns. Which leads me to something I've wondered for a while - > why do we lock the whole row? Is it just a matter of "not optimised that > yet" or is there a good reason why locking just some columns isn't > practical. For the conditions of generating the deadlock, see: http://archives.postgresql.org/pgsql-general/2006-12/msg00029.php The reason of the occasional orphan rows is not completely clear to me, but it must be some kind of race condition while inserting/deleting/?updating concurrently the parent/child tables. Cheers, Csaba.
Csaba Nagy wrote: > The reason of the occasional orphan rows is not completely clear to me, > but it must be some kind of race condition while > inserting/deleting/?updating concurrently the parent/child tables. I guess the following sequence would generate a orphaned row. A: executes "insert into table_child parent_id=1" B: executes "delete from table_parent where id=1" A: RI trigger checks for matching row in table_parent B: The row with id=1 is marked as deleted in table_parent A: The new row with parent_id=1 is inserted into table_child B: The delete is commited A: The insert is comitted. Any ordering that marks the row as deleted between the execution of the ri-trigger and the insertion of the new row would lead to the same result.. greetings, Florian Pflug
On Fri, 2 Feb 2007, Simon Riggs wrote: > It sounds like if we don't put a SHARE lock on the referenced table then > we can end the transaction in an inconsistent state if the referenced > table has concurrent UPDATEs or DELETEs. BUT those operations do impose > locking rules back onto the referencing tables that would not be granted > until after any changes to the referencing table complete, whereupon > they would restrict or cascade. So an inconsistent state doesn't seem > possible to me. What locking back to the referencing table are you thinking about? The row locks are insufficient because that doesn't prevent an insert of a new row that matches the criteria previously locked against AFAIK.
On Fri, 2007-02-02 at 12:01 +0100, Csaba Nagy wrote: > > You say below the cut that you're not updating keys, so presumably it's > > other columns. Which leads me to something I've wondered for a while - > > why do we lock the whole row? Is it just a matter of "not optimised that > > yet" or is there a good reason why locking just some columns isn't > > practical. > > For the conditions of generating the deadlock, see: > > http://archives.postgresql.org/pgsql-general/2006-12/msg00029.php > > The reason of the occasional orphan rows is not completely clear to me, > but it must be some kind of race condition while > inserting/deleting/?updating concurrently the parent/child tables. Thanks very much to Csaba, Richard and Florian for insight on this. As you might have guessed across my recent posts, I'm coping with a locking problem that is occurring on RI checks. Similarly to Csaba's example, this is a port from Oracle. My earlier thinking was that Oracle appears to be able to avoid locking and my thought was that this was simply a rather dodgy interpretation of the SQL Standard. Anyway, I'm not happy with simply forgetting the SHARE lock; that clearly leads to invalid states in some cases, even if I need to have a strong cup of coffee in the morning before I see them. Using SELECT ... FOR SHARE in RI checks is better than using FOR UPDATE, but its still too heavy a lock for many scenarios. In particular, an INSERT on the referencing table can be blocked because of an RI check against the referenced table, when there is a concurrent UPDATE (on referenced table). So RI works well when the referenced tables are mostly read-only, but we see lots of problems when we perform regular updates against both referencing and referenced tables. When we perform an INSERT into the referencing table, we want to be certain that the FK value we are inserting still exists within the referenced table. A DELETE on the referenced table should prevent the INSERT, as should an UPDATE which changes the Primary Key. However, an UPDATE which simply UPDATEs a non-PK column on the referenced table should neither block nor prevent the INSERT on the referencing table. We might think of using a SELECT ... FOR SHARE NOWAIT but that will return an ERROR. Even if we wrap the request in a subtransaction the query will still fail even when a permissible non-PK UPDATE is taking place, so that alone is not good enough. Various other thoughts about new lock modes yield nothing useful either, after close analysis. So changing the lock *strength* is not the right thing to do, but changing the lock footprint does seem worthwhile. My initial proposal is to change the way write locking works, so as to implement simplified column-level locking. Rather than locking each column individually, we can lock the columns in one of two groups, plus the full row. Thus we have three types of write lock: 1. full row write lock as well as two mutually exclusive groups of columns: 2.a) PK cols 2.b) all columns apart from the PK cols So type (1) conflicts with either (2a) or (2b). (2a) and (2b) do not conflict with one another. Shared and Write locks conflict as before at the various levels. INSERT, DELETE - full row write lock UPDATE - will place a write lock on either: full row or all-non-PK-cols, depending upon whether the SET clause touches the PK columns. (So you cannot UPDATE the PK while the non-PK cols are being UPDATEd...) If no FK references this table, we will always take a full row write lock. SELECT FOR UPDATE - full row write lock SELECT FOR SHARE - will place full row lock by default, but in cases where the SELECT doesn't reference anything apart from the PK columns, we would place the lock only on the PK-cols. (Might be easier to do this only for RI check queries.) With this model, an INSERT or FK UPDATE on the referencing table that generates a SELECT FOR SHARE onto the referenced table will conflict with a DELETE or a PK UPDATE, but won't conflict at all with a non-PK UPDATE. This would be possible by using 2 additional tuple info bits to flag that the lock held was either HEAP_LOCKED_PK_ONLY or HEAP_LOCKED_NON_PK_COLS. When both lock types are held simultaneously we will replace xmax with a multitransaction id, where we hold only two transactionids at most - the first Xid is the holder of the PK cols lock, the second Xid is the holder of the non-PK cols lock. As a non-PK UPDATE is carried out, the details of any share locks on the PK cols are carried forward onto the new row version. So all of the machinery we need is already in place, we just need to extend it somewhat. Clearly an 8.4+ item, but seems worth getting onto the TODO list if we agree to it: TODO: "Develop non-conflicting locking scheme to allow RI checks to co-exist peacefully with non-PK UPDATEs on the referenced table". -- Simon Riggs EnterpriseDB http://www.enterprisedb.com
On Fri, 2007-02-02 at 10:35 -0800, Stephan Szabo wrote: > On Fri, 2 Feb 2007, Simon Riggs wrote: > > > It sounds like if we don't put a SHARE lock on the referenced table then > > we can end the transaction in an inconsistent state if the referenced > > table has concurrent UPDATEs or DELETEs. BUT those operations do impose > > locking rules back onto the referencing tables that would not be granted > > until after any changes to the referencing table complete, whereupon > > they would restrict or cascade. So an inconsistent state doesn't seem > > possible to me. > > What locking back to the referencing table are you thinking about? The row > locks are insufficient because that doesn't prevent an insert of a > new row that matches the criteria previously locked against AFAIK. Probably best to read the later posts; this one was at the beginning of my thought train, so is slightly off track, as later posters remind me. -- Simon Riggs EnterpriseDB http://www.enterprisedb.com
"Simon Riggs" <simon@2ndquadrant.com> writes: > Thus we have three types of write lock: > 1. full row write lock > as well as two mutually exclusive groups of columns: > 2.a) PK cols > 2.b) all columns apart from the PK cols This appears to require that we add enough fields to row headers to represent *three* locking transactions instead of the current one. Given the amount of griping about row header overhead that normally flows across this list, I can't see such a proposal getting accepted. > This would be possible by using 2 additional tuple info bits to flag > that the lock held was either HEAP_LOCKED_PK_ONLY or > HEAP_LOCKED_NON_PK_COLS. When both lock types are held simultaneously we > will replace xmax with a multitransaction id, where we hold only two > transactionids at most - the first Xid is the holder of the PK cols > lock, the second Xid is the holder of the non-PK cols lock. You haven't thought that through: it fails to distinguish who holds which lock (mxact membership is not ordered), and it doesn't scale to more than two holders, and I don't think it works for combinations of share and exclusive lock. Also, what happened to the third type of lock? Implementation details aside, I'm a tad concerned about introducing deadlock failures that did not happen before, in scenarios where transactions touch the row multiple times and end up needing to acquire one of these locks while already holding another. regards, tom lane
On Fri, 2007-02-02 at 15:57 -0500, Tom Lane wrote: > "Simon Riggs" <simon@2ndquadrant.com> writes: > > Thus we have three types of write lock: > > 1. full row write lock > > as well as two mutually exclusive groups of columns: > > 2.a) PK cols > > 2.b) all columns apart from the PK cols > > This appears to require that we add enough fields to row headers to > represent *three* locking transactions instead of the current one. > Given the amount of griping about row header overhead that normally > flows across this list, I can't see such a proposal getting accepted. Yeh, I said "ouch" myself. > > This would be possible by using 2 additional tuple info bits to flag > > that the lock held was either HEAP_LOCKED_PK_ONLY or > > HEAP_LOCKED_NON_PK_COLS. When both lock types are held simultaneously we > > will replace xmax with a multitransaction id, where we hold only two > > transactionids at most - the first Xid is the holder of the PK cols > > lock, the second Xid is the holder of the non-PK cols lock. > > You haven't thought that through: it fails to distinguish who holds > which lock (mxact membership is not ordered) OK, sorry, I thought there was some ordering possible. > , and it doesn't scale to > more than two holders, and I don't think it works for combinations of > share and exclusive lock. Also, what happened to the third type of lock? Well, we just need to record the maximum two lock holders (given the semantics described). The third lock type is both locks at once. > Implementation details aside, I'm a tad concerned about introducing > deadlock failures that did not happen before, in scenarios where > transactions touch the row multiple times and end up needing to > acquire one of these locks while already holding another. Well, right now we have deadlocks and lots of locking. Updating PKs isn't a regular occurence, so I'd rather swap a common deadlock for an uncommon one, any day. Anyway, implementation aside, I wanted to agree the overall TODO, so we can think through the best way over a long period, if you agree in general. -- Simon Riggs EnterpriseDB http://www.enterprisedb.com
"Simon Riggs" <simon@2ndquadrant.com> writes: > On Fri, 2007-02-02 at 15:57 -0500, Tom Lane wrote: >> , and it doesn't scale to >> more than two holders, and I don't think it works for combinations of >> share and exclusive lock. Also, what happened to the third type of lock? > Well, we just need to record the maximum two lock holders (given the > semantics described). The third lock type is both locks at once. You're not going to support shared locks? That will be a big step backwards ... > Anyway, implementation aside, I wanted to agree the overall TODO, so we > can think through the best way over a long period, if you agree in > general. No, I don't. I think knowledge of which columns are in a PK is quite a few levels away from the semantics of row locking. To point out just one problem, what happens when you add or drop a PK? Or drop and replace with a different column set? Yes, I know dropping one requires exclusive lock on the table, but the transaction doing it could hold row locks within the table, and now it's very unclear what they mean. regards, tom lane
Simon Riggs wrote: > My earlier thinking was that Oracle appears to be able to avoid locking > and my thought was that this was simply a rather dodgy interpretation of > the SQL Standard. Anyway, I'm not happy with simply forgetting the SHARE > lock; that clearly leads to invalid states in some cases, even if I need > to have a strong cup of coffee in the morning before I see them. I think oracle is in a completly different situation here - Oracle imposes limits on the maximum "size" of a transaction dues to various reasons I believe - one being the size of the rollback segment. AFAIK, postgres doesn't impose any such limits (apart from problems with long-running transactions an vacuums, and possibly if you do "set constraints all deferred") - which is why row locks have to be storedon-disk in the tuple header, and not in some shared-memory segment. Now, _if_ you're already imposing limits on transaction size, than it becomes quite feasable IMHO to also limit the number of row-locks a transaction can take - and to just store them in memory. This again makes column-level locking much easier I'd think. > Using SELECT ... FOR SHARE in RI checks is better than using FOR UPDATE, > but its still too heavy a lock for many scenarios. I think it's not too heavy, but it's actually the wrong kind of lock for ri checks. Both "SHARE" and "EXCLUSIVE" row locks are, well, _row_ locks - the lock a specific tuple. This is fine, if you want to guarantee that a certain tuple stays as it is as long as still need it. But it's not really what a RI constraints wants to ensure. An RI constraint actually wants to force a specific _condition_ (namely, the existence of a row with a certain value in a certain column) to be true, not prevent a specific physical tuple from being modifier. Now, a generic mechanism for "condition locking" is probably impossible to implement with large performance sacrifices - but AFAICS the cases needed for RI checks are always of the form "Is there a row where (field1, field2, ...) = (a, b, c, ...)". And - at least for RI-Checks done when updating the _referencing_ table, postgres already forces an index to exist on (field1, field2, ...) I think. The condition "There is a row where (field1, field2, ...) = (a,b,c,...)" is the same as saying "There as an index entry for (a,b,c,....) that points to a live row". _If_ is was possible to somehow take a lock on a index key (not a certain index entry, but rather _all_ entries with a given key), than that could maybe be used for more efficient RI locks. I guess this would need some sort of tuple-visibility-in-index entries, but it seems that there a few people interested in making this happen. greetings, Florian Pflug
On Fri, 2007-02-02 at 16:50 -0500, Tom Lane wrote: > "Simon Riggs" <simon@2ndquadrant.com> writes: > > On Fri, 2007-02-02 at 15:57 -0500, Tom Lane wrote: > >> , and it doesn't scale to > >> more than two holders, and I don't think it works for combinations of > >> share and exclusive lock. Also, what happened to the third type of lock? > > > Well, we just need to record the maximum two lock holders (given the > > semantics described). The third lock type is both locks at once. > > You're not going to support shared locks? That will be a big step > backwards ... I did say that Shared locks were supported also. The lack of ordering of multitransactions is a hole in my suggestion, so I need to reconsider. > > Anyway, implementation aside, I wanted to agree the overall TODO, so we > > can think through the best way over a long period, if you agree in > > general. > > No, I don't. I think knowledge of which columns are in a PK is quite a > few levels away from the semantics of row locking. To point out just > one problem, what happens when you add or drop a PK? Or drop and > replace with a different column set? Yes, I know dropping one requires > exclusive lock on the table, but the transaction doing it could hold row > locks within the table, and now it's very unclear what they mean. There are issues, yes. Dropping PKs is a very irregular occurrence nor is it likely to be part of a complex transaction. It wouldn't bother me to say that if a transaction already holds a RowExclusiveLock or a RowShareLock it cannot upgrade to an AccessExclusiveLock. For me, PKs are intimately tied to the rows they uniquely identify; we should be mentally linking them and seeing them as a replacement for Oids, which we still see as intimately linked at low level. We'll be missing many optimizations if we don't, we've discussed others this week already. The TODO I was requesting you consider was this: "Develop non-conflicting locking scheme to allow RI checks to co-exist peacefully with non-PK UPDATEs on the referenced table". That is, IMHO, a general statement of an important unresolved issue with our Referential Integrity implementation. That is in no way intended as any form of negative commentary on the excellent detailed work that has got us so far already. -- Simon Riggs EnterpriseDB http://www.enterprisedb.com
On Sat, 3 Feb 2007, Simon Riggs wrote: > On Fri, 2007-02-02 at 16:50 -0500, Tom Lane wrote: > > No, I don't. I think knowledge of which columns are in a PK is quite a > > few levels away from the semantics of row locking. To point out just > > one problem, what happens when you add or drop a PK? Or drop and > > replace with a different column set? Yes, I know dropping one requires > > exclusive lock on the table, but the transaction doing it could hold row > > locks within the table, and now it's very unclear what they mean. > > There are issues, yes. Dropping PKs is a very irregular occurrence nor > is it likely to be part of a complex transaction. It wouldn't bother me > to say that if a transaction already holds a RowExclusiveLock or a > RowShareLock it cannot upgrade to an AccessExclusiveLock. The lock check seems like a strange constraint, given that it's not necessarily going to be anything that conflicts with the row locks. I'm not sure there'd be a better idea given this sort of scheme, but it still seems strange. > The TODO I was requesting you consider was this: > > "Develop non-conflicting locking scheme to allow RI checks to co-exist > peacefully with non-PK UPDATEs on the referenced table". > > That is, IMHO, a general statement of an important unresolved issue with > our Referential Integrity implementation. That is in no way intended as > any form of negative commentary on the excellent detailed work that has > got us so far already. Well, if we really want to solve that completely then we really need column locking, or at least locking at the level of arbitrary (possibly overlapping) unique constraints, not just the PK because foreign keys don't necessarily reference the primary key. But the PK case is certainly the most common and it'd certainly be nice to cover that case.
On 2/2/2007 4:51 AM, Simon Riggs wrote: > It sounds like if we don't put a SHARE lock on the referenced table then > we can end the transaction in an inconsistent state if the referenced > table has concurrent UPDATEs or DELETEs. BUT those operations do impose > locking rules back onto the referencing tables that would not be granted > until after any changes to the referencing table complete, whereupon > they would restrict or cascade. So an inconsistent state doesn't seem > possible to me. > > What am I missing? > You're missing MVCC. The newly inserted reference only becomes visible when it is committed. If the order of actions is insert and check for PK, other transaction deletes PK and commits, inserted FK commits ... the other transaction didn't see "it coming". Jan -- #======================================================================# # It's easier to get forgiveness for being wrong than for being right. # # Let's break this rule - forgive me. # #================================================== JanWieck@Yahoo.com #
On Sat, 2007-02-03 at 09:43 -0800, Stephan Szabo wrote: > On Sat, 3 Feb 2007, Simon Riggs wrote: > > > On Fri, 2007-02-02 at 16:50 -0500, Tom Lane wrote: > > > No, I don't. I think knowledge of which columns are in a PK is quite a > > > few levels away from the semantics of row locking. To point out just > > > one problem, what happens when you add or drop a PK? Or drop and > > > replace with a different column set? Yes, I know dropping one requires > > > exclusive lock on the table, but the transaction doing it could hold row > > > locks within the table, and now it's very unclear what they mean. > > > > There are issues, yes. Dropping PKs is a very irregular occurrence nor > > is it likely to be part of a complex transaction. It wouldn't bother me > > to say that if a transaction already holds a RowExclusiveLock or a > > RowShareLock it cannot upgrade to an AccessExclusiveLock. > > The lock check seems like a strange constraint, given that it's not > necessarily going to be anything that conflicts with the row locks. I'm > not sure there'd be a better idea given this sort of scheme, but it still > seems strange. > > > The TODO I was requesting you consider was this: > > > > "Develop non-conflicting locking scheme to allow RI checks to co-exist > > peacefully with non-PK UPDATEs on the referenced table". > > > > That is, IMHO, a general statement of an important unresolved issue with > > our Referential Integrity implementation. That is in no way intended as > > any form of negative commentary on the excellent detailed work that has > > got us so far already. > > Well, if we really want to solve that completely then we really need > column locking, or at least locking at the level of arbitrary (possibly > overlapping) unique constraints, not just the PK because foreign keys > don't necessarily reference the primary key. But the PK case is certainly > the most common and it'd certainly be nice to cover that case. IMHO generic column level locking would hardly ever be used. Locking for RI seems to be 99% of the use case, which means we'd be OK if we found a way of only locking an arbitary number of unique col groups. By definition, each of these column groups is covered by a unique index. It occurs to me that if we had visibility in unique indexes, this would allow the index rows to be separately lockable to the main row. That's exactly what we need here. It also occurs to me that putting visibility in indexes doesn't prevent us from optimizing away index inserts for UPDATEs. There is no particular reason why the xmin and xmax of a unique index exactly matches the xmin and xmax of the main row. [I said the opposite to Jim Nasby a few days ago, regrettably]. The indexes would record the xmin and xmax of the row, while the main heap would have the xmin and xmax of the individual row versions. If we did both HOT + visibility in unique indexes then we would be able to eliminate the contention between INSERTs and UPDATEs with RI. As Tom pointed out this would complicate deadlock detection, but then so would any column-level locking scheme, so that isn't an argument against any one in particular. -- Simon Riggs EnterpriseDB http://www.enterprisedb.com
On Sat, 3 Feb 2007, Simon Riggs wrote: > There are issues, yes. Dropping PKs is a very irregular occurrence nor > is it likely to be part of a complex transaction. It wouldn't bother me > to say that if a transaction already holds a RowExclusiveLock or a > RowShareLock it cannot upgrade to an AccessExclusiveLock. Actually, since rearranging PKs is such a drastic change I would expect them only to be part of a large complex transaction. I know for apps I work on it would be part of a single transaction script that updated whole chunks of data and schema. Kris Jurka
On Sun, 2007-02-04 at 09:38 +0000, Simon Riggs wrote: > > > The TODO I was requesting you consider was this: > > > > > > "Develop non-conflicting locking scheme to allow RI checks to co-exist > > > peacefully with non-PK UPDATEs on the referenced table". > > > > > > That is, IMHO, a general statement of an important unresolved issue with > > > our Referential Integrity implementation. That is in no way intended as > > > any form of negative commentary on the excellent detailed work that has > > > got us so far already. > > > > Well, if we really want to solve that completely then we really need > > column locking, or at least locking at the level of arbitrary (possibly > > overlapping) unique constraints, not just the PK because foreign keys > > don't necessarily reference the primary key. But the PK case is certainly > > the most common and it'd certainly be nice to cover that case. ... > It occurs to me that if we had visibility in unique indexes, this would > allow the index rows to be separately lockable to the main row. That's > exactly what we need here. I've implemented a work-around using this principle, utilising RULEs and a duplicated PK column-only table. This still allows FK checks to work correctly, yet doesn't require the backend hack Csaba mentioned. My feeling is that more work in this area is required, even if we can't yet agree a TODO item. -- Simon Riggs EnterpriseDB http://www.enterprisedb.com
Simon Riggs wrote: > > It occurs to me that if we had visibility in unique indexes, this would > > allow the index rows to be separately lockable to the main row. That's > > exactly what we need here. > > I've implemented a work-around using this principle, utilising RULEs and > a duplicated PK column-only table. This still allows FK checks to work > correctly, yet doesn't require the backend hack Csaba mentioned. > > My feeling is that more work in this area is required, even if we can't > yet agree a TODO item. OK, please propose some wording so at least we can get agreement on that. -- Bruce Momjian bruce@momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. +
On Mon, 5 Feb 2007, Simon Riggs wrote: > On Sun, 2007-02-04 at 09:38 +0000, Simon Riggs wrote: > > > > The TODO I was requesting you consider was this: > > > > > > > > "Develop non-conflicting locking scheme to allow RI checks to co-exist > > > > peacefully with non-PK UPDATEs on the referenced table". > > > > > > > > That is, IMHO, a general statement of an important unresolved issue with > > > > our Referential Integrity implementation. That is in no way intended as > > > > any form of negative commentary on the excellent detailed work that has > > > > got us so far already. > > > > > > Well, if we really want to solve that completely then we really need > > > column locking, or at least locking at the level of arbitrary (possibly > > > overlapping) unique constraints, not just the PK because foreign keys > > > don't necessarily reference the primary key. But the PK case is certainly > > > the most common and it'd certainly be nice to cover that case. > > ... > > > It occurs to me that if we had visibility in unique indexes, this would > > allow the index rows to be separately lockable to the main row. That's > > exactly what we need here. > > I've implemented a work-around using this principle, utilising RULEs and > a duplicated PK column-only table. This still allows FK checks to work > correctly, yet doesn't require the backend hack Csaba mentioned. > > My feeling is that more work in this area is required, even if we can't > yet agree a TODO item. I actually like the general idea your TODO item had, although I would say non-referenced column update rather than non-PK update. Even if we put it far out due to questions about what would be acceptable implementation.
"Bruce Momjian" <bruce@momjian.us> writes: > OK, please propose some wording so at least we can get agreement on > that. How about something open-ended like "arrange for updates that do not update columns referenced by foreign keys from other tables to avoid being blocked by locks from concurrent RI checks" -- Gregory Stark EnterpriseDB http://www.enterprisedb.com
"Gregory Stark" <stark@enterprisedb.com> writes: > "Bruce Momjian" <bruce@momjian.us> writes: > >> OK, please propose some wording so at least we can get agreement on >> that. > > How about something open-ended like "arrange for updates that do not update > columns referenced by foreign keys from other tables to avoid being blocked by > locks from concurrent RI checks" Hum. Reading back in the thread it seems what I wrote is basically equivalent to the wording Simon originally proposed. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com
On Mon, 2007-02-05 at 23:25 +0000, Gregory Stark wrote: > "Gregory Stark" <stark@enterprisedb.com> writes: > > > "Bruce Momjian" <bruce@momjian.us> writes: > > > >> OK, please propose some wording so at least we can get agreement on > >> that. > > > > How about something open-ended like "arrange for updates that do not update > > columns referenced by foreign keys from other tables to avoid being blocked by > > locks from concurrent RI checks" > > Hum. Reading back in the thread it seems what I wrote is basically equivalent > to the wording Simon originally proposed. I like your wording. It's clearer and includes Stephan's clarification. Some minor mods... TODO "avoid blocking of updates because of concurrent RI checks when those updates do not alter columns referenced by foreign keys from other tables" -- Simon Riggs EnterpriseDB http://www.enterprisedb.com
Simon Riggs wrote: > On Sat, 2007-02-03 at 09:43 -0800, Stephan Szabo wrote: >> On Sat, 3 Feb 2007, Simon Riggs wrote: >> >>> On Fri, 2007-02-02 at 16:50 -0500, Tom Lane wrote: >>>> No, I don't. I think knowledge of which columns are in a PK is quite a >>>> few levels away from the semantics of row locking. To point out just >>>> one problem, what happens when you add or drop a PK? Or drop and >>>> replace with a different column set? Yes, I know dropping one requires >>>> exclusive lock on the table, but the transaction doing it could hold row >>>> locks within the table, and now it's very unclear what they mean. >>> There are issues, yes. Dropping PKs is a very irregular occurrence nor >>> is it likely to be part of a complex transaction. It wouldn't bother me >>> to say that if a transaction already holds a RowExclusiveLock or a >>> RowShareLock it cannot upgrade to an AccessExclusiveLock. >> The lock check seems like a strange constraint, given that it's not >> necessarily going to be anything that conflicts with the row locks. I'm >> not sure there'd be a better idea given this sort of scheme, but it still >> seems strange. >> >>> The TODO I was requesting you consider was this: >>> >>> "Develop non-conflicting locking scheme to allow RI checks to co-exist >>> peacefully with non-PK UPDATEs on the referenced table". >>> >>> That is, IMHO, a general statement of an important unresolved issue with >>> our Referential Integrity implementation. That is in no way intended as >>> any form of negative commentary on the excellent detailed work that has >>> got us so far already. >> Well, if we really want to solve that completely then we really need >> column locking, or at least locking at the level of arbitrary (possibly >> overlapping) unique constraints, not just the PK because foreign keys >> don't necessarily reference the primary key. But the PK case is certainly >> the most common and it'd certainly be nice to cover that case. > > IMHO generic column level locking would hardly ever be used. Locking for > RI seems to be 99% of the use case, which means we'd be OK if we found a > way of only locking an arbitary number of unique col groups. By > definition, each of these column groups is covered by a unique index. Not saying this'll gain us anything but... It has ocurred to me that the lock could be reduced in another way. If we had an "immutable" constraint that could be applied to pkey-columns then we wouldn't have to worry about updates at all. If a pkey value was there before an update, it would be there after too. The only thing you'd need to prevent would be deletes. -- Richard Huxton Archonet Ltd