Thread: SIREAD lock versus ACCESS EXCLUSIVE lock
Composing my rather long-winded response to Simon got me thinking -- which just led me to realize there is probably a need to fix another thing related to SSI. For correct serializable behavior in the face of concurrent DDL execution, I think that a request for a heavyweight ACCESS EXCLUSIVE lock might need to block until all SIREAD locks on the relation have been released. Picture, for example, what might happen if one transaction acquires some predicate locks, then commits (releasing its heavyweight lock on the table), and before concurrent READ WRITE transactions complete there is a CLUSTER on the table. Or a DROP INDEX. :-( Both require an ACCESS EXCLUSIVE lock. Since an active transaction would already have an ACCESS SHARE lock when acquiring the SIREAD locks, this couldn't block in the other direction or with an active transaction. That means that it couldn't cause any deadlocks if we added blocking to the acquisition of an ACCESS EXCLUSIVE based on this. If we don't do this I don't think that there is a more serious impact than inaccurate conflict detection for serializable transactions which are active when these operations are performed. Well, that and the possibility of seeing SIRead locks in the pg_locks view for indexes or tables which no longer exist. So far I don't see any crash modes or effects on non-serializable transactions. If this change is too destabilizing for this point in the release we could document it as a limitation and fix it in 9.2. We're pretty aggressive about cleaning up SIREAD locks as soon as allowable after transaction completion, so this probably wouldn't happen very often. -Kevin
On 27.04.2011 22:59, Kevin Grittner wrote: > For correct serializable behavior in the face of concurrent DDL > execution, I think that a request for a heavyweight ACCESS EXCLUSIVE > lock might need to block until all SIREAD locks on the relation have > been released. Picture, for example, what might happen if one > transaction acquires some predicate locks, then commits (releasing > its heavyweight lock on the table), and before concurrent READ WRITE > transactions complete there is a CLUSTER on the table. Or a DROP > INDEX. :-( Hmm, could we upgrade all predicate locks to relation-level predicate locks instead? -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> wrote: > On 27.04.2011 22:59, Kevin Grittner wrote: >> For correct serializable behavior in the face of concurrent DDL >> execution, I think that a request for a heavyweight ACCESS >> EXCLUSIVE lock might need to block until all SIREAD locks on the >> relation have been released. Picture, for example, what might >> happen if one transaction acquires some predicate locks, then >> commits (releasing its heavyweight lock on the table), and before >> concurrent READ WRITE transactions complete there is a CLUSTER on >> the table. Or a DROP INDEX. :-( > > Hmm, could we upgrade all predicate locks to relation-level > predicate locks instead? Tied to what backend? This only comes into play with transactions which have committed and are not yet able to release their predicate locks because of overlapping READ WRITE serializable transactions which are still active. Until that point the ACCESS SHARE lock (or stronger) is still there protecting against this. One way we could push this entirely into the heavyweight locking system would be for a committing transaction with predicate locks to somehow cause all overlapping read write serializable transactions which are still active to acquire an ACCESS SHARE lock on each relation for which the committing transaction has any predicate lock(s). (Unless of course they already hold some lock on a relevant relation.) This would need to be done before the committing transaction released its lock on the relation (which it must have acquired to read something which would cause it to have a predicate lock). Is that feasible? It seems like a lot of work, and I'm not sure how one transaction would convince another process's transaction to acquire the locks. As an alternative, perhaps we could find a way to leave the relation locks for a serializable transaction until it's safe to clean up the predicate locks? They could be downgraded to ACCESS SHARE if they are stronger. They would need to survive beyond not only the commit of the transaction, but also the termination of the connection. They would need to be immune to being chosen as deadlock victims. Any other ideas? -Kevin
On Wed, Apr 27, 2011 at 02:59:19PM -0500, Kevin Grittner wrote: > For correct serializable behavior in the face of concurrent DDL > execution, I think that a request for a heavyweight ACCESS EXCLUSIVE > lock might need to block until all SIREAD locks on the relation have > been released. Picture, for example, what might happen if one I think you're correct about this, but also agree that it would be reasonable to document the limitation for now and punt on a fix until 9.2. On Wed, Apr 27, 2011 at 04:09:38PM -0500, Kevin Grittner wrote: > Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> wrote: > > Hmm, could we upgrade all predicate locks to relation-level > > predicate locks instead? > > Tied to what backend? This only comes into play with transactions > which have committed and are not yet able to release their predicate > locks because of overlapping READ WRITE serializable transactions > which are still active. Until that point the ACCESS SHARE lock (or > stronger) is still there protecting against this. I think Heikki was suggesting to upgrade to relation-level SIREAD locks. This seems like it would work, but might cause a lot of aborts immediately afterwards. I do wonder if it might be necessary to upgrade index locks to relation-level locks on the heap relation, in cases of dropping the index. > One way we could push this entirely into the heavyweight locking > system would be for a committing transaction with predicate locks to > somehow cause all overlapping read write serializable transactions > which are still active to acquire an ACCESS SHARE lock on each > relation for which the committing transaction has any predicate > lock(s). I'm not entirely clear on how this would work, but it sounds like it could also have a significant performance cost. > As an alternative, perhaps we could find a way to leave the relation > locks for a serializable transaction until it's safe to clean up the > predicate locks? They could be downgraded to ACCESS SHARE if they > are stronger. They would need to survive beyond not only the commit > of the transaction, but also the termination of the connection. > They would need to be immune to being chosen as deadlock victims. This sounds like it would require major changes to the heavyweight lock manager. There's already a notion of keeping locks past backend termination for 2PC prepared transactions, but it would be hard to apply here. The most practical solutions I see here are to, when acquiring an AccessExclusive lock, either wait for any associated SIREAD locks to go away, or to promote them to relation level. Dan -- Dan R. K. Ports MIT CSAIL http://drkp.net/
Dan Ports <drkp@csail.mit.edu> wrote: > On Wed, Apr 27, 2011 at 04:09:38PM -0500, Kevin Grittner wrote: >> Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> wrote: >>> Hmm, could we upgrade all predicate locks to relation-level >>> predicate locks instead? >> >> Tied to what backend? > I think Heikki was suggesting to upgrade to relation-level SIREAD > locks. Oh, yeah. That *is* what he said, isn't it. That's a simpler solution which works just fine. Please forget my over-excited flights of fancy on the topic -- I was so focused on what I thought was the solution I misread Heikki's much better idea as some variation on my theme. :-( -Kevin
On Wed, Apr 27, 2011 at 8:59 PM, Kevin Grittner <Kevin.Grittner@wicourts.gov> wrote: > For correct serializable behavior in the face of concurrent DDL > execution, I think that a request for a heavyweight ACCESS EXCLUSIVE > lock might need to block until all SIREAD locks on the relation have > been released. Picture, for example, what might happen if one > transaction acquires some predicate locks, then commits (releasing > its heavyweight lock on the table), and before concurrent READ WRITE > transactions complete there is a CLUSTER on the table. Or a DROP > INDEX. :-( Sorry, I can't picture it. What will happen? > Both require an ACCESS EXCLUSIVE lock. Since an active transaction > would already have an ACCESS SHARE lock when acquiring the SIREAD > locks, this couldn't block in the other direction or with an active > transaction. That means that it couldn't cause any deadlocks if we > added blocking to the acquisition of an ACCESS EXCLUSIVE based on > this. > > If we don't do this I don't think that there is a more serious > impact than inaccurate conflict detection for serializable > transactions which are active when these operations are performed. > Well, that and the possibility of seeing SIRead locks in the > pg_locks view for indexes or tables which no longer exist. So far I > don't see any crash modes or effects on non-serializable > transactions. If this change is too destabilizing for this point in > the release we could document it as a limitation and fix it in 9.2. I don't think this should wait for 9.2 It either works, or it doesn't. Putting caveats in there will just detract from people's belief in it. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Simon Riggs <simon@2ndQuadrant.com> wrote: > On Wed, Apr 27, 2011 at 8:59 PM, Kevin Grittner > <Kevin.Grittner@wicourts.gov> wrote: > >> For correct serializable behavior in the face of concurrent DDL >> execution, I think that a request for a heavyweight ACCESS >> EXCLUSIVE lock might need to block until all SIREAD locks on the >> relation have been released. Picture, for example, what might >> happen if one transaction acquires some predicate locks, then >> commits (releasing its heavyweight lock on the table), and before >> concurrent READ WRITE transactions complete there is a CLUSTER on >> the table. Or a DROP INDEX. :-( > > Sorry, I can't picture it. What will happen? Rather than get into a complex generalized discussion, I'll provide the simplest example I can picture. Let's say we have two concurrent transactions, T0 and T1. Up to this point T0 has read from table x and written to table y based on what was read from x. T1 has read from y -- but since the transactions are concurrent, it doesn't see T0's write. Let's assume each read was of a single tuple accessed through a btree index, so each transaction has one tuple lock on the heap and one page lock on the index. Now T0 commits. T0 must hold its SIREAD locks because of concurrent transaction T1. Everything is fine so far. Now a DBA runs CLUSTER against table x. The SIREAD locks held by T0 are probably now wrong, because the tuple and its index entry are likely to have moved. Now T1 writes to table x based on what it read from y. It could incorrectly detect a conflict if it happens to write to a tuple at the locked block and tuple number when it's not the same row. Worse, it could miss detecting a conflict if it's really updating the same row that T0 wrote, and that's not detected because it's not at the locked location any more. >> If this change is too destabilizing for this point in the release >> we could document it as a limitation and fix it in 9.2. > > I don't think this should wait for 9.2 > > It either works, or it doesn't. Putting caveats in there will just > detract from people's belief in it. I see your point. And this clearly is a bug. We failed to consider this category of problem and cover it. Heikki's suggestion is clearly the best plan. In the example above, when the CLUSTER was run it would make a call to the predicate locking module telling it to promote all SIREAD locks for table x or any of its indexes into a relation level lock on table x. The CLUSTER would cause us to lose the finer granularity of the locks on the table, and in this example if T1 wrote to table x it be rolled back with a serialization failure. This could be a false positive, but we expect to have some of those -- the transaction is retried and then succeeds. You can't have a false negative, so integrity is preserved. I'll try to work up a detailed plan of which commands need what actions. For example, DROP INDEX needs to promote SIREAD locks on the dropped index to relation locks on the related table. TRUNCATE TABLE is a little confusing -- I think that if it's run in a serializable transaction we generate a rw-conflict out from that transaction to every transaction holding any SIREAD lock on that table or any of its indexes, and then clear those SIREAD locks. This'll take some study. -Kevin
"Kevin Grittner" <Kevin.Grittner@wicourts.gov> wrote: > This'll take some study. I've gone through the list of commands in the development docs with an eye toward exposing anything else we might have missed in dealing with the SSI predicate locking. Some of this needs further research, but I'm posting what I have so far so that people have a chance to argue about what I think needs doing or to point out anything else they can think of which I'm still missing. And, of course, so people know where we are with it. I tested file_fdw at the serializable isolation level and found that no predicate locks were taken and no problems showed up. I think that's all correct -- I don't see how we can expect to include FDW data in serializability. I haven't dug into ALTER INDEX enough to know whether it can ever cause an index to be rebuilt. If so, we need to treat it like DROP INDEX and REINDEX -- which should change all predicate locks of any granularity on the index into relation locks on the associated table. CLUSTER or an ALTER TABLE which causes a rewrite should change all predicate locks on the table and all indexes into relation locks on the associated table. (Obviously, an existing relation lock on the table doesn't require any action.) TRUNCATE TABLE and DROP TABLE should generate a rw-conflict *in* to the enclosing transaction (if it is serializable) from all transactions holding predicate locks on the table or its indexes. Note that this could cause a transactions which is running one of these statements to roll back with a serialization error. This seems correct to me, since these operations essentially delete all rows. If you don't want the potential rollback, these operations should be run at another isolation level. The difference between these two statements is that I think that TRUNCATE TABLE should also move the existing predicate locks to relation locks on the table while DROP TABLE (for obvious reasons) should just delete the predicate locks. DROP DATABASE should quietly clean up any predicate locks from committed transactions which haven't yet hit their cleanup point because of overlapping transactions in other databases. DROP OWNED or DROP SCHEMA could CASCADE to some of the above, in which case the above rules would apply. I need to do some testing with Large Objects to see what state those are in with SSI, unless someone else has already looked at that. SAVEPOINTs have been considered and there is a lot of coding to try to handle them correctly, but they probably merit a bit more attention and testing. On some quick testing everything seemed to be in line with previous discussions and with what seems logical to me, but our SSI regression tests in src/test/isolation lack any coverage for them and I don't know how much others have beat up on them. At a minimum we should add a couple tests. I couldn't find anything else which hadn't already been considered and covered. More to come as I finish investigating. -Kevin
"Kevin Grittner" <Kevin.Grittner@wicourts.gov> wrote: > I haven't dug into ALTER INDEX enough to know whether it can ever > cause an index to be rebuilt. If so, we need to treat it like > DROP INDEX and REINDEX -- which should change all predicate locks > of any granularity on the index into relation locks on the > associated table. > > CLUSTER or an ALTER TABLE which causes a rewrite should change all > predicate locks on the table and all indexes into relation locks > on the associated table. (Obviously, an existing relation lock on > the table doesn't require any action.) > > TRUNCATE TABLE and DROP TABLE should generate a rw-conflict *in* > to the enclosing transaction (if it is serializable) from all > transactions holding predicate locks on the table or its indexes. > Note that this could cause a transactions which is running one of > these statements to roll back with a serialization error. This > seems correct to me, since these operations essentially delete all > rows. If you don't want the potential rollback, these operations > should be run at another isolation level. The difference between > these two statements is that I think that TRUNCATE TABLE should > also move the existing predicate locks to relation locks on the > table while DROP TABLE (for obvious reasons) should just delete > the predicate locks. > > DROP DATABASE should quietly clean up any predicate locks from > committed transactions which haven't yet hit their cleanup point > because of overlapping transactions in other databases. I missed VACUUM FULL when pulling together the above, but I haven't found any other omissions. (Still happy to hear about any that anyone can spot.) I notice that most of these need to shift transfer locks to relation locks on the heap, either for a single index or for the heap and all indexes. I wrote a function to do this and called it from one place to be able to test it. Consider this a WIP patch on which I would appreciate review while I work on finding the other places to call it and the miscellaneous other fixes needed. Note that I had to expose one previously-static function from index.c to find the heap OID from the index OID. Also, I ran pgindent against predicate.c, as I generally like to do when I modify much code, and it found four comment blocks in predicate.c touched since the recent global pgindent run which it re-wrapped. I can manually exclude those from the final patch if people would prefer that; but if people can ignore those whitespace tweaks, it might not be all bad to get standard formatting onto them at this point. -Kevin
Attachment
Just a quick status update. I wrote: > Consider this a WIP patch The serializable branch on my git repo has a modified form of this which has been tested successfully with: DROP INDEX REINDEX VACUUM FULL CLUSTER ALTER TABLE I'm holding off on posting another version of the patch until I cover the remaining commands, which are: TRUNCATE TABLE DROP TABLE DROP DATABASE I'm having to work on this off-hours at the moment, so expect the patch to come sometime this weekend. -Kevin
On 30.04.2011 01:04, Kevin Grittner wrote: > TRUNCATE TABLE and DROP TABLE should generate a rw-conflict *in* to > the enclosing transaction (if it is serializable) from all > transactions holding predicate locks on the table or its indexes. > Note that this could cause a transactions which is running one of > these statements to roll back with a serialization error. This seems > correct to me, since these operations essentially delete all rows. > If you don't want the potential rollback, these operations should be > run at another isolation level. The difference between these two > statements is that I think that TRUNCATE TABLE should also move the > existing predicate locks to relation locks on the table while DROP > TABLE (for obvious reasons) should just delete the predicate locks. Note that TRUNCATE has never been MVCC-safe anyway. Perhaps it's best to just treat it like DROP TABLE. Or can we use the predicate lock mechanism to abort serializable transactions that incorrectly see the table as empty? > DROP DATABASE should quietly clean up any predicate locks from > committed transactions which haven't yet hit their cleanup point > because of overlapping transactions in other databases. This is just an optimization, right? The predicate locks will eventually go away anyway. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
> Heikki Linnakangas wrote: > On 30.04.2011 01:04, Kevin Grittner wrote: >> TRUNCATE TABLE and DROP TABLE should generate a rw-conflict *in* >> to the enclosing transaction (if it is serializable) from all >> transactions holding predicate locks on the table or its indexes. >> Note that this could cause a transactions which is running one of >> these statements to roll back with a serialization error. This >> seems correct to me, since these operations essentially delete all >> rows. If you don't want the potential rollback, these operations >> should be run at another isolation level. The difference between >> these two statements is that I think that TRUNCATE TABLE should >> also move the existing predicate locks to relation locks on the >> table while DROP TABLE (for obvious reasons) should just delete >> the predicate locks. > > Note that TRUNCATE has never been MVCC-safe anyway. Yeah, Dan pointed out that another REPEATABLE READ or SERIALIZABLE transaction *will* see the work of a committed TRUNCATE TABLE statement, so it is not semantically identical to DELETE FROM with no WHERE clause, which was something I somehow had gotten into my head. > Perhaps it's best to just treat it like DROP TABLE. We had been leaning that way based on the above observation. > Or can we use the predicate lock mechanism to abort serializable > transactions that incorrectly see the table as empty? Predicate locks only interact with writes; it'd be a rather nasty wart to try to bend them to the above use. I think we just have to stick with the dictum which has controlled so far -- Serializable Snapshot Isolation can only serialize those things which follow the semantics of Snapshot Isolation. TRUNCATE TABLE doesn't. >> DROP DATABASE should quietly clean up any predicate locks from >> committed transactions which haven't yet hit their cleanup point >> because of overlapping transactions in other databases. > > This is just an optimization, right? The predicate locks will > eventually go away anyway. Yes, correct. The only way they could create a problem besides just taking up predicate lock slots is if a new database was created with an identical OID before overlapping serializable transactions in other databases completed; that seems rather far-fetched. Perhaps DROP DATABASE is so infrequent that it's not worth the extra code to do the early cleanup? The code to do that part might not carry its own weight on a cost/benefit basis. -Kevin
On 1 May 2011 I wrote: > Consider this a WIP patch Just so people know where this stands... By 8 May 2011 I had the attached. I didn't post it because I was not confident I had placed the calls to the SSI functions for DROP TABLE and TRUNCATE TABLE correctly. Then came PGCon and also the row versus tuple discussion -- the patch for which had conflicts with this one. I will be reviewing this on the weekend and making any final adjustments, but the patch is very likely to look just like this with the possible exception of placement of the DROP TABLE and TRUNCATE TABLE calls. If anyone wants to start reviewing this now, it would leave more time for me to make changes if needed. Also, if anyone has comments or hints about the placement of those calls, I'd be happy to receive them. This patch compiles with `make world`, passes `make check-world`, and passes both `make installcheck-world` and `make -C src/test/isolation installcheck` against a database with default_transaction_isolation = 'serializable'. It also passed a few ad hoc tests of the implemented functionality. -Kevin
Attachment
On 03.06.2011 21:04, Kevin Grittner wrote: > Also, if anyone has comments or hints about the placement of those > calls, I'd be happy to receive them. heap_drop_with_catalog() schedules the relation for deletion at the end of transaction, but it's still possible that the transaction aborts and the heap doesn't get dropped after all. If you put the DropAllPredicateLocksFromTable() call there, and the transaction later aborts, you've lost all the locks already. I think you'll need to just memorize the lock deletion command in a backend-local list, and perform the deletion in a post-commit function. Something similar to the PendingRelDelete stuff in storage.c. In fact, hooking into smgrDoPendingDeletes would work, but that seems like a modularity violation. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> wrote: > On 03.06.2011 21:04, Kevin Grittner wrote: >> Also, if anyone has comments or hints about the placement of >> those calls, I'd be happy to receive them. > > heap_drop_with_catalog() schedules the relation for deletion at > the end of transaction, but it's still possible that the > transaction aborts and the heap doesn't get dropped after all. If > you put the DropAllPredicateLocksFromTable() call there, and the > transaction later aborts, you've lost all the locks already. > > I think you'll need to just memorize the lock deletion command in > a backend-local list, and perform the deletion in a post-commit > function. Something similar to the PendingRelDelete stuff in > storage.c. In fact, hooking into smgrDoPendingDeletes would work, > but that seems like a modularity violation. Thanks. That's helpful. Will look at that code and do something similar. I knew it didn't look right in the place it was, but couldn't quite see what to do instead.... -Kevin
"Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes: > Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> wrote: >> I think you'll need to just memorize the lock deletion command in >> a backend-local list, and perform the deletion in a post-commit >> function. Something similar to the PendingRelDelete stuff in >> storage.c. In fact, hooking into smgrDoPendingDeletes would work, >> but that seems like a modularity violation. > Thanks. That's helpful. Will look at that code and do something > similar. Keep in mind that it's too late to throw any sort of error post-commit. Any code you add there will need to have negligible probability of failure. regards, tom lane
Tom Lane <tgl@sss.pgh.pa.us> wrote: > "Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes: >> Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> wrote: >>> I think you'll need to just memorize the lock deletion command >>< in a backend-local list, and perform the deletion in a >>> post-commit function. Something similar to the PendingRelDelete >>> stuff in storage.c. In fact, hooking into smgrDoPendingDeletes >>> would work, but that seems like a modularity violation. > >> Thanks. That's helpful. Will look at that code and do something >> similar. > > Keep in mind that it's too late to throw any sort of error > post-commit. Any code you add there will need to have negligible > probability of failure. Thanks for the heads-up. I think we're OK in that regard, though. We have two commit-time functions in SSI: PreCommit_CheckForSerializationFailure(void) which is the "last chance for failure" before actual commit, and ReleasePredicateLocks(const bool isCommit) which is for resource cleanup after rollback or commit is effective. We pretty much can't fail on the latter except for Assert statements, and this new logic would be the same. It's hard to fail when freeing resources unless something is quite seriously hosed already. This is where I was planning to put the freeing of the locks, based on queuing up the request at the time the DROP TABLE statement is encountered. -Kevin
Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> wrote: > I think you'll need to just memorize the lock deletion command in > a backend-local list, and perform the deletion in a post-commit > function. Hmm. As mentioned earlier in the thread, cleaning these up doesn't actually have any benefit beyond freeing space in the predicate locking collections. I'm not sure that benefit is enough to justify this much new mechanism. Maybe I should just leave them alone and let them get cleaned up in due course with the rest of the locks. Any opinions on that? -Kevin
On 03.06.2011 23:44, Kevin Grittner wrote: > Heikki Linnakangas<heikki.linnakangas@enterprisedb.com> wrote: > >> I think you'll need to just memorize the lock deletion command in >> a backend-local list, and perform the deletion in a post-commit >> function. > > Hmm. As mentioned earlier in the thread, cleaning these up doesn't > actually have any benefit beyond freeing space in the predicate > locking collections. I'm not sure that benefit is enough to justify > this much new mechanism. Maybe I should just leave them alone and > let them get cleaned up in due course with the rest of the locks. > Any opinions on that? Is there a chance of false positives if oid wraparound happens and a new table gets the same oid as the old one? It's also possible for a heap to get the OID of an old index or vice versa, will that confuse things? -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> wrote: > On 03.06.2011 23:44, Kevin Grittner wrote: >> Hmm. As mentioned earlier in the thread, cleaning these up >> doesn't actually have any benefit beyond freeing space in the >> predicate locking collections. I'm not sure that benefit is >> enough to justify this much new mechanism. Maybe I should just >> leave them alone and let them get cleaned up in due course with >> the rest of the locks. Any opinions on that? > > Is there a chance of false positives if oid wraparound happens and > a new table gets the same oid as the old one? It's also possible > for a heap to get the OID of an old index or vice versa, will that > confuse things? Tuple locks should be safe from that because we use the tuple xmin as part of the target key, but page and heap locks could result in false positive serialization failures on OID wraparound unless we do the cleanup. I guess that tips the scales in favor of it being worth the extra code. I think it's still in that gray area where it's a judgment call because it would be very infrequent and it would be recoverable; but the fewer mysterious rollbacks, the fewer posts about it we'll get. So any costs in maintaining the extra code will probably be saved in reduced support, even if the performance benefit is negligible. -Kevin
"Kevin Grittner" <Kevin.Grittner@wicourts.gov> wrote: > Tuple locks should be safe from that because we use the tuple xmin > as part of the target key, but page and heap locks That should have read "page and relation locks". > I guess that tips the scales in favor of it being worth the extra > code. I think it's still in that gray area I just thought of something which takes it out of the gray area for me: pg_locks. Even though it would be extremely rare for a false positive to actually occur if we let this go, people would be sure to get confused by seeing locks on the dropped objects in the pg_locks view.. They've got to be cleaned up. -Kevin
Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes: > On 03.06.2011 21:04, Kevin Grittner wrote: >> Also, if anyone has comments or hints about the placement of those >> calls, I'd be happy to receive them. > heap_drop_with_catalog() schedules the relation for deletion at the end > of transaction, but it's still possible that the transaction aborts and > the heap doesn't get dropped after all. If you put the > DropAllPredicateLocksFromTable() call there, and the transaction later > aborts, you've lost all the locks already. But on the third thought: is that wrong? Surely locks taken by an aborted transaction can be discarded. regards, tom lane
On 04.06.2011 19:19, Tom Lane wrote: > Heikki Linnakangas<heikki.linnakangas@enterprisedb.com> writes: >> On 03.06.2011 21:04, Kevin Grittner wrote: >>> Also, if anyone has comments or hints about the placement of those >>> calls, I'd be happy to receive them. > >> heap_drop_with_catalog() schedules the relation for deletion at the end >> of transaction, but it's still possible that the transaction aborts and >> the heap doesn't get dropped after all. If you put the >> DropAllPredicateLocksFromTable() call there, and the transaction later >> aborts, you've lost all the locks already. > > But on the third thought: is that wrong? Surely locks taken by an > aborted transaction can be discarded. These are predicate locks - there can be "locks" on the table belonging to transactions that have already committed. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
Heikki Linnakangas wrote: > On 04.06.2011 19:19, Tom Lane wrote: >> Heikki Linnakangas writes: >>> On 03.06.2011 21:04, Kevin Grittner wrote: >>>> Also, if anyone has comments or hints about the placement of >>>> those calls, I'd be happy to receive them. >> >>> heap_drop_with_catalog() schedules the relation for deletion at >>> the end of transaction, but it's still possible that the >>> transaction aborts and the heap doesn't get dropped after all. If >>> you put the DropAllPredicateLocksFromTable() call there, and the >>> transaction later aborts, you've lost all the locks already. >> >> But on the third thought: is that wrong? Surely locks taken by an >> aborted transaction can be discarded. > > These are predicate locks - there can be "locks" on the table > belonging to transactions that have already committed. It took me a while to think this through, but if the transaction (T1) which reads the table to create the SIREAD lock overlaps another transaction (T2) with which it might interact to create a dangerous structure, and T2 has not yet accessed the table in any way, then after T1 commits a third transaction (T3) could try to drop the table but roll back, and T2 could still proceed to do a write which conflicts with the predicate lock. That certainly sounds like a low frequency combination of events, but one which can't be ignored if we want correct behavior (i.e., no false negatives). -Kevin
> Heikki Linnakangas wrote: > On 03.06.2011 23:44, Kevin Grittner wrote: >> Heikki Linnakangas wrote: >> >>> I think you'll need to just memorize the lock deletion command in >>> a backend-local list, and perform the deletion in a post-commit >>> function. >> >> Hmm. As mentioned earlier in the thread, cleaning these up doesn't >> actually have any benefit beyond freeing space in the predicate >> locking collections. I'm not sure that benefit is enough to >> justify this much new mechanism. Maybe I should just leave them >> alone and let them get cleaned up in due course with the rest of >> the locks. Any opinions on that? > > Is there a chance of false positives if oid wraparound happens and > a new table gets the same oid as the old one? It's also possible > for a heap to get the OID of an old index or vice versa, will that > confuse things? Some additional thoughts after working on the new code... The risk we're trying to cover is that a table can be dropped while committed transactions still have predicate locks against it, which won't be cleared until overlapping transactions which are still active, and which can form a dangerous structure with the committed transactions, commit. If we just leave those predicate locks to be cleaned up in the normal course of transaction completion, there could be a new object created with the old OID which, if a complicated set of conditions are met, could lead to a serialization failure of a transaction which would otherwise succeed. We can't clean the predicate locks up immediately when the DROP TABLE appears to succeed, because the enclosing transaction (or subtransaction) could still be rolled back. That last bit seems pretty hard to dodge, because premature predicate lock cleanup could lead to false negatives -- which would allow data inconsistencies. So cleanup before commit is definitely out. But some additional time meditating on the dangerous structure diagram today has me wondering whether we might be OK just ignoring cleanup prior to what would normally happen as transactions complete. Tin - - -> Tpivot - - -> Tout As mentioned above, this whole issue is based around concern about false positives when a transaction has read from a table and committed before the table is dropped. We don't need to worry about that for Tout -- its position in the dangerous structure is just based around what it *writes*; it can't be responsible for the read-and-commit before the table is dropped. To have a problem, Tpivot can't be the transaction which commits before the DROP TABLE either -- since it reads what Tout writes and Tout has to be first to commit, then it cannot read and commit before the DROP TABLE and still have Tout write afterward. That means that the transaction which did the read and commit would have to be Tin, and Tpivot would need to be the transaction which later wrote to an object with the same OID as the dropped table. For a problem to manifest itself, this sequence must occur: - The timing of transaction starts doesn't matter (other than the RO optimizations) as long as Tpivot overlaps both Tin andTout. - Tout commits first - Any time before Tin commits, it reads from relation X. - Tin commits second - Tpivot develops a rw-conflict out to Tout on a relation *other than* X, any time after both have started and before Tpivotcommits. - Tpivot writes to a relation which has the same OID as relation X, but which isn't relation X. This requires that a newrelation be created with the same OID which had been used by X, and that this new relation is visible to Tpivot -- whichacquired its snapshot *before X was dropped*. Is this possible? If a transaction gets its snapshot while OID of N is assigned to relation X, can that transaction wind up seeing an OID of N as a reference to relation Y? If not, there aren't any false positives possible. Given the quantity and complexity of code which I've needed to generate to cover this, I'm wondering again whether it is justified *even if* such a false positive is possible. It certainly seems like it would be a *very* infrequent occurrence, There's also the question of what shows in pg_locks, but I'm wondering whether even that is enough to justify this much code. Thoughts? Maybe I should submit a patch without added complexity of the scheduled cleanup and we can discuss that as a separate patch? -Kevin
"Kevin Grittner" wrote: > Maybe I should submit a patch without added complexity of the > scheduled cleanup and we can discuss that as a separate patch? Here's a patch which adds the missing support for DDL. Cleanup of predicate locks at commit time for transactions which ran DROP TABLE or TRUNCATE TABLE can be added as a separate patch. I consider those to be optimizations which are of dubious benefit, especially compared to the complexity of the extra code required. Also, Heikki pointed out that explicitly releasing the predicate locks after a DROP DATABASE is an optimization, which on reflection also seems to be of dubious value compared to the code needed. Unless someone can find a way in which any of these early cleanups are needed for correctness, I suggest they are better left as enhancements in some future release, where there can be some demonstration that they matter enough for performance to justify the extra code, and there can be more testing to ensure the new code doesn't break anything. I stashed the partial work on the more aggressive cleanup, so if there's a consensus that we want it, I can post a patch pretty quickly. In making sure that the new code for this patch was in pgindent format, I noticed that the ASCII art and bullet lists recently added to OnConflict_CheckForSerializationFailure() were mangled badly by pgindent, so I added the dashes to protect those and included a pgindent form of that function. That should save someone some trouble sorting things out after the next global pgindent run. -Kevin
Attachment
On Sun, Jun 05, 2011 at 12:45:41PM -0500, Kevin Grittner wrote: > Is this possible? If a transaction gets its snapshot while OID of N > is assigned to relation X, can that transaction wind up seeing an OID > of N as a reference to relation Y? If not, there aren't any false > positives possible. This ought to be possible, assuming that the transaction doesn't try to read (and take an AccessShareLock on) X. On the other hand, given all the timing constraints you've noted, I think it's reasonable to ignore the risk of false positives here. What you're saying is that we might inadvertently roll back a transaction, if a table gets dropped and a new table created and assigned the same OID, and there's an uncommitted transaction that started before the DROP, *and* certain other conditions hold about the reads and timing of overlapping transactions. This combination of conditions seems quite unlikely and I have a hard time getting too worked up about it. Occasional false positives are already a fact of life when using SSI. Dan -- Dan R. K. Ports MIT CSAIL http://drkp.net/
On 06.06.2011 05:13, Kevin Grittner wrote: > "Kevin Grittner" wrote: > >> Maybe I should submit a patch without added complexity of the >> scheduled cleanup and we can discuss that as a separate patch? > > Here's a patch which adds the missing support for DDL. It makes me a bit uncomfortable to do catalog cache lookups while holding all the lwlocks. We've also already removed the reserved entry for scratch space while we do that - if a cache lookup errors out, we'll leave behind quite a mess. I guess it shouldn't fail, but it seems a bit fragile. When TransferPredicateLocksToHeapRelation() is called for a heap, do we really need to transfer all the locks on its indexes too? When a heap is dropped or rewritten or truncated, surely all its indexes are also dropped or reindexed or truncated, so you'll get separate Transfer calls for each index anyway. I think the logic is actually wrong at the moment: When you reindex a single index, DropAllPredicateLocksFromTableImpl() will transfer all locks belonging to any index on the same table, and any finer-granularity locks on the heap. It would be enough to transfer only locks on the index being reindexed, so you risk getting some unnecessary false positives. As a bonus, if you dumb down DropAllPredicateLocksFromTableImpl() to only transfer locks on the given heap or index, and not any other indexes on the same table, you won't need IfIndexGetRelation() anymore, making the issue of catalog cache lookups moot. Seems weird to call SkipSplitTracking() for heaps. I guess it's doing the right thing, but all the comments and the name of that refer to indexes. > Cleanup of > predicate locks at commit time for transactions which ran DROP TABLE > or TRUNCATE TABLE can be added as a separate patch. I consider those > to be optimizations which are of dubious benefit, especially compared > to the complexity of the extra code required. Ok. > In making sure that the new code for this patch was in pgindent > format, I noticed that the ASCII art and bullet lists recently added > to OnConflict_CheckForSerializationFailure() were mangled badly by > pgindent, so I added the dashes to protect those and included a > pgindent form of that function. That should save someone some > trouble sorting things out after the next global pgindent run. Thanks, committed that part. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
Heikki Linnakangas wrote: > It makes me a bit uncomfortable to do catalog cache lookups while > holding all the lwlocks. We've also already removed the reserved > entry for scratch space while we do that - if a cache lookup errors > out, we'll leave behind quite a mess. I guess it shouldn't fail, > but it seems a bit fragile. > > When TransferPredicateLocksToHeapRelation() is called for a heap, > do we really need to transfer all the locks on its indexes too? > When a heap is dropped or rewritten or truncated, surely all its > indexes are also dropped or reindexed or truncated, so you'll get > separate Transfer calls for each index anyway. Probably. Will confirm and simplify based on that. > I think the logic is actually wrong at the moment: When you reindex > a single index, DropAllPredicateLocksFromTableImpl() will transfer > all locks belonging to any index on the same table, and any > finer-granularity locks on the heap. It would be enough to transfer > only locks on the index being reindexed, so you risk getting some > unnecessary false positives. It seemed like a good idea at the time -- a relation lock on the heap makes any other locks on the heap or any of its indexes redundant. So it was an attempt at "cleaning house". Since we don't do anything for an index request unless there is a lock on that index, it couldn't cause false positives. But this probably fits into the category of premature optimizations, since the locks can't cause any difference in when you get a serialization failure -- it's only a matter of "taking up space". I could revert that. > As a bonus, if you dumb down DropAllPredicateLocksFromTableImpl() > to only transfer locks on the given heap or index, and not any > other indexes on the same table, you won't need > IfIndexGetRelation() anymore, making the issue of catalog cache > lookups moot. Which really makes it look like simplifying here to avoid the attempt to clean house is a good idea. If there's a benefit to be had from it, it should be demonstrated before attempting (in some later release), the same as any other optimization. > Seems weird to call SkipSplitTracking() for heaps. I guess it's > doing the right thing, but all the comments and the name of that > refer to indexes. Yeah, I think it's the right thing, but the macro name should probably be changed. It was originally created to do the right thing during index split operations and became useful in other cases where a transaction was doing things to predicate locks for all transactions. Most of this is simplifying, plus one search-and-replace in a single file. I'll try to post a new patch this evening. Thanks for the review. -Kevin
"Kevin Grittner" wrote: > Heikki Linnakangas wrote: > >> I think the logic is actually wrong at the moment: When you >> reindex a single index, DropAllPredicateLocksFromTableImpl() will >> transfer all locks belonging to any index on the same table, and >> any finer-granularity locks on the heap. It would be enough to >> transfer only locks on the index being reindexed, so you risk >> getting some unnecessary false positives. > > It seemed like a good idea at the time -- a relation lock on the > heap makes any other locks on the heap or any of its indexes > redundant. So it was an attempt at "cleaning house". Since we > don't do anything for an index request unless there is a lock on > that index, it couldn't cause false positives. But this probably > fits into the category of premature optimizations, since the locks > can't cause any difference in when you get a serialization failure > -- it's only a matter of "taking up space". I could revert that. On reflection, Heikki was dead-on right here; I had some fuzzy thinking going. Just because one transaction has a lock in the index doesn't mean that all transactions need lock promotion. That still leaves an opportunity for cleanup, but it's much narrower -- only locks from transactions which held locks on the reorganized index can be replaced by the heap relation lock. That one is narrow enough to be very unlikely to be worthwhile. As usual, Heikki was right on target. :-) -Kevin
Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes: > It makes me a bit uncomfortable to do catalog cache lookups while > holding all the lwlocks. We've also already removed the reserved entry > for scratch space while we do that - if a cache lookup errors out, we'll > leave behind quite a mess. I guess it shouldn't fail, but it seems a bit > fragile. The above scares the heck out of me. If you don't believe that a catcache lookup will ever fail, I will contract to break the patch. You need to rearrange the code so that this is less fragile. regards, tom lane
Tom Lane <tgl@sss.pgh.pa.us> wrote: > If you don't believe that a catcache lookup will ever fail, I will > contract to break the patch. As you probably know by now by reaching the end of the thread, this code is going away based on Heikki's arguments; but for my understanding, so that I don't make a bad assumption in this area again, what could cause the following function to throw an exception if the current process is holding an exclusive lock on the relation passed in to it? (I could be a heap or an index relation.) It seemed safe to me, and I can't spot the risk on a scan of the called functions. What am I missing? static Oid IfIndexGetRelation(Oid indexId) { HeapTuple tuple; Form_pg_index index; Oid result; tuple = SearchSysCache1(INDEXRELID, ObjectIdGetDatum(indexId)); if (!HeapTupleIsValid(tuple)) return InvalidOid; index = (Form_pg_index) GETSTRUCT(tuple); Assert(index->indexrelid == indexId); result = index->indrelid; ReleaseSysCache(tuple); return result; } Thanks for any clues, -Kevin
"Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes: > Tom Lane <tgl@sss.pgh.pa.us> wrote: >> If you don't believe that a catcache lookup will ever fail, I will >> contract to break the patch. > As you probably know by now by reaching the end of the thread, this > code is going away based on Heikki's arguments; but for my > understanding, so that I don't make a bad assumption in this area > again, what could cause the following function to throw an exception > if the current process is holding an exclusive lock on the relation > passed in to it? (I could be a heap or an index relation.) It > seemed safe to me, and I can't spot the risk on a scan of the called > functions. What am I missing? Out-of-memory. Query cancel. The attempted catalog access failing because it results in a detected deadlock. I could probably think of several more if I spent ten minutes on it; and that's not even considering genuine problem conditions such as a corrupted catalog index, which robustness demands that we not fall over completely for. You should never, ever assume that an operation as complicated as a catalog lookup can't fail. regards, tom lane
Tom Lane <tgl@sss.pgh.pa.us> wrote: > "Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes: >> What am I missing? > > Out-of-memory. Query cancel. The attempted catalog access > failing because it results in a detected deadlock. I could > probably think of several more if I spent ten minutes on it; and > that's not even considering genuine problem conditions such as a > corrupted catalog index, which robustness demands that we not fall > over completely for. > > You should never, ever assume that an operation as complicated as > a catalog lookup can't fail. Got it. Thanks. -Kevin
Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> wrote: > We've also already removed the reserved entry for scratch space This and Tom's concerns have me wondering if we should bracket the two sections of code where we use the reserved lock target entry with HOLD_INTERRUPTS() and RESUME_INTERRUPTS(). In an assert-enable build we wouldn't really recover from a transaction canceled while it was checked out (although if that were the only problem, that could be fixed), but besides that a cancellation while it's checked out could cause these otherwise-safe functions to throw exceptions due to a full heap table. -Kevin
On 07.06.2011 20:03, Kevin Grittner wrote: > Heikki Linnakangas<heikki.linnakangas@enterprisedb.com> wrote: > >> We've also already removed the reserved entry for scratch space > > This and Tom's concerns have me wondering if we should bracket the > two sections of code where we use the reserved lock target entry > with HOLD_INTERRUPTS() and RESUME_INTERRUPTS(). That's not necessary. You're holding a lwlock, which implies that interrupts are held off already. There's a HOLD_INTERRUPTS() call in LWLockAcquire and RESUME_INTERRUPTS() in LWLockRelease. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> wrote: > It makes me a bit uncomfortable to do catalog cache lookups while > holding all the lwlocks. I think I've caught up with the rest of the class on why this isn't sane in DropAllPredicateLocksFromTableImpl, but I wonder about CheckTableForSerializableConflictIn. We *do* expect to be throwing errors in here, and we need some way to tell whether an index is associated with a particular heap relation. Is the catalog cache the right way to check that here, or is something else more appropriate? -Kevin
On 07.06.2011 20:42, Kevin Grittner wrote: > Heikki Linnakangas<heikki.linnakangas@enterprisedb.com> wrote: > >> It makes me a bit uncomfortable to do catalog cache lookups while >> holding all the lwlocks. > > I think I've caught up with the rest of the class on why this isn't > sane in DropAllPredicateLocksFromTableImpl, but I wonder about > CheckTableForSerializableConflictIn. We *do* expect to be throwing > errors in here, and we need some way to tell whether an index is > associated with a particular heap relation. Is the catalog cache > the right way to check that here, or is something else more > appropriate? Hmm, it's not as dangerous there, as you're not in the middle of modifying stuff, but it doesn't feel right there either. Predicate locks on indexes are only needed to lock key ranges, to notice later insertions into the range, right? For locks on tuples that do exist, we have locks on the heap. If we're just about to delete every tuple in the heap, that doesn't need to conflict with any locks on indexes, because we're deleting, not inserting. So I don't think we need to care about index locks here at all, only locks on the heap. Am I missing something? -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
"Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes: > I think I've caught up with the rest of the class on why this isn't > sane in DropAllPredicateLocksFromTableImpl, but I wonder about > CheckTableForSerializableConflictIn. We *do* expect to be throwing > errors in here, and we need some way to tell whether an index is > associated with a particular heap relation. Is the catalog cache > the right way to check that here, or is something else more > appropriate? Just to answer the question (independently of Heikki's concern about whether this is needed at all): it depends on the information you have. If all you have is the index OID, then yeah a catcache lookup in pg_index is probably the best thing. If you have an open Relation for the index, you could instead look into its cached copy of its pg_index row. If you have an open Relation for the table, I'd think that looking for a match in RelationGetIndexList() would be the cheapest, since more than likely that information is already cached. regards, tom lane
Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> wrote: > Predicate locks on indexes are only needed to lock key ranges, to > notice later insertions into the range, right? For locks on tuples > that do exist, we have locks on the heap. If we're just about to > delete every tuple in the heap, that doesn't need to conflict with > any locks on indexes, because we're deleting, not inserting. So I > don't think we need to care about index locks here at all, only > locks on the heap. Am I missing something? You're right again. My brain must be turning to mush. This function can also become simpler, and there is now no reason at all to add catalog cache lookups to predicate.c. I think that leaves me with all the answers I need to get a new patch out this evening (U.S. Central Time). Thanks, -Kevin
Tom Lane <tgl@sss.pgh.pa.us> wrote: > Just to answer the question (independently of Heikki's concern > about whether this is needed at all): it depends on the > information you have. If all you have is the index OID, then yeah > a catcache lookup in pg_index is probably the best thing. If you > have an open Relation for the index, you could instead look into > its cached copy of its pg_index row. If you have an open Relation > for the table, I'd think that looking for a match in > RelationGetIndexList() would be the cheapest, since more than > likely that information is already cached. Thanks, I wasn't aware of RelationGetIndexList(). That's a good one to remember. The issue here was: given a particular heap relation, going through a list of locks looking for matches, where the lock targets use OIDs. So if I really *did* need to check whether an index was related to that heap relation, it sounds like the catcache approach would have been right. But as Heikki points out, the predicate locks on the indexes only need to guard scanned *gaps* against *insert* -- so a DROP TABLE or TRUNCATE TABLE can just skip looking at any locks which aren't against the heap relation. I think maybe I need a vacation -- you know, one where I'm not using my vacation time to attend a PostgreSQL conference. Thanks for the overview of what to use when; it takes a while, but I think I'm gradually coming up to speed on the million lines of code which comprise PostgreSQL. Tips like that do help. -Kevin
On 07.06.2011 21:10, Kevin Grittner wrote: > I think that leaves me > with all the answers I need to get a new patch out this evening > (U.S. Central Time). Great, I'll review it in my morning (in about 12h) -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> wrote: > On 07.06.2011 21:10, Kevin Grittner wrote: >> I think that leaves me with all the answers I need to get a new >> patch out this evening (U.S. Central Time). > > Great, I'll review it in my morning (in about 12h) Attached. Passes all the usual regression tests I run, plus light ad hoc testing. All is working fine as far as this patch itself goes, although more testing is needed to really call it sound. If anyone is interested in the differential from version 3 of the patch, it is the result of these two commits to my local repo: http://git.postgresql.org/gitweb?p=users/kgrittn/postgres.git;a=commitdiff;h=018b0fcbeba05317ba7066e552efe9a04e6890d9 http://git.postgresql.org/gitweb?p=users/kgrittn/postgres.git;a=commitdiff;h=fc651e2721a601ea806cf6e5d53d0676dfd26dca During testing I found two annoying things not caused by this patch which should probably be addressed in 9.1 if feasible, although I don't think they rise to the level of blockers. More on those in separate threads. -Kevin
Attachment
On 08.06.2011 03:16, Kevin Grittner wrote: > + /* > + * It's OK to remove the old lock first because of the ACCESS > + * EXCLUSIVE lock on the heap relation when this is called. It is > + * desirable to do so because it avoids any chance of running out > + * of lock structure entries for the table. > + */ That assumption is wrong, for REINDEX. REINDEX INDEX holds an AccessExclusive on the index, but only a ShareLock on the heap. The code is still OK, though. As long as you hold an AccessExclusiveLock on the index, no-one will care about predicate locks on the index, so we can remove them before acquiring the lock on the heap. And we hold lwlocks on all the predicate lock partitions, so all the operations will appear atomic to any outside observer anyway. Committed after adjusting that comment. I did a lot of other cosmetic changes too, please double-check that I didn't screw up anything. I also made rewriting ALTER TABLE to behave like CLUSTER, by adding a call to TransferPredicateLocksToHeapRelation() there. I just looked back at your old email where you listed the different DDL operations, and notice that we missed VACUUM FULL as well (http://archives.postgresql.org/message-id/4DBD7E91020000250003D0D6@gw.wicourts.gov). I'll look into that. ALTER INDEX was also still an open question in that email. BTW, truncating the tail of a heap in vacuum probably also should drop any locks on the truncated part of the heap. Or is it not possible any such locks to exist, because none of the tuples are visible to anyone anymore? A comment in lazy_truncate_heap() might be in order. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
On 08.06.2011 14:18, Heikki Linnakangas wrote: > I just looked back > at your old email where you listed the different DDL operations, and > notice that we missed VACUUM FULL as well > (http://archives.postgresql.org/message-id/4DBD7E91020000250003D0D6@gw.wicourts.gov). > I'll look into that. Never mind that, VACUUM FULL uses cluster_rel(), so that's covered already. > ALTER INDEX was also still an open question in that email. None of the variants of ALTER INDEX do anything that affects predicate locks, so that's OK too. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
(sorry for repeatedly replying to self. I'll go for a coffee after this...) On 08.06.2011 14:18, Heikki Linnakangas wrote: > Committed after adjusting that comment. I did a lot of other cosmetic > changes too, please double-check that I didn't screw up anything. Also, it would be nice to have some regression tests for this. I don't think any of the existing tests exercise these new functions (except for the fast-exit, which isn't very interesting). -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> wrote: > (sorry for repeatedly replying to self. I'll go for a coffee after > this...) That's so nice of you to try to make me feel better for the serious brain fade I suffered yesterday. ;-) > On 08.06.2011 14:18, Heikki Linnakangas wrote: >> Committed after adjusting that comment. I did a lot of other >> cosmetic changes too, please double-check that I didn't screw up >> anything. On a first read-through, all your changes look like improvements to me. I'll do some testing and give it a closer read. > Also, it would be nice to have some regression tests for this. I > don't think any of the existing tests exercise these new functions > (except for the fast-exit, which isn't very interesting). I'll see about posting regression tests for this, as well as looking into the questions in your earlier posts -- particularly about the heap truncation in vacuum. I'm pretty sure that vacuum can't clean up anything where we're still holding predicate locks because of visibility rules, but I'll take another look to be sure. I hope to do all that today, but I don't want to push to the point where I'm making dumb mistakes again. -Kevin