Thread: Promise index tuples for UPSERT
Summary of algorithm to use "promise tuples" for concurrency control during UPSERT 1. Perform btree search to location of key, if it exists. a) If an unkilled index tuple exists, we decide this is an UPDATE and drop straight thru to step 2 b) If it does not exist, insert a "promise" tuple into unique index(s) - marked with the xid of the inserting transaction, but using the key. This happens while the page is locked, so it is not possible to insert a second promise tuple concurrently. Record the btree blockid on the index scan and move to step 3 When later insert scans see the promise tuple they perform XactLockTableWait() and when they get control they look again for the key. If they find a promise tuple with an aborted xid they replace that value with their own xid and continue as a). Otherwise b). 2. Find existing heap tuple Find heap tuple. Check it is actually valid. If not, go back to (1), kill the prior tuple and follow 1b) path If it is valid, perform heap_update as normal. 3. Insert new heap tuple Perform heap_insert Re-find index tuple using the btree blockid recorded at step 1; this may require moving right until we find the actual value we are looking for, so block splits don't negatively affect this approach. Once re-found we change the index tuple from a promise tuple to a normal index tuple, by setting tid and removing promise flag. Tuple remains same length because the value was known when promise tuple inserted, so this is an inplace update. Insert other index values normally. If a transaction that inserted a promise tuple dies, how is that cleaned up? Any user that sees a dead promise tuple with a value they want will clean it up. Dead promise tuples can be removed as needed, just as killed tuples currently are. VACUUM can remove dead transactions from index as it scans, no problems. Index bloat is less of a problem than with normal inserts since there are additional ways of removing promise tuples. Only one index tuple at a time can have a specific value, so we would actually reduce heap bloat caused by concurrent inserts. It's possible we find existing rows that we can't see according to our snapshot. That is handled in exactly the same as the way we treat UPDATEs. There is a potential loop here in that we might find an index tuple, follow it, find out the tuple is actually dead then return to insert a promise tuple, only to find that someone else just did that and we have to wait/re-follow the link to update the new tuple. That is an extremely unlikely race condition, though possible; there is no heavy locking to prevent that in this approach. In principle deadlocks are possible from that, but that is not any different from the current case where people that insert same key at same time might cause deadlocks. Its not a common application pattern and not something we should be protecting against. All of this is only needed for unique indexes. It can handled by a new API call for acquire_value_lock() and release_value_lock(), or similar. Advantages * We don't do anything weird in the heap - if this breaks, data is not corrupt * There is no heap bloat or index bloat above existing levels * The approach leverages existing mechanism for transaction waiting * Optimistic approach to value locking will improve performance over heavy weight locking Disadvantages * Not written yet - <1 month to code, still possible for 9.5, doesn't look hard Other stated possible disadvantages * Violates existing invariant that every index tuple has a corresponding heap tuple, which goes back to the Berkeley days. Currently, we always create heap tuples first, and physically delete them last. [Why is that a problem? Could be, but why?] ("Deleting them last" implies something is being touched in that regard by this suggestion. I'm not sure what you are referring to) * Relatedly, concern about breaking VACUUM. VACUUMing of btrees occurs with a list of TIDs to kill in index, built from the heap. Current bulk delete callback cannot be used for this. Super-exclusive lock is needed to delete tuples in btree page (i.e. VACUUM). Normally skipping of LockBufferForCleanup() (as added by bbb6e559) works okay in heap because it tends to imply that list of tids won't be bulk deleted in index, where we currently cannot skip for failure to quickly acquire super exclusive lock. So this could make the problem addressed by bbb6e559 return to some extent. [Don't see any problems; just test the xid as we scan a promise tuple and remove it if needed] "Index-only" bloat becomes a possibility. Implications are not well understood. [FUD, no reason to believe there is a problem.] We have to re-find any items using an ordinary index scan, not a tid. Must match our xid to that. [Explained above, easy and efficient.] Doesn't have a strategy for dealing with unprincipled deadlocks, at least at the moment. Presumably some aspects of #2 could be adopted here. [FUD. Unprincipled deadlock still not properly explained as yet. No way of telling whether it will happen with this approach, or not]. Comments please. -- Simon Riggs http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On 10/01/2014 02:34 PM, Simon Riggs wrote: > Summary of algorithm to use "promise tuples" for concurrency control > during UPSERT > > 1. Perform btree search to location of key, if it exists. > a) If an unkilled index tuple exists, we decide this is an UPDATE and > drop straight thru to step 2 > b) If it does not exist, insert a "promise" tuple into unique index(s) > - marked with the xid of the inserting transaction, but using the key. > This happens while the page is locked, so it is not possible to insert > a second promise tuple concurrently. > Record the btree blockid on the index scan and move to step 3 > When later insert scans see the promise tuple they perform > XactLockTableWait() and when they get control they look again for the > key. If they find a promise tuple with an aborted xid they replace > that value with their own xid and continue as a). Otherwise b). XactLockTableWait() waits until the end of transaction, that's not you want here. If the backend that inserted the promise tuple decides to not proceed with the insertion, and removes the promise tuple, the backend waiting on it needs to be woken up more or less immediately, not when the transaction completes. I had this exact same issue in my earlier patch versions, fixed it with a new kind of heavy-weight lock, and a new field in PGPROC (http://www.postgresql.org/message-id/52D00D2D.6030307@vmware.com). That wasn't very pretty, but it got the job done. Some other design might be better. - Heikki
On Wed, Oct 1, 2014 at 4:34 AM, Simon Riggs <simon@2ndquadrant.com> wrote: > Summary of algorithm to use "promise tuples" for concurrency control > during UPSERT > Index bloat is less of a problem than with normal inserts since there > are additional ways of removing promise tuples. Only one index tuple > at a time can have a specific value, so we would actually reduce heap > bloat caused by concurrent inserts. I am not all that concerned about bloat. I didn't think it was a major advantage of #1 relative to #2 when I investigated the differences, and looked at the numbers. And I'm speaking as the advocate of the design with zero bloat. > It's possible we find existing rows that we can't see according to our > snapshot. That is handled in exactly the same as the way we treat > UPDATEs. This isn't a special case in the patch. It's more like REPEATABLE READ and SERIALIZABLE have a special responsibility to make sure they're not updating an invisible-to-MVCC-snapshot (not "instantaneously invisible", by which I mean invisible in the HeapTupleSatisfiesUpdate() sense). Otherwise, we don't care if it's MVCC-visible or not. I imagine that by "That is handled in exactly the same as the way we treat UPDATEs", you mean that we do the EvalPlanQual() stuff. We only do that in the event of a concurrent UPDATE/DELETE, though. Otherwise, we trust the underlying relation scan to accurate represent that tuples pulled up are visible. So I don't understand the comparison (but tell me if I've misunderstood). > There is a potential loop here in that we might find an index tuple, > follow it, find out the tuple is actually dead then return to insert a > promise tuple, only to find that someone else just did that and we > have to wait/re-follow the link to update the new tuple. That is an > extremely unlikely race condition, though possible; there is no heavy > locking to prevent that in this approach. In principle deadlocks are > possible from that, but that is not any different from the current > case where people that insert same key at same time might cause > deadlocks. Its not a common application pattern and not something we > should be protecting against. People are going to use this feature in cases where they could almost use an UPDATE. It will happen if you let it happen. But even if you don't, a guarantee is infinitely more useful than a non-guarantee to app developers. I'll be up-front about this: you have very little chance of getting me to budge on this point. I *really* hate the idea of allowing this kind of thing. I don't think I'm alone here. > All of this is only needed for unique indexes. > > It can handled by a new API call for acquire_value_lock() and > release_value_lock(), or similar. > > Advantages > * We don't do anything weird in the heap - if this breaks, data is not corrupt Index corruption leads to wrong answers. I don't think this is a very good argument, fwiw. > * There is no heap bloat or index bloat above existing levels Fair enough. > * The approach leverages existing mechanism for transaction waiting That's not really an advantage. That more or less applies to all designs. > * Optimistic approach to value locking will improve performance over > heavy weight locking There is no evidence that promise index tuples will perform better. #2 didn't perform better than #1. > Disadvantages > * Not written yet - <1 month to code, still possible for 9.5, doesn't look hard Maybe, but that is beside the point, which is: If there were any fundamental problems with either #1 or #2, I think I'd have figured them out by now. They are less risky today - it might be that #3 turns out to have the same properties, but I cannot be sure. That counts for something. I feel perfectly entitled to hold that kind of uncertainty against any design, tbh. > Other stated possible disadvantages > * Violates existing invariant that every index tuple has a > corresponding heap tuple, which goes back to the Berkeley days. > Currently, we always create heap tuples first, and physically delete > them last. [Why is that a problem? Could be, but why?] Unknown unknowns. Huge amounts of code must be audited to figure out that it isn't an issue. So I don't know, but then I'm not the one making the proposal. > ("Deleting them last" implies something is being touched in that > regard by this suggestion. I'm not sure what you are referring to) The uncertain scope of the problem is a big part of the problem. > * Relatedly, concern about breaking VACUUM. VACUUMing of btrees occurs > with a list of TIDs to kill in index, built from the heap. Current > bulk delete callback cannot be used for this. Super-exclusive lock is > needed to delete tuples in btree page (i.e. VACUUM). Normally skipping > of LockBufferForCleanup() (as added by bbb6e559) works okay in heap > because it tends to imply that list of tids won't be bulk deleted in > index, where we currently cannot skip for failure to quickly acquire > super exclusive lock. So this could make the problem addressed by > bbb6e559 return to some extent. > [Don't see any problems; just test the xid as we scan a promise tuple > and remove it if needed] > > "Index-only" bloat becomes a possibility. Implications are not well understood. > [FUD, no reason to believe there is a problem.] "no reason to believe there is a problem" isn't that good a defense. I know this from experience. > Doesn't have a strategy for dealing with unprincipled deadlocks, at > least at the moment. Presumably some aspects of #2 could be adopted > here. > [FUD. Unprincipled deadlock still not properly explained as yet. No > way of telling whether it will happen with this approach, or not]. What's unclear about unprincipled deadlocks? Heikki went to considerable effort to have his design meet my standard of not doing that. I think that there is almost no controversy about whether or not we need to avoid that at this stage. If you want a more practical definition, I think it's very unlikely that I'll accuse any implementation of exhibiting this problem once it doesn't deadlock with the test-case I presented Heikki with last year [1]. [1] http://www.postgresql.org/message-id/CAM3SWZShbE29KpoD44cVc3vpZJGmDer6k_6FGHiSzeOZGmTFSQ@mail.gmail.com -- Peter Geoghegan
On Wed, Oct 1, 2014 at 12:54 PM, Heikki Linnakangas <hlinnakangas@vmware.com> wrote: > XactLockTableWait() waits until the end of transaction, that's not you want > here. If the backend that inserted the promise tuple decides to not proceed > with the insertion, and removes the promise tuple, the backend waiting on it > needs to be woken up more or less immediately, not when the transaction > completes. Simon has not been inconsistent here: he has said that deadlocks may be possible. I happen to think that allowing them would be a major mistake on our part, but that's another story. -- Peter Geoghegan
On Wed, Oct 1, 2014 at 12:59 PM, Peter Geoghegan <pg@heroku.com> wrote: > On Wed, Oct 1, 2014 at 12:54 PM, Heikki Linnakangas > <hlinnakangas@vmware.com> wrote: >> XactLockTableWait() waits until the end of transaction, that's not you want >> here. If the backend that inserted the promise tuple decides to not proceed >> with the insertion, and removes the promise tuple, the backend waiting on it >> needs to be woken up more or less immediately, not when the transaction >> completes. > > Simon has not been inconsistent here: he has said that deadlocks may > be possible. I happen to think that allowing them would be a major > mistake on our part, but that's another story. Don't forget that not waiting on XactLockTableWait(), but rather waiting on a "speculative insertion token" wasn't just the thing that made your prototype not deadlock - it was also the thing that made its performance more or less comparable to that of my original value locking design. Your prototype performed only a fraction as well as my design before that last revision. So that's two *excellent* reasons to not use XactLockTableWait() here. -- Peter Geoghegan
On 1 October 2014 20:54, Heikki Linnakangas <hlinnakangas@vmware.com> wrote: > On 10/01/2014 02:34 PM, Simon Riggs wrote: >> ... >> When later insert scans see the promise tuple they perform >> XactLockTableWait() and when they get control they look again for the >> key. If they find a promise tuple with an aborted xid they replace >> that value with their own xid and continue as a). Otherwise b). > > > XactLockTableWait() waits until the end of transaction, that's not you want > here. If the backend that inserted the promise tuple decides to not proceed > with the insertion, and removes the promise tuple, the backend waiting on it > needs to be woken up more or less immediately, not when the transaction > completes. There is no "remove promise tuple" part of the above design. > I had this exact same issue in my earlier patch versions, fixed it with a > new kind of heavy-weight lock, and a new field in PGPROC > (http://www.postgresql.org/message-id/52D00D2D.6030307@vmware.com). That > wasn't very pretty, but it got the job done. Some other design might be > better. Currently, if some fool developer decides to insert rows that already duplicate values in the table, then currently he inserts a heap row, then sees a potential conflict in the index and waits for the conflicting xact to end. His fool insert, his wait. That's how btree does this now. I don't see any reason why we need to do better for Upsert. If the row already exists we need to be able to quickly change into an update, but we still only support one write xact at a time. The value in the index needs to be protected by a block level lock, so we can check it quickly, but the eventual heap work is serialized by transactional semantics. I think a little perspective is due here and we should stick to the main use case, not cater for bizarre edge cases. What we are looking for is something that can decide whether it is an insert or an update and progress quickly with that. It needs to be correct, but there is no requirement to be faster than currently possible, in the case of concurrency. Any form of tuple locking that uses the general lock manager will not be usable. I can't see it is worth the overhead of doing that to protect against deadlocks that would only be experienced by people doing foolish things. Please let's not over think and over engineer this. No flying cars please. -- Simon Riggs http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On 10/03/2014 11:07 AM, Simon Riggs wrote: > On 1 October 2014 20:54, Heikki Linnakangas <hlinnakangas@vmware.com> wrote: >> On 10/01/2014 02:34 PM, Simon Riggs wrote: >>> > ... >>> When later insert scans see the promise tuple they perform >>> XactLockTableWait() and when they get control they look again for the >>> key. If they find a promise tuple with an aborted xid they replace >>> that value with their own xid and continue as a). Otherwise b). >> >> >> XactLockTableWait() waits until the end of transaction, that's not you want >> here. If the backend that inserted the promise tuple decides to not proceed >> with the insertion, and removes the promise tuple, the backend waiting on it >> needs to be woken up more or less immediately, not when the transaction >> completes. > > There is no "remove promise tuple" part of the above design. > >> I had this exact same issue in my earlier patch versions, fixed it with a >> new kind of heavy-weight lock, and a new field in PGPROC >> (http://www.postgresql.org/message-id/52D00D2D.6030307@vmware.com). That >> wasn't very pretty, but it got the job done. Some other design might be >> better. > > Currently, if some fool developer decides to insert rows that already > duplicate values in the table, then currently he inserts a heap row, > then sees a potential conflict in the index and waits for the > conflicting xact to end. His fool insert, his wait. That's how btree > does this now. > I don't see any reason why we need to do better for Upsert. > > If the row already exists we need to be able to quickly change into an > update, but we still only support one write xact at a time. That lowers the bar from what I thought everyone agreed on. Namely, if two backends run a similar UPSERT command concurrently on a table that has more than one unique constraint, they might deadlock, causing one of them to throw an error instead of INSERTing or UPDATEing anything. I'm sure that's useful enough in many applications, but I'd like to have a more robust implementation. The shorter we can keep the list of caveats, the better. > The value in the index needs to be protected by a block level lock, so > we can check it quickly, but the eventual heap work is serialized by > transactional semantics. > > I think a little perspective is due here and we should stick to the > main use case, not cater for bizarre edge cases. I'm trying to bisect your thoughts on exactly what use cases you think we must support, and which ones you consider bizarre edge cases, and what exactly is acceptable behavior in those edge cases. > What we are looking for is something that can decide whether it is an > insert or an update and progress quickly with that. It needs to be > correct, but there is no requirement to be faster than currently > possible, in the case of concurrency. I believe we all agree on that. > Any form of tuple locking that uses the general lock manager will not > be usable. I can't see it is worth the overhead of doing that to > protect against deadlocks that would only be experienced by people > doing foolish things. Maybe, maybe not, but let's define the acceptable behavior first, and think about the implementation second. I'm pretty sure all of the approaches discussed so far can be made fast enough, and the bloat issues can be made small enough, that it doesn't matter much which one we choose from a performance point of view. The differences are in what use cases they can support, and the maintainability of the code. - Heikki
On Fri, Oct 3, 2014 at 2:03 AM, Heikki Linnakangas <hlinnakangas@vmware.com> wrote: > That lowers the bar from what I thought everyone agreed on. Namely, if two > backends run a similar UPSERT command concurrently on a table that has more > than one unique constraint, they might deadlock, causing one of them to > throw an error instead of INSERTing or UPDATEing anything. It lowers the bar to a level that I am not willing to limbo dance under. You don't even need two unique constraints. Nothing as "complicated" as that is required. When this happens with MySQL, they have the good sense to call it a bug [1], and even fix it. I find the comparison with conventional insertion entirely unconvincing. > I'm sure that's useful enough in many applications, but I'd like to have a > more robust implementation. The shorter we can keep the list of caveats, the > better. INSERT and UPDATE are supposed to be fairly well balanced here. Conflicts are the norm. >> The value in the index needs to be protected by a block level lock, so >> we can check it quickly, but the eventual heap work is serialized by >> transactional semantics. >> >> I think a little perspective is due here and we should stick to the >> main use case, not cater for bizarre edge cases. > > > I'm trying to bisect your thoughts on exactly what use cases you think we > must support, and which ones you consider bizarre edge cases, and what > exactly is acceptable behavior in those edge cases. "Lots of concurrency" is not an edge-case. >> Any form of tuple locking that uses the general lock manager will not >> be usable. I can't see it is worth the overhead of doing that to >> protect against deadlocks that would only be experienced by people >> doing foolish things. > > Maybe, maybe not, but let's define the acceptable behavior first, and think > about the implementation second. +1. Updating a lot with UPSERT is not foolish. That's all it took to make earlier prototypes deadlock. > I'm pretty sure all of the approaches > discussed so far can be made fast enough, and the bloat issues can be made > small enough, that it doesn't matter much which one we choose from a > performance point of view. The differences are in what use cases they can > support, and the maintainability of the code. +1 What do we get for giving up on not having unprincipled deadlocks here? What's the advantage? Assuming that this is a bizarre edge-case (note: it isn't), what do we get in return for giving up on fixing it? [1] MySQL bug #52020 -- Peter Geoghegan
On 3 October 2014 10:03, Heikki Linnakangas <hlinnakangas@vmware.com> wrote: > That lowers the bar from what I thought everyone agreed on. Namely, if two > backends run a similar UPSERT command concurrently on a table that has more > than one unique constraint, they might deadlock, causing one of them to > throw an error instead of INSERTing or UPDATEing anything. Now we get to a productive discussion, this is good. When we first make requirements, obviously everyone agrees a long list of things since there is initially not much reason to say No to it. As we go towards implementation we begin to understand the true price of meeting each requirement. It was good that this detail was raised and sensible to attempt to avoid unprincipled deadlocks. If the price of avoiding them is high, it is worth reconsidering how important that is. My view is that I can't see the above use case from happening in real situations, except by infrequent mistake. In most cases, unique indexes represent some form of object identity and those don't change frequently in the real world. So to be changing two unique fields at the same time and it not representing some form of business process error that people would like to see fail anyway, I'd be surprised by. If someone has an example of that in a real, common case then I would like to see it and I would revise my view accordingly We are frequently hampered by trying to design something that can sing and dance at the same time. That thought is exactly how we are looking at upsert now, not merge. So trimming our objectives to what makes sense is an accepted part of this project already. >> Any form of tuple locking that uses the general lock manager will not >> be usable. I can't see it is worth the overhead of doing that to >> protect against deadlocks that would only be experienced by people >> doing foolish things. > > > Maybe, maybe not, but let's define the acceptable behavior first, and think > about the implementation second. Hand in hand, I think, given the other constraints of time, review, maintainability etc.. > I'm pretty sure all of the approaches > discussed so far can be made fast enough, and the bloat issues can be made > small enough, that it doesn't matter much which one we choose from a > performance point of view. The differences are in what use cases they can > support, and the maintainability of the code. The discussion of approaches has up to now focused only on what impossibities exist, with a "we must do this because feature A can't do aspect X". I haven't yet seen much discussion of maintainability of code, but I would guess simpler is better, overall. Realistically, I won't be coding any separate approaches, so this is down to Peter and maybe yourself Heikki. I hope only to avoid foreclosing viable and simple approaches for the wrong reasons. There are many other considerations that make up the final view. -- Simon Riggs http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On Fri, Oct 3, 2014 at 2:50 AM, Simon Riggs <simon@2ndquadrant.com> wrote: > My view is that I can't see the above use case from happening in real > situations, except by infrequent mistake. In most cases, unique > indexes represent some form of object identity and those don't change > frequently in the real world. So to be changing two unique fields at > the same time and it not representing some form of business process > error that people would like to see fail anyway, I'd be surprised by. Are we talking about two different things here? Unprincipled deadlocks can be seen without updating any constrained column in the UPSERT. The test-case that originally highlighted the issue only had one unique index, and it was *not* in the update's targetlist. See: https://wiki.postgresql.org/wiki/Value_locking#.22Unprincipled_Deadlocking.22_and_value_locking -- Peter Geoghegan
On 3 October 2014 10:32, Peter Geoghegan <pg@heroku.com> wrote: > On Fri, Oct 3, 2014 at 2:03 AM, Heikki Linnakangas > <hlinnakangas@vmware.com> wrote: >> That lowers the bar from what I thought everyone agreed on. Namely, if two >> backends run a similar UPSERT command concurrently on a table that has more >> than one unique constraint, they might deadlock, causing one of them to >> throw an error instead of INSERTing or UPDATEing anything. > > It lowers the bar to a level that I am not willing to limbo dance > under. You don't even need two unique constraints. Nothing as > "complicated" as that is required. > > When this happens with MySQL, they have the good sense to call it a > bug [1], and even fix it. I find the comparison with conventional > insertion entirely unconvincing. Is there a test case that demonstrates the problem? -- Simon Riggs http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On Fri, Oct 3, 2014 at 3:04 AM, Simon Riggs <simon@2ndquadrant.com> wrote: > Is there a test case that demonstrates the problem? Yes. See my e-mail to Heikki here: http://www.postgresql.org/message-id/CAM3SWZShbE29KpoD44cVc3vpZJGmDer6k_6FGHiSzeOZGmTFSQ@mail.gmail.com Testcase is attached. -- Peter Geoghegan
On 3 October 2014 10:57, Peter Geoghegan <pg@heroku.com> wrote: > On Fri, Oct 3, 2014 at 2:50 AM, Simon Riggs <simon@2ndquadrant.com> wrote: >> My view is that I can't see the above use case from happening in real >> situations, except by infrequent mistake. In most cases, unique >> indexes represent some form of object identity and those don't change >> frequently in the real world. So to be changing two unique fields at >> the same time and it not representing some form of business process >> error that people would like to see fail anyway, I'd be surprised by. > > Are we talking about two different things here? > > Unprincipled deadlocks can be seen without updating any constrained > column in the UPSERT. The test-case that originally highlighted the > issue only had one unique index, and it was *not* in the update's > targetlist. See: > > https://wiki.postgresql.org/wiki/Value_locking#.22Unprincipled_Deadlocking.22_and_value_locking I followed that to a wiki page, then clicked again to an old email. No test case. -- Simon Riggs http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On 10/03/2014 01:05 PM, Peter Geoghegan wrote: > On Fri, Oct 3, 2014 at 3:04 AM, Simon Riggs <simon@2ndquadrant.com> wrote: >> Is there a test case that demonstrates the problem? > > Yes. See my e-mail to Heikki here: > > http://www.postgresql.org/message-id/CAM3SWZShbE29KpoD44cVc3vpZJGmDer6k_6FGHiSzeOZGmTFSQ@mail.gmail.com > > Testcase is attached. Simon's approach would actually pass that test case just fine. It inserts the (promise) index tuple first, and heap tuple only after that. It will fail the test case with more than one unique index, however. - Heikki
On Fri, Oct 3, 2014 at 3:54 AM, Heikki Linnakangas <hlinnakangas@vmware.com> wrote: > Simon's approach would actually pass that test case just fine. It inserts > the (promise) index tuple first, and heap tuple only after that. It will > fail the test case with more than one unique index, however. Oh, I see. Still, I don't think you need to UPDATE a uniquely-constrained attribute - even if updating constrained attributes is rare (dubious), non-HOT updates will have the same effect, no? I still think that's unacceptable. In any case, I still don't see what this buys us over the other two designs. What's the pay-off for giving up on the general avoidance of unprincipled deadlocks? -- Peter Geoghegan
On 3 October 2014 11:54, Heikki Linnakangas <hlinnakangas@vmware.com> wrote: > Simon's approach would actually pass that test case just fine. It inserts > the (promise) index tuple first, and heap tuple only after that. It will > fail the test case with more than one unique index, however. Please explain what you mean by "fail" here? My understanding of what you're saying is that if * we have a table with >1 unique index * and we update the values of the uniquely index columns (e.g. PK update) * on both of the uniquely indexed column sets then we get occaisonal deadlocks, just as we would do using current UPDATE/INSERT. Is their a business use case that requires that? (Or exactly what you meant, if that isn't it?) My view is if we are going to base the whole design on this point, then we need to have it very clearly accessible for all to understand. -- Simon Riggs http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On 10/06/2014 03:05 PM, Simon Riggs wrote: > On 3 October 2014 11:54, Heikki Linnakangas <hlinnakangas@vmware.com> wrote: > >> Simon's approach would actually pass that test case just fine. It inserts >> the (promise) index tuple first, and heap tuple only after that. It will >> fail the test case with more than one unique index, however. > > Please explain what you mean by "fail" here? I meant that the test case will sometimes deadlock, and some transactions will therefore be rolled back. > My understanding of what you're saying is that if > > * we have a table with >1 unique index > * and we update the values of the uniquely index columns (e.g. PK update) > * on both of the uniquely indexed column sets > then we get occaisonal deadlocks, just as we would do using current > UPDATE/INSERT. Right. To be precise: you don't need to update both of the columns in the same transaction, it's enough that some of the concurrent transactions update one column, while other transactions update the other column. > Is their a business use case that requires that? I don't know. Conceivably any use case where you have two unique constraints to begin with. - Heikki
On 10/06/2014 03:21 PM, Heikki Linnakangas wrote: > On 10/06/2014 03:05 PM, Simon Riggs wrote: >> My understanding of what you're saying is that if >> >> * we have a table with >1 unique index >> * and we update the values of the uniquely index columns (e.g. PK update) >> * on both of the uniquely indexed column sets >> then we get occaisonal deadlocks, just as we would do using current >> UPDATE/INSERT. > > Right. To be precise: you don't need to update both of the columns in > the same transaction, it's enough that some of the concurrent > transactions update one column, while other transactions update the > other column. Ok, that didn't make much sense. With UPSERT, you have to specify values for both columns. But it's sufficient that you have a mix of transactions where only some are UPSERTs, and others are regular UPDATEs on one of the columns. - Heikki
On 6 October 2014 13:21, Heikki Linnakangas <hlinnakangas@vmware.com> wrote: >> My understanding of what you're saying is that if >> >> * we have a table with >1 unique index >> * and we update the values of the uniquely index columns (e.g. PK update) >> * on both of the uniquely indexed column sets >> then we get occaisonal deadlocks, just as we would do using current >> UPDATE/INSERT. > > > Right. To be precise: you don't need to update both of the columns in the > same transaction, it's enough that some of the concurrent transactions > update one column, while other transactions update the other column. CREATE TABLE foo (id1 integer not null primary key ,id2 integer not null unique ,val integer); Given the table above, which one do we mean? 1. When we mix UPDATE foo SET id2 = X WHERE id1 = Y; and UPDATE foo SET id1 = Y WHERE id2 = X; we can deadlock 2. When we mix UPDATE foo SET val = Z WHERE id1 = Y; and UPDATE foo SET val = W WHERE id2 = X; we can deadlock (2) is a common use case, (1) is a very rare use case and most likely a poor design If the user wishes to protect against such deadlocks they retain the option to use row locking. Yes? -- Simon Riggs http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On 10/06/2014 04:44 PM, Simon Riggs wrote: > On 6 October 2014 13:21, Heikki Linnakangas <hlinnakangas@vmware.com> wrote: > >>> My understanding of what you're saying is that if >>> >>> * we have a table with >1 unique index >>> * and we update the values of the uniquely index columns (e.g. PK update) >>> * on both of the uniquely indexed column sets >>> then we get occaisonal deadlocks, just as we would do using current >>> UPDATE/INSERT. >> >> >> Right. To be precise: you don't need to update both of the columns in the >> same transaction, it's enough that some of the concurrent transactions >> update one column, while other transactions update the other column. > > CREATE TABLE foo > (id1 integer not null primary key > ,id2 integer not null unique > ,val integer); > > Given the table above, which one do we mean? > > 1. When we mix UPDATE foo SET id2 = X WHERE id1 = Y; and UPDATE foo > SET id1 = Y WHERE id2 = X; we can deadlock > 2. When we mix UPDATE foo SET val = Z WHERE id1 = Y; and UPDATE foo > SET val = W WHERE id2 = X; we can deadlock > > (2) is a common use case, (1) is a very rare use case and most likely > a poor design Well, at least one of the statements has to be an UPSERT, and at least one of them has to update a column with a unique constraint on it. This pair of transactions could deadlock, for example: Transaction 1: INSERT INTO foo VALUES (Y, X, Z) ON CONFLICT IGNORE; Transaction 2: UPDATE foo SET id2 = X WHERE id1 = Y; That's made-up syntax, but the idea is that the first transaction attempts to insert a row with values id1=Y, id2=X, val=Z. If that fails because of a row with id1=Y or id2=X already exists, then it's supposed to do nothing. > If the user wishes to protect against such deadlocks they retain the > option to use row locking. Yes? Sorry, I didn't understand that. Row locking? In general, this is of course a lot easier to implement if we restrict it so that it only works in some limited cases. That may be fine, but then we have to be able to document clearly what the limitations are, and throw an error if you violate those limitations. - Heikki
On 6 October 2014 15:04, Heikki Linnakangas <hlinnakangas@vmware.com> wrote: > On 10/06/2014 04:44 PM, Simon Riggs wrote: >> >> On 6 October 2014 13:21, Heikki Linnakangas <hlinnakangas@vmware.com> >> wrote: >> >>>> My understanding of what you're saying is that if >>>> >>>> * we have a table with >1 unique index >>>> * and we update the values of the uniquely index columns (e.g. PK >>>> update) >>>> * on both of the uniquely indexed column sets >>>> then we get occaisonal deadlocks, just as we would do using current >>>> UPDATE/INSERT. >>> >>> >>> >>> Right. To be precise: you don't need to update both of the columns in the >>> same transaction, it's enough that some of the concurrent transactions >>> update one column, while other transactions update the other column. >> >> >> CREATE TABLE foo >> (id1 integer not null primary key >> ,id2 integer not null unique >> ,val integer); >> >> Given the table above, which one do we mean? >> >> 1. When we mix UPDATE foo SET id2 = X WHERE id1 = Y; and UPDATE foo >> SET id1 = Y WHERE id2 = X; we can deadlock >> 2. When we mix UPDATE foo SET val = Z WHERE id1 = Y; and UPDATE foo >> SET val = W WHERE id2 = X; we can deadlock >> >> (2) is a common use case, (1) is a very rare use case and most likely >> a poor design > > > Well, at least one of the statements has to be an UPSERT, and at least one > of them has to update a column with a unique constraint on it. This pair of > transactions could deadlock, for example: > > Transaction 1: > INSERT INTO foo VALUES (Y, X, Z) ON CONFLICT IGNORE; > Transaction 2: > UPDATE foo SET id2 = X WHERE id1 = Y; > > That's made-up syntax, but the idea is that the first transaction attempts > to insert a row with values id1=Y, id2=X, val=Z. If that fails because of a > row with id1=Y or id2=X already exists, then it's supposed to do nothing. Lets look at a real world example CREATE TABLE citizen (ssn integer not null primary key ,email text not null unique ,tax_amount decimal); Transaction 1: INSERT INTO citizen VALUES (555123456, 'simon@2ndQuadrant.com', 1000.00) ON CONFLICT IGNORE; Transaction 2: UPDATE foo SET email = 'simon@2ndQuadrant.com', tax_amount = 1000.00 WHERE ssn = 555123456; OK, now I understand how a deadlock is possible. Thanks for your help. Again I note that there is no isolation test that refers to this situation, nor any documentation, internal or user facing that describes the situation or its workaround. My feeling is that is an unlikely situation. To have two actors concurrently updating the same data AND in different ways from two different angles. How likely is it that we would issue those two transactions concurrently AND we would be concerned because this caused an error? If the tax_amount was the same, it wouldn't matter that one failed. If the tax_amount differeed, we would want to know about the error, not accept it in silence. Are any of those things substantially worse than the current situation using INSERT/UPDATE loops? It might be nice if the above never deadlocked. What is the price of ensuring that in terms of code maintainability and performance? What would this do to COPY performance? >> If the user wishes to protect against such deadlocks they retain the >> option to use row locking. Yes? > > > Sorry, I didn't understand that. Row locking? I think that thought doesn't apply here. > In general, this is of course a lot easier to implement if we restrict it so > that it only works in some limited cases. That may be fine, but then we have > to be able to document clearly what the limitations are, and throw an error > if you violate those limitations. Seems reasonable. My point here is to establish that... a) there are multiple ways to implement the UPSERT feature and none should be thrown away too quickly b) the current patch does not implement something we all agree on yet c) not all requirements have been properly documented, understood or agreed by hackers If we want to move forwards we need to agree things based upon clarity and real world usage. It may be that people on reading this now believe Peter's HW locking approach is the best. I'm happy to go with consensus. My feeling is that substantially more work is required on explaining the details around multiple unique index constraints, trigger behaviour and various other corner cases. -- Simon Riggs http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On Mon, Oct 6, 2014 at 5:33 PM, Simon Riggs <simon@2ndquadrant.com> wrote: > Lets look at a real world example > > CREATE TABLE citizen > (ssn integer not null primary key > ,email text not null unique > ,tax_amount decimal); > > Transaction 1: > INSERT INTO citizen VALUES (555123456, 'simon@2ndQuadrant.com', > 1000.00) ON CONFLICT IGNORE; > Transaction 2: > UPDATE foo SET email = 'simon@2ndQuadrant.com', tax_amount = 1000.00 > WHERE ssn = 555123456; > > OK, now I understand how a deadlock is possible. Thanks for your help. > Again I note that there is no isolation test that refers to this > situation, nor any documentation, internal or user facing that > describes the situation or its workaround. This seems like a concern specific to other approaches to value locking. But fair enough. > My feeling is that is an unlikely situation. To have two actors > concurrently updating the same data AND in different ways from two > different angles. Hard to say for sure. > How likely is it that we would issue those two transactions > concurrently AND we would be concerned because this caused an error? > If the tax_amount was the same, it wouldn't matter that one failed. > If the tax_amount differeed, we would want to know about the error, > not accept it in silence. > > Are any of those things substantially worse than the current situation > using INSERT/UPDATE loops? Yes, because the new feature is supposed to make it so that you yourself don't have to put your UPSERT statement in a loop with subxacts. Taking away the burden of having to think about this stuff is something I'm striving for here. > It might be nice if the above never deadlocked. What is the price of > ensuring that in terms of code maintainability and performance? I am going to reserve judgement on that, at least for a little while. It seems like the person proposing an alternative ought to have a better sense of what the price of avoiding this is. I'd understand what you were getting at more here if it immediately made our lives easier in some obvious way. I don't see that it does, though I admit that I may simply not understand where you're coming from. So sure, let's not be prejudiced about what's important, but at the same time I don't see that either Heikki or I have actually been inflexible to a degree that hurts things WRT not giving up on important high-level-ish goals. I am not completely inflexible on "never error". I am very close to totally inflexible, though. I think I could live with an error that literally no one would ever see. For example, we could error if there was an excessive number of retries, which I find acceptable because it will never happen in the real world. I tend to think that what you're talking about is pretty far from that, though. > My point here is to establish that... > > a) there are multiple ways to implement the UPSERT feature and none > should be thrown away too quickly > b) the current patch does not implement something we all agree on yet > c) not all requirements have been properly documented, understood or > agreed by hackers > > If we want to move forwards we need to agree things based upon clarity > and real world usage. I certainly agree with that. > It may be that people on reading this now believe Peter's HW locking > approach is the best. I'm happy to go with consensus. I bet you didn't think that you'd say that a week ago. :-) I hope I don't sound smug when I say that. I just mean, as you say, that we all need to keep an open mind on this. A healthy respect for the problem is recommended. I think it's still possible that there are problems with design #1, even on its own terms. > My feeling is that substantially more work is required on explaining > the details around multiple unique index constraints, trigger > behaviour and various other corner cases. Probably. Ideally, we should do that in a way driven by real-world prototypes. In that spirit, I attach a new version of my patch, but now implemented using approach #2 to value locking. I haven't spent all that much time testing this (at least recently, in this form), but it does pass all existing tests, including my stress-tests when run for half an hour. A lot of those corner cases you mention are big concerns. It's much easier to identify these issues by breaking real implementations. So surprisingly, to a certain extent (with something like this) it makes sense to have requirements driven by actual implementations. If we cannot do this iteratively, we are likely to fail. That's just how it is, I think. -- Peter Geoghegan
Attachment
- 0001-Make-UPDATE-privileges-distinct-from-INSERT-privileg.patch.gz
- 0006-User-visible-documentation-for-INSERT-.-ON-CONFLICT-.patch.gz
- 0005-Internal-documentation-for-INSERT-.-ON-CONFLICT-UPDA.patch.gz
- 0004-Tests-for-INSERT-.-ON-CONFLICT-UPDATE-IGNORE.patch.gz
- 0003-CONFLICTING-expressions-within-ON-CONFLICT-UPDATE.patch.gz
- 0002-Support-INSERT-.-ON-CONFLICT-UPDATE-IGNORE.patch.gz
On 7 October 2014 03:31, Peter Geoghegan <pg@heroku.com> wrote: >> It may be that people on reading this now believe Peter's HW locking >> approach is the best. I'm happy to go with consensus. > > I bet you didn't think that you'd say that a week ago. :-) You're right, because last week I thought heavyweight locking sucks and I still think that; I haven't said it is the best. What we've just discovered is that we're locking 100% of the time, but its not needed in 99.9% of cases and is arguable in the 0.1% case - not "required" at all. The price of avoiding that rare and possibly erroneous condition seems high to me. Is there a way of detecting that we are updating a unique constraint column and then applying the HW locking only in that case? Or can we only apply locking when we have multiple unique constraints on a table? If so, I would withdraw any objection to the HW locking technique. -- Simon Riggs http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On Tue, Oct 7, 2014 at 8:33 AM, Simon Riggs <simon@2ndquadrant.com> wrote: > On 7 October 2014 03:31, Peter Geoghegan <pg@heroku.com> wrote: >>> It may be that people on reading this now believe Peter's HW locking >>> approach is the best. I'm happy to go with consensus. >> >> I bet you didn't think that you'd say that a week ago. :-) > > You're right, because last week I thought heavyweight locking sucks > and I still think that; I haven't said it is the best. > > What we've just discovered is that we're locking 100% of the time, but > its not needed in 99.9% of cases and is arguable in the 0.1% case - > not "required" at all. > > The price of avoiding that rare and possibly erroneous condition seems > high to me. > > Is there a way of detecting that we are updating a unique constraint > column and then applying the HW locking only in that case? Or can we > only apply locking when we have multiple unique constraints on a > table? > If so, I would withdraw any objection to the HW locking technique. I'm not up on the details of what Peter's patch does with heavyweight locking, but I'd say it this way: if the patch uses heavyweight locking routinely, that's probably not going to scale well[1]. If the patch detects possible conflicts and uses heavyweight locking only in those cases and for the specific purpose of untangling those conflicts, then that might well be OK. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company [1] As evidence, I offer the fact that removing 1 of 2 places where heavyweight locking is used in hash indexes resulted in a large performance gain at high client counts. See commit 76837c1507cb5a5a0048046233568447729e66dd.
On 7 October 2014 14:06, Robert Haas <robertmhaas@gmail.com> wrote: >> Is there a way of detecting that we are updating a unique constraint >> column and then applying the HW locking only in that case? Or can we >> only apply locking when we have multiple unique constraints on a >> table? >> If so, I would withdraw any objection to the HW locking technique. > > I'm not up on the details of what Peter's patch does with heavyweight > locking, but I'd say it this way: if the patch uses heavyweight > locking routinely, that's probably not going to scale well[1]. If > the patch detects possible conflicts and uses heavyweight locking only > in those cases and for the specific purpose of untangling those > conflicts, then that might well be OK. Yes, what I meant, just more clearly phrased. Thanks for the link. -- Simon Riggs http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On Tue, Oct 7, 2014 at 6:06 AM, Robert Haas <robertmhaas@gmail.com> wrote: > I'm not up on the details of what Peter's patch does with heavyweight > locking, but I'd say it this way: if the patch uses heavyweight > locking routinely, that's probably not going to scale well[1]. If > the patch detects possible conflicts and uses heavyweight locking only > in those cases and for the specific purpose of untangling those > conflicts, then that might well be OK. The patch opportunistically tries to use shared buffer locks when a conflict is expected, when we restart (but only on the unique index where a conflict was detected). So in the event of a lot of near-conflicts, the hwlock traffic is quite modest. That, combined with the fact that it uses what I've called "an index scan with an identity crisis" (could be a near-insertion + hwlock in advance of insertion proper, or just something akin to a regular index scan) makes it perform best (at least with one or two unique indexes, which is what I tested a few months back). It does not have a pre-check that is always wasted with insertion-heavy workloads. Now, we're not talking about a huge advantage here (I should re-test that). And, in case I wasn't clear: I have misgivings about all 3 designs. Like Simon, I think it is appropriate that we figure out our exact requirements using the two working prototype patches. Although, right now #1 and #2 (the prototypes) seem quite comparable, that might just be down to a failure of imagination. It's hard to be completely confident about something like that. -- Peter Geoghegan
On Tue, Oct 7, 2014 at 11:25 AM, Peter Geoghegan <pg@heroku.com> wrote: > Now, we're not talking about a huge advantage here (I should re-test > that). I attach raw output when running the bash scripts insert.sh and update.sh. These are benchmarks that concern performance in terms of total system throughput (TPS). The scripts are available from my stress-test suite: https://github.com/petergeoghegan/upsert These scripts were originally designed to compare UPSERT with an unsympathetic "gold-standard" for performance: "equivalent" INSERTs and UPDATEs. I looked at a few runs of 60 seconds, on unlogged tables, making the most direct comparison between UPSERTs and "equivalent" INSERTs and UPDATEs that is possible. To be clear, by "equivalent" I mean UPSERTS where we know we'll only UPDATE (in the case of update.sh), and UPSERTS where we know we'll only INSERT (in the case of insert.sh). Both #1 and #2 do respectably as compared to "equivalent" INSERTs and UPDATEs. There may be even less sympathetic though more representative cases, but certainly for these simple cases, performance is solid across the board. I got these numbers on my laptop, and it may be necessary to devise a more rigorous benchmark later, but performance is quite consistent between runs shown here. Approach #1 wins out with UPDATEs. The heavyweight-lock avoidance stuff is enough to compensate for the fact that we never INSERT (and never need B-Tree leaf page heavyweight locks). Median TPS was 19,310.79 for #1. Whereas for #2, it was 18,872.63 TPS. This is the case even though the "pre-check" for #2 is always appropriate, while we still acquire page-level heavyweight locks sometimes with #1. INSERTs see #2 win, and by a wider margin than #1 beat #2 with UPDATEs. However, insert.sh is by design very unsympathetic towards #1. It uses a serial primary key, so every INSERT uselessly obtains a HW lock on the same leaf page for the duration of heap insertion. Anyway, the median INSERT TPS numbers is 17,759.53 for #1, and 20,441.57 TPS for #2. So you're pretty much seeing the full brunt of page heavyweight locking, and it isn't all that bad. However, Heikki has said something about being more clever with when and how #2 is made to pre-check (which is always wasted here); so it's possible to imagine INSERTs becoming faster for #2, while that probably isn't the case for #1. I think that if I wanted to, I could get #1 to do much better on another case where page heavyweight locking is more varied. My goal here was to do the opposite, though. -- Peter Geoghegan
Attachment
On Tue, 2014-10-07 at 13:33 +0100, Simon Riggs wrote: > Is there a way of detecting that we are updating a unique constraint > column and then applying the HW locking only in that case? Or can we > only apply locking when we have multiple unique constraints on a > table? What is the use case of doing an UPSERT into a table with multiple unique constraints? Consider table user with unique columns name and email and a non-unique column age. If it has data Jack | jack@example.com |33 Tom | tom@example.com | 35 And the user does UPSERT values (Jack, tom@example.com, 34). The proposed specification will pick random unique index (either name or email index) and do an update of that row. First, this will cause unique violation, second, doing the UPSERT on random index seems confusing. The MySQL documentation says that "you should try to avoid using an ON DUPLICATE KEY UPDATE clause on tables with multiple unique indexes"[1]. The proposed feature's documentation has the same suggestion[2]. Still, the feature defaults to this behavior. Why is the default something the documentation says you shouldn't do? Going a bit further, I am wondering what is the use case of doing an UPSERT against multiple unique indexes? If multiple unique indexes UPSERT could be dropped that might allow for faster or cleaner index locking techniques. - Anssi 1: http://dev.mysql.com/doc/refman/5.6/en/insert-on-duplicate.html 2: http://postgres-benchmarks.s3-website-us-east-1.amazonaws.com/on-conflict-docs/sql-insert.html
On Wed, Oct 8, 2014 at 12:41 AM, Anssi Kääriäinen <anssi.kaariainen@thl.fi> wrote: > The MySQL documentation says that "you should try to avoid using an ON > DUPLICATE KEY UPDATE clause on tables with multiple unique indexes"[1]. > The proposed feature's documentation has the same suggestion[2]. Still, > the feature defaults to this behavior. Why is the default something the > documentation says you shouldn't do? MySQL started saying that when they realized it broke their statement-based replication. They have their own weird reasons for disliking it. Now, that's not to minimize the risks. The reasoning behind making the unique index specification optional is: We cannot easily cover corner cases with another syntax - unique indexes must be named directly to cover every case, and make the user's intent absolutely clear. That's not obviously the case, but consider partial unique indexes, for example. Or consider uniquely constrained columns, with an almost equivalent uniquely constrained expression on those same columns. On the one hand I am not comfortable failing to support those, but on the other hand it could get very messy to do it another way. As we all know, naming a unique index in DML is ugly, and has poor support in ORMs. It seems likely that we're better off making it optional - it hasn't been much of a problem with the existing subxact looping pattern. A lot of the time, there will only be a single unique index anyway, or there will be a trivial serial PK unique index and the unique index we always want to treat would-be conflicts within as triggering the alternative UPDATE/IGNORE path. > Going a bit further, I am wondering what is the use case of doing an > UPSERT against multiple unique indexes? If multiple unique indexes > UPSERT could be dropped that might allow for faster or cleaner index > locking techniques. I see no reason to think that. I don't think it buys us much at all. -- Peter Geoghegan
On 10/08/2014 11:10 AM, Peter Geoghegan wrote: > The reasoning behind making the unique index specification optional is: > > We cannot easily cover corner cases with another syntax - unique > indexes must be named directly to cover every case, and make the > user's intent absolutely clear. That's not obviously the case, but > consider partial unique indexes, for example. Or consider uniquely > constrained columns, with an almost equivalent uniquely constrained > expression on those same columns. On the one hand I am not comfortable > failing to support those, but on the other hand it could get very > messy to do it another way. > > As we all know, naming a unique index in DML is ugly, and has poor > support in ORMs. I vehemently object to naming indexes in the UPSERT statement. That mixes logical and physical database design, which is a bad idea. This is not ISAM. Instead of naming the index, you should name the columns, and the system can look up the index or indexes that match those columns. (Remind me again, where did this need to name an index come from in the first place?) - Heikki
On Wed, Oct 8, 2014 at 1:25 AM, Heikki Linnakangas <hlinnakangas@vmware.com> wrote: > Instead of naming the index, you should name the columns, and the system can > look up the index or indexes that match those columns. It's not totally clear that we need *any* WITHIN clause, BTW. I'm not dead set on it. It was something I mainly added at Kevin's request. I do see the risks, though. > (Remind me again, where did this need to name an index come from in the > first place?) I agree that naming indexes is ugly, and best avoided. It's tricky, though. The first thing you might try is to look up the index during parse analysis and/or planning. That could work in simple cases (which are probably the vast majority), but I'm worried about stuff like before row triggers that change the values being inserted out from under us, in a way that interferes with partial unique indexes. Maybe you could choose the wrong one if you didn't leave it until the last minute. But it's not very convenient to leave it until the last minute. If you're willing to live with the command conservatively failing when there isn't a clean specification (although not in a way that can make the command fail when the user innocently adds a unique index later), then I think I can do it. Otherwise, it could be surprisingly hard to cover all the cases non-surprisingly. I freely admit that putting a unique index in a DML statement is weird, but it is sort of the most direct way of expressing what we want. Oracle actually have an INSERT-IGNORE like hint that names a unique index (yes, really - see the UPSERT wiki page). That's really bizarre, but the point is that they may have felt that there was no reasonable alternative. -- Peter Geoghegan
On Wed, 2014-10-08 at 02:22 -0700, Peter Geoghegan wrote: > On Wed, Oct 8, 2014 at 1:25 AM, Heikki Linnakangas > <hlinnakangas@vmware.com> wrote: > > Instead of naming the index, you should name the columns, and the system can > > look up the index or indexes that match those columns. > > It's not totally clear that we need *any* WITHIN clause, BTW. I'm not > dead set on it. It was something I mainly added at Kevin's request. I > do see the risks, though. To be usable in Django ORM's .save() method there must be an option to use the primary key index, and only the primary key index for conflict resolution. Assume an author table with id SERIAL PRIMARY KEY, email TEXT UNIQUE, age INTEGER, then when saving an object, Django must update the row with matching primary key value, or otherwise do an insert. Doing an update of matching email column isn't allowed. So, Django can't use the query: INSERT INTO author values(1, 'tom@example.com', 35) ON CONFLICT UPDATE SET email = conflicting(email), age = conflicting(age); If the table contains data (id=2, email='tom@example.com', age=34), the query would update tom's age to 35 when it should have resulted in unique constraint violation. Other ORMs have similar save/persist implementations, that is objects are persisted on primary key identity alone. - Anssi
On Wed, 2014-10-08 at 01:10 -0700, Peter Geoghegan wrote: > On Wed, Oct 8, 2014 at 12:41 AM, Anssi Kääriäinen > <anssi.kaariainen@thl.fi> wrote: > > The MySQL documentation says that "you should try to avoid using an ON > > DUPLICATE KEY UPDATE clause on tables with multiple unique indexes"[1]. > > The proposed feature's documentation has the same suggestion[2]. Still, > > the feature defaults to this behavior. Why is the default something the > > documentation says you shouldn't do? > As we all know, naming a unique index in DML is ugly, and has poor > support in ORMs. It seems likely that we're better off making it > optional - it hasn't been much of a problem with the existing subxact > looping pattern. The subxact approach is a bit different than the proposed UPSERT command. It loops: try: INSERT INTO author VALUE('Jack', 'tom@example.com', 34) except UniqueConstraintViolation: UPDATE authorSET ... WHERE name = 'Jack' while the UPSERT command does something like: try: INSERT INTO author VALUE('Jack', 'tom@example.com', 34) except UniqueConstaintViolation: UPDATE authorSET ... WHERE name = 'Jack' OR email = 'tom@example.com' LIMIT 1; - Anssi
On 8 October 2014 00:34, Peter Geoghegan <pg@heroku.com> wrote: > INSERTs see #2 win, and by a wider margin than #1 beat #2 with > UPDATEs. However, insert.sh is by design very unsympathetic towards > #1. It uses a serial primary key, so every INSERT uselessly obtains a > HW lock on the same leaf page for the duration of heap insertion. > Anyway, the median INSERT TPS numbers is 17,759.53 for #1, and > 20,441.57 TPS for #2. So you're pretty much seeing the full brunt of > page heavyweight locking, and it isn't all that bad. Lets see the results of running a COPY please. -- Simon Riggs http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Peter Geoghegan <pg@heroku.com> wrote: On Wed, Oct 8, 2014 at 1:25 AM, Heikki Linnakangas <hlinnakangas@vmware.com> wrote: >> Instead of naming the index, you should name the columns, and >> the system can look up the index or indexes that match those >> columns. +1 That is what I have been consistently requesting instead of index names, every time I notice it mentioned. > It's not totally clear that we need *any* WITHIN clause, BTW. > I'm not dead set on it. It was something I mainly added at > Kevin's request. I do see the risks, though. What I said was in response to your assertion that your technique would *never* generate a duplicate key error. As others have again been pointing out, when there is more than one unique index you can have rows to apply which cannot be applied accurately without causing such an error. What I requested was that the behavior in those cases be clear and documented. I didn't take a position on whether you pick an index or ignore the input row (with a warning?), but we need to decide how it is handled. I have made the same point Heikki is making, though -- we have no business referencing an index name in the statement. -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Wed, Oct 8, 2014 at 2:50 PM, Kevin Grittner <kgrittn@ymail.com> wrote: > What I said was in response to your assertion that your technique > would *never* generate a duplicate key error. That is strictly true: the INSERT cannot raise a unique violation error, because to do so would cause it to take the UPDATE path (assuming there is no WITHIN clause). The UPDATE may get one, though. It doesn't have to get one for your statement to have effects that are surprising, though. > As others have again > been pointing out, when there is more than one unique index you can > have rows to apply which cannot be applied accurately without > causing such an error. It's simpler than that. You want to make sure that the right condition - the right unique index having a would-be duplicate violation - is the one taken *in practice*, the condition on which the UPDATE path is taken. What happens after that is not that interesting (e.g. whether or not there is an UPDATE-path duplicate violation), because either way it's broken. > What I requested was that the behavior in > those cases be clear and documented. I didn't take a position on > whether you pick an index or ignore the input row (with a > warning?), but we need to decide how it is handled. I have made > the same point Heikki is making, though -- we have no business > referencing an index name in the statement. I think that's fair enough. That's all fine - but actually doing that is quite tricky. Look at what I've said in relation to doing that. Consider the corner-cases I name. They're certainly only a small minority of cases in practice, but as an implementer I still need to worry about them (maybe even just as much). Yes, I would like to just name columns, but it's hard to make that fully generally. If you want me to do what you say, fine. But in order to do that, I need support for having the corner cases make parse analysis throw up its hands and error. Is that a price you're willing to pay? If so, then I'll implement it. Or, alternatively, we could do WITHIN PRIMARY key/NOT WITHIN PRIMARY KEY. -- Peter Geoghegan
On Wed, Oct 8, 2014 at 6:29 AM, Simon Riggs <simon@2ndquadrant.com> wrote: > Lets see the results of running a COPY please. Not exactly sure what you mean here. A concurrent COPY? -- Peter Geoghegan