Re: Promise index tuples for UPSERT - Mailing list pgsql-hackers
From | Peter Geoghegan |
---|---|
Subject | Re: Promise index tuples for UPSERT |
Date | |
Msg-id | CAM3SWZTU188Oti4FaVv010zj-TiY_naZq2ORcX-WJ4Lfi_9b9w@mail.gmail.com Whole thread Raw |
In response to | Promise index tuples for UPSERT (Simon Riggs <simon@2ndquadrant.com>) |
List | pgsql-hackers |
On Wed, Oct 1, 2014 at 4:34 AM, Simon Riggs <simon@2ndquadrant.com> wrote: > Summary of algorithm to use "promise tuples" for concurrency control > during UPSERT > Index bloat is less of a problem than with normal inserts since there > are additional ways of removing promise tuples. Only one index tuple > at a time can have a specific value, so we would actually reduce heap > bloat caused by concurrent inserts. I am not all that concerned about bloat. I didn't think it was a major advantage of #1 relative to #2 when I investigated the differences, and looked at the numbers. And I'm speaking as the advocate of the design with zero bloat. > It's possible we find existing rows that we can't see according to our > snapshot. That is handled in exactly the same as the way we treat > UPDATEs. This isn't a special case in the patch. It's more like REPEATABLE READ and SERIALIZABLE have a special responsibility to make sure they're not updating an invisible-to-MVCC-snapshot (not "instantaneously invisible", by which I mean invisible in the HeapTupleSatisfiesUpdate() sense). Otherwise, we don't care if it's MVCC-visible or not. I imagine that by "That is handled in exactly the same as the way we treat UPDATEs", you mean that we do the EvalPlanQual() stuff. We only do that in the event of a concurrent UPDATE/DELETE, though. Otherwise, we trust the underlying relation scan to accurate represent that tuples pulled up are visible. So I don't understand the comparison (but tell me if I've misunderstood). > There is a potential loop here in that we might find an index tuple, > follow it, find out the tuple is actually dead then return to insert a > promise tuple, only to find that someone else just did that and we > have to wait/re-follow the link to update the new tuple. That is an > extremely unlikely race condition, though possible; there is no heavy > locking to prevent that in this approach. In principle deadlocks are > possible from that, but that is not any different from the current > case where people that insert same key at same time might cause > deadlocks. Its not a common application pattern and not something we > should be protecting against. People are going to use this feature in cases where they could almost use an UPDATE. It will happen if you let it happen. But even if you don't, a guarantee is infinitely more useful than a non-guarantee to app developers. I'll be up-front about this: you have very little chance of getting me to budge on this point. I *really* hate the idea of allowing this kind of thing. I don't think I'm alone here. > All of this is only needed for unique indexes. > > It can handled by a new API call for acquire_value_lock() and > release_value_lock(), or similar. > > Advantages > * We don't do anything weird in the heap - if this breaks, data is not corrupt Index corruption leads to wrong answers. I don't think this is a very good argument, fwiw. > * There is no heap bloat or index bloat above existing levels Fair enough. > * The approach leverages existing mechanism for transaction waiting That's not really an advantage. That more or less applies to all designs. > * Optimistic approach to value locking will improve performance over > heavy weight locking There is no evidence that promise index tuples will perform better. #2 didn't perform better than #1. > Disadvantages > * Not written yet - <1 month to code, still possible for 9.5, doesn't look hard Maybe, but that is beside the point, which is: If there were any fundamental problems with either #1 or #2, I think I'd have figured them out by now. They are less risky today - it might be that #3 turns out to have the same properties, but I cannot be sure. That counts for something. I feel perfectly entitled to hold that kind of uncertainty against any design, tbh. > Other stated possible disadvantages > * Violates existing invariant that every index tuple has a > corresponding heap tuple, which goes back to the Berkeley days. > Currently, we always create heap tuples first, and physically delete > them last. [Why is that a problem? Could be, but why?] Unknown unknowns. Huge amounts of code must be audited to figure out that it isn't an issue. So I don't know, but then I'm not the one making the proposal. > ("Deleting them last" implies something is being touched in that > regard by this suggestion. I'm not sure what you are referring to) The uncertain scope of the problem is a big part of the problem. > * Relatedly, concern about breaking VACUUM. VACUUMing of btrees occurs > with a list of TIDs to kill in index, built from the heap. Current > bulk delete callback cannot be used for this. Super-exclusive lock is > needed to delete tuples in btree page (i.e. VACUUM). Normally skipping > of LockBufferForCleanup() (as added by bbb6e559) works okay in heap > because it tends to imply that list of tids won't be bulk deleted in > index, where we currently cannot skip for failure to quickly acquire > super exclusive lock. So this could make the problem addressed by > bbb6e559 return to some extent. > [Don't see any problems; just test the xid as we scan a promise tuple > and remove it if needed] > > "Index-only" bloat becomes a possibility. Implications are not well understood. > [FUD, no reason to believe there is a problem.] "no reason to believe there is a problem" isn't that good a defense. I know this from experience. > Doesn't have a strategy for dealing with unprincipled deadlocks, at > least at the moment. Presumably some aspects of #2 could be adopted > here. > [FUD. Unprincipled deadlock still not properly explained as yet. No > way of telling whether it will happen with this approach, or not]. What's unclear about unprincipled deadlocks? Heikki went to considerable effort to have his design meet my standard of not doing that. I think that there is almost no controversy about whether or not we need to avoid that at this stage. If you want a more practical definition, I think it's very unlikely that I'll accuse any implementation of exhibiting this problem once it doesn't deadlock with the test-case I presented Heikki with last year [1]. [1] http://www.postgresql.org/message-id/CAM3SWZShbE29KpoD44cVc3vpZJGmDer6k_6FGHiSzeOZGmTFSQ@mail.gmail.com -- Peter Geoghegan
pgsql-hackers by date: