Re: INSERT ... ON CONFLICT {UPDATE | IGNORE} 2.0 - Mailing list pgsql-hackers
From | Heikki Linnakangas |
---|---|
Subject | Re: INSERT ... ON CONFLICT {UPDATE | IGNORE} 2.0 |
Date | |
Msg-id | 54E1A398.6080304@vmware.com Whole thread Raw |
In response to | Re: INSERT ... ON CONFLICT {UPDATE | IGNORE} 2.0 (Peter Geoghegan <pg@heroku.com>) |
Responses |
Re: INSERT ... ON CONFLICT {UPDATE | IGNORE} 2.0
Re: INSERT ... ON CONFLICT {UPDATE | IGNORE} 2.0 |
List | pgsql-hackers |
On 02/16/2015 02:44 AM, Peter Geoghegan wrote: > On Sat, Feb 14, 2015 at 2:06 PM, Heikki Linnakangas > <hlinnakangas@vmware.com> wrote: >> I did not solve the potential for livelocks (discussed at >> http://www.postgresql.org/message-id/CAM3SWZTfTt_fehet3tU3YKCpCYPYnNaUqUZ3Q+NAASnH-60teA@mail.gmail.com). >> The patch always super-deletes the already-inserted tuple on conflict, and >> then waits for the other inserter. That would be nice to fix, using one of >> the ideas from that thread, or something else. > > How about we don't super-delete at all with exclusion constraints? I'd > be willing to accept unprincipled deadlocks for exclusion constraints, > because they already exist today for UPSERT/NOSERT type use cases, and > with idiomatic usage seem far less likely for the IGNORE variant > (especially with exclusion constraints). So INSERT ON CONFLICT IGNORE on a table with an exclusion constraint might fail. I don't like that. The point of having the command in the first place is to deal with concurrency issues. If it sometimes doesn't work, it's broken. > I can see people using ON > CONFLICT UPDATE where they'd almost or actually be better off using a > plain UPDATE - that's quite a different situation. I find livelocks to > be a *very* scary prospect, and I don't think the remediations that > were mentioned are going to fly. It's just too messy, and too much of > a niche use case. TBH I am skeptical of the idea that we can fix > exclusion constraints properly in this way at all, at least not > without the exponential backoff thing, which just seems horrible. The idea of comparing the TIDs of the tuples as a tie-breaker seems most promising to me. If the conflicting tuple's TID is smaller than the inserted tuple's, super-delete and wait. Otherwise, wait without super-deletion. That's really simple. Do you see a problem with that? > Although you understood what I was on about when I first talked about > unprincipled deadlocks, I think that acceptance of that idea came > much later from other people, because it's damn complicated. BTW, it would good to explain somewhere in comments or a README the term "unprincipled deadlock". It's been a useful concept, and hard to grasp. If you defined it once, with examples and everything, then you could just have "See .../README" in other places that need to refer it. > It's not worth it to add > some weird "Dining philosophers" exponential backoff thing to make > sure that the IGNORE variant when used with exclusion constraints can > never deadlock in an unprincipled way, but if it is worth it then it > seems unreasonable to suppose that this patch needs to solve that > pre-existing problem. No? The point of solving the existing problem is that it allows us to split the patch, for easier review. >> Are we happy with acquiring the SpeculativeInsertLock heavy-weight lock for >> every insertion? That seems bad for performance reasons. Also, are we happy >> with adding the new fields to the proc array? Another alternative that was >> discussed was storing the speculative insertion token on the heap tuple >> itself. (See >> http://www.postgresql.org/message-id/52D00D2D.6030307@vmware.com) > > Whatever works, really. I can't say that the performance implications > of acquiring that hwlock are at the forefront of my mind. I never > found that to be a big problem on an 8 core box, relative to vanilla > INSERTs, FWIW - lock contention is not normal, and may be where any > heavweight lock costs would really be encountered. Oh, cool. I guess the fast-path in lmgr.c kicks ass, then :-). > I don't see how storing the promise token in heap tuples buys us not > having to involve heavyweight locking of tokens. (I think you may have > made a thinko in suggesting otherwise) It wouldn't get rid of heavyweight locks, but it would allow getting rid of the procarray changes. The inserter could generate a token, then acquire the hw-lock on the token, and lastly insert the heap tuple with the correct token. >> Are we happy with the way super-deletion works? Currently, the xmin field is >> set to invalid to mark the tuple as super-deleted. That required checks in >> HeapTupleSatisfies* functions. One alternative would be to set xmax equal to >> xmin, and teach the code currently calls XactLockTableWait() to check if >> xmax=xmin, and not consider the tuple as conflicting. > > That couldn't work without further HeapTupleSatisfiesDirty() logic. Why not? > Besides, why should one transaction be entitled to insert a > conflicting value tuple just because a still running transaction > deleted it (having also inserted the tuple itself)? Didn't one > prototype version of value locking #2 have that as a bug [1]? In fact, > originally, didn't the "xmin set to invalid" thing come immediately > from realizing that that wasn't workable? Ah, right. So the problem was that some code might assume that if you insert a row, delete it in the same transaction, and then insert the same value again, the 2nd insertion will always succeed because the previous insertion effectively acquired a value-lock on the key. Ok, so we can't unconditionally always ignore tuples with xmin==xmax. We would need an infomask bit to indicate super-deletion, and only ignore it if the bit is set. I'm starting to think that we should bite the bullet and consume an infomask bit for this. The infomask bits are a scarce resource, but we should use them when it makes sense. It would be good for forensic purposes too, to leave a trace that a super-deletion happened. > I too think the tqual.c changes are ugly. I can't see a way around > using a token lock, though - I would only consider (and only consider) > refining the interface by which a waiter becomes aware that it must > wait on the outcome of the inserting xact's speculative > insertion/value lock (and in particular, that is should not wait on > xact outcome). We clearly need something to wait on that isn't an > XID...heavyweight locking of a token value is the obvious way of doing > that. Yeah. > Jim Nasby said something about setting the HEAP_XMIN_INVALID hint bit. > Maybe he is right...if that can be made to be reliable (always > WAL-logged), it could be marginally better than setting xmin to > invalidTransactionId. I'm not a big fan of that. The xmin-invalid bit is currently always just a hint. > But only marginally; it seems like your > discomfort is basically inherent to playing these new games with > visibility, which is inherent to this design. No, I get that we're playing games with visibility. I want to find the best way to implement those games. > As I said, I am having a > hard time seeing a way to do anything more than polish what we have > here. That's probably fine, though...it hasn't proven to be too > problematic (exclusion constraints aside). > > Not having to change HeapTupleSatisfiesMVCC() and so on (to check if > xmin = InvalidTransactionId) is not obviously a useful goal here, > since ISTM that any alternative would have to *logically* do the same > thing. If I'm off the mark about your thinking here, please correct > me....are you just worried about extra cycles in the > HeapTupleSatisfies* routines? Extra cycles yes, but even more importantly, code readability and maintainability. - Heikki
pgsql-hackers by date: