Re: Making joins involving ctid work for the benefit of UPSERT - Mailing list pgsql-hackers
From | Peter Geoghegan |
---|---|
Subject | Re: Making joins involving ctid work for the benefit of UPSERT |
Date | |
Msg-id | CAM3SWZQXBWR6UaxcR7fNkUP-bss0zRO2dp7UWofxh1tWO8SoeA@mail.gmail.com Whole thread Raw |
In response to | Re: Making joins involving ctid work for the benefit of UPSERT (Robert Haas <robertmhaas@gmail.com>) |
Responses |
Re: Making joins involving ctid work for the benefit of UPSERT
Re: Making joins involving ctid work for the benefit of UPSERT Re: Making joins involving ctid work for the benefit of UPSERT |
List | pgsql-hackers |
On Wed, Jul 23, 2014 at 9:58 AM, Robert Haas <robertmhaas@gmail.com> wrote: > This all seems completely to one side of Andres's point. I think what > he's saying is: don't implement an SQL syntax of the form INSERT ON > CONFLICT and let people use that to implement upsert. Instead, > directly implement an SQL command called UPSERT or similar. That's > long been my intuition as well. I think generality here is not your > friend. Sure, I was just making one fairly narrow point in relation to Andres' concern about executor structuring of the UPSERT. > I'd suggest something like: > > UPSERT table SET col = value [, ...] WHERE predicate; Without dismissing the overall concern shared by you and Andres, that particular update-driven syntax doesn't strike me as something that should be pursued. I would like to be able to insert or update more than a single row at a time, for one thing. For another, what exactly appears in the predicate? Is it more or less the same as an UPDATE's predicate? > with semantics like this: > > 1. Search the table, using any type of scan you like, for a row > matching the given predicate. Perhaps I've misunderstood, but this is fundamentally different to what I'd always thought would need to happen. I always understood that the UPSERT should be insert-driven, and so not really have an initial scan at all in the same sense that every Insert lacks one. Moreover, I've always thought that everyone agreed on that. We go to insert, and then in the course of inserting index tuples do something with dirty snapshots. That's how we get information on conflicts. For one thing we cannot use any kind of scan unless there is a new mechanism for seeing not-yet-visible tuples, like some kind of non-MVCC snapshot that those scan nodes can selectively use. Now, I guess that you intend that in this scenario you go straight to 5, and then your insert finds the not-yet-visible conflict. But it's not clear that when 5 fails, and we pick up from 1 that we then get a new MVCC Snapshot or something else that gives us a hope of succeeding this time around. And it seems quite wasteful to not use knowledge of conflicts from the insert at all - that's one thing I'm trying to address here; removing unnecessary index scans when we actually know what our conflict TIDs are. > 2. If you find more than one tuple that is visible to your scan, error. This point seems to concern making the UPSERT UPDATE only affect zero or one rows. Is that right? If so, why do you think that's a useful constraint to impose? > 3. If you find exactly one tuple that is visible to your scan, update it. Stop. > 4. If you find no tuple that is visible to your scan, but you do find > at least one tuple whose xmin has committed and whose xmax has not > committed, then (4a) if the isolation level is REPEATABLE READ or > higher, error; (4b) if there is more than one such tuple, error; else > (4b) update it, in violation of normal MVCC visibility rules. Point 4b ("if there is more than one such tuple...") seems like it needs some practical or theoretical justification - do you just not want to follow an update chain? If so, why? What would the error message actually be? And, to repeat myself: Why should an MVCC snapshot see more than one such tuple in the ordinary course of scanning a relation, since there is by definition a unique constraint that prevents this? Or maybe I'm wrong; maybe you don't even require that. That isn't clear. As you know, I've always opposed any type of READ COMMITTED serialization failure. If you're worried about lock starvation hazards - although I don't think strong concerns here are justified - we can always put in a fail-safe number of retries before throwing an error. I'm comfortable with that because I think based on experimentation that it's going to be virtually impossible to hit. Apparently other snapshot isolation databases acquire a new snapshot, and re-do the command rather than using an EvalPlanQual()-like mechanism and thereby violating snapshot isolation. Those other systems have just such a hard limit on retries before erroring, and it seems to work out okay for them. > 5. Construct a new tuple using the col/value pairs from the SET clause > and try to insert it. If this would fail with a unique index > violation, back out and go back to step 1. Again, I can't see why you'd do this step last and not first, since this is naturally the place to initially "see into the future" with a dirty snapshot. > Having said all that, I believe the INSERT ON CONFLICT syntax is more > easily comprehensible than previous proposals. But I still tend to > agree with Andres that an explicit UPSERT syntax or something like it, > that captures all of the MVCC games inside itself, is likely > preferable from a user standpoint, whatever the implementation ends up > looking like. Okay then. If you both feel that way, I will come up with something closer to what you sketch. But for now I still strongly feel it ought to be driven by an insert. Perhaps I've misunderstood you entirely, though. -- Peter Geoghegan
pgsql-hackers by date: