Re: UPSERT wiki page, and SQL MERGE syntax - Mailing list pgsql-hackers
From | Robert Haas |
---|---|
Subject | Re: UPSERT wiki page, and SQL MERGE syntax |
Date | |
Msg-id | CA+TgmoY7DYLtq=qaB9CajF=_qFy-yAcwO5qipbH30BsDcE1c+g@mail.gmail.com Whole thread Raw |
In response to | Re: UPSERT wiki page, and SQL MERGE syntax (Peter Geoghegan <pg@heroku.com>) |
Responses |
Re: UPSERT wiki page, and SQL MERGE syntax
|
List | pgsql-hackers |
On Fri, Oct 10, 2014 at 4:41 PM, Peter Geoghegan <pg@heroku.com> wrote: > On Fri, Oct 10, 2014 at 12:09 PM, Robert Haas <robertmhaas@gmail.com> wrote: >> I think what's realistic here is that the patch isn't going to get >> committed the way you have it today, because nobody else likes that >> design. That may be harsh, but I think it's accurate. > > I do think that's harsh, because it's unnecessary: I am looking for > the best design. If you want to propose alternatives, great, but there > is a reason why I've done things that way, and that should be > acknowledged. I too think naming the unique index is ugly as sin, and > have said as much multiple times. We're almost on the same page here. Come on. You've had the syntax that way for a while, multiple people have objected to it, and it didn't get changed. When people renewed their objections, you said that they were not having a "realistic conversation". I'm tired of getting critiqued because there's some fine point of one of the scores of emails you've sent on this topic that you feel I haven't adequately appreciated. I would like to avoid getting into a flame-war here where we spend a lot of time arguing about who should have said what in exactly what way, but I think it is fair to say that your emails have come across to me, and to a number of other people here, as digging in your heels and refusing to budge. I would go so far as to say that said perception is the primary reason why this patch is making so little progress, far outstripping whatever the actual technical issues are. >> Whatever the actual behavior >> of your patch is today, it seems to me that the conflict is, >> fundamentally, over a set of column values, NOT over a particular >> index. The index is simply a mechanism for noticing that conflict. > > I think that this is the kernel of our disagreement. The index is not > simply a mechanism for noticing that conflict. The user had better > have one unique index in mind to provoke taking the alternative path > in the event of a would-be unique violation. Clearly it doesn't matter > much in this particular example. But what if there were two partial > unique indexes, that were actually distinct, but only in terms of the > constant appearing in their predicate? And what about having that > changed by a before insert row-level trigger? Are we to defer deciding > which unique index to use at the last possible minute? I don't immediately see why this would cause a problem. IIUC, the scenario we're talking about is: create table foo (a int, b int); create unique index foo1 on foo (a) where b = 1; create unique index foo2 on foo (a) where b = 2; If the user issues INSERT .. ON DUPLICATE (a) UPDATE, we'll fire before-insert triggers and then inspect the tuple to be inserted. If b is neither 1 nor 2, then we'll fail with an error saying that we can't support ON DUPLICATE without a relevant index to enforce uniqueness. (This can presumably re-use the same error message that would have popped out if the user done ON DUPLICATE (b), which is altogether un-indexed.) But if b is 1 or 2, then we'll search the matching index for a conflicting tuple; finding none, we'll insert; finding one, we'll do the update as per the user's instructions. > As always with this stuff, the devil is in the details. If we work out > the details, then I can come up with something that's acceptable to > everyone. Would you be okay with this never working with partial > unique indexes? That gives me something to work with. If it can't be made to work with them in a reasonable fashion, I would probably be OK with that, but so far I see no reason for such pessimism. >>>> Also, how about making the SET clause optional, with the semantics >>>> that we just update all of the fields for which a value is explicitly >>>> specified: >>>> >>>> INSERT INTO overwrite_with_abandon (key, value) >>>> VALUES (42, 'meaning of life') >>>> ON DUPLICATE (key) UPDATE; > >> Your syntax allows the exact same thing; it simply require the user to >> be more verbose in order to get that behavior. If we think that >> people wanting that behavior will be rare, then it's fine to require >> them to spell it out when they want it. If we think it will be the >> overwhelming common application of this feature, and I do, then making >> people spell it out when we could just as well infer it is pointless. > > Did you consider my example? I think that people will like this idea, > too - that clearly isn't the only consideration, though. As you say, > it would be very easy to implement this. However, IMV, we shouldn't, > because it is hazardous. MySQL doesn't allow this, and they tend to > find expedients like this useful. I'm considering your points in this area as well as I can, but as far as I can tell you haven't actually set what's bad about it, just that it might be hazardous in some way that you don't appear to have specified, and that MySQL doesn't allow it. I am untroubled by the latter point; it is not our goal to confine our implementation to a subset of MySQL. But it would be fine to leave this kind of shortcut out of the initial patch. If other people agree that it's useful, it will get re-proposed. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
pgsql-hackers by date: