Re: MERGE vs REPLACE - Mailing list pgsql-hackers
From | Simon Riggs |
---|---|
Subject | Re: MERGE vs REPLACE |
Date | |
Msg-id | 1132092694.3388.82.camel@holly Whole thread Raw |
In response to | Re: MERGE vs REPLACE (Josh Berkus <josh@agliodbs.com>) |
List | pgsql-hackers |
On Tue, 2005-11-15 at 10:27 -0800, Josh Berkus wrote: > > The UPSERT concept is also supported by Teradata, who simply append an > > ELSE INSERT clause onto the standard UPDATE syntax. MySQL REPLACE seems > > to me to be a fairly small subset of MERGE functionality and we ought to > > be able to offer that functionality as a side branch of the main work. > > Yes, I guess my hesitation on the full-table-lock strategy is that it > doesn't really fulfill the mandate for why people want REPLACE-like > statements ... to give them an INSERT-or-UPDATE with *higher* efficiency > and concurrency than doing two statements. Agreed... OK, what I said was "...Where there is doubt, we should fall back to table locking just like the rest of the world, IMHO." I didn't mean we should use full table locking all of the time. Sorry if I wasn't clear. What I meant, in context was, IMHO - we do no need predicate locking - we should use row level locks when these can be used - we should use table level locks other times On Fri, 2005-11-11 at 14:40 -0800, Josh Berkus wrote: > Scenario: > > session1: REPLACE .... 1 > session2: REPLACE ..... 1 > session1: check to see that "1" exists .... no > session2: check to see that "1" exists .... no > session1: INSERT 1 > session2: INSERT 1 .... ERROR > > Get the picture? The only way to avoid a race condition is to be able to > do "predicate locking", that is to lock the table against any data write > matching that predicate. The above situation happens now if you have two people doing SELECT then INSERT on the same row. Or even if you do UPDATE then INSERT. Currently if two people INSERT a row with a duplicate PK, we may find that both INSERTs made it into a data block, but one has its txn aborted because of a unique index violation. Nobody complains about that, nor says we should have predicate locking for that case, so why worry about it for MERGE? I don't see we need a special effort to avoid: if you try to do the same thing more than once, only the first one will work. At least you get a nice error message to let you know its happened. Seems like we - run the query in the USING clause - join it to the target table using pseudo outer join logic - if it matches we UPDATE - if it doesn't we INSERT ...but we do the last two in a single step, to minimise the window of opportunity for wierd situations. > That being said, I've > personally designed more than a dozen web applications and have not yet > been faced with a single circumstance of not knowing whether I wanted to > INSERT or UPDATE. I've even ported MySQL apps and found it easy to > re-code them to do "if $id = 0, then insert ..." without even needing to > use a pl/pgsql hack. OK, but not everybody knows what they're doing as well as you do. :-) > So we thus have two seperate use cases. The first, for bulk loading/ETL is > what MERGE fulfills rather neatly and for that full table locking is > perfectly OK, even desirable. You really don't want to MERGE-load the > same table on two threads at once. Probably true, but I do want to avoid full table locking for MERGE whenever possible. Concurrency is important even in data warehousing. Best Regards, Simon Riggs
pgsql-hackers by date: