Thread: on duplicate key
Hello. I was just asked by a mysql-user how do you do insert ..... on duplicate key update .... (or however they have it in mysql) in postgresql, if you are going to run commands from the command line? My solution up till now has been a function with the BEGIN insert .... EXCEPTION WHEN OTHERS THEN update ... END; I think this is not possible to run this kind of commands on the command line. Correct? As I see it, there are three ways 1) a function 2) try to rewrite it as two separate queries insert ... ; update ... ; where the insert will fail sometimes 3) try to search and see if there were any result back, but that would require the IF THEN construct which is also not available outside of functions, right? Is it correct to assume that a function that is searching for the key and then choosing to insert or update depending on what it found, is about as fast as doing an insert within a begin- exception-end statement or are there some inherent speed differences? Comparing method 1 and 2, are there any noticable speed differences when the exception is triggered? What should I answer the mysql-user? By the way, is there any work done on getting this functionality? I must admit that it would be handy some times ;-) Shouldn't this kind of question be added to the FAQ?
On Thu, Sep 25, 2008 at 10:25 AM, A B <gentosaker@gmail.com> wrote: > Hello. > I was just asked by a mysql-user how do you do > insert ..... on duplicate key update .... > (or however they have it in mysql) in postgresql, if you are going to > Is it correct to assume that a function that is searching for the key > and then choosing to insert or update depending on what it found, is > about as fast as doing an insert within a begin- exception-end > statement or are there some inherent speed differences? none really. They're both usually quite fast. > What should I answer the mysql-user? You should ask him what happens if you setup a benchmark on mysql and postgresq that runs 500 of these queries in parallel thousands of times, and halfway through pull the plug. Oh wait, I forgot, he's running mysql, so he probably doesn't actually care about data integrity. :) P.s. I'm mostly just kidding around, but in all seriousness, he's putting the cart ahead of the horse here. > By the way, is there any work done on getting this functionality? I > must admit that it would be handy some times ;-) I don't know, I find running an update followed by an insert if the update returns 0 rows is pretty easy to do. From external code or your favorite pl/lang.
On Thu, 2008-09-25 at 18:25 +0200, A B wrote: > My solution up till now has been a function with the > > BEGIN > insert .... > EXCEPTION WHEN OTHERS THEN > update ... > END; Here is the appropriate documentation link, where they have an example: http://www.postgresql.org/docs/8.3/static/plpgsql-control-structures.html#PLPGSQL-ERROR-TRAPPING The primary difference is that they use a loop, which is more robust. In theory, if you delete the record between when the INSERT happens and when the UPDATE happens, you will get no effect, which isn't what you're looking for. The loop will correct for this by ensuring that something happens before it terminates. > I think this is not possible to run this kind of commands on the > command line. Correct? This specific construct is in the language PL/pgSQL, which only exists as a procedural language, and can only be used to create functions. However, you can do the same thing on the command line using subtransactions, a.k.a. SAVEPOINTs: http://www.postgresql.org/docs/8.3/static/sql-savepoint.html This is how PL/pgSQL works internally, and how you can use other procedural languages to accomplish the same thing. Using SQL on the command line doesn't allow you to loop directly, so the best solution is usually to use a function (and PL/pgSQL is a good language for this). > As I see it, there are three ways > 1) a function This is probably the right approach, if you use the BEGIN...EXCEPTION method in the first link I provided. > 2) try to rewrite it as two separate queries insert ... ; update > ... ; where the insert will fail sometimes There's a race in the case of a DELETE happening between the INSERT and the UPDATE. PostgreSQL is designed for high concurrency, and this is great for performance (especially scalable and consistent performance), but race conditions are more likely with many things happening at once. > 3) try to search and see if there were any result back, but that would > require the IF THEN construct which is also not available outside of > functions, right? The same race condition exists here. > Is it correct to assume that a function that is searching for the key > and then choosing to insert or update depending on what it found, is > about as fast as doing an insert within a begin- exception-end > statement or are there some inherent speed differences? Again, be careful of race conditions, but the speed should be comparable. > What should I answer the mysql-user? To ask questions on pgsql-general, just like you did ;) > By the way, is there any work done on getting this functionality? I > must admit that it would be handy some times ;-) > Shouldn't this kind of question be added to the FAQ? Here's a starting place: http://archives.postgresql.org/pgsql-hackers/2008-04/msg01475.php You can find other threads about the status of the work by browsing the mailing list archives. Regards, Jeff Davis
> Here is the appropriate documentation link, where they have an example: > http://www.postgresql.org/docs/8.3/static/plpgsql-control-structures.html#PLPGSQL-ERROR-TRAPPING > > The primary difference is that they use a loop, which is more robust. In > theory, if you delete the record between when the INSERT happens and > when the UPDATE happens, you will get no effect, which isn't what you're > looking for. The loop will correct for this by ensuring that something > happens before it terminates. Yes, it could have been removed. I seem to remember that it is not possible to lock the table with a transaction from within a function, this has to be done on the level from which I call the function? > However, you can do the same thing on the command line using > subtransactions, a.k.a. SAVEPOINTs: > http://www.postgresql.org/docs/8.3/static/sql-savepoint.html Oh, fun to learn new stuff :-) Thank you for your reply