Thread: savepoint problems
Hello, i have migrated from Maxdb to Postgresql recently and i am having a speed problem in large transactions over slow links because of autorollback on error postgresql feature, i create data in any tables with triggers in other tables and i do large inserts from the data created in this tables to any other postgresql servers (replication purposes), for this example maybe we can say 20000 rows, i want do this in a transaction to make rollback on certain errors, but i use a fallback feature if a duplicated is found i relaunch the last insert data in a update to the existing row, so i have to set savepoint and release after the insert has been successful, so my traffic flow is anything like this. client server begin ------------------> <----------------- ok savepoint-------------> <----------------- ok insert ------------------> <----------------- ok release savepoint---> <----------------- ok insert ------------------> <----------------- error duplicated key update -----------------> <----------------- ok release savepoint---> <----------------- ok 20000 rows later.... commit -----------------> <----------------- ok obviously in a slow link this is slow as hell, i have posted this same email in spanish pgsql-es-ayuda where Alvaro Herrera has replied my with some solutions (thanks Alvaro for your great support in spanish mailing list!), mainly two: 1- create a function that uses EXCEPTION to save data traffic or the function like an upsert that can be located in the example 38-1 at http://www.postgresql.org/docs/8.3/static/plpgsql-control-structures.html but this have the problem that i still have the savepoint overhead. 2- create a function that make a select locking the table before decide to do an insert or an update. Well i would like to know if every can help with any other idea or any notes on this problem? Other question i have it is how i could create a function without be sure the number of columns to insert/update. Thanks in advance. Best Regards, Miguel Angel.
On Fri, Aug 01, 2008 at 06:30:36PM +0200, Linos wrote: > Hello, > i have migrated from Maxdb to Postgresql recently and i am having a > speed problem in large transactions over slow links because of autorollback > on error postgresql feature, i create data in any tables with triggers in > other tables and i do large inserts from the data created in this tables to > any other postgresql servers (replication purposes), for this example maybe > we can say 20000 rows, i want do this in a transaction to make rollback on > certain errors, but i use a fallback feature if a duplicated is found i > relaunch the last insert data in a update to the existing row, so i have to > set savepoint and release after the insert has been successful, so my > traffic flow is anything like this. If the goal is to reduce latency costs, the best way could be: 1. Use COPY to transfer all the data in one stream to the server into a temporary table. 2. Use an UPDATE and and INSERT to merge the table into the old one. SQL has a MERGE statement but postgresql doesn't support that, so you'll have to do it by hand. That would be a total of 5 round-trips, including transaction start/end. hope this helps, -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > Please line up in a tree and maintain the heap invariant while > boarding. Thank you for flying nlogn airlines.
Attachment
Martijn van Oosterhout <kleptog@svana.org> writes: > On Fri, Aug 01, 2008 at 06:30:36PM +0200, Linos wrote: >> i have migrated from Maxdb to Postgresql recently and i am having a >> speed problem in large transactions over slow links because of autorollback > If the goal is to reduce latency costs, the best way could be: > [ move it to the server side ] Or move the logic into a server-side function, if you prefer to stick with your existing procedural approach. regards, tom lane
David Wilson escribió: > On Fri, Aug 1, 2008 at 12:30 PM, Linos <info@linos.es> wrote: > >> Well i would like to know if every can help with any other idea or any notes >> on this problem? Other question i have it is how i could create a function >> without be sure the number of columns to insert/update. Thanks in advance. > > you could do: > > begin; > create temporary table tmp (...); > [insert (or better yet, COPY) into tmp table] > [delete from real table where exists in temporary table]; > insert into real_table select * from tmp; > drop table tmp; > commit; > > Your client <--> server communication should be extremely small. > I think this is probably the better solution if i get the jdbc to use the copy command, but i still dont know how to make a function with a variable column number, maybe i simply can put all the columns and let the null columns insert/update with null. Regards, Miguel Angel.
Tom Lane escribió: > Martijn van Oosterhout <kleptog@svana.org> writes: >> On Fri, Aug 01, 2008 at 06:30:36PM +0200, Linos wrote: >>> i have migrated from Maxdb to Postgresql recently and i am having a >>> speed problem in large transactions over slow links because of autorollback > >> If the goal is to reduce latency costs, the best way could be: >> [ move it to the server side ] > > Or move the logic into a server-side function, if you prefer to stick > with your existing procedural approach. > > regards, tom lane > when you say move the logic into a server-side function do you mean send the data in a copy command (or many inserts) to a temporary table and load from here with a server-side functions like David or Martijn or are you telling me other way to do it, could you elaborate this please? Thanks. Regards, Miguel Angel.