Thread: savepoint problems

savepoint problems

From
Linos
Date:
Hello,
     i have migrated from Maxdb to Postgresql recently and i am having a speed problem in
large transactions over slow links because of autorollback on error postgresql feature, i
create data in any tables with triggers in other tables and i do large inserts from the
data created in this tables to any other postgresql servers (replication purposes), for
this example maybe we can say 20000 rows, i want do this in a transaction to make rollback
on certain errors, but i use a fallback feature if a duplicated is found i relaunch the
last insert data in a update to the existing row, so i have to set savepoint and release
after the insert has been successful, so my traffic flow is anything like this.

client                server
begin ------------------>
      <----------------- ok
savepoint------------->
      <----------------- ok
insert ------------------>
      <----------------- ok
release savepoint--->
      <----------------- ok
insert ------------------>
      <----------------- error duplicated key
update ----------------->
      <----------------- ok
release savepoint--->
      <----------------- ok
20000 rows later....
commit ----------------->
      <----------------- ok

obviously in a slow link this is slow as hell, i have posted this same email in spanish
pgsql-es-ayuda where Alvaro Herrera has replied my with some solutions (thanks Alvaro for
your great support in spanish mailing list!), mainly two:

1- create a function that uses EXCEPTION to save data traffic or the function like an
upsert that can be located in the example 38-1 at
http://www.postgresql.org/docs/8.3/static/plpgsql-control-structures.html but this have
the problem that i still have the savepoint overhead.

2- create a function that make a select locking the table before decide to do an insert or
an update.

Well i would like to know if every can help with any other idea or any notes on this
problem? Other question i have it is how i could create a function without be sure the
number of columns to insert/update. Thanks in advance.

Best Regards,
Miguel Angel.

Re: savepoint problems

From
Martijn van Oosterhout
Date:
On Fri, Aug 01, 2008 at 06:30:36PM +0200, Linos wrote:
> Hello,
>     i have migrated from Maxdb to Postgresql recently and i am having a
>     speed problem in large transactions over slow links because of autorollback
> on error postgresql feature, i create data in any tables with triggers in
> other tables and i do large inserts from the data created in this tables to
> any other postgresql servers (replication purposes), for this example maybe
> we can say 20000 rows, i want do this in a transaction to make rollback on
> certain errors, but i use a fallback feature if a duplicated is found i
> relaunch the last insert data in a update to the existing row, so i have to
> set savepoint and release after the insert has been successful, so my
> traffic flow is anything like this.

If the goal is to reduce latency costs, the best way could be:

1. Use COPY to transfer all the data in one stream to the server into a
temporary table.
2. Use an UPDATE and and INSERT to merge the table into the old one.
SQL has a MERGE statement but postgresql doesn't support that, so
you'll have to do it by hand.

That would be a total of 5 round-trips, including transaction start/end.

hope this helps,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> Please line up in a tree and maintain the heap invariant while
> boarding. Thank you for flying nlogn airlines.

Attachment

Re: savepoint problems

From
Tom Lane
Date:
Martijn van Oosterhout <kleptog@svana.org> writes:
> On Fri, Aug 01, 2008 at 06:30:36PM +0200, Linos wrote:
>> i have migrated from Maxdb to Postgresql recently and i am having a
>> speed problem in large transactions over slow links because of autorollback

> If the goal is to reduce latency costs, the best way could be:
> [ move it to the server side ]

Or move the logic into a server-side function, if you prefer to stick
with your existing procedural approach.

            regards, tom lane

Re: savepoint problems

From
Linos
Date:
David Wilson escribió:
> On Fri, Aug 1, 2008 at 12:30 PM, Linos <info@linos.es> wrote:
>
>> Well i would like to know if every can help with any other idea or any notes
>> on this problem? Other question i have it is how i could create a function
>> without be sure the number of columns to insert/update. Thanks in advance.
>
> you could do:
>
> begin;
> create temporary table tmp (...);
> [insert (or better yet, COPY) into tmp table]
> [delete from real table where exists in temporary table];
> insert into real_table select * from tmp;
> drop table tmp;
> commit;
>
> Your client <--> server communication should be extremely small.
>

I think this is probably the better solution if i get the jdbc to use the copy command,
but i still dont know how to make a function with a variable column number, maybe i simply
can put all the columns and let the null columns insert/update with null.

Regards,
Miguel Angel.

Re: savepoint problems

From
Linos
Date:
Tom Lane escribió:
> Martijn van Oosterhout <kleptog@svana.org> writes:
>> On Fri, Aug 01, 2008 at 06:30:36PM +0200, Linos wrote:
>>> i have migrated from Maxdb to Postgresql recently and i am having a
>>> speed problem in large transactions over slow links because of autorollback
>
>> If the goal is to reduce latency costs, the best way could be:
>> [ move it to the server side ]
>
> Or move the logic into a server-side function, if you prefer to stick
> with your existing procedural approach.
>
>             regards, tom lane
>

when you say move the logic into a server-side function do you mean send the data in a
copy command (or many inserts) to a temporary table and load from here with a server-side
functions like David or Martijn or are you telling me other way to do it, could you
elaborate this please? Thanks.

Regards,
Miguel Angel.