Thread: postgresql multiple insert slow
Hello, I've got a table in an oracle database with approx. 100000 records, that I'd like to put into a table in a postgresql database. (This should be done a couple of times per week) I have written a short perl script, on a server that has remote access to both the oracle database as well as the postgresql database. I am running postgresql 7.4.1 on FreeBSD. My perl script looks something like this: [...] my $sth2 = $cnx2->prepare('SELECT * FROM oracle_table'); my $res2 = $sth2->execute(); while(my($field2,$field5,$field6) = ($sth2->fetchrow_array)) {if(defined($field2)) { my $sth = $cnx->prepare('INSERT INTO the_pg_table(field1, field2) VALUES(?,?)'); my $result = $sth->execute($field2,$field5); $sth->finish; } } [...] I runs fine - and I get no errors - but it takes almost 25 minutes to complete.. I tried running the script while just grabbing the rows from the oracle database and writing to a text file - and then it only takes a couple of minutes .. So it must be the INSERT command that chokes - is there a better way to do it ? Any advise much appreciated. /mich -- Best Regards,Michael L. Hostbaek */ PGP-key available upon request /*
"Michael L. Hostbaek" <mich@freebsdcluster.org> writes: > I runs fine - and I get no errors - but it takes almost 25 minutes to > complete.. I tried running the script while just grabbing the rows from > the oracle database and writing to a text file - and then it only takes > a couple of minutes .. So it must be the INSERT command that chokes - is > there a better way to do it ? You probably want to add BEGIN/COMMIT operations around the loop. See the documentation's tips on bulk data loading: http://www.postgresql.org/docs/7.4/static/populate.html regards, tom lane
Centuries ago, Nostradamus foresaw when mich@freebsdcluster.org ("Michael L. Hostbaek") would write: > I runs fine - and I get no errors - but it takes almost 25 minutes to > complete.. I tried running the script while just grabbing the rows from > the oracle database and writing to a text file - and then it only takes > a couple of minutes .. So it must be the INSERT command that chokes - is > there a better way to do it ? > > Any advise much appreciated. The problem here is that each insert implicitly sets up a transaction BEGIN and COMMIT. If you were to wrap the loop with a BEGIN at the start and a COMMIT at the end, you would likely find it would complete much, much faster. -- output = ("cbbrowne" "@" "acm.org") http://www3.sympatico.ca/cbbrowne/wp.html "Java and C++ make you think that the new ideas are like the old ones. Java is the most distressing thing to hit computing since MS-DOS." -- Alan Kay
In article <20040219163238.GD10913@mich2.itxmarket.com>, "Michael L. Hostbaek" <mich@freebsdcluster.org> writes: > Hello, > I've got a table in an oracle database with approx. 100000 records, that > I'd like to put into a table in a postgresql database. (This should be > done a couple of times per week) > I have written a short perl script, on a server that has remote access > to both the oracle database as well as the postgresql database. I am > running postgresql 7.4.1 on FreeBSD. > My perl script looks something like this: > [...] > my $sth2 = $cnx2->prepare('SELECT * FROM oracle_table'); > my $res2 = $sth2->execute(); > while(my($field2,$field5,$field6) = ($sth2->fetchrow_array)) { > if(defined($field2)) { > my $sth = $cnx->prepare('INSERT INTO > the_pg_table(field1, field2) VALUES(?,?)'); > my $result = $sth->execute($field2,$field5); > $sth->finish; > } > } > [...] > I runs fine - and I get no errors - but it takes almost 25 minutes to > complete.. I tried running the script while just grabbing the rows from > the oracle database and writing to a text file - and then it only takes > a couple of minutes .. So it must be the INSERT command that chokes - is > there a better way to do it ? First of all, you should prepare the insert statement only once, outside of the loop. Then you could use fetchrow_arrarref instead of fetchrow_array; this should eliminate a copy operation. But the biggest win would be not to use INSERT at all. Instruct Oracle to dump the rows into a CSV file, and then do just $cnx->do ("COPY the_pg_table FROM 'csv.file'")