Thread: Best way to import data in postgresl (not "COPY")

Best way to import data in postgresl (not "COPY")

From
Denis BUCHER
Date:
Hello,

I have a system that must each day import lots of data from another one.
Our system is in postgresql and we connect to the other via ODBC.

Currently we do something like :

SELECT ... FROM ODBC source
foreach row {
INSERT INTO postgresql
}

The problem is that this method is very slow...

Does someone has a better suggestion ?

Thanks a lot in advance !

Denis

Re: Best way to import data in postgresl (not "COPY")

From
Andy Colson
Date:
Denis BUCHER wrote:
> Hello,
>
> I have a system that must each day import lots of data from another one.
> Our system is in postgresql and we connect to the other via ODBC.
>
> Currently we do something like :
>
> SELECT ... FROM ODBC source
> foreach row {
> INSERT INTO postgresql
> }
>
> The problem is that this method is very slow...
>
> Does someone has a better suggestion ?
>
> Thanks a lot in advance !
>
> Denis
>

If you can prepare your statement it would run a lot faster, no idea if
odbc supports such things though.

so:

select ... from odbc...;
$q = prepare('insert into pg...')
foreach row {
   $q.params[0] = ..
   $q.params[1] = ..
   $q.execute;
}
commit;

(* if possible, make sure you are not commitiing each insert statement,
do them all the commit once at the end *)


If you cant prepare, you should try to build multi-value insert statements:

insert into pgtable (col1, col2, col3) values ('a', 'b', 'c'), ('d',
'e', 'f'), ('g','h','i'),...;

Or, you could look into dblink, dunno if it would be faster.

-Andy

Re: Best way to import data in postgresl (not "COPY")

From
Sam Mason
Date:
On Wed, Jul 22, 2009 at 08:24:22PM +0200, Denis BUCHER wrote:
> SELECT ... FROM ODBC source
> foreach row {
> INSERT INTO postgresql
> }
>
> The problem is that this method is very slow...
>
> Does someone has a better suggestion ?

Using COPY[1] is normally the preferred solution to getting data into PG
fast.  Some languages make this easier than others, if you can generate
SQL that looks like:

  COPY table (col1,col2) FROM STDIN WITH CSV;
  13,hello
  42,"text with,comma"
  \.

then you should be in luck---just bung this off to the ODBC driver
as is and all should good.  If you need to copy more than will fit
in a string, arrange to put a few thousand rows in each batch, and
generate them and insert them one-at-a-time in a transaction.  Using
tab-delimited mode (drop the WITH CSV) is possible, but most languages
will provide library code for generating CSV files and hence will
probably be easier to get correct.

--
  Sam  http://samason.me.uk/

 [1] http://www.postgresql.org/docs/current/static/sql-copy.html

Re: Best way to import data in postgresl (not "COPY")

From
Denis BUCHER
Date:
Hello everyone,

Denis BUCHER a écrit :
> I have a system that must each day import lots of data from another one.
> Our system is in postgresql and we connect to the other via ODBC.
>
> Currently we do something like :
>
> SELECT ... FROM ODBC source
> foreach row {
> INSERT INTO postgresql
> }
>
> The problem is that this method is very slow...
> Does someone has a better suggestion ?

Thanks a lot for the help of everyone !

There are the first results of my tries, it's very interesting !!!

a) ON THE DESTINATION (PHP/Postgresql)

1. Preparing INSERT statements (to Postgres) was already a better idea
2. Then using BEGIN WORK COMMIT improved even more
3. At first I didn't realised I could remove quotes escaping thank to
prepare, this improved a little more
4. Then I found something very interesting : pg_send_execute !
(asynchronous)

Inserted lines : 134297
Required time : 292 seconds ([0] without prepare)
Required time : 253 seconds ([1] with prepare) (13% better)
Required time : 224 seconds ([2] with prepare and BEGIN COMMIT) (12% better)
Required time : 221 seconds [3]removed escaping
Required time : 214 seconds ([4] 4% better)

b) ON THE SOURCE (PHP/ODBC)
5. Believe it or not but changing from PHP ODBC to PHP PDO ODBC
From : http://us2.php.net/manual/en/ref.uodbc.php
To :   http://fr.php.net/manual/en/class.pdostatement.php
...helped a LOT :

Inserted lines : 134297
Required time : 25 seconds ([1] [2] [3] [4] [5] + PDO)

Hope it will help other people !

Thanks a lot again to everyone that help me :-)

Denis