Thread: really quick multiple inserts can use COPY?

really quick multiple inserts can use COPY?

From
"Jens Schipkowski"
Date:
Hello!

In our JAVA application we do multiple inserts to a table by data from a
Hash Map. Due to poor database access implemention - done by another
company (we got the job to enhance the software) - we cannot use prepared
statements. (We are not allowed to change code at database access!)
First, we tried to fire one INSERT statement per record to insert. This
costs 3 ms per row which is to slow because normally we insert 10.000
records which results in 30.000 ms just for inserts.

for(){
sql = "INSERT INTO tblfoo(foo,bar) VALUES("+it.next()+","+CONST.BAR+");";
}

I was searching for an quicker way - MSSQL offers Array Inserts - at
PostgreSQL. The only solution seem to be "INSERT INTO foo SELECT" and this
is really dirty.
I improved the inserts using the subselect with union.

sql = "INSERT INTO tblfoo(foo,bar) ";
for(){
sql += "SELECT "+it.next()+","+CONST.BAR+" UNION " ...
}

This results in a really long INSERT INTO SELECT UNION statement and works
correct and quick but looks dirty.

When I heard about COPY I thought this will be the right way. But it does
not work using JDBC. Testing via psql does it perfect but sending the same
SQL statements via JDBC throws an error.
-> BEGIN
sql = "COPY tblfoo(foo,bar) FROM STDIN;\n1  'foobar'\n2  'foobar'\n\\.";
-> COMMIT
ERROR:  syntax error at or near "1" at character 34

So, my questions:
Is it possible to use COPY FROM STDIN with JDBC?
Will it bring performance improvement compared to SELECT UNION solution?

many thanks in advance,
Jens Schipkowski

--
**
APUS Software GmbH

Re: really quick multiple inserts can use COPY?

From
Andreas Kretschmer
Date:
Jens Schipkowski <jens.schipkowski@apus.co.at> schrieb:

> Hello!
>
> In our JAVA application we do multiple inserts to a table by data from a
> Hash Map. Due to poor database access implemention - done by another
> company (we got the job to enhance the software) - we cannot use prepared
> statements. (We are not allowed to change code at database access!)
> First, we tried to fire one INSERT statement per record to insert. This
> costs 3 ms per row which is to slow because normally we insert 10.000
> records which results in 30.000 ms just for inserts.

Can you change this from INSERT-Statements to COPY? Copy is *much*
faster than INSERT.

If no, do you using all INSERTs in one transaction? I believe, a 'BEGIN'
and a 'COMMIT' around all INSERTs may increase the speed.


Andreas
--
Really, I'm not out to destroy Microsoft. That will just be a completely
unintentional side effect.                              (Linus Torvalds)
"If I was god, I would recompile penguin with --enable-fly."    (unknow)
Kaufbach, Saxony, Germany, Europe.              N 51.05082°, E 13.56889°

Re: really quick multiple inserts can use COPY?

From
"Rajesh Kumar Mallah"
Date:
> So, my questions:
> Is it possible to use COPY FROM STDIN with JDBC?

Should be. Its at least possible using DBI and DBD::Pg (perl)


 my $copy_sth = $dbh -> prepare( "COPY
general.datamining_mailing_lists (query_id,email_key) FROM STDIN;") ;
  $copy_sth -> execute();
  while (my ($email_key ) = $fetch_sth -> fetchrow_array ()) {
          $dbh -> func("$query_id\t$email_key\n", 'putline');
  }
  $fetch_sth -> finish();
  $dbh -> func("\\.\n", 'putline');
  $dbh -> func('endcopy');
  $copy_sth->finish();

Some JDBC expert would tell better how its done with JDBC.


> Will it bring performance improvement compared to SELECT UNION solution?

COPY is quite faast.

Regds
mallah.

>
> many thanks in advance,
> Jens Schipkowski
>
> --
> **
> APUS Software GmbH
>
> ---------------------------(end of broadcast)---------------------------
> TIP 1: if posting/reading through Usenet, please send an appropriate
>        subscribe-nomail command to majordomo@postgresql.org so that your
>        message can get through to the mailing list cleanly
>

Re: really quick multiple inserts can use COPY?

From
Tom Lane
Date:
"Jens Schipkowski" <jens.schipkowski@apus.co.at> writes:
> Is it possible to use COPY FROM STDIN with JDBC?

You should be asking the pgsql-jdbc list, not here.  (I know I've seen
mention of a JDBC patch to support COPY, but I dunno if it's made it into
any official version.)

> Will it bring performance improvement compared to SELECT UNION solution?

Please, at least be smart enough to use UNION ALL not UNION.

If you're using 8.2 you could also consider using INSERT with multiple
VALUES-lists.

            regards, tom lane

Re: really quick multiple inserts can use COPY?

From
Guillaume Cottenceau
Date:
"Jens Schipkowski" <jens.schipkowski 'at' apus.co.at> writes:

> Hello!
>
> In our JAVA application we do multiple inserts to a table by data from
> a  Hash Map. Due to poor database access implemention - done by
> another  company (we got the job to enhance the software) - we cannot
> use prepared  statements. (We are not allowed to change code at
> database access!)
> First, we tried to fire one INSERT statement per record to insert.
> This  costs 3 ms per row which is to slow because normally we insert
> 10.000  records which results in 30.000 ms just for inserts.
>
> for(){
> sql = "INSERT INTO tblfoo(foo,bar) VALUES("+it.next()+","+CONST.BAR+");";
> }

You should try to wrap that into a single transaction. PostgreSQL
waits for I/O write completion for each INSERT as it's
implicitely in its own transaction. Maybe the added performance
would be satisfactory for you.

--
Guillaume Cottenceau
Create your personal SMS or WAP Service - visit http://mobilefriends.ch/

Re: really quick multiple inserts can use COPY?

From
imad
Date:
On 12/11/06, Andreas Kretschmer <akretschmer@spamfence.net> wrote:
> Jens Schipkowski <jens.schipkowski@apus.co.at> schrieb:
>
> > Hello!
> >
> > In our JAVA application we do multiple inserts to a table by data from a
> > Hash Map. Due to poor database access implemention - done by another
> > company (we got the job to enhance the software) - we cannot use prepared
> > statements. (We are not allowed to change code at database access!)
> > First, we tried to fire one INSERT statement per record to insert. This
> > costs 3 ms per row which is to slow because normally we insert 10.000
> > records which results in 30.000 ms just for inserts.
>
> Can you change this from INSERT-Statements to COPY? Copy is *much*
> faster than INSERT.
>
> If no, do you using all INSERTs in one transaction? I believe, a 'BEGIN'
> and a 'COMMIT' around all INSERTs may increase the speed.


Performance increment can also be gained by disabling constraints in the
transaction. These disabled constraints are invoked at the end of the
transaction according to the SQL standard, so no worries about data
consistency.

Hmmm... PG currently supports disabling foreign constraints only. But that
can still be significant.

--Imad
www.EnterpriseDB.com

Re: really quick multiple inserts can use COPY?

From
"Jens Schipkowski"
Date:
Thanks a lot to all for your tips.

Of course, I am doing all the INSERTs using a transaction. So the cost per
INSERT dropped from 30 ms to 3 ms.
The improvement factor matches with the hint by Brian Hurt.
Sorry, I forgot to mention we are using PostgreSQL 8.1.4.
Thanks for the code snippet posted by mallah. It looks like you are using
prepared statements, which are not available to us.
But I will check our database access if its possible to do a workaround,
because this looks clean and quick to me.

regards
Jens Schipkowski


On Mon, 11 Dec 2006 17:53:52 +0100, Guillaume Cottenceau <gc@mnc.ch> wrote:

> "Jens Schipkowski" <jens.schipkowski 'at' apus.co.at> writes:
>
>> Hello!
>>
>> In our JAVA application we do multiple inserts to a table by data from
>> a  Hash Map. Due to poor database access implemention - done by
>> another  company (we got the job to enhance the software) - we cannot
>> use prepared  statements. (We are not allowed to change code at
>> database access!)
>> First, we tried to fire one INSERT statement per record to insert.
>> This  costs 3 ms per row which is to slow because normally we insert
>> 10.000  records which results in 30.000 ms just for inserts.
>>
>> for(){
>> sql = "INSERT INTO tblfoo(foo,bar)
>> VALUES("+it.next()+","+CONST.BAR+");";
>> }
>
> You should try to wrap that into a single transaction. PostgreSQL
> waits for I/O write completion for each INSERT as it's
> implicitely in its own transaction. Maybe the added performance
> would be satisfactory for you.
>



--
**
APUS Software GmbH

Re: really quick multiple inserts can use COPY?

From
nicky
Date:



Jens Schipkowski wrote:
> Thanks a lot to all for your tips.
>
> Of course, I am doing all the INSERTs using a transaction. So the cost
> per INSERT dropped from 30 ms to 3 ms.
> The improvement factor matches with the hint by Brian Hurt.
> Sorry, I forgot to mention we are using PostgreSQL 8.1.4.
> Thanks for the code snippet posted by mallah. It looks like you are
> using prepared statements, which are not available to us.
> But I will check our database access if its possible to do a
> workaround, because this looks clean and quick to me.
>
> regards
> Jens Schipkowski
>
>
> On Mon, 11 Dec 2006 17:53:52 +0100, Guillaume Cottenceau <gc@mnc.ch>
> wrote:
>
>> "Jens Schipkowski" <jens.schipkowski 'at' apus.co.at> writes:
>>
>>> Hello!
>>>
>>> In our JAVA application we do multiple inserts to a table by data from
>>> a  Hash Map. Due to poor database access implemention - done by
>>> another  company (we got the job to enhance the software) - we cannot
>>> use prepared  statements. (We are not allowed to change code at
>>> database access!)
>>> First, we tried to fire one INSERT statement per record to insert.
>>> This  costs 3 ms per row which is to slow because normally we insert
>>> 10.000  records which results in 30.000 ms just for inserts.
>>>
>>> for(){
>>> sql = "INSERT INTO tblfoo(foo,bar)
>>> VALUES("+it.next()+","+CONST.BAR+");";
>>> }
>>
>> You should try to wrap that into a single transaction. PostgreSQL
>> waits for I/O write completion for each INSERT as it's
>> implicitely in its own transaction. Maybe the added performance
>> would be satisfactory for you.
>>
>
>
>
> --**
> APUS Software GmbH
>
> ---------------------------(end of broadcast)---------------------------
> TIP 7: You can help support the PostgreSQL project by donating at
>
>                http://www.postgresql.org/about/donate
>
>
>


This link might be what you are looking for, it has some information
about implementing COPY in the JDBC driver. Check the reply message as
well.

http://archives.postgresql.org/pgsql-jdbc/2005-04/msg00134.php


Another solution might be to have Java dump the contents of the HashMap
to a CVS file and have it load through psql with COPY commands.

Good luck,

Nick