Re: Fastest method to insert data. - Mailing list pgsql-jdbc

From Dennis R. Gesker
Subject Re: Fastest method to insert data.
Date
Msg-id 3CC1BD5A.6020808@gesker.com
Whole thread Raw
In response to Fastest method to insert data.  (Sam Varshavchik <mrsam@courier-mta.com>)
Responses Re: Fastest method to insert data.  (Sam Varshavchik <mrsam@courier-mta.com>)
List pgsql-jdbc
I'd be interested to hear thoughts on a "COPY TABLE" method or set of libraries. In my case I'm reading the table structure of tables in a Microsoft SQL database, using this information to recreate these tables in PostgreSql (some mapping of data types was needed of course) then pulling the data contained in the MS tables and storing them in PG.

In the sequences of actually transferring the data I encountered some memory problems when using the .addBatch() methods. I probably should have expected this since the tables I'm seeking to draw data from are kind of large. I was thinking that using the .addBatch() would be a good approach since I could treat the whole batch as a transaction helping to ensure that I pulled everything I intended.

Now I'm reading a record from the MS database and then executing the insert statement on the PG. database right away. My routines are pretty rough at this point (I'm still new to the language) but I would have to imagine that a more sophisticated programmers have probably already worked this out.

Earlier I was thinking that I would loop through the MS resultset from and then use and updatable resultset on the PG. side using the movetoinsertrow and like methods available to an updatable resultset. I don't think that updatable result sets are supported by the PG. JDBC driver yet. (This may be in there now) But from a performance point of view I don't know if it would make a huge difference.

Now, if there were classes that could GULP whole tables from a database (PG., MS or otherwise) at one shot and recreate these tables in PG. this would be great!

Sam: Is the direction in which you were thinking?

If this was doable I would have to wonder how much of a difference in speed would be say in a scenario that is closer to Sam's where a a tables is being copied both to and from a PG. database versus just a copy table from the command line of the command interpreter. Maybe Sam's is a scenario where a stored procedure would be useful.

Dennis

Sam Varshavchik wrote:
On Fri, Apr 19, 2002 at 12:49:38PM -0700, Barry Lind wrote:

COPY will always be your best option if speed is the most important factor.

Would COPY issued through JDBC actually work? My understanding is that for
non-postgres uid clients the COPY command is restricted to using stdin/stdout
only. Or perhaps it's an artificial limitation imposed by psql. I don't think
I've seen a COPY TABLE JDBC extension in the documentation.

Resetting System.cin/System.cout to a file, then issuing a COPY TABLE might
be worth investigating, but that's a hack.

--Barry

Sam Varshavchik wrote:
What would be the fastest way to insert a bunch of data (in 7.1)?  
Currently, a bunch of addBatch()es followed by executeBatch() is pretty
fast, but not as fast as COPY TABLE via psql. I suppose I can use a
single executeBatch() to insert the entire table, but AFAIK addBatch()
just buffers the SQL in memory, and I prefer to have some kind of an
upper limit on my memory footprint.



---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo@postgresql.org so that your
message can get through to the mailing list cleanly


--  _________|~~      @||  ====   | Dennis Roman Gesker		ICQ: 194047|  ====   | mailto:dennis@gesker.com		Fax: 413.740.4653|_________| http://www.gesker.com

pgsql-jdbc by date:

Previous
From: Dave Cramer
Date:
Subject: Re: pgsql and JNDI failures
Next
From: Sam Varshavchik
Date:
Subject: Re: Fastest method to insert data.