On 06/28/2012 07:53 PM, Robert Buckley wrote:
Hi,
I have to create a script which imports csv data into postgresql ...and have a few questions about the best way to do it.
The advice already given is pretty good. Remember you can always create a clean new table then INSERT INTO ... SELECT to populate it from a scratch table you loaded your CSV into, so you don't have to do your cleanups/transformations to the CSV or during the COPY its self.
If it's a big job, it's going to be regular, you're going to have to merge it with more imports later, etc, consider an ETL tool like Penatho.
http://kettle.pentaho.com/ For very very fast loading of bulk data, consider pg_bulkload
http://pgbulkload.projects.postgresql.org/ . It's only worth the hassle if your load will take many, many hours without it.
--
Craig Ringer