Better way to bulk-load millions of CSV records into postgres? - Mailing list pgsql-novice

From Ron Johnson
Subject Better way to bulk-load millions of CSV records into postgres?
Date
Msg-id 1022013600.16609.61.camel@rebel
Whole thread Raw
Responses Re: Better way to bulk-load millions of CSV records into postgres?  (Josh Berkus <josh@agliodbs.com>)
Re: Better way to bulk-load millions of CSV records into postgres?  ("Joel Burton" <joel@joelburton.com>)
List pgsql-novice
Hi,

Currently, I've got a python script using pyPgSQL that
parses the CSV record, creates a string that is a big
"INSERT INTO VALUES (...)" command, then, execute() it.

top shows that this method uses postmaster with ~70% CPU
utilization, and python with ~15% utilization.

Still, it's only inserting ~190 recs/second.  Is there a
better way to do this, or am I constrained by the hardware?

Instead of python and postmaster having to do a ton of data
xfer over sockets, I'm wondering if there's a way to send a
large number of csv records (4000, for example) in one big
chunk to a stored procedure and have the engine process it
all.

Linux 2.4.18
PostgreSQL 7.2.1
python 2.1.3
csv file on /dev/hda
table on /dev/hde  (ATA/100)

--
+---------------------------------------------------------+
| Ron Johnson, Jr.        Home: ron.l.johnson@cox.net     |
| Jefferson, LA  USA      http://ronandheather.dhs.org:81 |
|                                                         |
| "I have created a government of whirled peas..."        |
|   Maharishi Mahesh Yogi, 12-May-2002,                   |
!   CNN, Larry King Live                                  |
+---------------------------------------------------------+


pgsql-novice by date:

Previous
From: Ron Johnson
Date:
Subject: Large tables being split at 1GB boundary
Next
From: Tom Lane
Date:
Subject: Re: Large tables being split at 1GB boundary