Home > mailing lists

Re: Improve COPY performance for large data sets - Mailing list pgsql-performance

From	Ryan Hansen
Subject	Re: Improve COPY performance for large data sets
Date	September 10, 2008 14:14:23
Msg-id	48C8006F.50706@brightbuilders.com Whole thread Raw
In response to	Improve COPY performance for large data sets (Ryan Hansen <ryan.hansen@brightbuilders.com>)
Responses	答复: [PERFORM] Improve COPY performance for large data sets
List	pgsql-performance

Tree view

NEVERMIND!!

I found it.  Turns out there was still a constraint on the table.  Once
I dropped that, the time went down to 44 minutes.

Maybe I am an idiot after all. :)

-Ryan
Greetings,

I'm relatively new to PostgreSQL but I've been in the IT applications
industry for a long time, mostly in the LAMP world.

One thing I'm experiencing some trouble with is running a COPY of a
large file (20+ million records) into a table in a reasonable amount of
time.  Currently it's taking about 12 hours to complete on a 64 bit
server with 3 GB memory allocated (shared_buffer), single SATA 320 GB
drive.  I don't seem to get any improvement running the same operation
on a dual opteron dual-core, 16 GB server.

I'm not asking for someone to solve my problem, just some direction in
the best ways to tune for faster bulk loading, since this will be a
fairly regular operation for our application (assuming it can work this
way).  I've toyed with the maintenance_work_mem and some of the other
params, but it's still way slower than it seems like it should be.
So any contributions are much appreciated.

Thanks!

P.S. Assume I've done a ton of reading and research into PG tuning,
which I have.  I just can't seem to find anything beyond the basics that
talks about really speeding up bulk loads.

pgsql-performance by date:

From: Ryan Hansen
Date: 10 September 2008, 13:48:42
Subject: Improve COPY performance for large data sets

From: Alan Hodgson
Date: 10 September 2008, 14:14:53
Subject: Re: Improve COPY performance for large data sets

Re: Improve COPY performance for large data sets - Mailing list pgsql-performance

Previous

Next