Re: bulk insert performance problem - Mailing list pgsql-performance

From Craig Ringer
Subject Re: bulk insert performance problem
Date
Msg-id 47FAE418.3000406@postnewspapers.com.au
Whole thread Raw
In response to bulk insert performance problem  ("Christian Bourque" <christian.bourque@gmail.com>)
Responses Re: bulk insert performance problem
List pgsql-performance
Christian Bourque wrote:
> Hi,
>
> I have a performance problem with a script that does massive bulk
> insert in 6 tables. When the script starts the performance is really
> good but will degrade minute after minute and take almost a day to
> finish!
>
Would I be correct in guessing that there are foreign key relationships
between those tables, and that there are significant numbers of indexes
in use?

The foreign key checking costs will go up as the tables grow, and AFAIK
the indexes get a bit more expensive to maintain too.

If possible you should probably drop your foreign key relationships and
drop your indexes, insert your data, then re-create the indexes and
foreign keys. The foreign keys will be rechecked when you recreate them,
and it's *vastly* faster to do it that way. Similarly, building an index
from scratch is quite a bit faster than progressively adding to it. Of
course, dropping the indices is only useful if you aren't querying the
tables as you build them.

Also, if you're loading data using stored procedures you should avoid
the use of exception blocks. I had some major problems with my bulk data
conversion code due to overuse of exception blocks creating large
numbers of subtransactions behind the scenes and slowing everything to a
crawl.

--
Craig Ringer

pgsql-performance by date:

Previous
From: "Christian Bourque"
Date:
Subject: bulk insert performance problem
Next
From: Chris
Date:
Subject: Re: bulk insert performance problem