Re: bulk insert performance problem - Mailing list pgsql-performance

From Chris
Subject Re: bulk insert performance problem
Date
Msg-id 47FAE768.80200@gmail.com
Whole thread Raw
In response to Re: bulk insert performance problem  (Craig Ringer <craig@postnewspapers.com.au>)
List pgsql-performance
Craig Ringer wrote:
> Christian Bourque wrote:
>> Hi,
>>
>> I have a performance problem with a script that does massive bulk
>> insert in 6 tables. When the script starts the performance is really
>> good but will degrade minute after minute and take almost a day to
>> finish!
>>
> Would I be correct in guessing that there are foreign key relationships
> between those tables, and that there are significant numbers of indexes
> in use?
>
> The foreign key checking costs will go up as the tables grow, and AFAIK
> the indexes get a bit more expensive to maintain too.
>
> If possible you should probably drop your foreign key relationships and
> drop your indexes, insert your data, then re-create the indexes and
> foreign keys. The foreign keys will be rechecked when you recreate them,
> and it's *vastly* faster to do it that way. Similarly, building an index
> from scratch is quite a bit faster than progressively adding to it. Of
> course, dropping the indices is only useful if you aren't querying the
> tables as you build them.

If you are, add "analyze" commands through the import, eg every 10,000
rows. Then your checks should be a bit faster.

The other suggestion would be to do block commits:

begin;
do stuff for 5000 rows;
commit;

repeat until finished.

--
Postgresql & php tutorials
http://www.designmagick.com/

pgsql-performance by date:

Previous
From: Craig Ringer
Date:
Subject: Re: bulk insert performance problem
Next
From: "Potluri Srikanth"
Date:
Subject: bulk data loading