Re: Alternative to drop index, load data, recreate index? - Mailing list pgsql-general

From Scott Marlowe
Subject Re: Alternative to drop index, load data, recreate index?
Date
Msg-id dcc563d10709101733p3cdc8fd4v906bfdaace793652@mail.gmail.com
Whole thread Raw
In response to Alternative to drop index, load data, recreate index?  ("Jason L. Buberel" <jason@buberel.org>)
List pgsql-general
On 9/10/07, Jason L. Buberel <jason@buberel.org> wrote:
>
>  When loading very large data exports (> 1 million records) I have found it
> necessary to use the following sequence to achieve even reasonable import
> performance:
>
>  1. Drop all indices on the recipient table
>  2. Use "copy recipient_table from '/tmp/input.file';"
>  3. Recreate all indices on the recipient table
>
>  However, I now have tables so large that even the 'recreate all indices'
> step is taking too long (15-20 minutes on 8.2.4).

Well, that's pretty much the fastest way to import data.
If you've got foreign keys you can look into disabling them while
importing, if they're slowing things down.

I've got indexes that individually take 20 to 30 minutes to create on
one large reporting table.  all 8 or so indexes take well over 2 hours
to create.  But I don't load it very often.

Note that with 8.2 you can create your new indexes in a non-blocking
mode so that the table is accessible while the indexes are being
created.

>  I am considering moving to date-based partitioned tables (each table = one
> month-year of data, for example). Before I go that far - is there any other
> tricks I can or should be using to speed up my bulk data loading?

Partitioning would help.  I've used it for tables where I've had to do
stuff like that, and it definitely helps.

pgsql-general by date:

Previous
From: Jeff Davis
Date:
Subject: Re: Alternative to drop index, load data, recreate index?
Next
From: Tom Lane
Date:
Subject: Re: Postgresql 7.3 on Red Hat Enterprise 5 (Problem with SEMMNI, SEMMNS)