Home > mailing lists

Re: Tuning massive UPDATES and GROUP BY's? - Mailing list pgsql-performance

From	Greg Spiegelberg
Subject	Re: Tuning massive UPDATES and GROUP BY's?
Date	March 14, 2011 13:34:38
Msg-id	AANLkTikqxB3EmnvgxYEGPwOg50r0K1JD=u651puWOzqW@mail.gmail.com Whole thread Raw
In response to	Re: Tuning massive UPDATES and GROUP BY's? (Marti Raudsepp <marti@juffo.org>)
Responses	Re: Tuning massive UPDATES and GROUP BY's? (runner <runner@winning.com>)
List	pgsql-performance

Tree view

On Mon, Mar 14, 2011 at 4:17 AM, Marti Raudsepp <marti@juffo.org> wrote:

On Sun, Mar 13, 2011 at 18:36, runner <runner@winning.com> wrote:
> Other than being very inefficient, and consuming
> more time than necessary, is there any other down side to importing
> into an indexed table?

Doing so will result in somewhat larger (more bloated) indexes, but
generally the performance impact of this is minimal.

Bulk data imports of this size I've done with minimal pain by simply breaking the raw data into chunks (10M records becomes 10 files of 1M records), on a separate spindle from the database, and performing multiple COPY commands but no more than 1 COPY per server core. I tested this a while back on a 4 core server and when I attempted 5 COPY's at a time the time to complete went up almost 30%. I don't recall any benefit having fewer than 4 in this case but the server was only processing my data at the time. Indexes were on the target table however I dropped all constraints. The UNIX split command is handy for breaking the data up into individual files.

Greg

pgsql-performance by date:

From: Marti Raudsepp
Date: 14 March 2011, 10:18:13
Subject: Re: Tuning massive UPDATES and GROUP BY's?

From: Merlin Moncure
Date: 14 March 2011, 13:41:51
Subject: Re: unexpected stable function behavior

Re: Tuning massive UPDATES and GROUP BY's? - Mailing list pgsql-performance

Previous

Next