Home > mailing lists

Re: general questions postgresql performance config - Mailing list pgsql-general

From	Craig Ringer
Subject	Re: general questions postgresql performance config
Date	January 26, 2010 01:12:53
Msg-id	4B5E4F8A.2060008@postnewspapers.com.au Whole thread Raw
In response to	general questions postgresql performance config (Dino Vliet <dino_vliet@yahoo.com>)
Responses	Re: general questions postgresql performance config (Andy Colson <andy@squeakycode.net>)
List	pgsql-general

Tree view

On 26/01/2010 12:15 AM, Dino Vliet wrote:

> 5) Other considerations?


To get optimal performance for bulk loading you'll want to do concurrent
data loading over several connections - up to as many as you have disk
spindles. Each connection will individually be slower, but the overall
throughput will be much greater.

Just how many connections you'll want to use depends on your I/O
subsystem and to a lesser extent your CPU capacity.

Inserting data via multiple connections isn't something every data
loading tool supports, so make sure to consider this carefully.


Another consideration is how you insert the data. It's vital to insert
your data in large transaction-batched chunks (or all in one
transaction) ; even with synchronized_commit = off you'll still see
rather poor performance if you do each INSERT in its own transaction.
Doing your inserts as prepared statements where each INSERT is multi-row
valued will help too.

Even better is to use COPY to load large chunks of data. libpq provides
access to the COPY interface if you feel like some C coding. The JDBC
driver (dev version only so far) now provides access to the COPY API, so
you can also bulk-load via Java very efficiently now.

If your data needs little/no transformation and is externally validated
you can look into pg_bulkload as an alternative to all this.

--
Craig Ringer

pgsql-general by date:

From: Jeff Davis
Date: 25 January 2010, 22:43:56
Subject: Re: postgres

From: Craig Ringer
Date: 26 January 2010, 01:24:34
Subject: Re: Log full of: statement_timeout out of the valid range.

Re: general questions postgresql performance config - Mailing list pgsql-general

Previous

Next