Re: Very poor performance loading 100M of sql data using copy - Mailing list pgsql-performance

From Heikki Linnakangas
Subject Re: Very poor performance loading 100M of sql data using copy
Date
Msg-id 48160F05.2090207@enterprisedb.com
Whole thread Raw
In response to Very poor performance loading 100M of sql data using copy  (John Rouillard <rouilj@renesys.com>)
Responses Re: Very poor performance loading 100M of sql data using copy
List pgsql-performance
John Rouillard wrote:
> We are running postgresql-8.1.3 under Centos 4

You should upgrade, at least to the latest minor release of the 8.1
series (8.1.11), as there has been a bunch of important bug and security
fixes. Or even better, upgrade to 8.3, which has reduced the storage
size of especially variable length datatypes like text/char/varchar in
particular. As your COPY is I/O bound, reducing storage size will
translate directly to improved performance.

> dm-6 is where the data files reside and dm-4 is where the WAL archives
> are kept. Note all the DM's are on the same RAID 0 device /dev/sda2.

Another reason to upgrade to 8.3: if you CREATE or TRUNCATE the table in
the same transaction as you COPY into it, you can avoid WAL logging of
the loaded data, which will in the best case double your performance as
your WAL is on the same physical drives as the data files.

> The only indexes we have to drop are the ones on the primary keys
> (there is one non-primary key index in the database as well).
>
> Can you drop an index on the primary key for a table and add it back
> later?  Am I correct in saying: the primary key index is what enforces
> the unique constraint in the table? If the index is dropped and
> non-unique primary key data has been added, what happens when you
> re-add the index?

Yes, the index is what enforces the uniqueness. You can drop the primary
key constraint, and add it back after the load with ALTER TABLE. If the
load introduces any non-unique primary keys, adding the primary key
constraint will give you an error and fail.

Dropping and recreating the indexes is certainly worth trying.

--
   Heikki Linnakangas
   EnterpriseDB   http://www.enterprisedb.com

pgsql-performance by date:

Previous
From: Greg Smith
Date:
Subject: Re: Replication Syatem
Next
From: John Rouillard
Date:
Subject: Re: Very poor performance loading 100M of sql data using copy