Re: Postgresql - performance of using array in big database - Mailing list pgsql-performance

From Craig Ringer
Subject Re: Postgresql - performance of using array in big database
Date
Msg-id 5022081E.5080900@ringerc.id.au
Whole thread Raw
List pgsql-performance
On 08/03/2012 05:14 PM, roberthanco@o2.pl wrote:

> It is read-only table so every integer column have an index.

First tip: Define the table without the indexes. INSERT your data, and only after it is inserted create your indexes.

Similarly, if you're making huge changes to the table you should consider dropping the indexes, making the changes, and re-creating the indexes. You might not have to drop the indexes if you aren't changing indexed fields, since HOT might save you, but it depends a lot on the specifics of the table's on-disk layout etc.

The server is: 12 GB RAM, 1,5 TB SATA, 4 CORES. All server for postgres.
There are many more tables in this database so RAM do not cover all database.

OK, in that case more info on the disk subsystem is generally helpful. Disk spin speed, type? RAID configuration if any? eg:

  4 x 750GB 7200RPM Western Digital  Black SATA 3 HDDs in RAID 10 using the Linux 'md' raid driver

or

  2 x 1.5TB 7200RPM "Enterprise/near-line" SATA3 HDDs in RAID 1 using a Dell PARC xxxx controller with BBU in write-back cache mode.

... though if you're only bulk-inserting the BBU doesn't matter much.

I wonder what option would be better in performance point of view.

I would advise you to test on a subset of your data. Try loading the same 50,000 records into different databases, one with each structure. Measure how long the load takes for each design, and how long the queries you need to run take to execute. Repeat the process with 500,000 records and see if one design slows down more than the other design does. Etc.

I need to make a good decision because import of this data will take me a 20 days.

For the sheer size of data you have you might want to think about using pg_bulkload. If you can't or don't want to do that, then at least use COPY to load big batches of your data.

--
Craig Ringer

pgsql-performance by date:

Previous
From: "Rajiv Kasera"
Date:
Subject: Postgres Upgrade from 8.4 to 9.1
Next
From: Craig Ringer
Date:
Subject: Re: Postgres Upgrade from 8.4 to 9.1