Re: TPC-H Scaling Factors X PostgreSQL Cluster Command - Mailing list pgsql-performance

From Nelson Kotowski
Subject Re: TPC-H Scaling Factors X PostgreSQL Cluster Command
Date
Msg-id d34b24380704230852p2fe52a05qbd7397a4293ce5a9@mail.gmail.com
Whole thread Raw
In response to Re: TPC-H Scaling Factors X PostgreSQL Cluster Command  (Heikki Linnakangas <heikki@enterprisedb.com>)
Responses Re: TPC-H Scaling Factors X PostgreSQL Cluster Command
List pgsql-performance
Hi Heikki,

Thanks for answering! :)

 I don't get how creating only the indexes i cluster on would improve my cluster command perfomance. I believed that all other indexes wouldn't interfere because so far they're created in a fashionable time and they don't refer to any field/column in the orders/lineitem table. Could you explain me again?

As for the load, when you say the right order to start, you mean i should order the load file by the index field in the table before loading it?

Thanks in advance,
Nelson P Kotowski Filho.

On 4/23/07, Heikki Linnakangas <heikki@enterprisedb.com> wrote:
Nelson Kotowski wrote:
> So far, i need to do it in three different scale factors (1, 2 and 5GB
> databases).
>
> My build process comprehends creating the tables without any foreign keys,
> indexes, etc. - Running OK!
> Then, i load the data from the flat files generated through DBGEN software
> into these tables. - Running OK!
>
> Finally, i run a "optimize" script that does the following:
>
> - Alter the tables to add the mandatory foreign keys;
> - Create all mandatory indexes;
> - Cluster the orders table by the orders table index;
> - Cluster the lineitem table by the lineitem table index;
> - Vacuum the database;
> - Analyze statistics.

Cluster will completely rewrite the table and indexes. On step 2, you
should only create the indexes you're clustering on, and create the rest
of them after clustering.

Or even better, generate and load the data in the right order to start
with, so you don't need to cluster at all.

--
   Heikki Linnakangas
   EnterpriseDB   http://www.enterprisedb.com

pgsql-performance by date:

Previous
From: Ron
Date:
Subject: Re: postgres: 100% CPU utilization
Next
From: Scott Marlowe
Date:
Subject: Re: postgres: 100% CPU utilization