Re: Load experimentation - Mailing list pgsql-performance

From Dimitri Fontaine
Subject Re: Load experimentation
Date
Msg-id 87ocm9rfxe.fsf@hi-media-techno.com
Whole thread Raw
In response to Re: Load experimentation  (Ben Brehmer <benbrehmer@gmail.com>)
Responses Re: Load experimentation  (Scott Marlowe <scott.marlowe@gmail.com>)
List pgsql-performance
Hi,

Ben Brehmer <benbrehmer@gmail.com> writes:
> By "Loading data" I am implying: "psql -U postgres -d somedatabase -f sql_file.sql".  The sql_file.sql contains table
createsand insert statements. There are no 
> indexes present nor created during the load.
>
> OS: x86_64-unknown-linux-gnu, compiled by GCC gcc (GCC) 4.1.2 20080704 (Red Hat 4.1.2-44)
>
> PostgreSQL: I will try upgrading to latest version.
>
> COPY command: Unfortunately I'm stuck with INSERTS due to the nature
> this data was generated (Hadoop/MapReduce).

What I think you could do is the followings:

 - switch to using 8.4
 - load your files in a *local* database
 - pg_dump -Fc
 - now pg_restore -j X on the amazon setup

That way you will be using COPY rather than INSERTs and parallel loading
built-in pg_restore (and optimisations of when to add the indexes and
constraints). The X is to choose depending on the IO power and the
number of CPU...

Regards,
--
dim

pgsql-performance by date:

Previous
From: Scott Marlowe
Date:
Subject: Re: Load experimentation
Next
From: Scott Marlowe
Date:
Subject: Re: Load experimentation