Re: Performance Concern - Mailing list pgsql-performance

From Sean Shanny
Subject Re: Performance Concern
Date
Msg-id 3F996FDF.1040307@earthlink.net
Whole thread Raw
In response to Performance Concern  ("John Pagakis" <john@pagakis.com>)
Responses Re: Performance Concern  ("John Pagakis" <thebfh@toolsmythe.com>)
Re: Performance Concern  ("John Pagakis" <john@pagakis.com>)
List pgsql-performance
John,

Are you treating each insertion as a separate transaction?  If so the
performance will suffer.  I am doing the same thing in building a data
warehouse using PG.  I have to load millions of records each night.  I
do two different things:

1) If I need to keep the insertions inside the java process I turn off
auto-commit and every n insertions (5000 seems to give me the best
performance for my setup) issue a commit.  Make sure you do a final
commit in a finally block so you don't miss anything.

2) Dump all the data to a file and then use a psql COPY <table>
(columns) FROM 'file path' call to load it.  Very fast.

--sean

John Pagakis wrote:

>Greetings.
>
>I have a table that will require 100,000 rows initially.
>
>Assume the following (some of the field names have been changed for
>confidentiality reasons):
>
>CREATE TABLE baz (
>    baz_number CHAR(15) NOT NULL,
>    customer_id CHAR(39),
>    foobar_id INTEGER,
>    is_cancelled BOOL DEFAULT false NOT NULL,
>    create_user VARCHAR(60) NOT NULL,
>    create_datetime TIMESTAMP DEFAULT 'now()' NOT NULL,
>    last_update_user VARCHAR(60) NOT NULL,
>    last_update_datetime TIMESTAMP DEFAULT 'now()' NOT NULL,
>    CONSTRAINT PK_baz PRIMARY KEY (baz_number)
>);
>
>ALTER TABLE baz
>    ADD FOREIGN KEY (customer_id) REFERENCES purchase (customer_id);
>
>ALTER TABLE baz
>    ADD FOREIGN KEY (foobar_id) REFERENCES foobar (foobar_id);
>
>
>Using JDBC, it took approximately one hour to insert 100,000 records.  I
>have an algorithm to generate a unique baz_number - it is a mixture of alpha
>and numerics.
>
>There is a purchase table; one purchase can have many associated baz
>records, but the baz records will be pre-allocated - baz.customer_id allows
>null.  The act of purchasing a baz will cause baz.customer_id to be
>populated from the customer_id (key) field in the purchase table.
>
>If it took an hour to insert 100,000 records, I can only imagine how much
>time it will take if one customer were to attempt to purchase all 100,000
>baz.  Certainly too long for a web page.
>
>I've not had to deal with this kind of volume in Postgres before; I have my
>suspicions on what is wrong here (could it be using a CHAR( 15 ) as a key?)
>but I'd *LOVE* any thoughts.
>
>Would I be better off making the key an identity field and not indexing on
>baz_number?
>
>Thanks in advance for any help.
>
>__________________________________________________________________
>John Pagakis
>Email: ih8spam_thebfh@toolsmythe.com
>
>
>"The best way to make your dreams come true is to wake up."
>        -- Paul Valery
>
>This signature generated by
>     ... and I Quote!!(tm) Copyright (c) 1999 SpaZmodic Frog Software, Inc.
>                                              www.spazmodicfrog.com
>
>
>---------------------------(end of broadcast)---------------------------
>TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org
>
>
>


pgsql-performance by date:

Previous
From: Rod Taylor
Date:
Subject: Re: Performance Concern
Next
From: Christopher Browne
Date:
Subject: Re: Performance Concern