Re: Performance Concern - Mailing list pgsql-performance

From Sean Shanny
Subject Re: Performance Concern
Date
Msg-id 3F99CEAF.9000709@earthlink.net
Whole thread Raw
In response to Performance Concern  ("John Pagakis" <john@pagakis.com>)
List pgsql-performance
John,

One other thing I forgot to mention with solution 2.  If you are going
to be adding a fair number of records to the table on an ongoing basis
you will want to drop indexes first and re-create them after the load is
complete.  I have tried it both ways and dropping is faster overall.

--sean

John Pagakis wrote:

>Sean -
>I believe auto-commit was off (not at the box right now).  I'll play with
>the commit interval; I know commits are expensive operations.
>
>Thanks for item 2.  I was toying with the notion of pre-creating 100000
>bazes off-loading them and then seeing if the COPY would be any faster; you
>saved me the effort of experimenting.  Thanks for the benefit of your
>experience.
>
>__________________________________________________________________
>John Pagakis
>Email: ih8spam_thebfh@toolsmythe.com
>
>
>"Oh, you hate your job?  Why didn't you say so?
> There's a support group for that. It's called
> EVERYBODY, and they meet at the bar."
>        -- Drew Carey
>
>This signature generated by
>     ... and I Quote!!(tm) Copyright (c) 1999 SpaZmodic Frog Software, Inc.
>                                              www.spazmodicfrog.com
>
>
>-----Original Message-----
>From: Sean Shanny [mailto:shannyconsulting@earthlink.net]
>Sent: Friday, October 24, 2003 11:31 AM
>To: john@pagakis.com
>Cc: pgsql-performance@postgresql.org
>Subject: Re: [PERFORM] Performance Concern
>
>
>John,
>
>Are you treating each insertion as a separate transaction?  If so the
>performance will suffer.  I am doing the same thing in building a data
>warehouse using PG.  I have to load millions of records each night.  I
>do two different things:
>
>1) If I need to keep the insertions inside the java process I turn off
>auto-commit and every n insertions (5000 seems to give me the best
>performance for my setup) issue a commit.  Make sure you do a final
>commit in a finally block so you don't miss anything.
>
>2) Dump all the data to a file and then use a psql COPY <table>
>(columns) FROM 'file path' call to load it.  Very fast.
>
>--sean
>
>John Pagakis wrote:
>
>
>
>>Greetings.
>>
>>I have a table that will require 100,000 rows initially.
>>
>>Assume the following (some of the field names have been changed for
>>confidentiality reasons):
>>
>>CREATE TABLE baz (
>>   baz_number CHAR(15) NOT NULL,
>>   customer_id CHAR(39),
>>   foobar_id INTEGER,
>>   is_cancelled BOOL DEFAULT false NOT NULL,
>>   create_user VARCHAR(60) NOT NULL,
>>   create_datetime TIMESTAMP DEFAULT 'now()' NOT NULL,
>>   last_update_user VARCHAR(60) NOT NULL,
>>   last_update_datetime TIMESTAMP DEFAULT 'now()' NOT NULL,
>>   CONSTRAINT PK_baz PRIMARY KEY (baz_number)
>>);
>>
>>ALTER TABLE baz
>>   ADD FOREIGN KEY (customer_id) REFERENCES purchase (customer_id);
>>
>>ALTER TABLE baz
>>   ADD FOREIGN KEY (foobar_id) REFERENCES foobar (foobar_id);
>>
>>
>>Using JDBC, it took approximately one hour to insert 100,000 records.  I
>>have an algorithm to generate a unique baz_number - it is a mixture of
>>
>>
>alpha
>
>
>>and numerics.
>>
>>There is a purchase table; one purchase can have many associated baz
>>records, but the baz records will be pre-allocated - baz.customer_id allows
>>null.  The act of purchasing a baz will cause baz.customer_id to be
>>populated from the customer_id (key) field in the purchase table.
>>
>>If it took an hour to insert 100,000 records, I can only imagine how much
>>time it will take if one customer were to attempt to purchase all 100,000
>>baz.  Certainly too long for a web page.
>>
>>I've not had to deal with this kind of volume in Postgres before; I have my
>>suspicions on what is wrong here (could it be using a CHAR( 15 ) as a key?)
>>but I'd *LOVE* any thoughts.
>>
>>Would I be better off making the key an identity field and not indexing on
>>baz_number?
>>
>>Thanks in advance for any help.
>>
>>__________________________________________________________________
>>John Pagakis
>>Email: ih8spam_thebfh@toolsmythe.com
>>
>>
>>"The best way to make your dreams come true is to wake up."
>>       -- Paul Valery
>>
>>This signature generated by
>>    ... and I Quote!!(tm) Copyright (c) 1999 SpaZmodic Frog Software, Inc.
>>                                             www.spazmodicfrog.com
>>
>>
>>---------------------------(end of broadcast)---------------------------
>>TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org
>>
>>
>>
>>
>>
>
>
>
>


pgsql-performance by date:

Previous
From: "John Pagakis"
Date:
Subject: Re: Performance Concern
Next
From: "John Pagakis"
Date:
Subject: Re: Performance Concern