Re: Performance Concern - Mailing list pgsql-performance
From | Sean Shanny |
---|---|
Subject | Re: Performance Concern |
Date | |
Msg-id | 3F99CEAF.9000709@earthlink.net Whole thread Raw |
In response to | Performance Concern ("John Pagakis" <john@pagakis.com>) |
List | pgsql-performance |
John, One other thing I forgot to mention with solution 2. If you are going to be adding a fair number of records to the table on an ongoing basis you will want to drop indexes first and re-create them after the load is complete. I have tried it both ways and dropping is faster overall. --sean John Pagakis wrote: >Sean - >I believe auto-commit was off (not at the box right now). I'll play with >the commit interval; I know commits are expensive operations. > >Thanks for item 2. I was toying with the notion of pre-creating 100000 >bazes off-loading them and then seeing if the COPY would be any faster; you >saved me the effort of experimenting. Thanks for the benefit of your >experience. > >__________________________________________________________________ >John Pagakis >Email: ih8spam_thebfh@toolsmythe.com > > >"Oh, you hate your job? Why didn't you say so? > There's a support group for that. It's called > EVERYBODY, and they meet at the bar." > -- Drew Carey > >This signature generated by > ... and I Quote!!(tm) Copyright (c) 1999 SpaZmodic Frog Software, Inc. > www.spazmodicfrog.com > > >-----Original Message----- >From: Sean Shanny [mailto:shannyconsulting@earthlink.net] >Sent: Friday, October 24, 2003 11:31 AM >To: john@pagakis.com >Cc: pgsql-performance@postgresql.org >Subject: Re: [PERFORM] Performance Concern > > >John, > >Are you treating each insertion as a separate transaction? If so the >performance will suffer. I am doing the same thing in building a data >warehouse using PG. I have to load millions of records each night. I >do two different things: > >1) If I need to keep the insertions inside the java process I turn off >auto-commit and every n insertions (5000 seems to give me the best >performance for my setup) issue a commit. Make sure you do a final >commit in a finally block so you don't miss anything. > >2) Dump all the data to a file and then use a psql COPY <table> >(columns) FROM 'file path' call to load it. Very fast. > >--sean > >John Pagakis wrote: > > > >>Greetings. >> >>I have a table that will require 100,000 rows initially. >> >>Assume the following (some of the field names have been changed for >>confidentiality reasons): >> >>CREATE TABLE baz ( >> baz_number CHAR(15) NOT NULL, >> customer_id CHAR(39), >> foobar_id INTEGER, >> is_cancelled BOOL DEFAULT false NOT NULL, >> create_user VARCHAR(60) NOT NULL, >> create_datetime TIMESTAMP DEFAULT 'now()' NOT NULL, >> last_update_user VARCHAR(60) NOT NULL, >> last_update_datetime TIMESTAMP DEFAULT 'now()' NOT NULL, >> CONSTRAINT PK_baz PRIMARY KEY (baz_number) >>); >> >>ALTER TABLE baz >> ADD FOREIGN KEY (customer_id) REFERENCES purchase (customer_id); >> >>ALTER TABLE baz >> ADD FOREIGN KEY (foobar_id) REFERENCES foobar (foobar_id); >> >> >>Using JDBC, it took approximately one hour to insert 100,000 records. I >>have an algorithm to generate a unique baz_number - it is a mixture of >> >> >alpha > > >>and numerics. >> >>There is a purchase table; one purchase can have many associated baz >>records, but the baz records will be pre-allocated - baz.customer_id allows >>null. The act of purchasing a baz will cause baz.customer_id to be >>populated from the customer_id (key) field in the purchase table. >> >>If it took an hour to insert 100,000 records, I can only imagine how much >>time it will take if one customer were to attempt to purchase all 100,000 >>baz. Certainly too long for a web page. >> >>I've not had to deal with this kind of volume in Postgres before; I have my >>suspicions on what is wrong here (could it be using a CHAR( 15 ) as a key?) >>but I'd *LOVE* any thoughts. >> >>Would I be better off making the key an identity field and not indexing on >>baz_number? >> >>Thanks in advance for any help. >> >>__________________________________________________________________ >>John Pagakis >>Email: ih8spam_thebfh@toolsmythe.com >> >> >>"The best way to make your dreams come true is to wake up." >> -- Paul Valery >> >>This signature generated by >> ... and I Quote!!(tm) Copyright (c) 1999 SpaZmodic Frog Software, Inc. >> www.spazmodicfrog.com >> >> >>---------------------------(end of broadcast)--------------------------- >>TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org >> >> >> >> >> > > > >
pgsql-performance by date: