Re: optimising data load - Mailing list pgsql-novice
From | Ron Johnson |
---|---|
Subject | Re: optimising data load |
Date | |
Msg-id | 1022121607.19121.81.camel@rebel Whole thread Raw |
In response to | optimising data load (John Taylor <postgres@jtresponse.co.uk>) |
Responses |
Re: optimising data load
|
List | pgsql-novice |
On Wed, 2002-05-22 at 16:05, John Taylor wrote: > On Wednesday 22 May 2002 18:40, Ron Johnson wrote: > > On Wed, 2002-05-22 at 08:45, John Taylor wrote: > > > > > > Hi all, > > > > > > I'm (still) working on an application to regularly populate my database with some data provided from > > > a third party DB. > > > > > > I'm still having really bad performance problems. > > > There are 300,000 records to be inserted, but I'm only getting 10,000/hour. > > > 30 hours to populate a single table is just not on. There must be something seriously bad going on. > > > > From what I see of your INSERT statement, you are joining orderheader > > to orderlines and then inserting those records into orderlines. > > Is that correct? > > Yes, unfortunately I get incomplete data, so I need to select some details before inserting. > > > > > I think I'd make a view from the SELECT portion. Then, nightly: > > 1. COPY out the view. > > 2. Drop the index(es) on orderlines > > 3. COPY into orderlines > > 4. recreate indexes on orderlines. > > So, how would I merge the data from the view with the data to copy in ? ??? Unless I am misunderstanding you: 0. create view v_new_orderlines (theorder, type, stock, line, ordercurrent, sellingquant, price, discount, vatrate, comment) as SELECT oh.theorder, 'P', '0310', coalesce(ol.line+1,1), 5, 0, .52, 0, 0, '' FROM orderheader oh LEFT OUTER JOIN orderlines ol ON oh.theorder = ol.theorder WHERE oh.account=' MILN1' AND oh.delivery=1 AND oh.thedate='2002-06-01' AND oh.ordertype='O' ORDER BY ol.line ; 1. COPY v_new_orderlines to '/tmp/new_orderlines.tsv'; 2. DROP each index from orderlines 3. COPY orderlines from '/tmp/new_orderlines.tsv'; 4. CREATE each index on orderlines > I was thinking of generating the copy file by performing a select for each row, > but a selecting each row into a file, and then copy in might be more expensive than > just inserting it like I do now. > > > > > Do the same for orderlineupdates. The thing is is that > > modifying indexes is a _very_ expensive operation, no matter > > what RDBMS or hardware configuration you have... > > > > Yes, I was thinking of that for my next step. > Once I figure how to get copy working, I'll also try dropping indexes. -- +---------------------------------------------------------+ | Ron Johnson, Jr. Home: ron.l.johnson@cox.net | | Jefferson, LA USA http://ronandheather.dhs.org:81 | | | | "I have created a government of whirled peas..." | | Maharishi Mahesh Yogi, 12-May-2002, | ! CNN, Larry King Live | +---------------------------------------------------------+
pgsql-novice by date: