Re: COPY v. java performance comparison - Mailing list pgsql-general

From Jeff Janes
Subject Re: COPY v. java performance comparison
Date
Msg-id CAMkU=1z461BxP_K-Lp6WxyxMmiGyX9x27yBK6o16BSpRg78Bnw@mail.gmail.com
Whole thread Raw
In response to Re: COPY v. java performance comparison  (Rob Sargent <robjsargent@gmail.com>)
List pgsql-general
On Wed, Apr 2, 2014 at 3:46 PM, Rob Sargent <robjsargent@gmail.com> wrote:
On 04/02/2014 04:36 PM, Jeff Janes wrote:
 
Are you sure you actually dropped the indices?  (And the primary key?)

I get about 375,000 lines per second with no indexes, triggers, constraints.

perl -le 'my $x="000000000000"; foreach(1..37e6) {$x++; print join "\t", "a0eebc99-9c0b-4ef8-bb6d-$x",$_,$_,"A","T"}'|time psql -c 'truncate oldstyle; copy oldstyle from stdin;'

(More if I truncate it in the same transaction as the copy)

If you can't drop the pk constraint, can you at least generate the values in sort-order?

Cheers,

Jeff
No I'll leave the pk in at the very least.  My example load (37M records) will not be the last word by any means.  That's one experiment, if you will.  My goal is not to see how fast I can get records in, rather to see what I can expect going forward.

You will probably want to pre-load the unindexed (including no PK) table with dummy values until you anticipate at least index will be larger than RAM.  Then build the indexes and PK; and then load some more values and time that load.  

If you just test on a small table, you will get answers that are unrealistic for the long term.  If you try to build up the table from scratch with the indexes in place, it could take 6 months to simulate 12 months of growth.

Cheers,

Jeff

pgsql-general by date:

Previous
From: Andy Colson
Date:
Subject: Re: COPY v. java performance comparison
Next
From: Jeff Janes
Date:
Subject: Re: COPY v. java performance comparison