Re: Best COPY Performance - Mailing list pgsql-performance

From Worky Workerson
Subject Re: Best COPY Performance
Date
Msg-id ce4072df0610311313w6bee6b7cwf04796d838a0766a@mail.gmail.com
Whole thread Raw
In response to Re: Best COPY Performance  ("Luke Lonergan" <llonergan@greenplum.com>)
List pgsql-performance
> >>>> 1 0 345732 29304 770272 12946764  0  0 16 16428 1192 3105 12  2 85  1
> >>>> 1 0 345732 30840 770060 12945480  0  0 20 16456 1196 3151 12  2 84  1
> >>>> 1 0 345732 32760 769972 12943528  0  0 12 16460 1185 3103 11  2 86  1
> >>
> >> iirc, he is running quad opteron 885 (8 cores), so if my math is
> >> correct he can split up his process for an easy gain.
> >
> > Are you saying that I should be able to issue multiple COPY commands
> > because my I/O wait is low?  I was under the impression that I am I/O
> > bound, so multiple simeoultaneous loads would have a detrimental
> > effect ...
>
> The reason I asked how many CPUs was to make sense of the 12% usr CPU time
> in the above.  That means you are CPU bound and are fully using one CPU.  So
> you aren't being limited by the I/O in this case, it's the CPU.
... snip ...
> For now, you could simply split the file in two pieces and load two copies
> at once, then watch the same "vmstat 1" for 10 seconds and look at your "bo"
> rate.

Significantly higher on average, and a parallel loads were ~30% faster
that a single with index builds (240s vs 340s) and about ~45% (150s vs
230s) without the PK index.  I'll definitely look into the bizgres
java loader.

Thanks!

pgsql-performance by date:

Previous
From: "Luke Lonergan"
Date:
Subject: Re: Best COPY Performance
Next
From: Rob Lemley
Date:
Subject: Re: commit so slow program looks frozen